CN103516469A

CN103516469A - Device and method of sending and receiving speech frame

Info

Publication number: CN103516469A
Application number: CN201210210328.7A
Authority: CN
Inventors: 严成安; 李加周; 阮亚平; 包乐辉
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2012-06-25
Filing date: 2012-06-25
Publication date: 2014-01-15
Anticipated expiration: 2032-06-25
Also published as: CN103516469B

Abstract

The invention discloses a device and method of sending and receiving speech frames. The device of sending speech frames comprises a calculating module which is used for, based on a first coding time delay of a first preset coding algorithm and a second coding time delay of a second preset coding algorithm, respectively calculating initial positions of coding synchronization of the first preset coding algorithm and the second preset coding algorithm; a coding module which is used for, based on the calculated initial positions, respectively employing the first preset coding algorithm and the second preset coding algorithm to code collected speech frames; and a sending module which is used for integrating and sending a current speech frame among the collected speech frames coded by the first preset coding algorithm and a previous speech frame among the collected speech frame coded by the second preset coding algorithm. According to the technical scheme of the invention, primary and secondary coding technologies can better adapt to different coding and decoding algorithms, such that the resistance to packet loss of speech transmission is enhanced while the time delay is low and the bandwidth cost is low.

Description

The transmission of speech frame, receiving system and method

Technical field

The present invention relates to the communications field, in particular to a kind of transmission, receiving system and method for speech frame.

Background technology

Full Internet protocol (Internet Protocol, referred to as IP) change the trend that transmission is current multimedia communication development, and IP network based on transmitted in packets can only provide the service of doing one's best, the problems such as packet loss, time delay and shake can be accompanied by the multimedias such as voice, image in the application of packet network always.At present, the technical scheme that solves Network Packet Loss in correlation technique mainly can comprise: interweave, re-transmission and forward error correction.

1, interleaving technology is that several packets are divided into one group, according to certain rule by the packet transmission of reordering, be mainly used in resisting sudden continual data package dropout, its advantage is can not increase bandwidth cost, and shortcoming is packet to be divided into groups, coded data can not send immediately and decode, and introduces time delay large, is not suitable for the interactive type communication business that real-time is higher.

2, retransmission technique is to detect after packet loss when receiving terminal, sends feedback request and retransmits, and transmitting terminal can be according to the packet of feedback information retransmission of lost.The shortcoming retransmitting is, only adapt to the business with feedback channel, and time delay is higher, its time delay size depends primarily on (Round Trip Time network two-way time between transmitting terminal and receiving terminal, referred to as RTT) size and shake, and the processing time of two ends to relevant information.In addition, retransmission data packet may and feedback message have all been introduced extra bandwidth cost.

3, forward error correction technique is according to the relation with media content, can be divided into media-independent and with media relevant forward error correction technique.With the forward error correction technique of media-independent be that several media bags are divided into one group, utilize forward error correction algorithm, such as Reed Solomon code, convolution code etc., generate redundant data packets.In grouping during one or several media packet loss, media bag and redundancy packet that receiving terminal is not lost in can utilization group, the media bag that reduction is lost.The advantage of this technology is with concrete media content irrelevant, and amount of calculation is little, easy to implement; And one of shortcoming is the same with interleaving technology, media bag be carried out to block-coded data can not send immediately and decode, and introduces time delay large.In addition, also need to increase bandwidth cost.The forward error correction technique relevant to media, is called again primary and secondary coding techniques, is the part not at the same level with two kinds of different encryption algorithms or same encryption algorithm by same media units, constructs two copies, respectively transmission.First transmission copy is called main coding, and the second transmission copy is called time coding.When on network, main coding is lost, receiving terminal can restore media units from time coding correspondence and that do not lose.The advantage of the forward error correction that media are relevant is that time delay is little, transmission together with time coding of the media units of main coding data and former frame conventionally, and time delay only has a frame just can compensate the loss of main coding.Yet, the shortcoming of the primary and secondary coding in correlation technique is mainly reflected in when primary and secondary coding adopts different encryption algorithms, because the encoding time delay of algorithms of different may be different, the media of the data reduction that same media units algorithms of different is compiled out may not be upper alignment of time.In addition, even on the time media signal being restored by different encoding and decoding of alignment also can be in phase place or amplitude difference larger, therefore, the result of the inferior coded data reduction of same frame is gone to replace the main coding of losing, the problem not being connected with regard to media units before and after existing, the phenomenon of appearance " card " in the time of may causing media-playback.In addition, common encryption algorithm all has the characteristic of interframe reference, when main coding is lost, even from inferior code restoration the media units of losing, but in follow-up a period of time, the effect of main coding data decode also can be subject to the impact of packet loss.

Summary of the invention

The invention provides a kind of transmission, receiving system and method for speech frame, cannot be by the media units of main coding data decode and the problem of passing through the media units seamless connection of inferior coding reduction at least to solve the method for the primary and secondary coding in correlation technique.

A kind of dispensing device of speech frame is provided according to an aspect of the present invention.

According to the dispensing device of speech frame of the present invention, comprise: computing module, for calculate respectively the original position of the code synchronism of the first default encryption algorithm and the second default encryption algorithm according to the second encoding time delay of the first encoding time delay of the first default encryption algorithm and the second default encryption algorithm; Coding module, adopts respectively the first default encryption algorithm and the second default encryption algorithm to the speech frame the collecting processing of encoding for the original position according to calculating; Sending module, in the speech frame collecting after processing with the default encryption algorithm coding of employing second for the speech frame current speech frame collecting after the default encryption algorithm coding of employing first is processed, a upper speech frame is integrated and is processed and send.

Preferably, above-mentioned computing module comprises: acquiring unit, for obtaining respectively the first encoding time delay and the second encoding time delay; Computing unit, for poor according to the first encoding time delay and the second encoding time delay computation delay; Filler cells for the first encoding time delay and the second encoding time delay are compared, adopting default encryption algorithm that encoding time delay is less encode before processing to collecting speech frame, is filled the quiet data that length is delay inequality before collecting speech frame.

A kind of receiving system of speech frame is provided according to a further aspect in the invention.

According to the receiving system of speech frame of the present invention, comprise: receiver module, for receiving the speech frame of integrating after processing, wherein, the speech frame of integrating after processing comprises: from the first original position of presetting the code synchronism of encryption algorithm, adopt the first speech frame the first default encryption algorithm coding is processed and adopt the second speech frame the second default encryption algorithm coding is processed from the second original position of presetting the code synchronism of encryption algorithm, the second speech frame is the previous speech frame of the first speech frame; Parsing module, for the speech frame of integrating after processing is carried out to dissection process, carries out separated by the first speech frame with the second speech frame; Output module, for the processing of decoding respectively of the first speech frame to after separation and the second speech frame, and by the first speech frame output after decoding is processed; Or, during speech frame generation packet loss after the upper integration once receiving is processed, by the second speech frame output.

Preferably, above-mentioned output module comprises: the first decoding unit, for processing that the first speech frame is decoded; The second decoding unit, for processing that the second speech frame is decoded; Detecting unit, whether continuous with the first speech frame after last time decoding is processed for detection of the first speech frame after current decoding is processed; Switch unit, while for being output as at detecting unit being, by the first speech frame output after current decoding is processed; Or, at detecting unit, be output as when no, switch to the second speech frame output after current decoding is processed.

Preferably, above-mentioned switch unit, while being also one or more mute frame for the second speech frame in output, switches to the first speech frame output after current decoding is processed.

Preferably, output module also comprises: smoothing processing unit, and for the second speech frame after decoding is processed and the speech frame of last output are carried out to smoothing processing.

A kind of sending method of speech frame is provided according to another aspect of the invention.

According to the sending method of speech frame of the present invention, comprise: the original position that calculates respectively the code synchronism of the first default encryption algorithm and the second default encryption algorithm according to the second encoding time delay of the first encoding time delay of the first default encryption algorithm and the second default encryption algorithm; According to the original position calculating, adopt respectively the first default encryption algorithm and the second default encryption algorithm to the speech frame the collecting processing of encoding; A upper speech frame in the speech frame collecting after current speech frame in the speech frame collecting after the processing of the default encryption algorithm coding of employing first and the processing of the default encryption algorithm coding of employing second is integrated and processed and send.

Preferably, according to the second encoding time delay of the first encoding time delay of the described first default encryption algorithm and the described second default encryption algorithm, calculating respectively the described first original position of presetting the described code synchronism of encryption algorithm and the second default encryption algorithm comprises: obtain respectively described the first encoding time delay and described the second encoding time delay; Poor according to described the first encoding time delay and described the second encoding time delay computation delay; Described the first encoding time delay and described the second encoding time delay are compared, adopt default encryption algorithm that encoding time delay is less encode before processing to the described speech frame that collects, described, filling the quiet data that length is described delay inequality before collecting speech frame.

A kind of method of reseptance of speech frame is provided in accordance with a further aspect of the present invention.

According to the method for reseptance of speech frame of the present invention, comprise: receive the speech frame of integrating after processing, wherein, the speech frame of integrating after processing comprises: from the first original position of presetting the code synchronism of encryption algorithm, adopt the first speech frame the first default encryption algorithm coding is processed and adopt the second speech frame the second default encryption algorithm coding is processed from the second original position of presetting the code synchronism of encryption algorithm, the second speech frame is the previous speech frame of the first speech frame; The speech frame of integrating after processing is carried out to dissection process, the first speech frame is carried out separated with the second speech frame; To the processing of decoding respectively of the first speech frame after separation and the second speech frame, and by the first speech frame output after decoding is processed; Or, during speech frame generation packet loss after the upper integration once receiving is processed, by the second speech frame output.

Preferably, to the processing of decoding respectively of the first speech frame after separation and the second speech frame, and by the first speech frame output after decoding is processed; Or, during speech frame generation packet loss after the upper integration once receiving is processed, the second speech frame output is comprised: to the processing of decoding of the first speech frame; To the processing of decoding of the second speech frame; Whether first speech frame of detection after current decoding is processed be continuous with the first speech frame after last time decoding is processed; At detecting unit, be output as while being, by the first speech frame output after current decoding is processed; Or, at detecting unit, be output as when no, switch to the second speech frame output after current decoding is processed.

Preferably, after switching to the second speech frame output after current decoding is processed, also comprise: when the second speech frame of output is one or more mute frame, switch to the first speech frame output after current decoding is processed.

Preferably, to the processing of decoding respectively of the first speech frame after separation and the second speech frame, and by the first speech frame output after decoding is processed; Or, during speech frame generation packet loss after the upper integration once receiving is processed, the second speech frame output is also comprised: the second speech frame after decoding is processed and the speech frame of last output are carried out to smoothing processing.

By the present invention, employing calculates respectively the original position of the code synchronism of each encryption algorithm according to the encoding time delay of different coding algorithm, according to the original position calculating, adopt respectively different encryption algorithms to the speech frame the collecting processing of encoding, in the speech frame collecting after current speech frame in the speech frame collecting adopting after a kind of encryption algorithm coding is processed and the default encryption algorithm coding that adopts another are processed, a upper speech frame is integrated and is processed and send, the method that has solved the primary and secondary coding in correlation technique cannot be by the media units of main coding data decode and the problem of passing through the media units seamless connection of inferior coding reduction, and then make primary and secondary coding techniques be adapted to better different code decode algorithms, with the cost of low time delay low bandwidth overhead, strengthened the anti-packet loss ability of voice transfer.

Accompanying drawing explanation

Accompanying drawing described herein is used to provide a further understanding of the present invention, forms the application's a part, and schematic description and description of the present invention is used for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:

Fig. 1 is according to the structured flowchart of the dispensing device of the speech frame of the embodiment of the present invention;

Fig. 2 is the structured flowchart of the dispensing device of speech frame according to the preferred embodiment of the invention;

Fig. 3 is the schematic diagram that obtains according to the preferred embodiment of the invention primary and secondary code synchronism starting point and integrate primary and secondary coded data;

Fig. 4 is the structural representation of the dispensing device of speech frame according to the preferred embodiment of the invention;

Fig. 5 is according to the structured flowchart of the receiving system of the speech frame of the embodiment of the present invention;

Fig. 6 is the structured flowchart of the receiving system of speech frame according to the preferred embodiment of the invention;

Fig. 7 is the schematic diagram of speech frame splicing according to the preferred embodiment of the invention and smoothing processing;

Fig. 8 is the structural representation of the receiving system of speech frame according to the preferred embodiment of the invention;

Fig. 9 is according to the flow chart of the sending method of the speech frame of the embodiment of the present invention;

Figure 10 is the flow chart of the sending method of speech frame according to the preferred embodiment of the invention;

Figure 11 is according to the flow chart of the method for reseptance of the speech frame of the embodiment of the present invention; And

Figure 12 is the flow chart of the method for reseptance of speech frame according to the preferred embodiment of the invention.

Embodiment

Hereinafter with reference to accompanying drawing, also describe the present invention in detail in conjunction with the embodiments.It should be noted that, in the situation that not conflicting, embodiment and the feature in embodiment in the application can combine mutually.

Fig. 1 is according to the structured flowchart of the dispensing device of the speech frame of the embodiment of the present invention.As shown in Figure 1, the dispensing device of this speech frame comprises: computing module 10, for calculate respectively the original position of the code synchronism of the first default encryption algorithm and the second default encryption algorithm according to the second encoding time delay of the first encoding time delay of the first default encryption algorithm and the second default encryption algorithm; Coding module 20, adopts respectively the first default encryption algorithm and the second default encryption algorithm to the speech frame the collecting processing of encoding for the original position according to calculating; Sending module 30, in the speech frame collecting after processing with the default encryption algorithm coding of employing second for the speech frame current speech frame collecting after the default encryption algorithm coding of employing first is processed, a upper speech frame is integrated and is processed and send.

In correlation technique, the method for primary and secondary coding cannot be by the media units of main coding data decode and the media units seamless connection of passing through inferior coding reduction.Adopt device as shown in Figure 1, employing calculates respectively the original position of the code synchronism of each encryption algorithm according to the encoding time delay of different coding algorithm, according to the original position calculating, adopt respectively different encryption algorithms to the speech frame the collecting processing of encoding, in the speech frame collecting after current speech frame in the speech frame collecting adopting after a kind of encryption algorithm coding is processed and the default encryption algorithm coding that adopts another are processed, a upper speech frame is integrated and is processed and send, the method that has solved the primary and secondary coding in correlation technique cannot be by the media units of main coding data decode and the problem of passing through the media units seamless connection of inferior coding reduction, and then make primary and secondary coding techniques be adapted to better different code decode algorithms, with the cost of low time delay low bandwidth overhead, strengthened the anti-packet loss ability of voice transfer.

In preferred implementation process, primary and secondary coding techniques can be adapted to different code decode algorithms better, for example: main coding can adopt the algorithm that broadband, high code check, voice quality are good, anti-packet loss ability is weak, and inferior coding can adopt arrowband, low code check, algorithm that anti-packet loss ability is strong.

In a preferred embodiment, transmitting terminal first can be according to two kinds of encryption algorithms of primary and secondary synchronous starting point of encoding and decoding time delay calculation code separately; Then according to the starting point of code synchronism separately to the media data the collecting row cache of going forward side by side of encoding successively; Finally the inferior coded data of the main coding data of the current speech frame collecting and historical frames is packed and sent.

Preferably, as shown in Figure 2, computing module 10 can comprise: acquiring unit 100, for obtaining respectively the first encoding time delay and the second encoding time delay; Computing unit 102, for poor according to the first encoding time delay and the second encoding time delay computation delay; Filler cells 104, for the first encoding time delay and the second encoding time delay are compared, adopting default encryption algorithm that encoding time delay is less encode before processing to collecting speech frame, before collecting speech frame, fill the quiet data that length is delay inequality.

In a preferred embodiment, Fig. 3 is the schematic diagram that obtains according to the preferred embodiment of the invention primary and secondary code synchronism starting point and integrate primary and secondary coded data.As shown in Figure 3, suppose that the time delay of main coding algorithm is Lmain, inferior encryption algorithm time delay is Linferior, L=|Lmain-Linferior|.Using L as primary and secondary, coding starting point is poor.Work as Linferior>Lmaintime, before main encoder input image data, needing to fill the quiet data that length is L, the synchronous point of follow-up every frame coding is n*F-L, wherein, if n*F-L<0 represents to fill n-1 complete mute frame, and the synchronous point of inferior coding is n*F; Work as Linferior<Lmaintime, detailed process is contrary to the above, repeats no more herein.

Below in conjunction with the preferred embodiment shown in Fig. 4, above-mentioned preferred implementation process is further described.

Fig. 4 is the structural representation of the dispensing device of speech frame according to the preferred embodiment of the invention.As shown in Figure 4, can comprise: collector, primary and secondary encoder, integrator and Packet Generation device, wherein, network can refer to IP packet network.Collector, for gathering speech frame; Primary and secondary encoder, encodes to the speech data gathering successively for the starting point of the code synchronism separately according to calculating; Integrator, for primary and secondary coded data being integrated into the load of real time transport protocol (Real-time Transport Protocol, referred to as RTP) bag, and constructs the header packet information of RTP bag; Transmitter, for sending to IP packet network by the VoP after integrating.

Fig. 5 is according to the structured flowchart of the receiving system of the speech frame of the embodiment of the present invention.As shown in Figure 5, the receiving system of this speech frame can comprise: receiver module 40, for receiving the speech frame of integrating after processing, wherein, the speech frame of integrating after processing comprises: from the first original position of presetting the code synchronism of encryption algorithm, adopt the first speech frame the first default encryption algorithm coding is processed and adopt the second speech frame the second default encryption algorithm coding is processed from the second original position of presetting the code synchronism of encryption algorithm, the second speech frame is the previous speech frame of the first speech frame; Parsing module 50, for the speech frame of integrating after processing is carried out to dissection process, carries out separated by the first speech frame with the second speech frame; Output module 60, for the processing of decoding respectively of the first speech frame to after separation and the second speech frame, and by the first speech frame output after decoding is processed; Or, during speech frame generation packet loss after the upper integration once receiving is processed, by the second speech frame output.

Adopt device as described in Figure 5, the method that has solved the primary and secondary coding in correlation technique cannot be by the media units of main coding data decode and the problem of passing through the media units seamless connection of inferior coding reduction, and then make primary and secondary coding techniques be adapted to better different code decode algorithms, with the cost of low time delay low bandwidth overhead, strengthened the anti-packet loss ability of voice transfer.

In a preferred embodiment, receiving terminal, after receiving the speech frame of transmitting terminal transmission, can first be placed in anti-jitter buffer; Secondly from anti-jitter buffer, obtain packet, detect whether packet loss phenomenon occurs simultaneously; If there is main coding data packet loss, in anti-jitter buffer, search the packet of the main coding synchronised of time coding and loss; If there is not main coding data packet loss, from VoP, parse primary and secondary coded data; Then primary and secondary decoder decoding and coding data separately.

Preferably, as shown in Figure 6, output module 60 can comprise: the first decoding unit 600, for processing that the first speech frame is decoded; The second decoding unit 602, for processing that the second speech frame is decoded; Detecting unit 604, whether continuous with the first speech frame after last time decoding is processed for detection of the first speech frame after current decoding is processed; Switch unit 606, while for being output as at detecting unit being, by the first speech frame output after current decoding is processed; Or, at detecting unit, be output as when no, switch to the second speech frame output after current decoding is processed.

In a preferred embodiment, if detect packet loss or according to strategy due to packet loss and main decoder cannot compensating missing data, switch to time state of decoder output data of playing; Otherwise current main decoder output data are input to broadcasting buffer memory.

Preferably, above-mentioned switch unit 606, while being also one or more mute frame for the second speech frame in output, switches to the first speech frame output after current decoding is processed.

In a preferred embodiment, second speech frame that can decode inferior decoder carries out VAD detection, if testing result is mute frame or continuous mute frame, switches to the state of playing main decoder output data; Otherwise last time the data of decoder output were input to broadcasting buffer memory.

Preferably, as shown in Figure 6, output module 60 can also comprise: smoothing processing unit 608, and for the second speech frame after decoding is processed and the speech frame of last output are carried out to smoothing processing.

In a preferred embodiment, if detect packet loss or according to strategy due to packet loss and main decoder cannot compensating missing data, the decoded speech frame of inferior decoder and the front broadcasting buffer memory speech frame that once outputs to are carried out to smoothing processing in joining place, and the frame after processing is outputed to broadcasting buffer memory; Otherwise, current main decoder output data are input to broadcasting buffer memory.

In a preferred embodiment, Fig. 7 is the schematic diagram of speech frame splicing according to the preferred embodiment of the invention and smoothing processing.As shown in Figure 7, inferior decoded data and the front data that are once input to broadcasting buffer memory are carried out to smoothing processing.Although be synchronous in time in transmitting terminal primary and secondary coded data, the speech frame restoring separately difference to some extent all in phase place and amplitude, therefore, need to carry out smoothing processing.

Below in conjunction with the preferred embodiment shown in Fig. 8, above-mentioned preferred implementation is further described.

Fig. 8 is the structural representation of the receiving system of speech frame according to the preferred embodiment of the invention.As shown in Figure 8, can comprise: anti-jitter buffer, resolver, primary and secondary decoder, switch controller and play-out buffer, wherein, anti-jitter buffer, for disturbing the speech frame buffer memory receiving and elimination; Resolver, for resolving RTP load separated primary and secondary coded data; Primary and secondary decoder, for decoding to primary and secondary coded data respectively; Switch controller, for switching and exporting play-out buffer between primary and secondary decoder.

Fig. 9 is according to the flow chart of the sending method of the speech frame of the embodiment of the present invention.As shown in Figure 9, the method can comprise following treatment step:

Step S902: the original position that calculates respectively the code synchronism of the first default encryption algorithm and the second default encryption algorithm according to the second encoding time delay of the first encoding time delay of the first default encryption algorithm and the second default encryption algorithm;

Step S904: adopt respectively the first default encryption algorithm and the second default encryption algorithm to the speech frame the collecting processing of encoding according to the original position calculating;

Step S906: a upper speech frame in the speech frame collecting after current speech frame in the speech frame collecting after the processing of the default encryption algorithm coding of employing first and the processing of the default encryption algorithm coding of employing second is integrated and processed and send.

Preferably, in step S902, the original position that calculates respectively the code synchronism of the first default encryption algorithm and the second default encryption algorithm according to the second encoding time delay of the first encoding time delay of the first default encryption algorithm and the second default encryption algorithm can comprise following operation:

Step S1: obtain respectively the first encoding time delay and the second encoding time delay;

Step S2: poor according to the first encoding time delay and the second encoding time delay computation delay;

Step S3: the first encoding time delay and the second encoding time delay are compared, adopting default encryption algorithm that encoding time delay is less encode before processing to collecting speech frame, fill the quiet data that length is delay inequality before collecting speech frame.

Below in conjunction with the preferred embodiment shown in Figure 10, above-mentioned preferred implementation process is further described.

Figure 10 is the flow chart of the sending method of speech frame according to the preferred embodiment of the invention.As shown in figure 10, the method can comprise the following steps:

Step S1002: collector collects a frame speech data, and the sample rate of image data is consistent with main coding algorithm, supposes that frame length is F;

Step S1004: obtain main coding synchronous point, suppose that the time delay of main coding algorithm is Lmain, inferior encryption algorithm time delay is Linferior, L=|Lmain-Linferior|.Using L as primary and secondary, coding starting point is poor.Work as Linferior>Lmaintime, before main encoder input image data, needing to fill the quiet data that length is L, the synchronous point of follow-up every frame coding is n*F-L, wherein, if n*F-L<0 represents to fill n-1 complete mute frame, and the synchronous point of inferior coding is n*F; Work as Linferior<Lmaintime, detailed process is contrary to the above, repeats no more herein;

Step S1006: adopt main coding algorithm to compile the data that length is F;

Step S1008: buffer memory main coding data;

Step S1010: obtain time code synchronism point;

Step S1012: judge that whether main coding sample rate is consistent with time coding, if so, forwards step S1016 to; Otherwise, continue execution step S1014;

Step S1014: by the data that convert time coded sample rate to of image data;

Step S1016: adopt time encryption algorithm to compile the data that length is F;

Step S1018: buffer memory time coded data;

Step S1020: primary and secondary coded data is integrated, and the strategy of integration can be determined according to Network Packet Loss model, preferably, the inferior coded data of current main coding data and former frame can be combined into a RTP load;

Step S1022: send RTP and wrap network.

Figure 11 is according to the flow chart of the method for reseptance of the speech frame of the embodiment of the present invention.As shown in figure 11, the method can comprise following treatment step:

Step S1102: receive the speech frame of integrating after processing, wherein, the speech frame of integrating after processing comprises: from the first original position of presetting the code synchronism of encryption algorithm, adopt the first speech frame the first default encryption algorithm coding is processed and adopt the second speech frame the second default encryption algorithm coding is processed from the second original position of presetting the code synchronism of encryption algorithm, the second speech frame is the previous speech frame of the first speech frame;

Step S1104: the speech frame of integrating after processing is carried out to dissection process, the first speech frame is carried out separated with the second speech frame;

Step S1106: to the processing of decoding respectively of the first speech frame after separation and the second speech frame, and by the first speech frame output after decoding is processed; Or, during speech frame generation packet loss after the upper integration once receiving is processed, by the second speech frame output.

Preferably, in step S1106, to the processing of decoding respectively of the first speech frame after separation and the second speech frame, and by the first speech frame output after decoding is processed; Or, during speech frame generation packet loss after the upper integration once receiving is processed, the second speech frame output can be comprised the following steps:

Step S4: to the processing of decoding of the first speech frame;

Step S5: to the processing of decoding of the second speech frame;

Step S6: whether the first speech frame detecting after current decoding is processed is continuous with the first speech frame after last time decoding is processed;

Step S7: be output as while being at detecting unit, by the first speech frame output after current decoding is processed; Or, at detecting unit, be output as when no, switch to the second speech frame output after current decoding is processed.

Preferably, at step S7, switch to after the second speech frame output after current decoding is processed, can also comprise following processing: when the second speech frame of output is one or more mute frame, switch to the first speech frame output after current decoding is processed.

Preferably, in step S1106, to the processing of decoding respectively of the first speech frame after separation and the second speech frame, and by the first speech frame output after decoding is processed; Or, during speech frame generation packet loss after the upper integration once receiving is processed, the second speech frame output can also be comprised to following operation: the second speech frame after decoding is processed and the speech frame of last output are carried out to smoothing processing.

Below in conjunction with the preferred embodiment shown in Figure 12, above-mentioned preferred implementation process is further described.

Figure 12 is the flow chart of the method for reseptance of speech frame according to the preferred embodiment of the invention.As shown in figure 12, the method can comprise the following steps:

Step S1202: obtain a RTP bag from anti-jitter buffer;

Step S1204: whether have packet loss according to front RTP sequence number inspection of once obtaining, if there is packet loss, continue execution step S1206; Otherwise, forward step S1208 to;

Step S1206: notice primary and secondary decoder has packet loss, carries out Discarded Packets compensation processing;

Step S1208: carry out the load of RTP bag and resolve, and isolate primary and secondary coded data;

Step S1210: to main coding decoding data;

Step S1212: buffer memory main decoder data;

Step S1214: inferior coded data is decoded;

Step S1216: judge whether main decoder data sampling rate is consistent with time decoding, if so, continue execution step S1218; Otherwise, forward step S1220 to;

Step S1218: the sample rate that inferior decoded data sample rate is converted into main decoder data;

Step S1220: buffer memory time decoded data;

Step S1222: judge that whether current broadcast state is to play main decoder data, if so, continues execution step S1224; Otherwise, forward step S1232 to;

Step S1224: judge the current packet loss that whether has, if so, continue execution step S1226; Otherwise, forward step S1240 to;

Step S1226: inferior decoded data and the front data that are once input to broadcasting buffer memory are carried out to smoothing processing.Although be synchronous in time in transmitting terminal primary and secondary coded data, the speech frame restoring separately all may have difference in phase place and amplitude, therefore need smoothing processing;

Step S1228: broadcast state is switched to and plays time decoded data;

Step S1230: inferior decoded data is inputed to and plays buffer memory broadcasting, and flow process finishes;

Step S1232: inferior decoded data is carried out to quiet detection;

Step S1234: judgement time decoded data is mute frame or continuous mute frame, if so, continues execution step S1236; Otherwise, forward step S1230 to;

Step S1236: main decoder data and the front data that are once input to broadcasting buffer memory are carried out to smoothing processing;

Step S1238: broadcast state is switched to and plays main decoder data;

Step S1240: main decoder data are inputed to and play buffer memory broadcasting, and flow process finishes, and exits.

From above description, can find out, above-described embodiment has been realized following technique effect (it should be noted that these effects are effects that some preferred embodiment can reach): the present invention has taken into full account the time delay of primary and secondary encryption algorithm, in conjunction with level and smooth splicing, and the complex art of algorithm changeover, reach superpower anti-packet loss ability, thereby promoted user's experience of real-time multimedia communication business.

Obviously, those skilled in the art should be understood that, above-mentioned each module of the present invention or each step can realize with general calculation element, they can concentrate on single calculation element, or be distributed on the network that a plurality of calculation elements form, alternatively, they can be realized with the executable program code of calculation element, thereby, they can be stored in storage device and be carried out by calculation element, and in some cases, can carry out shown or described step with the order being different from herein, or they are made into respectively to each integrated circuit modules, or a plurality of modules in them or step are made into single integrated circuit module to be realized.Like this, the present invention is not restricted to any specific hardware and software combination.

The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.

Claims

1. a dispensing device for speech frame, is characterized in that, comprising:

Computing module, for calculating respectively the original position of the code synchronism of the described first default encryption algorithm and the described second default encryption algorithm according to the second encoding time delay of the first encoding time delay of the first default encryption algorithm and the second default encryption algorithm;

Coding module, adopts respectively the described first default encryption algorithm and the second default encryption algorithm to the speech frame the collecting processing of encoding for the described original position according to calculating;

Sending module, integrates and processes and send for a upper speech frame in the speech frame collecting described in by the speech frame current speech frame collecting described in after adopting the described first default encryption algorithm coding to process and after adopting the described second default encryption algorithm coding to process.

2. device according to claim 1, is characterized in that, described computing module comprises:

Acquiring unit, for obtaining respectively described the first encoding time delay and described the second encoding time delay;

Computing unit, for poor according to described the first encoding time delay and described the second encoding time delay computation delay;

Filler cells, for described the first encoding time delay and described the second encoding time delay are compared, adopt default encryption algorithm that encoding time delay is less encode before processing to the described speech frame that collects, described, filling the quiet data that length is described delay inequality before collecting speech frame.

3. a receiving system for speech frame, is characterized in that, comprising:

Receiver module, for receiving the speech frame of integrating after processing, wherein, speech frame after described integration processing comprises: from the first original position of presetting the code synchronism of encryption algorithm, adopt the first speech frame the described first default encryption algorithm coding is processed and adopt described second the second speech frame of presetting encryption algorithm coding is processed from the second original position of presetting the code synchronism of encryption algorithm, described the second speech frame is the previous speech frame of described the first speech frame;

Parsing module, carries out dissection process for the speech frame after described integration is processed, and described the first speech frame is carried out separated with described the second speech frame;

Output module, for the processing of decoding respectively of described the first speech frame to after separation and described the second speech frame, and by described the first speech frame output after decoding is processed; Or, during speech frame generation packet loss after the upper described integration once receiving is processed, described the second speech frame is exported.

4. device according to claim 3, is characterized in that, described output module comprises:

The first decoding unit, for processing that described the first speech frame is decoded;

The second decoding unit, for processing that described the second speech frame is decoded;

Detecting unit, whether continuous with the first speech frame after last time decoding is processed for detection of described the first speech frame after current decoding is processed;

Switch unit, while for being output as at described detecting unit being, by described the first speech frame output after described current decoding is processed; Or, at described detecting unit, be output as when no, switch to described the second speech frame output after described current decoding is processed.

5. device according to claim 4, is characterized in that, described switch unit, while being also one or more mute frame for described the second speech frame in output, switches to described the first speech frame output after described current decoding is processed.

6. according to the device described in any one in claim 3 to 5, it is characterized in that, described output module also comprises:

Smoothing processing unit, for carrying out smoothing processing by described the second speech frame after decoding is processed and the speech frame of last output.

7. a sending method for speech frame, is characterized in that, comprising:

According to the second encoding time delay of the first encoding time delay of the first default encryption algorithm and the second default encryption algorithm, calculate respectively the original position of the code synchronism of the described first default encryption algorithm and the described second default encryption algorithm;

According to the described original position calculating, adopt respectively the described first default encryption algorithm and the second default encryption algorithm to the speech frame the collecting processing of encoding;

In the speech frame collecting described in by current speech frame in the speech frame collecting described in after adopting the described first default encryption algorithm coding to process and after adopting the described second default encryption algorithm coding to process, a upper speech frame is integrated and is processed and send.

8. method according to claim 7, it is characterized in that, the original position that calculates respectively the described code synchronism of the described first default encryption algorithm and the second default encryption algorithm according to the second encoding time delay of the first encoding time delay of the described first default encryption algorithm and the described second default encryption algorithm comprises:

Obtain respectively described the first encoding time delay and described the second encoding time delay;

Poor according to described the first encoding time delay and described the second encoding time delay computation delay;

Described the first encoding time delay and described the second encoding time delay are compared, adopt default encryption algorithm that encoding time delay is less encode before processing to the described speech frame that collects, described, filling the quiet data that length is described delay inequality before collecting speech frame.

9. a method of reseptance for speech frame, is characterized in that, comprising:

Receive the speech frame of integrating after processing, wherein, speech frame after described integration processing comprises: from the first original position of presetting the code synchronism of encryption algorithm, adopt the first speech frame the described first default encryption algorithm coding is processed and adopt described second the second speech frame of presetting encryption algorithm coding is processed from the second original position of presetting the code synchronism of encryption algorithm, described the second speech frame is the previous speech frame of described the first speech frame;

Speech frame after described integration is processed carries out dissection process, and described the first speech frame is carried out separated with described the second speech frame;

To the processing of decoding respectively of described the first speech frame after separation and described the second speech frame, and by described the first speech frame output after decoding is processed; Or, during speech frame generation packet loss after the upper described integration once receiving is processed, described the second speech frame is exported.

10. method according to claim 9, is characterized in that, to the processing of decoding respectively of described the first speech frame after separation and described the second speech frame, and by described the first speech frame output after decoding is processed; Or, during speech frame generation packet loss after the upper described integration once receiving is processed, will described the second speech frame export and comprise:

To the processing of decoding of described the first speech frame;

To the processing of decoding of described the second speech frame;

Whether described first speech frame of detection after current decoding is processed be continuous with the first speech frame after last time decoding is processed;

At described detecting unit, be output as while being, by described the first speech frame output after described current decoding is processed; Or, at described detecting unit, be output as when no, switch to described the second speech frame output after described current decoding is processed.

11. methods according to claim 10, is characterized in that, after switching to described the second speech frame output after described current decoding is processed, also comprise:

When described second speech frame of output is one or more mute frame, switch to described the first speech frame output after described current decoding is processed.

12. according to the method described in any one in claim 9 to 11, it is characterized in that, and to the processing of decoding respectively of described the first speech frame after separation and described the second speech frame, and by described the first speech frame output after decoding is processed; Or, during speech frame generation packet loss after the upper described integration once receiving is processed, will described the second speech frame export and also comprise:

Described the second speech frame after decoding is processed and the speech frame of last output are carried out to smoothing processing.