CN100459696C - Audio mixed processing method and processor - Google Patents

Audio mixed processing method and processor Download PDF

Info

Publication number
CN100459696C
CN100459696C CNB2006100629521A CN200610062952A CN100459696C CN 100459696 C CN100459696 C CN 100459696C CN B2006100629521 A CNB2006100629521 A CN B2006100629521A CN 200610062952 A CN200610062952 A CN 200610062952A CN 100459696 C CN100459696 C CN 100459696C
Authority
CN
China
Prior art keywords
terminal
encoder
audio
volume maximum
maximum
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CNB2006100629521A
Other languages
Chinese (zh)
Other versions
CN1941891A (en
Inventor
梁丽燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CNB2006100629521A priority Critical patent/CN100459696C/en
Publication of CN1941891A publication Critical patent/CN1941891A/en
Application granted granted Critical
Publication of CN100459696C publication Critical patent/CN100459696C/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The method comprises: when a terminal with max voice volume varies, the coding control is respectively made for the audio signals of terminals with max voice volume before and after variation of voice volume. The invention also reveals a mixed sound processing apparatus comprises a decoder, a mixing module, an encoding module and a decoder switching module.

Description

A kind of audio mixing processing method and device thereof
Technical field
The present invention relates to the Audio Signal Processing field, specifically, relate to a kind of audio mixing processing method and device thereof.
Background technology
Along with the application more and more widely of video conference, more and more higher to the processing resource requirement of video signal conference system MCU (multipoint control unit).And in the finite element network bandwidth resources with do not reduce under the prerequisite of audio quality, the minimizing of Audio Processing resource can better meet high-quality and look audio protocols is handled or same Audio Processing resource realizes the requirement that more Audio Processing inserts.During the audio mixing of MCU is handled in traditional video conference, the sound that most of terminal that inserts is heard under a lot of situations all is the same, promptly can unify to handle the replacement individual processing to the terminal of this part, this provides very big space just for saving Audio Processing resource.
In the tradition video conference, as shown in Figure 1, MCU handles looking audio frequency media, make and to hear sound between the terminal that participates in a conference mutually and see image, wherein audio-frequency unit is handled the function that main realization respectively inserts audio mixing between the meeting-place, promptly make in the meeting sound of can field energy hearing the speech meeting-place, speech also can be heard sound mutually between the meeting-place, thereby realizes the purpose of long-range interchange.
Existing solution one:
Audio Processing mainly comprises three parts: decoding processing, audio mixing are handled and encoding process.Wherein decoding processing is that audio decoder is carried out in all access meeting-place, and purpose is to obtain the original voice data in all meeting-place; Audio mixing is handled several sides meeting-place of at first these meeting field data being carried out envelope calculating and relatively obtaining making a speech in the meeting and (is decided to be maximum tripartite herein approximately, the i.e. three parts of the speech volume maximum that only each meeting-place terminal collects in meeting, can be heard by other meeting-place), sound to the maximum tripartite meeting-place in the meeting carries out audio mixing then, exactly the voice data in the maximum tripartite meeting-place of volume is stacked up and give maximum tripartite other all meeting-place in addition in the meeting, be the sound that the tripartite meeting-place of volume maximum all can be heard in other meeting-place, the data that superpose in twos in the tripartite meeting-place of volume maximum are given the meeting-place of another one volume maximum, i.e. the arbitrary sound that can hear other two sides among the maximum three parts of volume; Encoding process mainly is that the voice data that the decoding of process and the audio mixing in each meeting-place are handled is encoded, and exports the meeting-place to.
As shown in Figure 2, supposing has terminal 1,2,3,4,5...N in the meeting, and volume maximum tripartite corresponding be terminal 1,2,3, during Audio Processing, at first can decode the data of receiving all terminals.Then, in audio mixing is handled, at first the envelope to all meeting-place calculates, relatively obtain the maximum tripartite terminal 1 of volume, 2,3, so export to the data of terminal 1 is the data stack of meeting- place 2 and 3, the data of exporting to terminal 2 are the data stack of meeting- place 1 and 3, the data of exporting to terminal 3 are the data stack of meeting- place 1 and 2, other- end outlet terminal 1,2 and 3 superimposed data, if next maximum three parts has constantly become terminal 2,3,5, terminal 2 is just heard the sound of terminal 3 and 5 so, and terminal 3 is just heard the sound of terminal 2 and 5, and terminal 5 is heard the sound of terminal 2 and 3, other-end is heard terminal 2,3 and 5 sound, the rest may be inferred for other situations.After encoding to the data of each terminal, last encoding process part exports to corresponding terminal.So just finished the sound mixing function of a meeting.
The shortcoming of existing solution one:
In above-mentioned technology, the meeting-place relative fixed of participating in speech in the next meeting of a lot of situations, particularly opening under the situation of Great Council, what hear under the most of situation of terminal like this all is maximum tripartite sound, if for the resource of an encoder of each terminal distribution to the output of encoding of same data, the coder resource that needs is the number N that accesses terminal, when accessing terminal number N value than greatly the time, will cause the wasting of resources, thereby increase cost.
Existing solution two:
Technology two is to improve on the basis of prior art one, and its core concept is exactly the encoder that as far as possible merges same treatment, makes resource utilization reach the highest.As shown in Figure 3, terminal 1,2,3 is maximum tripartite and can keep a period of time (being assumed to be more than the 2s) as volume, then terminal 4,5...N coding needs outlet terminal 1,2 and the 3 voice data superimposed data that produce, so only need be with an encoder to terminal 1,2 and 3 voice datas that produce superpose, just can satisfy terminal 4,5...N output, and other three encoders are terminal 1 to exporting to the maximum three parts of speech volume respectively, 2,3 data are encoded, be encoder C1, C2, the corresponding coding of C3 is given the tripartite meeting-place of volume maximum in the meeting, the meeting-place beyond the corresponding three parts who gives the volume maximum that encodes of encoder C4.In this case, the coder resource that needs is 1+3=4, and when the number N value that accesses terminal was big, this programme can be saved very most of resource.
The shortcoming of prior art two:
Top situation is the maximum tripartite processing in one case of hypothesis volume, if the meeting-place of making a speech in meeting changes, be that the maximum three parts of volume during corresponding audio mixing is handled changes, maximum three parts is changed to terminal 1,4,5 as volume, the audio frequency superimposed data of this moment terminal 1 outlet terminal 4 and 5 generations, terminal 4 outlet terminals 1 and the 5 audio frequency superimposed data that produce, terminal 5 outlet terminals 1 and the 4 audio frequency superimposed data that produce, other-end outlet terminal 1,4 and the 5 audio frequency superimposed data that produce.
Yet the maximum three parts in this meeting changes, and when causing the encoder that sends to terminal to switch, because the state of encoder is relevant before and after being, directly switching to influence sound effect, thus the sound effect variation that causes terminal to be heard.For example in above-mentioned Fig. 3, the three parts of volume maximum is varied to 1,4,5 from original 1,2,3, then concerning terminal 2, be originally that encoder C2 encodes to the data that send to terminal 2, after maximum three parts switches to 1,4,5, the data that send to terminal 2 become the coding with encoder C4, then sound that terminal 2 is heard will be in a period of time of switching variation.For terminal 3,4,5 also same problem can appear.
Summary of the invention
For overcoming the above problems, the invention provides a kind of audio mixing processing method and device, when avoiding the maximum three parts of volume in the meeting to change, the problem of the sound effect variation of hearing in terminal.
A kind of audio mixing processing method provided by the invention comprises: when the terminal of volume maximum changes, to export to change before and the coding control carried out respectively of the audio signal of the terminal of afterwards volume maximum.
Wherein, the described coding control of carrying out respectively comprise to before changing and the terminal of volume maximum afterwards separately the distributing independent encoder come described coding audio signal control.
The present invention comprises further that also the terminal retention time of volume maximum merges the encoder of same process above after certain threshold value.
Wherein said same process includes identical input and output signal.
And the terminal of described volume maximum is meant a strongest side or the above corresponding terminal of a side of audio signal that is input to multipoint control unit MCU.
Audio mixing processing unit disclosed by the invention comprises decoder, audio mixing module, encoder and encoder hand-off process module; Wherein:
Decoder: the audio frequency that receives is carried out audio decoder, obtain original voice data;
Audio mixing module: described voice data through decoder processes is carried out envelope calculate, several sides of volume maximum are carried out audio mixing handle;
Encoder: the voice data after handling through audio mixing is encoded;
Encoder hand-off process module: quantity and handoff procedure to the encoder that carries out encoding process are controlled.
Wherein, the described process that the quantity and the handoff procedure of encoder are controlled is: when the terminal of volume maximum changes, when the terminal of volume maximum changes, to export to change before and the coding control carried out of distributing independent encoder respectively of the audio signal of the terminal of afterwards volume maximum; Behind the certain hour, merge the encoder of same process.And when the terminal of volume maximum changed, the information exchange of the described terminal corresponding codes of described encoder hand-off process module controls device kept continuously the information of encoder and state.
Utilize the present invention, when the terminal of volume maximum changes, after the voice data that accesses terminal being decoded and audio mixing handles, to the terminal of the volume maximum before and after changing distributing independent encoder separately, give corresponding terminal after the coding audio signal control and treatment to its output.In the quantity that has controlled encoder, guaranteed speech quality.
Description of drawings
Fig. 1 is the video conference networking diagram;
Fig. 2 handles schematic diagram for audio-frequency unit;
Fig. 3 merges the processing schematic diagram of encoder for audio frequency;
Fig. 4 is an audio frequency processing system frame diagram of the present invention;
Fig. 5 is an embodiment of the invention Audio Processing flow chart.
Embodiment
Core concept of the present invention is exactly at the encoder that merges same treatment as far as possible, simultaneously when several sides of volume maximum change, the quantity and the handoff procedure of the encoder that the audio signal that sends to terminal is handled are controlled, and guarantee the terminal audio frequency output quality when saving encoder quantity.
Audio mixing treatment system provided by the invention is to carrying out audio mixing after the terminal data decoding that receives, output after controlled encoder is encoded to audio signal then.This system comprises decoder module, audio mixing module, coding module and encoder hand-off process module.As shown in Figure 4, wherein
Decoder module: the audio frequency that receives is carried out audio decoder, obtain original voice data;
Audio mixing module: voice data is carried out envelope calculate, several sides of volume maximum are carried out audio mixing handle;
Coding module: encode to handling the original voice data in back through audio mixing;
Encoder hand-off process module: quantity and handoff procedure to the encoder that sends to terminal are controlled.
The encoder changing method that the present invention adopts, when the terminal of volume maximum changes, to export to change before and the coding control carried out respectively of the audio signal of the terminal of afterwards volume maximum.After keeping a period of time, merge the encoder of same treatment.
With a specific embodiment the present invention is specifically described below, as shown in Figure 5:
Suppose that a meeting has meeting- place 1,2,3,4,5,6,7,8,9,10, the maximum tripartite terminal of certain volume constantly is 1,2,3, and the maximum tripartite terminal of next moment volume is 1,5,6, and keeps more than the 2s.
When the maximum three parts of volume is terminal 1,2,3, used 4 encoders, wherein give the volume in the meeting maximum tripartite respectively for 3, promptly the C1 encoder is distributed to terminal 1, the data of encoding terminal 2 and 3 stacks; The C2 encoder is distributed to terminal 2, the data of encoding terminal 1 and 3 stacks; The C3 encoder is distributed to terminal 3, the data of encoding terminal 1 and 2 stacks; Another one is given the terminal beyond the three parts of other volume maximums, i.e. terminal 4,5,6,7,8,9,10 shared encoder C4, the data of encoding terminal 1,2 and 3 stacks.
Maximum three parts is changed in 1,5,6 in volume, in order to keep the continuous of encoder, distributed encoder C5, C6 in addition for respectively new terminal 5 and 6 of participating in audio mixing, the information that copies terminal 5 and 6 encoder C4 is before simultaneously given C5 and C6, and coded message and encoding state thereof in terminal 5 and 6 are kept continuously.In addition for terminal 2 and 3, though it is the same with the data of giving terminal 4,7,8,9,10 with 3 data to give terminal 2, but for the influence that encoder switches sound is reduced, so temporarily keep the encoder of terminal 2 and 3, the encoder that other-end 4,7,8,9,10 uses is constant, uses encoder C4.The terminal 1 original encoder C1 that uses does not change yet.Like this, when the maximum three parts of volume changed into terminal 1,5,6, the number of the encoder that uses was 6 altogether.
If the maximum three parts of volume is a terminal 1,5,6 state (is assumed to be 2s more than keeping 2s, purpose is that the switching that guarantees encoder does not influence sound effect as far as possible), for terminal 2 and 3, its encoder C2 is the same with encoder C4 coded data with the C3 coded data, through after a while (2s) synchronously after, can think the state consistency of state fundamental sum encoder C4 of encoder C2 and C3, so just can reclaim encoder C2 and C3, and encoder C4 coded data is given terminal 2 and 3 simultaneously, it is terminal 2,3,4,7,8,9,10 shared encoder C4, the maximum tripartite terminal of volume is used encoder separately respectively, and the encoder number of this moment becomes 4 again.
If the maximum tripartite terminal of volume changes again in the time of 2s,,, otherwise do not need newly-increased encoder then for it distributes new encoder if the maximum tripartite terminal of volume did not have independently encoder originally.For the terminal corresponding codes device beyond the three parts of volume maximum, if coded data is the same with shared encoder C4 coded data and the duration reaches 2s when above, can reclaim terminal corresponding codes device, and export with common encoder C4 coded data.The rest may be inferred for other situations.
In sum, audio mixing treatment system in the video conference can be divided into several sections such as decoding, audio mixing, encoder switching, coding, after the voice data that accesses terminal being decoded and audio mixing handles, according to the method output needs coded data that above-mentioned encoder switches, then these are carried out giving corresponding terminal after the encoding process.When having guaranteed speech quality, controlled the quantity of encoder.
The above; only for the preferable embodiment of the present invention, but protection scope of the present invention is not limited thereto, and anyly is familiar with those skilled in the art in the technical scope that the present invention discloses; the variation that can expect easily or replacement all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection range of claim.

Claims (6)

1, a kind of audio mixing processing method, it is characterized in that, when the terminal of volume maximum changes, give before changing and the terminal of volume maximum afterwards distributing independent encoder separately, to export to change before and the coding control independently carried out respectively of the audio signal of the terminal of afterwards volume maximum;
After the terminal retention time of volume maximum surpasses certain threshold value, merge the encoder of same process.
2, method according to claim 1 is characterized in that, described same process includes identical input and output signal.
3, method according to claim 1 is characterized in that, the terminal of described volume maximum is meant a strongest side or the above corresponding terminal of a side of audio signal that is input to multipoint control unit MCU.
4, a kind of audio mixing processing unit is characterized in that, this device comprises decoder, audio mixing module, encoder and encoder hand-off process module; Wherein:
Decoder: the audio frequency that receives is carried out audio decoder, obtain original voice data;
Audio mixing module: described voice data through decoder processes is carried out envelope calculate, several sides of volume maximum are carried out audio mixing handle;
Encoder: the voice data after handling through audio mixing is encoded;
Encoder hand-off process module: when the terminal of volume maximum changes, to export to change before and the coding control carried out of distributing independent encoder respectively of the audio signal of the terminal of afterwards volume maximum, after the terminal retention time of volume maximum surpasses certain threshold value, merge the encoder of same process.
5, device according to claim 4 is characterized in that, described same process includes identical input and output signal.
6, device according to claim 4 is characterized in that, when the terminal of volume maximum changed, the information exchange of the described terminal corresponding codes of described encoder hand-off process module controls device kept continuously the information of encoder and state.
CNB2006100629521A 2006-09-29 2006-09-29 Audio mixed processing method and processor Active CN100459696C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006100629521A CN100459696C (en) 2006-09-29 2006-09-29 Audio mixed processing method and processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100629521A CN100459696C (en) 2006-09-29 2006-09-29 Audio mixed processing method and processor

Publications (2)

Publication Number Publication Date
CN1941891A CN1941891A (en) 2007-04-04
CN100459696C true CN100459696C (en) 2009-02-04

Family

ID=37959612

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100629521A Active CN100459696C (en) 2006-09-29 2006-09-29 Audio mixed processing method and processor

Country Status (1)

Country Link
CN (1) CN100459696C (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101466043B (en) * 2008-12-30 2010-12-15 华为终端有限公司 Method, equipment and system for processing multipath audio signal
CN102118523A (en) * 2009-12-30 2011-07-06 北京大唐高鸿数据网络技术有限公司 Mixing control method for centralized teleconference

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1013556A (en) * 1996-06-21 1998-01-16 Oki Electric Ind Co Ltd Video conference system
US6008838A (en) * 1996-08-29 1999-12-28 Nec Corporation Multi-point video conference system
CN2483899Y (en) * 2001-02-16 2002-03-27 成都大志科技有限公司 Multi-point video telephone meeting controllor
CN1510898A (en) * 2002-12-23 2004-07-07 华为技术有限公司 Mixed speech processing method
US7007098B1 (en) * 2000-08-17 2006-02-28 Nortel Networks Limited Methods of controlling video signals in a video conference
CN1805006A (en) * 2006-01-24 2006-07-19 北京邮电大学 Quick and real-time sound mixing method for multimedia conference

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1013556A (en) * 1996-06-21 1998-01-16 Oki Electric Ind Co Ltd Video conference system
US6008838A (en) * 1996-08-29 1999-12-28 Nec Corporation Multi-point video conference system
US7007098B1 (en) * 2000-08-17 2006-02-28 Nortel Networks Limited Methods of controlling video signals in a video conference
CN2483899Y (en) * 2001-02-16 2002-03-27 成都大志科技有限公司 Multi-point video telephone meeting controllor
CN1510898A (en) * 2002-12-23 2004-07-07 华为技术有限公司 Mixed speech processing method
CN1805006A (en) * 2006-01-24 2006-07-19 北京邮电大学 Quick and real-time sound mixing method for multimedia conference

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
视频会议中音频多点处理器的研究. 涂卫平,胡瑞敏,艾浩军,谢兄.武汉大学学报信息科学版,第27卷第1期. 2002
视频会议中音频多点处理器的研究. 涂卫平,胡瑞敏,艾浩军,谢兄.武汉大学学报信息科学版,第27卷第1期. 2002 *

Also Published As

Publication number Publication date
CN1941891A (en) 2007-04-04

Similar Documents

Publication Publication Date Title
CN101466043B (en) Method, equipment and system for processing multipath audio signal
US7054820B2 (en) Control unit for multipoint multimedia/audio conference
CN101448152B (en) Multipath video processing method and system, terminal and medium server thereof
CN112104836A (en) Audio mixing method, system, storage medium and equipment for audio server
CN101656863A (en) Conference control method, device and system
CN101370139A (en) Method and device for switching channels
CN109479113A (en) For using the method and apparatus for compressing parallel codec in multimedia communication
CN101051465B (en) Method and apparatus for decoding layer encoded data
CN101146208A (en) Terminal with built-in multiple-point control unit and its communication method
CN102915736B (en) Mixed audio processing method and stereo process system
CN102118523A (en) Mixing control method for centralized teleconference
CN100459696C (en) Audio mixed processing method and processor
CN102624743A (en) Resource allocation method of media server
CN101502043B (en) Method for carrying out a voice conference, and voice conference system
CN100466671C (en) Method and device for switching speeches
CN102223569A (en) Multi-path video processing method
CN110971862B (en) Video conference broadcasting method and device
CN112019488B (en) Voice processing method, device, equipment and storage medium
CN100388780C (en) Code flow bandwidth equalizing method
CN103051556A (en) Stream media data control system and method thereof
CN203206388U (en) Multi-point control unit used for video conferences
CN105049889A (en) Video online transcoding system and transcoding method thereof
CN102158917A (en) Handoffs between different voice encoder systems
CN209659457U (en) A kind of device with the direct-connected recorded video meeting of video conference terminal
CN1581969A (en) Rate adapting method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant