CN103310789B - A kind of sound event recognition method of the parallel model combination based on improving - Google Patents

A kind of sound event recognition method of the parallel model combination based on improving Download PDF

Info

Publication number
CN103310789B
CN103310789B CN201310239724.7A CN201310239724A CN103310789B CN 103310789 B CN103310789 B CN 103310789B CN 201310239724 A CN201310239724 A CN 201310239724A CN 103310789 B CN103310789 B CN 103310789B
Authority
CN
China
Prior art keywords
sound event
model
noise
template
spectral domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310239724.7A
Other languages
Chinese (zh)
Other versions
CN103310789A (en
Inventor
刘宏
王一
李晓飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Shenzhen Graduate School
Original Assignee
Peking University Shenzhen Graduate School
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Shenzhen Graduate School filed Critical Peking University Shenzhen Graduate School
Priority to CN201310239724.7A priority Critical patent/CN103310789B/en
Publication of CN103310789A publication Critical patent/CN103310789A/en
Application granted granted Critical
Publication of CN103310789B publication Critical patent/CN103310789B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention relates to a kind of sound event recognition method of the parallel model combination based on improving, its step comprises: the 1) data of recorded voice event, obtains GMM gauss hybrid models, set up clean sound event template according to clean sound event training; 2) under the true noisy environment in indoor, obtain the noise data in current environment, obtain GMM according to noise data training, set up noise template; 3) to the method that noise template and clean sound event template adopt the parallel model improved to merge, band noise sound event-template is obtained; 4) sampling obtains band and to make an uproar sound event sample signal, carries out voice recognition according to the parameter in band noise sound event-template to sample signal.The present invention is used as an input in PMC method by setting up the GMM that better can describe the distribution of ground unrest feature, sets up clean GMM another input as PMC of 5 kinds of sound events.This invention ensures that the robustness of recognition system to noise simultaneously.

Description

A kind of sound event recognition method of the parallel model combination based on improving
Technical field
The invention belongs to intelligent monitoring sound intermediate frequency signal transacting field, relate to sound event recognition method in indoor environment, being specifically related to a kind of sound event recognition method of the parallel model combination based on improving.
Background technology
Relative to audio recognition method ripe in artificial intelligence field, computing machine is utilized to be in recent years in relatively recent studies on direction to carry out the identification of sound event.Sound event identification for be in physical environment sounding there is certain implication or the sound event that can reflect people's behavior, judge automatically and sort out.In family intelligent monitoring system, situation about occurring in the monitoring family room environment that the identification of sound event can help people long-range, and inform which type of event user creates in time, be conducive to user and process in time.But being there is complicated noise in real environment, wanting to realize the effective monitoring under true environment, is necessary with urgent to the process of noise.
First, the identification of sound event belongs to the problem of a pattern-recognition, is similar to automatic speech recognition.Basic method is signal transacting and pattern-recognition.Existing sound event recognition method comprises following step:
(1) typing of sound event signal, pre-filtering, analog to digital conversion.First the analoging sound signal of typing is carried out pre-filtering, high-pass filtering suppresses 50HZ power supply noise signal; Low-pass filtering filtering sound signal intermediate frequency rate component exceedes the part of sample frequency half, prevents aliasing from disturbing.Analoging sound signal to be sampled and quantification obtains digital signal.
(2) framing, windowing.Voice signal is the same with voice signal, all has overall non-stationary, the short-term stationarity of local stationary, analogous terms tone signal, can think that voice signal is stable in 10 ~ 30ms, voice signal can be carried out framing according to the length of 30ms.Utilize window function to carry out the extraction of signal during framing, its selection of window function (shape and length) is very large to the properties influence of short-time analysis parameter, and conventional window function comprises rectangular window, Hanning window and Hamming window etc.Generally select Hamming window, the characteristic variations of voice signal can be reacted well.
(3) feature extraction.The feature of different sound events is not identical, wants to distinguish different voice signals, will carry out mathematical description to the feature of voice signal.The feature of conventional sound event identification has temporal signatures: short-time energy, short-time zero-crossing rate.Frequency domain character: sub belt energy, wavelet time frequency feature.Cepstral domain feature: linear prediction residue error (LPCC), mel-frequency cepstrum coefficient (MFCC) etc.
(4) identify.The recognition methods of sound event is also adopt the algorithm being similar to speech recognition.Conventional sound event knowledge method for distinguishing has the classification based on support vector machine (SVM), based on mixed Gauss model (GMM) clustering method, and hidden Markov model (HMM) method, Bayesian Classification Arithmetic.
Secondly, to the process of noise.When above-mentioned told about recognition methods is applied in actual environment, the performance of recognition system sharply can worsen along with the mismatch of training data and test data, and causes the reason of described mismatch to be exactly the impact of neighbourhood noise.Not mating of the training and testing caused by noise can be analyzed from signal space, feature space and the model space three spaces.Conventional method has that the sound being similar to speech enhan-cement strengthens method, robust features extractions, feature compensation, model compensation such as the methods such as parallel model combination (PMC) process noise, the robustness of raising system.
Existing method major part still continues to use a set of of speech recognition, to the process of noise also nothing more than above several method, can abundant describe environment noise and being widely adopted based on the method for PMC in above method, they fully can excavate the information in environment, improve the robustness of system identification, but describe for noise characteristic single Gauss model (SGM) in existing PMC method, for noise ratio comparatively complicated situation, SGM can not characterize the characteristic of noise very well.So discrimination is not ideal enough under noise complicated situation.
Summary of the invention
In order to solve the problems of the technologies described above, the object of the present invention is to provide a kind of method of the model parameter fusion by improving to obtain meeting the band noise sound event model of noise circumstance, the sound event to be identified under actual noise environment is identified.
In order to realize above-mentioned object, technical solution of the present invention is: a kind of sound event recognition method of the parallel model combination based on improving, and its step comprises:
1) obtain GMM gauss hybrid models according to clean sound event training, set up clean sound event template;
2) obtain GMM gauss hybrid models according to noise data training, set up noise template;
3) to the method that described noise template and described clean sound event template adopt parallel model to merge, band noise sound event-template is obtained;
4) sampling obtains band and to make an uproar sound event sample signal, carries out voice recognition according to the parameter in described band noise sound event-template to sample signal.
Further, the method setting up the template of clean sound event is as follows:
1) data of recorded voice event under without quiet indoor environment of making an uproar, again carry out framing, windowing process after carrying out pre-filtering, analog to digital conversion to the sound event recorded;
2) extract MFCC mel cepstrum coefficients feature, train the GMM Gaussian Mixture template of sound event.
Further, described gauss hybrid models adopts EM Algorithm for Training and upgrades the parameter of Gauss model, and the GMM parameter of training the clean sound event obtained is λ x={ w xk, μ xk, Σ xk, k=1,2 ..., M, wherein, w xkrepresent the hybrid weight of clean sound event model, μ xkrepresent the average of clean sound event model, Σ xkrepresent the variance of clean sound event model, M represents the exponent number of mixed Gaussian.
Further, obtain the noise data in current environment under the true noisy environment in indoor, setting up described noise template method is: extract MFCC feature, set up the GMM template of noise, obtaining noise template GMM parameter is λ n={ w nk, μ nk, Σ nk, k=1,2 ..., M, wherein, w nkrepresent the hybrid weight of noise model, μ nkrepresent the average of noise model, Σ nkrepresent the variance of noise model, M represents the exponent number of mixed Gaussian.
Further, adopt the method for the parallel model fusion improved as follows to described noise template and described clean sound event template:
(1) adopt inverse discrete cosine transform that arbitrary model parameter is mapped to linear spectral domain by cepstrum domain, obtain the average μ of log-spectral domain model log=C -1μ, and variance Σ log=C -1Σ (C -1) t, wherein, C is discrete cosine transformation matrix, and μ, Σ are respectively average into the cepstrum domain of model and variance;
(2) the log-spectral domain average in log-spectral domain model and variance are transformed to linear spectral domain by exponential function, μ i lin = exp ( μ i log + Σ ii log 2 ) For i-th element of the mean vector of linear spectral domain, Σ ij lin = μ i lin μ j lin [ exp ( Σ ij log ) - 1 ] For the i-th row jth column element of the covariance matrix of linear spectral domain.Wherein, μ i logfor i-th element of the mean vector of log-spectral domain, for the i-th row jth column element of the covariance matrix of log-spectral domain;
(3) adopt the parallel model combined method improved, clean sound event model parameter and noise model parameters merged at linear spectral domain, for the band noise sound event model after fusion is in the average of linear spectral domain, for the band noise sound event model after fusion is in the variance of linear spectral domain.Wherein μ xk linfor the average of the linear spectral domain of clean sound event model after described step (1) (2) conversion, for the variance of the linear spectral domain of clean sound event model after described step (1) (2) conversion, μ nk linfor the average of the linear spectral domain of noise model after described step (1) (2) conversion, for the variance of the linear spectral domain of noise model after described step (1) (2) conversion;
(4) average of the linear spectral domain model of the band noise sound event model after fusion and variance are obtained log-spectral domain parameter through the inverse transformation of above-mentioned steps (2), obtain the characteristic parameter of cepstrum domain again through above-mentioned steps (1) inverse transformation, obtain mean vector and the variance of being with noise sound event model.
Further, the parameter being with noise sound event model is λ y={ w yk, μ yk, Σ yk, k=1,2 ..., M, wherein w yk, μ yk, Σ ykrepresent hybrid weight, average and the variance of noise template respectively.Wherein hybrid weight does not have linear spectral domain, the difference of log-spectral domain and cepstrum domain.Therefore the hybrid weight w of noise sound event model is with ykbe the weight w of clean sound event template xk, M represents the exponent number of mixed Gaussian.
Further, the method for carrying out voice recognition to sample signal according to the parameter in described band noise sound event model is as follows:
1) pre-filtering, analog to digital conversion are carried out to described sample signal, then extract multidimensional MFCC feature after carrying out framing, windowing process and obtain sample signal characteristic sequence;
2) characteristic vector sequence of sample signal mated with described band noise sound event model, calculate match likelihood degree, the matching template of maximum likelihood degree is recognition result.
Further, described noise data adopts air-conditioning noise under the babble noise of NoiseX-92 and/or indoor environment.
Technique effect of the present invention:
The present invention under Complex Noise background, by setting up the background GMM that better can describe the distribution of ground unrest feature, can be used as an input in PMC method, sets up clean GMM another input as PMC of 5 kinds of sound events.The method that improved model parameter merges obtains the band noise sound event model meeting noise circumstance, for the sound event to be identified under actual noise environment, has good recognition effect.This invention ensures that the robustness of recognition system to noise.
Accompanying drawing explanation
Fig. 1 is the overall identification process schematic diagram of sound event recognition method of the parallel model combination that the present invention is based on improvement.
Fig. 2 is fusion method schematic diagram in sound event recognition method one embodiment of the parallel model combination that the present invention is based on improvement.
Fig. 3 is 5 kinds of sound event recognition effect schematic diagram in sound event recognition method one embodiment of the parallel model combination that the present invention is based on improvement.
Specific implementation method
Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described, be understandable that the technical scheme in the embodiment of the present invention, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those skilled in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.
The present invention is directed to 5 kinds of sound events that recurrent needs in indoor environment come into the picture to identify.In addition, take into full account the situation (the babble noise of the air-conditioning noise recorded under indoor environment, public noise database NoiseX-92) of Complex Noise, utilize GMM mixed Gauss model (GMM, " Speech processing " the 2nd edition, Zhao Li writes, China Machine Press, 228-230 page) ambient noise signal is described, the multiple Gaussian distribution weighting of GMM describes the feature distribution of background, better can describe the information of ground unrest fully.In model layer to the ground unrest model parameter described in utilizing, clean sound event model parameter is compensated, obtains the model with sound event of making an uproar, to prevent the mismatch of training data because noise brings and test data.
The present invention is a kind of sound event recognition method of the parallel model combination based on improving, and particular content is:
The first, set up the template of clean sound event.
(1) under quiet environment, the data of 5 kinds of sound events are recorded respectively, according to foregoing sound signal processing steps to pre-service such as windowing framings.
(2) again by as previously mentioned, extract the MFCC feature of robust, train the gauss hybrid models of 5 kinds of sound events respectively.The training of gauss hybrid models adopts EM algorithm to upgrade the parameter of Gauss model.Suppose to train wherein a kind of GMM parameter of the clean sound event obtained as follows:
λ x={w xkxkxk},k=1,2…,M(1)
The second, obtain the noise data in current environment, extract MFCC feature, set up the GMM template of noise.Obtain noise mode board parameter as follows:
λ n={w nknknk},k=1,2…,M(2)
3rd, carry out Model Fusion, because the data being used for training GMM in the present invention are all MFCC feature, belong to cepstral domain feature, and ground unrest and sound event model parameter only can add at linear spectral domain, so will do following process equally to above two kinds of models: (with λ={ w k, μ k, Σ k, k=1,2 ..., M carrys out the GMM of the clean sound of unified presentation and the GMM of ground unrest)
1) model parameter is mapped to linear spectral domain by cepstrum domain, the inverse transformation of discrete cosine transform specifically can be adopted to process.Here we do not extract the difference coefficient of MFCC, computing method formula (3) and (4):
μ log=C -1μ(3)
Σ log=C -1Σ(C -1) T(4)
Wherein μ log, Σ logfor average and the variance of log-spectral domain model, μ, Σ are average and the variance of the cepstrum domain of model, and C is discrete cosine transformation matrix.
2) normally distributed random variable of log-spectral domain is transformed to linear spectral domain by exponential function, computing method are as formula (5) and (6):
μ i lin = exp ( μ i log + Σ ii log 2 ) - - - ( 5 )
Σ ij lin = μ i lin μ j lin [ exp ( Σ ij log ) - 1 ] - - - ( 6 )
Wherein, μ i lin, be respectively i-th element of the mean vector of linear spectral domain and the i-th row jth column element of covariance matrix; μ i log, be respectively i-th element of the mean vector of log-spectral domain and the i-th row jth column element of covariance matrix.
3) suppose that mean vector and the variance of the linear spectral domain of the clean sound event model after being calculated by above formula are respectively μ xk lin, the linear spectral domain mean vector of noise model and variance are respectively μ nk lin, formula (7) and (8) is adopted to merge two models:
μ yk lin = g μ xk lin + ( 1 - g ) Σ k = 1 K w nk μ nk lin - - - ( 7 )
Σ yk lin = g 2 Σ xk lin + ( 1 - g ) 2 Σ k = 1 K w nk Σ nk lin - - - ( 8 )
Wherein μ yk lin, represent mean vector and the variance of the band noise sound event model after merging, g represents gain factor.
4) inverse transformation of linear spectral domain model parameter through formula (5) (6) after fusion is obtained the log-spectral domain parameter of model, then the inverse transformation through formula (3) (4) obtains the parameter of the cepstrum domain of model.That 5 kinds of sound event models are all done above-mentioned process can finally obtain is 5 kinds after fusion band noise sound event model parameters.
4th, for the band noise sound event signal sample extracted in room noise ambient sound, the object identified to determine which of 5 kinds of sound events current sample belong to, namely calculate sample to the posterior probability of 5 kinds of models, model corresponding to a wherein maximum posterior probability is the classification of sample.According to Bayesian formula, because 5 kinds of contingent probability of sound event are identical, for the observation vector determined, the calculating of above-mentioned maximum a posteriori probability is equal to and calculates this probability under 5 kinds of sound event models of revise, makes the model of this maximum probability be classification belonging to sample.
Be the overall identification process schematic diagram of sound event recognition method of the parallel model combination that the present invention is based on improvement as shown in Figure 1, comprise training part and identification division.
The present invention considers often to occur under indoor environment and needs 5 kinds of sound events coming into the picture, is respectively sound of closing the door, tap-tap, clapping, voice, birdie.5 kinds of sound event templates and noise train template training process as follows:
1, under quiet environment, record 5 kinds of sound event databases and go forward side by side rower calmly.Often kind of sound event type 100, by 5 male 5 female, sounding or generation action obtain respectively.Noise adopts air-conditioning noise under the babble noise of NoiseX-92 and indoor environment.
2, pre-filtering, high-pass filtering suppresses 50Hz power supply noise signal; Low-pass filtering filtering sound signal intermediate frequency rate component exceedes the part of sample frequency half; Analog to digital conversion, sample frequency is 11025Hz, and sampling precision is 16bits;
3, for each complete voice segments, framing, windowing.Frame length is 256 sampled points, and it is 128 sampled points that frame moves.Window function chooses Hamming window;
4, feature extraction.Extract 13 dimension MFCC features;
5, often kind of sound event utilizes 60 characteristic vector sequence respectively, and noise adopts 10 characteristic vector sequence, based on the GMM template λ of expectation maximization (EM) Algorithm for Training 5 kinds of sound xk, k=1,2 ... 5, and the template λ of noise n, template adopts the gauss hybrid models of 8 gaussian component.
Model Fusion process of the present invention is fusion method schematic diagram in sound event recognition method one embodiment of the parallel model combination that the present invention is based on improvement as shown in Figure 2.
Concrete steps are as follows:
1, adopt described formula (3) (4) (5) (6) that ground unrest model and ten clean sound event model parameter spectral domains are converted into linear spectral domain.
2, adopt described formula (7) (8) respectively by ten kinds of clean sound events linear spectral domain parameter and the linear spectral domain parameter of noise merge, g=0.5 here.
3, by the inverse transformation of warp (5) (6) formula and the inverse transformation of (3) (4) respectively of the linear spectral domain parameter of the band noise sound event model after fusion, 5 GMM model λ with sound event of making an uproar are obtained yk, k=1,2 ... 5.
Identifying of the present invention is as follows:
1, under above-mentioned two kinds of noise conditions, record 5 kinds of band noise sound event signals totally 110.Carry out pre-filtering; Analog to digital conversion, sample frequency is 11025Hz, and sampling precision is 16bits.
2, framing, windowing.Frame length is 256 sampled points, and it is 128 sampled points that frame moves.Window function chooses Hamming window.Extract 13 dimension MFCC features.
3, template matches.The characteristic vector sequence of present sound signals is with noise sound event-template to mate with 5 kinds.Feature vector sequence is X k, k=1 ..., N, 5 templates are λ yk, k=1,2 ... 5.Calculate match likelihood degree, select the template obtaining maximum likelihood degree to be recognition result.5 kinds of sound event recognition effect schematic diagram in sound event recognition method one embodiment of the parallel model combination that the present invention is based on improvement as shown in Figure 3.
Above-mentioned example is citing of the present invention, although disclose example of the present invention for the purpose of illustration, but it will be appreciated by those skilled in the art that: without departing from the spirit and scope of the invention and the appended claims, various replacement, change and amendment are all possible.Therefore, the present invention should not be limited to the content of this example.

Claims (7)

1., based on a sound event recognition method for the parallel model combination improved, its step comprises:
1) obtain GMM gauss hybrid models according to clean sound event training, set up clean sound event template;
2) obtain GMM gauss hybrid models according to noise data training, set up noise template;
3) to the method that described noise template and described clean sound event template adopt parallel model to merge, band noise sound event-template is obtained; Comprise step by step following:
3 ?1) adopt inverse discrete cosine transform arbitrary model parameter is mapped to linear spectral domain by cepstrum domain, obtain the average μ of log-spectral domain model log=C -1μ and variance Σ log=C -1Σ (C -1) t, wherein, C is discrete cosine transformation matrix, and μ, Σ are respectively average and the variance of the cepstrum domain of model;
3 ?2) the log-spectral domain average in log-spectral domain model and variance are transformed to linear spectral domain by exponential function, for i-th element of the mean vector of linear spectral domain, for the i-th row jth column element of the covariance matrix of linear spectral domain; Wherein, μ i logfor i-th element of the mean vector of log-spectral domain, for the i-th row jth column element of the covariance matrix of log-spectral domain;
3 ?3) adopt the parallel model combined method improved, clean sound event model parameter and noise model parameters are merged at linear spectral domain, for the band noise sound event model after fusion is in the average of linear spectral domain, for the band noise sound event model after fusion is in the variance of linear spectral domain, wherein μ xk linfor clean sound event model through described step 3 ?1), 3 ?2) average of linear spectral domain after conversion, for clean sound event model through described step 3 ?1), 3 ?2) variance of linear spectral domain after conversion, μ nk linfor noise model through described step 3 ?1), 3 ?2) average of linear spectral domain after conversion, for noise model through described step 3 ?1), 3 ?2) variance of linear spectral domain after conversion;
3 ?4) by the average of linear spectral domain model of the band noise sound event model after merging and variance through above-mentioned steps 3 ?2) inverse transformation obtain log-spectral domain parameter, again through above-mentioned steps 3 ?1) inverse transformation obtains the model parameter of cepstrum domain, obtains mean vector and the variance of being with noise sound event model;
4) sampling obtains band and to make an uproar sound event sample signal, carries out voice recognition according to the parameter in described band noise sound event-template to sample signal.
2., as claimed in claim 1 based on the sound event recognition method of the parallel model combination improved, it is characterized in that, the method setting up the template of clean sound event is as follows:
1) data of recorded voice event under without quiet indoor environment of making an uproar, again carry out framing, windowing process after carrying out pre-filtering, analog to digital conversion to the sound event recorded;
2) extract MFCC mel cepstrum coefficients feature, train the GMM Gaussian Mixture template of sound event.
3., as claimed in claim 1 based on the sound event recognition method of the parallel model combination improved, it is characterized in that, described gauss hybrid models adopts EM Algorithm for Training and upgrades the parameter of Gauss model, and the GMM parameter of training the clean sound event obtained is λ x={ w xk, μ xk, Σ xk, k=1,2 ..., M, wherein, w xkrepresent the hybrid weight of clean sound event model, μ xkrepresent the average of clean sound event model, Σ xkrepresent the variance of clean sound event model, M represents the exponent number of mixed Gaussian.
4. as claimed in claim 1 based on the sound event recognition method of the parallel model combination improved, it is characterized in that, the noise data in current environment is obtained under the true noisy environment in indoor, setting up described noise template method is: extract MFCC feature, set up the GMM template of noise, obtaining noise template GMM parameter is λ n={ w nk, μ nk, Σ nk, k=1,2 ..., M, wherein, w nkrepresent the hybrid weight of noise model, μ nkrepresent the average of noise model, Σ nkrepresent the variance of noise model, M represents the exponent number of mixed Gaussian.
5. as claimed in claim 1 based on the sound event recognition method of the parallel model combination improved, it is characterized in that, the parameter lambda of band noise sound event model y={ w yk, μ yk, Σ yk, k=1,2 ..., M, wherein w yk, μ yk, Σ ykrepresent the hybrid weight of noise template respectively, average and variance, wherein hybrid weight does not have linear spectral domain, the hybrid weight w of band noise sound event model ykbe the weight w of clean sound event template xk, M represents the exponent number of mixed Gaussian.
6. the sound event recognition method based on the parallel model combination improved as described in claim 1-5 any one, is characterized in that, described noise data adopts air-conditioning noise under the babble noise of NoiseX-92 and/or indoor environment.
7., as claimed in claim 1 based on the sound event recognition method of the parallel model combination improved, it is characterized in that, as follows according to the method that the parameter in described band noise sound event model carries out voice recognition to sample signal:
1) pre-filtering, analog to digital conversion are carried out to described sample signal, then extract multidimensional MFCC feature after carrying out framing, windowing process and obtain sample signal characteristic sequence;
2) characteristic vector sequence of sample signal mated with described band noise sound event model, calculate match likelihood degree, the matching template of maximum likelihood degree is recognition result.
CN201310239724.7A 2013-05-08 2013-06-17 A kind of sound event recognition method of the parallel model combination based on improving Expired - Fee Related CN103310789B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310239724.7A CN103310789B (en) 2013-05-08 2013-06-17 A kind of sound event recognition method of the parallel model combination based on improving

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201310166660 2013-05-08
CN2013101666602 2013-05-08
CN201310166660.2 2013-05-08
CN201310239724.7A CN103310789B (en) 2013-05-08 2013-06-17 A kind of sound event recognition method of the parallel model combination based on improving

Publications (2)

Publication Number Publication Date
CN103310789A CN103310789A (en) 2013-09-18
CN103310789B true CN103310789B (en) 2016-04-06

Family

ID=49135932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310239724.7A Expired - Fee Related CN103310789B (en) 2013-05-08 2013-06-17 A kind of sound event recognition method of the parallel model combination based on improving

Country Status (1)

Country Link
CN (1) CN103310789B (en)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104485108A (en) * 2014-11-26 2015-04-01 河海大学 Noise and speaker combined compensation method based on multi-speaker model
CN105657338A (en) * 2014-12-02 2016-06-08 深圳大学 Internet based remote mobile terminal control system and control method
CN104408440B (en) * 2014-12-10 2017-10-17 重庆邮电大学 A kind of facial expression recognizing method merged based on two step dimensionality reductions and Concurrent Feature
CN105118516A (en) * 2015-09-29 2015-12-02 浙江图维电力科技有限公司 Identification method of engineering machinery based on sound linear prediction cepstrum coefficients (LPCC)
CN105405447B (en) * 2015-10-27 2019-05-24 航宇救生装备有限公司 One kind sending words respiratory noise screen method
CN107492153B (en) * 2016-06-07 2020-04-07 腾讯科技(深圳)有限公司 Attendance system, method, attendance server and attendance terminal
CN106340292B (en) * 2016-09-08 2019-08-20 河海大学 A kind of sound enhancement method based on continuing noise estimation
CN108922518B (en) * 2018-07-18 2020-10-23 苏州思必驰信息科技有限公司 Voice data amplification method and system
CN109273021B (en) * 2018-08-09 2021-11-30 厦门亿联网络技术股份有限公司 RNN-based real-time conference noise reduction method and device
CN109631104A (en) * 2018-11-01 2019-04-16 广东万和热能科技有限公司 Air quantity Automatic adjustment method, device, equipment and the storage medium of kitchen ventilator
CN109472311A (en) * 2018-11-13 2019-03-15 北京物灵智能科技有限公司 A kind of user behavior recognition method and device
CN110120230B (en) * 2019-01-08 2021-06-01 国家计算机网络与信息安全管理中心 Acoustic event detection method and device
CN110544469B (en) * 2019-09-04 2022-04-19 秒针信息技术有限公司 Training method and device of voice recognition model, storage medium and electronic device
CN110838306B (en) * 2019-11-12 2022-05-13 广州视源电子科技股份有限公司 Voice signal detection method, computer storage medium and related equipment
CN113112681A (en) * 2020-01-13 2021-07-13 阿里健康信息技术有限公司 Vending equipment, and shipment detection method and device
CN111028841B (en) * 2020-03-10 2020-07-07 深圳市友杰智新科技有限公司 Method and device for awakening system to adjust parameters, computer equipment and storage medium
CN111711881B (en) * 2020-06-29 2022-02-18 深圳市科奈信科技有限公司 Self-adaptive volume adjustment method according to environmental sound and wireless earphone
CN112820318A (en) * 2020-12-31 2021-05-18 西安合谱声学科技有限公司 Impact sound model establishment and impact sound detection method and system based on GMM-UBM

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1397929A (en) * 2002-07-12 2003-02-19 清华大学 Speech intensifying-characteristic weighing-logrithmic spectrum addition method for anti-noise speech recognization
US6876966B1 (en) * 2000-10-16 2005-04-05 Microsoft Corporation Pattern recognition training method and apparatus using inserted noise followed by noise reduction
CN1819019A (en) * 2006-03-13 2006-08-16 华南理工大学 Phonetic identifier based on matrix characteristic vector function and identification thereof
CN102426837A (en) * 2011-12-30 2012-04-25 中国农业科学院农业信息研究所 Robustness method used for voice recognition on mobile equipment during agricultural field data acquisition

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4352790B2 (en) * 2002-10-31 2009-10-28 セイコーエプソン株式会社 Acoustic model creation method, speech recognition device, and vehicle having speech recognition device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6876966B1 (en) * 2000-10-16 2005-04-05 Microsoft Corporation Pattern recognition training method and apparatus using inserted noise followed by noise reduction
CN1397929A (en) * 2002-07-12 2003-02-19 清华大学 Speech intensifying-characteristic weighing-logrithmic spectrum addition method for anti-noise speech recognization
CN1819019A (en) * 2006-03-13 2006-08-16 华南理工大学 Phonetic identifier based on matrix characteristic vector function and identification thereof
CN102426837A (en) * 2011-12-30 2012-04-25 中国农业科学院农业信息研究所 Robustness method used for voice recognition on mobile equipment during agricultural field data acquisition

Also Published As

Publication number Publication date
CN103310789A (en) 2013-09-18

Similar Documents

Publication Publication Date Title
CN103310789B (en) A kind of sound event recognition method of the parallel model combination based on improving
CN103280220B (en) A kind of real-time recognition method for baby cry
CN103177722B (en) A kind of song retrieval method based on tone color similarity
WO2019019252A1 (en) Acoustic model training method, speech recognition method and apparatus, device and medium
CN102968990B (en) Speaker identifying method and system
CN102436809B (en) Network speech recognition method in English oral language machine examination system
CN104900229A (en) Method for extracting mixed characteristic parameters of voice signals
CN103345923A (en) Sparse representation based short-voice speaker recognition method
CN103065629A (en) Speech recognition system of humanoid robot
CN103117059A (en) Voice signal characteristics extracting method based on tensor decomposition
CN101923855A (en) Test-irrelevant voice print identifying system
CN104078039A (en) Voice recognition system of domestic service robot on basis of hidden Markov model
CN104900235A (en) Voiceprint recognition method based on pitch period mixed characteristic parameters
CN102789779A (en) Speech recognition system and recognition method thereof
CN109256144A (en) Sound enhancement method based on integrated study and noise perception training
CN109949823A (en) A kind of interior abnormal sound recognition methods based on DWPT-MFCC and GMM
CN104795064A (en) Recognition method for sound event under scene of low signal to noise ratio
CN106024010A (en) Speech signal dynamic characteristic extraction method based on formant curves
CN102592593B (en) Emotional-characteristic extraction method implemented through considering sparsity of multilinear group in speech
CN109192200A (en) A kind of audio recognition method
CN104887263A (en) Identity recognition algorithm based on heart sound multi-dimension feature extraction and system thereof
CN111341319B (en) Audio scene identification method and system based on local texture features
CN105845149A (en) Predominant pitch acquisition method in acoustical signal and system thereof
CN103456302A (en) Emotion speaker recognition method based on emotion GMM model weight synthesis
Li et al. Multi-level attention model with deep scattering spectrum for acoustic scene classification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160406

Termination date: 20170617