WO2006004050A1 - 特定音響信号含有区間検出システム及びその方法並びにプログラム - Google Patents
特定音響信号含有区間検出システム及びその方法並びにプログラム Download PDFInfo
- Publication number
- WO2006004050A1 WO2006004050A1 PCT/JP2005/012223 JP2005012223W WO2006004050A1 WO 2006004050 A1 WO2006004050 A1 WO 2006004050A1 JP 2005012223 W JP2005012223 W JP 2005012223W WO 2006004050 A1 WO2006004050 A1 WO 2006004050A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- small region
- spectrogram
- signal
- reference signal
- small
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/10—Speech classification or search using distance or distortion measures between unknown speech and reference templates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/63—Querying
- G06F16/632—Query formulation
- G06F16/634—Query by example, e.g. query by humming
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/683—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/0008—Associated control or indicating means
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2240/00—Data organisation or data communication aspects, specifically adapted for electrophonic musical tools or instruments
- G10H2240/121—Musical libraries, i.e. musical databases indexed by musical parameters, wavetables, indexing schemes using musical parameters, musical rule bases or knowledge bases, e.g. for automatic composing methods
- G10H2240/131—Library retrieval, i.e. searching a database or selecting a specific musical piece, segment, pattern, rule or parameter set
- G10H2240/141—Library retrieval matching, i.e. any of the steps of matching an inputted segment or phrase with musical database contents, e.g. query by humming, singing or playing; the steps may include, e.g. musical analysis of the input, musical feature extraction, query formulation, or details of the retrieval process
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H2250/00—Aspects of algorithms or signal processing methods without intrinsic musical character, yet specifically adapted for or used in electrophonic musical processing
- G10H2250/025—Envelope processing of music signals in, e.g. time domain, transform domain or cepstrum domain
- G10H2250/031—Spectrum envelope processing
Definitions
- the present invention relates to signal detection for finding a position of a similar signal from a reference sound signal that is a reference signal and a force of a stored sound signal that is longer than the reference sound signal.
- This is a specific acoustic signal containing section detection system that is used for acoustic signal detection using a part of the music on the Compact Disc) as a reference signal.
- the present invention uses a part of specific music recorded on a music CD as a reference signal, and detects the section including the reference signal in the stored signal.
- the section used as Music is searched from the power of a huge database, for example, a recorded TV broadcast.
- the section detection including a specific acoustic signal is performed by referring to a similar section including a sound similar to a specific acoustic signal (reference acoustic signal) called a reference signal. Detecting in longer acoustic signals (accumulated acoustic signals).
- detecting the similar section is defined as detecting the beginning time of the section where the similar section starts.
- the section detection method containing a specific acoustic signal for the purpose of detecting music used as BGM has been rarely used in the past, and there has been a self-optimized spectrum correlation method (for example, M. Abe and M. Nishiguchi: Self—optimized Spectral Correlation Method for Background Music Identification, Proc. IEEE ICME '02, Lausanne, vol. 1, 333,336 (2002)).
- the self-optimized spectral correlation method has a problem in that it takes a very long time to detect because of the amount of calculation.
- a division match search method has been proposed as a method for detecting a section containing a specific acoustic signal at a higher speed (for example, JP 2004-102023 “Specific acoustic signal detection method, signal detection apparatus, signal Detection program and recording medium ").
- Fig. 7 shows an overview of the above-described division match search method, and the processing procedure of the division match search method is described below.
- step (a) of Fig. 7 the power spectrum is extracted from each acoustic waveform signal of the reference signal and accumulated signal, and each spectrogram is obtained.
- a spectrogram of a small area of a certain size is cut out from the spectrogram of the reference signal at equal intervals.
- the spectrogram of this small region is obtained by cutting out a certain number of points on the spectrogram in the frequency axis direction and the time axis direction, respectively, also by the original spectrogram force. There may be overlap in these small area spectrograms.
- the spectrogram of the small region cut out in this way is called a small region spectrogram (smal 1—region spectrogram).
- the small region spectrogram F is cut off in the spectrogram of the reference signal.
- each small region spectrogram is normalized for each small region spectrogram in order to absorb fluctuations in volume.
- step (b) of FIG. 7 for each F in the reference signal, a search is made in the frequency band com for a time point similar to F from the accumulated signal.
- TAS time-series active search method
- G is a time point t such that the small region search threshold S ′ P is exceeded.
- the histogram overlap rate between F and G is used as the small region similarity P (F, G). Use.
- This small region similarity based on the histogram overlap rate is called the small region histogram similarity.
- FIG 8 shows an overview of the time-series active search method (TAS).
- the similarity of the histogram overlap with the spectrogram of the reference signal is the same length as the reference signal having the extragram larger than the threshold ⁇ , and the extragram power of the accumulated signal Explore.
- X and Y are spectrograms of the same size in the time axis direction and the frequency axis direction.
- the spectral features at each time point on the spectrogram are normalized, and then encoded by vector quantization.
- a code a code obtained by code quantization by vector quantization
- a histogram overlap rate calculation a histogram (histogram feature) is created for each spectrogram by counting the number of appearances of the vector quantization code. If h x and h Y are the histogram features of X and Y, the histogram overlap ratio S (h x , h Y ) of X and Y is calculated by the following equation (1).
- h ⁇ x and h ⁇ ⁇ are the frequencies contained in the ⁇ -th bin of h x and h Y , respectively, and L is the number of histogram bins.
- the number, D is the total frequency of one histogram.
- the histogram overlap rate is used as the similarity of the spectrum program.
- floor (X) is the largest integer that does not exceed X.
- the time-series active search method performs search processing by repeating the above processing. If the histogram overlap rate in the checked section is larger than ⁇ , the section is detected as a section similar to the reference signal.
- step (c) of FIG. 7 based on the search result of each small region spectrogram F, the small region similarity is calculated for each time point t in the accumulated signal.
- I TR I represents the number of elements of TR.
- G is a small region spectrogram similar to F.
- Equation 5 Therefore, in an actual search, G is a small region spectrum similar to F.
- the frequency band ⁇ m that maximizes the value is selected from the set W of all frequency bands.
- the section similarity (t) is the search threshold.
- a reference signal is detected in a section beginning at a time t greater than th.
- the histogram overlap rate Since the histogram overlap rate is calculated, it takes time to calculate the histogram overlap rate, and the histogram overlap rate may also be calculated for combinations of F and G that are not similar. com
- the present invention determines whether two small region spectrograms of a reference signal and an accumulated signal are similar in detection of a similar small region spectrogram that takes a long time by the above-described known method.
- Specified sound that can be detected at high speed compared to the conventional method by detecting the section containing a specific sound signal by omitting the similarity determination in the combination of small area spectrograms, with no possibility of similarities.
- the purpose is to provide a signal content detection system.
- the specific acoustic signal-containing section detection system of the present invention detects a section including a sound similar to a reference signal that is a specific acoustic signal in an accumulated signal that is a longer acoustic signal than the reference signal.
- a reference signal spectrogram dividing unit that divides a reference signal spectrogram, which is a content interval detection system and is a temporal frequency spectrum of a reference signal, into a small region spectrogram called a small region reference signal spectrogram, and a small region reference signal
- a small region accumulated signal spectrogram Detects small region stored signal spectrogram force based on the similarity of the small region stored signal spectrogram code that is encoded into the stored signal small region code and the small region spectrogram similar to each small region reference signal spectrogram.
- the present invention encodes two small region spectrograms to detect only the similarity.
- the amount of calculation can be greatly reduced compared to the conventional example, and the specific acoustic signal containing section can be detected at high speed.
- the small region reference signal spectrogram encoding unit and the small region stored signal extra program code encoding unit encode each small region extra program. (Referred to as a small region code) and the similar small region spectrogram detecting unit detects a small region accumulated signal extragram similar to each small region reference signal spectrogram based on the similarity of the small region codes. . That is, the similarity determination of two small region spectra is determined only by the similarity of the small region codes.
- the specific acoustic signal containing section detection system significantly reduces the amount of computation that does not require the histogram counting process and the like, compared to the conventional example in which the histogram overlap ratio is calculated.
- the small region reference signal spectrogram code part and the small region accumulated signal spectrogram code part create a small region code for each small region extragram
- a similar small region spectrogram detection unit for each of the small region reference signal spectrograms, lists each small region stored signal spectrogram of the corresponding band in the order of time and the small region code in order. Based on the degree of similarity, only the similar small region accumulated signal spectrograms are detected.
- the specific acoustic signal containing section detection system according to the present invention significantly reduces the amount of computation that does not require the histogram counting process and the like, compared to the conventional example in which the histogram overlap ratio is calculated.
- the small region reference signal spectrogram code portion and the small region accumulated signal spectrogram code portion generate a small region code for each small region spectrogram and are similar.
- the similarity between all the small region codes is calculated in advance, a table is created, and by referring to this, a small region code similar to the small region code of the small region reference spectrogram is extracted, and the index is further added. By referencing, a small region accumulated signal spectrogram similar to the small region reference signal extragram is detected.
- the specific acoustic signal containing section detection system of the present invention can perform similarity determination of two small area spectrograms at a higher speed than the calculation of the histogram overlap rate, and is not similar to the conventional example. Since two small region spectrograms cannot be collated, there is no possibility of similarity, and determination of similarity between the small region spectrograms can be omitted, and detection of a specific acoustic signal containing section can be performed at higher speed.
- FIG. 1 is a block diagram showing a configuration example of a specific acoustic signal containing section detection system according to an embodiment of the present invention.
- FIG. 2 is a conceptual diagram illustrating the processing of the specific acoustic signal containing section detection system of FIG.
- FIG. 3 is a conceptual diagram showing the configuration of a small area code similarity table in which the similarity is associated with each small area code pair.
- FIG. 4 is a conceptual diagram showing an index listing the present time of the accumulated signal small region spectrogram for each small region code.
- FIG. 5 is a flowchart showing an operation example of the specific acoustic signal containing section detection system shown in FIG. 1.
- FIG. 6 is a conceptual diagram for explaining the outline of the specific acoustic signal containing section detection.
- FIG. 7 is a conceptual diagram illustrating an outline of a division match search method in a conventional example.
- FIG. 8 is a conceptual diagram for explaining TAS (Time Series Active Search Method).
- FIG. 1 is a block diagram showing an embodiment of a specific acoustic signal containing section detection system according to the present invention.
- the specific acoustic signal containing section detection system shown in Fig. 1 is a system that detects a section containing a sound similar to a specific acoustic signal called a reference signal in an acoustic signal longer than the reference signal called a stored signal.
- a general computer having a CPU (Central Process Unit) and a memory.
- CPU Central Process Unit
- a small region accumulated signal spectrogram code part 101 encodes a small region accumulated signal extragram which is a small region spectrogram in the accumulated signal spectrogram which is a time-frequency spectrogram of the accumulated signal.
- the accumulated signal is output as a small area code.
- the similar small region spectrogram detection unit 102 has a function of indexing the current output of the small region accumulated signal spectrogram and a small region accumulated signal similar to the small region reference signal spectrogram by referring to the index.
- the former uses the accumulated signal small region code input from the small region accumulated signal outer program code unit 101, and instead of performing detailed section detection, the small region space is detected. This is pre-processing for detecting the similarity of the tatrograms and extracting the time point for detecting the section. Specifically, an index as shown in Fig. 4 is generated.
- a small region code similar to the reference signal small region code is extracted using a previously created small region code similarity table (Fig. 3), and the small region reference having the small region code is searched by index search.
- the signal spectrogram is detected and the current output time and the small area similarity are output.
- the reference signal spectrogram dividing unit 103 divides the reference signal spectrogram, which is a time frequency spectrogram of the reference signal (detected signal), into a small region spectrogram called a small region reference signal spectrogram.
- the small region reference signal spectrogram encoding unit 104 encodes the small region reference signal spectrogram and outputs it as a reference signal small region code.
- the interval similarity calculation unit 105 calculates the similarity between the small region reference signal extragram detected by the similar small region spectrogram detection unit 102 and the similar small region accumulated signal spout outergram (small region similarity). ) Is used to calculate the similarity (section similarity) between the section signal of the stored signal including the similar small area stored signal spectrogram and the reference signal.
- the similar section detection unit 106 detects a section including a sound similar to the reference signal in the accumulated signal based on the section similarity.
- FIG. 2 is a conceptual diagram for explaining the process of detecting a specific acoustic signal containing section according to the present invention.
- a storage signal spectrogram extraction unit and a reference signal spectrogram extraction unit read the acoustic waveform signals of the storage signal and the reference signal, respectively, extract the power spectrum, and output them as the storage signal spectrogram and the reference signal spectrogram.
- the reference signal spectrogram dividing unit 103 cuts out a spectrogram of a small region of a certain size (constant time width) from the reference signal spectrogram at equal intervals. And output as a small region reference signal spectrogram.
- the reference signal spectrogram dividing unit 103 performs the small area reference signal spectral logging.
- the ram is obtained by cutting out the original spectrogram force from a certain number of points on the spectrum drum in the frequency axis direction and the time axis direction.
- the spectrogram of a small region as described above is called a small-region spectrogram.
- a small region reference signal spectrogram in a reference signal having a leading time point ti and a frequency band com is denoted as F below.
- the first time point is t and the small area accumulated signal spectrogram of the same size as F above in the frequency band com is denoted as G.
- the number of elements W and the number of elements TR may each be 1.
- each small region spectrogram small region accumulated signal spectrogram and small region reference signal spectrogram
- the power spectrum of each small region spectrogram is normalized for each small region spectrogram in order to absorb volume fluctuation.
- the power spectrum value at each time point in the small region is normalized by the average value of the power spectrum values at that point in the small region frequency band.
- the small region reference signal spectrogram encoding unit 104 extracts histogram features from the small region reference signal spectrogram F in the same manner as the division matching search method described in the description of the conventional example. (After normalizing the spectral characteristics at each time point on the force spectrogram described in the description of the conventional example, the vector quantization is used to encode the number of occurrences of each code in the bin corresponding to the code. Count to get histogram features).
- This histogram feature is a feature vector whose component is the value of each bin of the histogram (the number of occurrences of each vector quantization code in the small region spectrogram).
- the small region reference signal spectrogram coding unit 104 performs coding of each small region reference signal spectrum by coding this histogram feature by vector quantization for each band.
- vector quantization means that one code is assigned to a given vector. It is a procedure to hit.
- the small region accumulated signal spectrogram code unit 101 Encode for each band.
- the small area accumulated signal outer spectrum code code unit 101 and the small area reference signal spectrum code unit 104 use the same codebook when encoding the small area spectrogram in each band.
- the code obtained by signing the histogram characteristics of the small region spectrogram obtained here is referred to as a small region code (reference signal small region code, accumulated signal small region code; these are histograms in band units.
- Vector quantization code the reference signal small region code of the small region reference signal spectrogram F is c (F), and the small region stored signal spectrum
- c (G) denote the accumulated signal small region code of the trogram G.
- the sign ⁇ of these small region spectrograms is a power vector value at each time point on the small region reference signal spectrogram and the small region accumulated signal spectrogram as a feature vector without using a histogram. It is also possible to encode the feature vector by vector quantization and use it as a reference signal small region code and a stored signal small region code (corresponding to the configuration of claim 2).
- the similar small region spectrogram detecting unit 102 uses the similarity between the reference signal small region code and the accumulated signal small region code as the similarity between the small region reference signal spectrogram and the small region accumulated signal spectrum. As shown in step (b) of Fig. 2, for each small region reference signal spectrogram F, a similar small region accumulated signal spectrogram is stored.
- Spectrogram force is detected.
- the similar small region spectrogram detecting unit 102 defines the similarity (similarity between small region codes) on the table for each small region code pair (similar small region spectrogram).
- the detection unit 102 stores it in the internal storage unit), and knows the similarity between the reference signal small region code and the stored signal small region code by referring to this table (called the small region code similarity table). Can do.
- FIG. 3 shows the configuration of the above-mentioned small area code similarity table.
- V (co m, j, k) Indicates the similarity between the small area codes q (com, j) and the small area code q (com, k) in the band com.
- the small region codes in the band com are indicated as q (com, 1), q (com, 2),...
- the similar small region spectrogram detecting unit 102 uses V (com, j, k) Is calculated as a large value when the calculated distance is small and a small value when the distance is large. To do. For example, there is a method using the Euclidean distance as the distance between the representative vectors.
- V (com, j, k) is defined as a real value from 0 to 1. That is, in each band com, calculation is performed so that V (com, j, k) is 0 when the distance is maximum, and v (com, j, k) is 1 when the distance is minimum.
- the small region accumulated signal spectrogram similar to F is the difference between F and G.
- Such a small region accumulation signal spectrogram G Such a small region accumulation signal spectrogram G.
- th is set for a threshold value by experimentally measuring a plurality of reference signals and accumulated signals in advance, and obtaining a value with no or little search omission in a similar section.
- this may be set to the same value for all bands in w, or a different value for each band.
- the similar small region spectrogram detecting unit 102 uses an index obtained by classifying the small region accumulated signal spectrogram for each small region code of the accumulated signal spectrogram as shown in FIG.
- the list of the appearance positions (time points) of the small region accumulated signal spectrogram having the small region code is also referred to by referring to the index force in FIG.
- the list pointed to by q (com m, j) (sequence of time points; horizontal column) is the small number of all small signals having q (com, j) in the accumulated signal small region code.
- the time points of the region accumulation signal spectrum program are stored as an array arranged in time series.
- the similar small region spectrogram detecting unit 102 arranges similar small region accumulated signal spectrograms for each small region reference signal spectrogram, and arranges the small region accumulated signal spectrograms of the corresponding bands in time order.
- Each of the small region accumulated signal extraspectrum programs in the list and the accumulated signal spectrogram are compared in order based on the similarity of the small region codes, and only similar small region accumulated signal spectrograms are detected. Is also possible (configuration according to claim 4).
- the similar small region spectrogram detecting unit 102 for each of the small region reference signal spectrograms described above, each small region accumulated signal spectrogram in a list in which the small region accumulated signal spectrograms of the corresponding bands are arranged in time order, and the small region in order
- the comparison may be made based on the similarity of the codes, and only the similar small region accumulated signal spectrograms may be detected.
- the interval similarity calculation unit 105 calculates the position of the appearance of the small region reference signal spectrogram in the reference signal from the position relationship between the appearance time in the accumulated signal of the small region accumulated signal extragram similar to this.
- the section start time t for calculating the similarity (section similarity) of the section in the stored signal including the reference signal and the small area stored signal spectrogram is obtained.
- the above-mentioned small region similarities are integrated, and the similarity (interval similarity) S (t) with the reference signal spectrogram at t is expressed by the following equation (7).
- ITRI represents the number of elements in the set TR at the time
- IWI represents the number of elements in the set W of frequency bands.
- the G force is In the case where it is not detected as a similar small region spectrogram, that is, as shown in the following equation (8), the small region similarity s P (F, G) is less than or equal to the small region search threshold s P
- the interval similarity calculation unit 105 detected G as a small region spectrogram similar to F in the index search using FIG. 3 and FIG.
- interval similarity s (t) at time t is expressed as I TR I. Divide by I w I to obtain the normality and calculate the interval similarity s (t) at t.
- the similar section detection unit 106 calculates the section similarity S (t) as a search threshold in the stored signal spectrum.
- a section starting from time t greater than th is detected as a section similar to the reference signal extragram.
- the similar section detector 106 determines the search threshold S experimentally or empirically.
- the obtained value can be set, and separately, the distribution of the obtained multiple interval similarity s (t) is taken, the standard deviation is calculated, and the obtained interval similarity S (t) A similar interval can be selected with the search threshold S being the value of 3 ⁇ for the maximum value.
- FIG. 5 is a flowchart showing an operation example of the specific acoustic signal containing section detection system shown in FIG.
- the small region accumulated signal outer program code encoder unit 101 in FIG. 1 reads an accumulated signal spectrogram from an unillustrated accumulated signal spectrogram extracting unit.
- the small region accumulated signal extraneous program code encoding unit 101 sequentially encodes the small region spectrogram of the accumulated signal extraneous program.
- the accumulated signal small region code obtained by the above-described processing is supplied from the small region accumulated signal extra-speech code encoding unit 101 to the similar small region spectrogram detecting unit 102 (step Sl).
- the similar small region spectrogram detecting unit 102 classifies the supplied accumulated signal small region codes, and generates an index shown in FIG. 4 (step S 2).
- the reference signal spectrogram dividing unit 103 reads the reference signal spectrogram from, for example, a file (a file in which a reference signal spectrogram generated by a reference signal spectrogram extracting unit (not shown) is recorded).
- the reference signal spectrogram dividing unit 103 divides this into a small region reference signal spectrum program, and sequentially divides the divided small region reference signal spectrum program into the small region reference signal spectrogram code part. Supply to 104 (step S3).
- the small region reference signal extragram code encoding unit 104 sequentially encodes the small region reference signal spectrogram, and obtains the obtained reference signal small region code c (F) and its reference.
- the time point ti on the signal is supplied to the similar small region spectrogram detection unit 102 (step S4).
- the similar small region spectrogram detecting unit 102 refers to the small region inter-code similarity table in FIG. 3 stored internally for the small region reference signal spectrogram, and corresponds to the corresponding small region inter-code similarity (small region similarity). Degree) and the small region search threshold value are compared, and a small region code exceeding the small region search threshold value is extracted. Then, the time point t + ti at which the small area code appears in the accumulated signal is searched using the index of FIG.
- an interval start time t of the accumulated signal similar to the reference signal is obtained, and the small region code class is obtained.
- the similarity (that is, the small area similarity) is supplied to the interval similarity calculation unit 105 in association with t (step S5).
- the interval similarity calculation unit 105 generates a small region reference signal spectrogram (F) and
- the interval similarity calculation unit 105 is supplied with the reference signal small region codes for all the small region reference signal extra programs from the small region reference signal extra program code unit 104, and steps S5 and S6 are performed. It is determined whether or not the above process has been completed (step S7).
- step S8 determines that it has not ended, the process proceeds to step S8. Proceed to S5.
- the interval similarity calculation unit 105 divides and normalizes the accumulated interval similarity at each time point by the number of supplied small region reference signal spectrograms using the equation (7) (step S8) and The similar interval detection unit 106 has a normalized interval similarity greater than the search threshold S.
- step S9 Assuming that the reference signal is in the section starting at time t, the time t is output, and the process is terminated (step S9).
- the similar section detection unit 106 may output only the section having the largest section similarity that exceeds the search threshold instead of outputting a plurality of sections exceeding the search threshold.
- the above embodiment and the conventional split matching search method were mounted on a personal computer having the following specifications, the detection speed was measured, and the embodiment and the conventional example were compared.
- HAT registered trademark
- Linux registered trademark
- GNU gcc was used as a compiler.
- the executable file was compiled using the compiler optimization option “1 03”.
- the frequency band number IWI was set to 4, and 525Hz force and 2000Hz.
- the spectrogram of the output of every 2 milliseconds of 28 bandpass filters arranged at equal intervals on the logarithmic axis was divided into four frequency bands in the frequency axis direction.
- the average detection time required about 0.58 seconds in the conventional method, and less than 0.01 seconds in the above-described embodiment of the present invention, and about 70 times faster detection in simple calculation. I did it.
- the stored signal was a mixture of music signal and audio signal with a power ratio (music signal power / audio signal power) average of 5 dB, but the search accuracy at this time was 99. 9% (refer to Japanese Unexamined Patent Application Publication No. 2004-102023 “Specific Acoustic Signal Detection Method, Signal Detection Device, Signal Detection Program and Recording Medium”). In this example, it is 99.0%, and the search accuracy is determined to be equivalent. .
- the program for realizing the function of the specific acoustic signal containing section detection system in FIG. 1 is recorded on a computer-readable recording medium, and the program recorded on the recording medium is read into the computer system. By executing this, processing for detecting a specific sound signal containing section may be performed.
- the “computer system” here includes the OS and hardware such as peripheral devices.
- “Computer system” includes a WWW system equipped with a homepage provision environment (or display environment).
- the “computer-readable recording medium” means a portable medium such as a flexible disk, a magneto-optical disk, a ROM, a CD-ROM, or a storage device such as a hard disk built in the computer system.
- the “computer-readable recording medium” means a volatile memory (RAM) in a computer system that becomes a server or a client when a program is transmitted via a network such as the Internet or a communication line such as a telephone line. ), Etc. that hold the program for a certain period of time [0062] Further, the program may be transmitted from a computer system storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium.
- the “transmission medium” for transmitting a program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.
- the program may be for realizing a part of the functions described above. Furthermore, what can realize the above-mentioned functions in combination with a program already recorded in the computer system, that is, a so-called differential file (differential program) may be used.
- a so-called differential file differential program
- the present invention encodes two small region spectrograms and uses similarity to index them. Therefore, the amount of calculation can be greatly reduced compared to the conventional example, and the specific acoustic signal containing section can be detected at high speed.
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/586,192 US7860714B2 (en) | 2004-07-01 | 2005-07-01 | Detection system for segment including specific sound signal, method and program for the same |
DE602005018776T DE602005018776D1 (de) | 2004-07-01 | 2005-07-01 | System für detektionssektion mit einem bestimmten akustischen signal, verfahren und programm dafür |
JP2006523770A JP4327202B2 (ja) | 2004-07-01 | 2005-07-01 | 特定音響信号含有区間検出システム及びその方法並びにプログラム |
EP05765265A EP1763018B1 (en) | 2004-07-01 | 2005-07-01 | System for detection section including particular acoustic signal, method and program thereof |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2004-195995 | 2004-07-01 | ||
JP2004195995 | 2004-07-01 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006004050A1 true WO2006004050A1 (ja) | 2006-01-12 |
Family
ID=35782854
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2005/012223 WO2006004050A1 (ja) | 2004-07-01 | 2005-07-01 | 特定音響信号含有区間検出システム及びその方法並びにプログラム |
Country Status (6)
Country | Link |
---|---|
US (1) | US7860714B2 (ja) |
EP (1) | EP1763018B1 (ja) |
JP (1) | JP4327202B2 (ja) |
CN (1) | CN100592386C (ja) |
DE (1) | DE602005018776D1 (ja) |
WO (1) | WO2006004050A1 (ja) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007298607A (ja) * | 2006-04-28 | 2007-11-15 | Victor Co Of Japan Ltd | 音響信号分析装置、音響信号分析方法、及び音響信号分析用プログラム |
JP2008065393A (ja) * | 2006-09-04 | 2008-03-21 | Research Organization Of Information & Systems | グループ判別装置及びグループ判別方法 |
JP2009541869A (ja) * | 2006-07-03 | 2009-11-26 | インテル・コーポレーション | 高速音声検索の方法および装置 |
JP2011209592A (ja) * | 2010-03-30 | 2011-10-20 | Brother Industries Ltd | 楽器音分離装置、及びプログラム |
JP2011209593A (ja) * | 2010-03-30 | 2011-10-20 | Brother Industries Ltd | 歌声分離装置、及びプログラム |
JP2012133371A (ja) * | 2012-01-04 | 2012-07-12 | Intel Corp | 高速音声検索の方法および装置 |
JP2012203382A (ja) * | 2011-03-28 | 2012-10-22 | Nippon Telegr & Teleph Corp <Ntt> | 特定音響信号含有区間検出装置、方法、及びプログラム |
JP2015031927A (ja) * | 2013-08-06 | 2015-02-16 | 日本電信電話株式会社 | 共通信号含有区間有無判定装置、方法、及びプログラム |
JP2015200857A (ja) * | 2014-04-10 | 2015-11-12 | 日本電信電話株式会社 | 系列信号特定方法、装置、及びプログラム |
JP7143327B2 (ja) | 2017-10-03 | 2022-09-28 | グーグル エルエルシー | コンピューティング装置によって実施される方法、コンピュータシステム、コンピューティングシステム、およびプログラム |
JP7283375B2 (ja) | 2019-02-01 | 2023-05-30 | 富士通株式会社 | 信号処理方法及び情報処理装置 |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7885420B2 (en) * | 2003-02-21 | 2011-02-08 | Qnx Software Systems Co. | Wind noise suppression system |
US8271279B2 (en) * | 2003-02-21 | 2012-09-18 | Qnx Software Systems Limited | Signature noise removal |
US7827179B2 (en) * | 2005-09-02 | 2010-11-02 | Nec Corporation | Data clustering system, data clustering method, and data clustering program |
JP4665836B2 (ja) * | 2006-05-31 | 2011-04-06 | 日本ビクター株式会社 | 楽曲分類装置、楽曲分類方法、及び楽曲分類プログラム |
JP2009008823A (ja) * | 2007-06-27 | 2009-01-15 | Fujitsu Ltd | 音響認識装置、音響認識方法、及び、音響認識プログラム |
CN101887720A (zh) * | 2009-05-13 | 2010-11-17 | 鸿富锦精密工业(深圳)有限公司 | 声讯语义辨识系统及方法 |
US8886531B2 (en) * | 2010-01-13 | 2014-11-11 | Rovi Technologies Corporation | Apparatus and method for generating an audio fingerprint and using a two-stage query |
TWI403304B (zh) | 2010-08-27 | 2013-08-01 | Ind Tech Res Inst | 隨身語能偵知方法及其裝置 |
WO2012071442A1 (en) * | 2010-11-22 | 2012-05-31 | Listening Methods, Llc | System and method for pattern recognition and analysis |
CN111462553B (zh) * | 2020-04-17 | 2021-03-30 | 杭州菲助科技有限公司 | 一种基于视频配音和纠音训练的语言学习方法及系统 |
CN113744765B (zh) * | 2021-08-19 | 2023-12-29 | 深圳市新国都股份有限公司 | Pos机语音播报检测方法、装置及存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10508391A (ja) * | 1995-08-28 | 1998-08-18 | フィリップス エレクトロニクス ネムローゼ フェンノートシャップ | 基準ベクトルの部分集合の動的な形成に基づくパターン認識の方法及びシステム |
JP2002236496A (ja) * | 2001-02-07 | 2002-08-23 | Nippon Telegr & Teleph Corp <Ntt> | 信号検出方法、信号検出装置、記録媒体及びプログラム |
JP2003242510A (ja) * | 2002-02-13 | 2003-08-29 | Nippon Telegr & Teleph Corp <Ntt> | 信号検索装置、信号検索方法、信号検索プログラム及び信号検索プログラムを記録した記録媒体 |
JP2004102023A (ja) * | 2002-09-11 | 2004-04-02 | Nippon Telegr & Teleph Corp <Ntt> | 特定音響信号検出方法、信号検出装置、信号検出プログラム及び記録媒体 |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH06105394B2 (ja) * | 1986-03-19 | 1994-12-21 | 株式会社東芝 | 音声認識方式 |
JPH06332492A (ja) * | 1993-05-19 | 1994-12-02 | Matsushita Electric Ind Co Ltd | 音声検出方法および検出装置 |
US5749073A (en) * | 1996-03-15 | 1998-05-05 | Interval Research Corporation | System for automatically morphing audio information |
JP3065314B1 (ja) | 1998-06-01 | 2000-07-17 | 日本電信電話株式会社 | 高速信号探索方法、装置およびその記録媒体 |
US6263311B1 (en) * | 1999-01-11 | 2001-07-17 | Advanced Micro Devices, Inc. | Method and system for providing security using voice recognition |
US6138089A (en) * | 1999-03-10 | 2000-10-24 | Infolio, Inc. | Apparatus system and method for speech compression and decompression |
JP3408800B2 (ja) | 2000-04-27 | 2003-05-19 | 日本電信電話株式会社 | 信号検出方法、装置及びそのプログラム、記録媒体 |
EP1161098B1 (en) * | 2000-04-27 | 2011-06-22 | Nippon Telegraph And Telephone Corporation | Signal detection method and apparatus |
US6990453B2 (en) | 2000-07-31 | 2006-01-24 | Landmark Digital Services Llc | System and methods for recognizing sound and music signals in high noise and distortion |
TW582022B (en) | 2001-03-14 | 2004-04-01 | Ibm | A method and system for the automatic detection of similar or identical segments in audio recordings |
JP3746690B2 (ja) | 2001-07-10 | 2006-02-15 | 日本電信電話株式会社 | 信号検出方法及び装置、プログラムならびに記録媒体 |
-
2005
- 2005-07-01 EP EP05765265A patent/EP1763018B1/en active Active
- 2005-07-01 DE DE602005018776T patent/DE602005018776D1/de active Active
- 2005-07-01 WO PCT/JP2005/012223 patent/WO2006004050A1/ja active Application Filing
- 2005-07-01 JP JP2006523770A patent/JP4327202B2/ja active Active
- 2005-07-01 US US10/586,192 patent/US7860714B2/en active Active
- 2005-07-01 CN CN200580002496A patent/CN100592386C/zh active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10508391A (ja) * | 1995-08-28 | 1998-08-18 | フィリップス エレクトロニクス ネムローゼ フェンノートシャップ | 基準ベクトルの部分集合の動的な形成に基づくパターン認識の方法及びシステム |
JP2002236496A (ja) * | 2001-02-07 | 2002-08-23 | Nippon Telegr & Teleph Corp <Ntt> | 信号検出方法、信号検出装置、記録媒体及びプログラム |
JP2003242510A (ja) * | 2002-02-13 | 2003-08-29 | Nippon Telegr & Teleph Corp <Ntt> | 信号検索装置、信号検索方法、信号検索プログラム及び信号検索プログラムを記録した記録媒体 |
JP2004102023A (ja) * | 2002-09-11 | 2004-04-02 | Nippon Telegr & Teleph Corp <Ntt> | 特定音響信号検出方法、信号検出装置、信号検出プログラム及び記録媒体 |
Non-Patent Citations (3)
Title |
---|
AKISATO KIMURA ET AL: "Global na edakari o donyu shita chojikan on kyo shingo no tansaku jikeiretsu active tansaku no koritsuka.(Quick searching of long audio signals using global pruning accelerating time series active search)", THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS GIJUTSU KENKYU HOKOKU, vol. 100, no. 634, 16 February 2001 (2001-02-16), pages 53 - 60, XP002998252 * |
NAGANO H ET AL: "Tasu no shoryoiki spectrogram no kensaku ni motozuku haikei ongaku no kosku tansakuho. (A Fast Search Algorithm for Background Music Signals Based on the Searches for Numerous Small-Region Spectograms)", THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS D-II, vol. 87, no. 5, 1 May 2004 (2004-05-01), pages 1179 - 1188, XP002998251 * |
See also references of EP1763018A4 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007298607A (ja) * | 2006-04-28 | 2007-11-15 | Victor Co Of Japan Ltd | 音響信号分析装置、音響信号分析方法、及び音響信号分析用プログラム |
JP2009541869A (ja) * | 2006-07-03 | 2009-11-26 | インテル・コーポレーション | 高速音声検索の方法および装置 |
JP2008065393A (ja) * | 2006-09-04 | 2008-03-21 | Research Organization Of Information & Systems | グループ判別装置及びグループ判別方法 |
JP2011209592A (ja) * | 2010-03-30 | 2011-10-20 | Brother Industries Ltd | 楽器音分離装置、及びプログラム |
JP2011209593A (ja) * | 2010-03-30 | 2011-10-20 | Brother Industries Ltd | 歌声分離装置、及びプログラム |
JP2012203382A (ja) * | 2011-03-28 | 2012-10-22 | Nippon Telegr & Teleph Corp <Ntt> | 特定音響信号含有区間検出装置、方法、及びプログラム |
JP2012133371A (ja) * | 2012-01-04 | 2012-07-12 | Intel Corp | 高速音声検索の方法および装置 |
JP2015031927A (ja) * | 2013-08-06 | 2015-02-16 | 日本電信電話株式会社 | 共通信号含有区間有無判定装置、方法、及びプログラム |
JP2015200857A (ja) * | 2014-04-10 | 2015-11-12 | 日本電信電話株式会社 | 系列信号特定方法、装置、及びプログラム |
JP7143327B2 (ja) | 2017-10-03 | 2022-09-28 | グーグル エルエルシー | コンピューティング装置によって実施される方法、コンピュータシステム、コンピューティングシステム、およびプログラム |
JP7283375B2 (ja) | 2019-02-01 | 2023-05-30 | 富士通株式会社 | 信号処理方法及び情報処理装置 |
Also Published As
Publication number | Publication date |
---|---|
CN100592386C (zh) | 2010-02-24 |
EP1763018A1 (en) | 2007-03-14 |
DE602005018776D1 (de) | 2010-02-25 |
JP4327202B2 (ja) | 2009-09-09 |
US7860714B2 (en) | 2010-12-28 |
CN1910651A (zh) | 2007-02-07 |
US20070156401A1 (en) | 2007-07-05 |
JPWO2006004050A1 (ja) | 2007-08-16 |
EP1763018A4 (en) | 2008-08-27 |
EP1763018B1 (en) | 2010-01-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2006004050A1 (ja) | 特定音響信号含有区間検出システム及びその方法並びにプログラム | |
US9589283B2 (en) | Device, method, and medium for generating audio fingerprint and retrieving audio data | |
US9313593B2 (en) | Ranking representative segments in media data | |
EP2791935B1 (en) | Low complexity repetition detection in media data | |
KR100717387B1 (ko) | 유사곡 검색 방법 및 그 장치 | |
KR100749045B1 (ko) | 음악 내용 요약본을 이용한 유사곡 검색 방법 및 그 장치 | |
JP5362178B2 (ja) | オーディオ信号からの特徴的な指紋の抽出とマッチング | |
US7081581B2 (en) | Method and device for characterizing a signal and method and device for producing an indexed signal | |
KR100725018B1 (ko) | 음악 내용 자동 요약 방법 및 그 장치 | |
US8073684B2 (en) | Apparatus and method for automatic classification/identification of similar compressed audio files | |
US20070107584A1 (en) | Method and apparatus for classifying mood of music at high speed | |
US20060155399A1 (en) | Method and system for generating acoustic fingerprints | |
KR100888804B1 (ko) | 동영상 데이터의 동일성 판단 및 동일 구간 검출 방법 및장치 | |
US8301284B2 (en) | Feature extraction apparatus, feature extraction method, and program thereof | |
JP5462827B2 (ja) | 特定音響信号含有区間検出装置、方法、及びプログラム | |
Htun | Analytical approach to MFCC based space-saving audio fingerprinting system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 2006523770 Country of ref document: JP |
|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2005765265 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 200580002496.5 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWW | Wipo information: withdrawn in national office |
Country of ref document: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 10586192 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 2005765265 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 10586192 Country of ref document: US |