US3825685A - Helium environment vocoder - Google Patents

Helium environment vocoder Download PDF

Info

Publication number
US3825685A
US3825685A US00250534A US25053472A US3825685A US 3825685 A US3825685 A US 3825685A US 00250534 A US00250534 A US 00250534A US 25053472 A US25053472 A US 25053472A US 3825685 A US3825685 A US 3825685A
Authority
US
United States
Prior art keywords
coupled
speech
inputs
output
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US00250534A
Inventor
D Roworth
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
STC PLC
INT STANDARD CORP
Original Assignee
INT STANDARD CORP
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by INT STANDARD CORP filed Critical INT STANDARD CORP
Application granted granted Critical
Publication of US3825685A publication Critical patent/US3825685A/en
Assigned to STC PLC reassignment STC PLC ASSIGNMENT OF ASSIGNORS INTEREST. Assignors: INTERNATIONAL STANDARD ELECTRIC CORPORATION, A DE CORP.
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders

Definitions

  • a vocoder system having a multichannel speech analyzer, a multichannel speech synthesizer and an excitation system for extracting time-varying characteristics of the excitation function of the input speech signal and reproducing this function for the excitation of the synthesizer.
  • the excitation system includes a speech extractor, a voiced/unvoiced detector, a pitch generator, a noise generator, a voiced/unvoiced switch and an output pulse generator.
  • the pitch extractor operates on the short term envelope of the speech and is derived from a signal which is the-sum of the signals of all analyzer channels. Constant width pulses are generated and integrated to form a varying d.c. signal proportional to the frequency of the extracted pitch. During voiced speech a multivibrator controlled by the dc. signal produces the input to the output pulse generator.
  • output pulse generator produces an output pulse train of constant energy and is applied. for excitation of the synthesizer channels.
  • the noise generator drives the output pulse generator.
  • This invention relates to a speech processor, such as a vocoder, and in particular to an excitationsystem therefor.
  • Such processors are especially useful in the processing of helium speech, which suffers severe distortion due to the speaker breathing an exotic gas mixture at abnormal pressures. This distortion is sometimes referred to as the Donald Duck effect.
  • second means coupled to the first means for producing a sequence of pulses at a rate having a predetermined relationship to the fundamental frequency; a noise generator to produce noise; third means coupled to the analyzer for determining whether the input speech is voiced or unvoiced; fourth means coupled to the second means, the noise generator and the third means responsive to the output signal of the third means for selecting the sequence of pulses if said input speech is voiced and for selectingthe noise from the noise generator if the input speech is unvoiced; and
  • I 2 Stillanother featureof the present invention is the provision of a vocoder as defined above wherein the second means includes a voltage controlled multivibra tor, and an input circuit for the multivibrator having a capacitor, and a pair of isolating diodes coupled to the capacitor, the pair of diodes having coupled thereto for applying to the capacitor through the pair of diodes a direct current voltage bearing a predetermined relationship to the fundamental frequency.
  • the second means includes a voltage controlled multivibra tor, and an input circuit for the multivibrator having a capacitor, and a pair of isolating diodes coupled to the capacitor, the pair of diodes having coupled thereto for applying to the capacitor through the pair of diodes a direct current voltage bearing a predetermined relationship to the fundamental frequency.
  • a further feature of the present invention is the provision of a vocoder as defined above and further including switching means coupled to the second means, the third means and the input circuit for the multivibrator, the switching means being responsive to the detection of unvoiced speech by the third means to disconnect the second means from the input circuit for the multivibrator.
  • Still a further feature of the present invention is the provision of a vocoder as defined above wherein the third means includes a comparator having two inputs and an output, one of the inputs being coupled to a first group of channels of the analyzer having a passband related to voiced speech and the other of the inputs being coupled to a second group of channels of the analyzer having a passband related to unvoiced speech, the second group of channels being different than the first group of channels, and a two level clamp circuit coupled to one of the two inputs to hold the output in one condition when the input exceeds .a predetermined first threshold level and to hold the output in the othercondition when that input falls below a second predetermined threshold level lower than the first threshold level.
  • each mannel of the analyzer includes a bandpass filter coupled to the input speech signal, the passband of the bandpass filter being different for each channel, and a rectifier.
  • the first means includes sixth means coupled in parallel to the output of a given number of the rectifiers to produce a sum of the output signals passed by each of the given number of the rectifiers, a first differential amplifier having an output and two inputs, one of the two inputs of the first differential amplifier being coupled to the sixth means to receive the sum of the output signals,
  • a seventh means coupled between the sixth means and the other of the two inputs of the second differential amplifier to smooth the sum of the output'signals prior to being applied to the'other of the two inputs of the first differential amplifier, eighth means coupled to the output of the first differential amplifier to square the output signal of the differential amplifier, a monostable circuit coupled to the eighth means driven by a squared output signal from the eighth means, and'a low pass filter means having gain coupled to the output of the monostable circuit to operate on the output signal of the monostable circuit.
  • FIG. 1 is a block diagramof a helium environment vocoder arrangement in accordance with the principles of the present invention.
  • FIG. 2 illustrates the block diagram of the multichannel analyzer and synthesizer channels of FIG. 1;.
  • FIG. 3 illustrates the schematic diagram of the first and second stages of the pitch extractor section of the excitation system of FIG. 1;
  • FIG. 4 illustrates the frequency response of the second stage of FIG. 3
  • FIG. 5 illustrates the schematic diagram of the third and fourth stages of-the pitch extraction section of the excitation system of FIG. 1;
  • FIG. 6 illustrates, the schematic diagram of the pitch and noise generators of the excitation system of FIG. 1;
  • FIG. 7 illustrates the schematic diagram of the voiced/unvoiced detector for the excitation system of FIG. 1;
  • FIG. 8 illustrateratesthe schematic diagram of the output pulse generator for the excitation system of FIG. 1.
  • the input analyzer channel also produces a channel signal which is the input for the corresponding channel in the synthesizer 12, the outputs of which are summed to give the processed speech output.
  • the excitation system can be broadly divided into two sections, namely, pitch extraction from analyzer and pitch generation for the synthesizer.
  • the pitch extraction section operates upon the short-term envelope of the speech which is obtained from all the analyzer channels via summing network ll.
  • the dc. component and unwanted high frequency a.c. components are eliminated by dc. eliminator l3 and squaring circuit 14, respectively, to give a square wave at the fundamental frequency.
  • This square wave is then converted to a dc. voltage whose amplitude is proportional to the instantaneous frequency averaged over a short period by pulse generator and integrator 16, respectively.
  • a voiced/unvoiced decision is made by voiced/unvoiced detector 17, based on the relative high and low frequency energies in the multichannel inputs to the synthesizer, as derived by two groups of diode gates 18 and 19.
  • Detector 17 controls the application of the dc. voltage representing pitch to a voltage controlled multi-vibrator 20 via gate 21 and backlash circuit 22.
  • detector 17 is responsible for selecting either the output of multivibrator 20, or the output of noise generator 23 by way of the changeover switch 24. The selected output is fed via a squaring circuit 25 to an output pulse generator 26 which provides the excitation signal for the synthesizer channels.
  • the analyzer 10 consists of a number of similar channels, say 22 in all, each of which has a bandpass filter 27, FIG. 2, followed by a rectifier 28.
  • Each bandpass filter 27 covers a different portion of the speech spectrum, and provides an output which, during voiced speech, is approximately equivalent to a pure tone at thev filter center frequency amplitude modulated with a sawtooth waveform at the fundamental frequency.
  • the result is a dc. signal with a sawtooth type of waveform and a superimposed a.c. ripple component at the channel input filter center frequency.
  • the sawtooth component which is approximately the same for all the channels, is enhanced and the ripple components, being different for each channel, are diminished.
  • the result is a dc. signal whose amplitude is varying in accordance with the dynamics of the speech signal, and upon which is imposed a sawtooth waveform at the fundamental frequency.
  • the first stage of the pitch extraction section uses a differential amplifier 30, FIG. 3, to compare the input waveform from summing network 11 with a smoothed version of itself.
  • the input waveform is applied to one input terminal of the amplifier and also to a tank circuit 31, from whichthe other input is derived.
  • the rise time and dc. components of the two signals are the same, but one signal has a smaller sawtooth component and a longer decay time than the other signal.
  • the dc. component is eliminated for rising and steady signals, but falling signals (such as at the end of a word) produce a temporary d.c. offset due to one input to the differential amplifier taking longer to decay than the other.
  • This remaining d.c. component is eliminated, the derived a.c. component low pass filtered and the remainder squared by the second stage of FIG. 3..
  • the signal is applied to both inputs of a differential amplifier 32 with RC integrating networks at each input.
  • the two integrating networks 33 and 34 are arranged to have slightly different integrating times (about 20 percent difference) and this results in a frequency response as shown in FIG. 4: the gain falls away at about 6 dB/octave in both directions from a center frequency which is set by the network time constants.
  • the dc. components of the signal are completely eliminated without introducing any high pass networks: effectively the operation relies on the difference in phase shift between the two integrating networks 33 and 34.
  • This stage is also arranged to square the output by utilizing the feedback network 35, which provides d.c. feedback but no a.c. feedback.
  • the square wave is applied to a monostable circuit 40, FIG. 5, which produces a pulse of fixed width each time it is triggered by a negative going edge in the square wave input.
  • These pulses are then passed through a two stage low pass filter.
  • the first stage 41- produces gain, set by the potentiometer for unity frequency ratio.
  • the second stage 42 has only unity gain and its output is a varying do. with minimal ripple at the pulse repetition frequency.
  • the gain is set so that the succeeding voltage controlled multivibrator 20, FIG. 1, produces pulses at the correct rate.
  • the second stage of the filter is arranged to have a dc. shift sufficient to compensate exactly for the dc. threshold exhibited in the control characteristic of multivibrator 20, due to the need to overcome the V of the control transistors.
  • the varying dc. voltage from the transistor of stage 42 is used to control the pitch generator multivibrator 43, FIG. 6.
  • This is a conventional cross-coupled multivibrator in which the charging current for the coupling capacitors is provided by current-source transistors 44 and 45 to whose bases is applied the control voltage.
  • a backlash circuit 22, FIG. 1 is interposed in the control circuit.
  • This backlash circuit takes the form of two germanium diodes 46 and 47 and a tank capacitor 48, as shown in FIG. 6. Changes in control voltage must be greater than the forward voltage of the diodes to be transferred to the control point of the multivibrator, so that significant changes in control voltage are transmitted while ripple components are rejected.
  • Gate 21 of FIG. 1 is realized by an FET switch 50 placed in the control line.
  • the control voltage effectively falls to zero and without the switch the multivibrator output would cease. Then, when voicing recommenced, the system would take a finite and not insignificant time to build up to the correct frequency. This is avoided by arranging for the switch 50 to be opened during the absence of voicing and then the effect of the bias network and tank capacitor 48 is to keep the multivibrator running.
  • the multivibratorfrequency will, however, drift slowly towards a pre-set median value. In this way the multivibrator is already operating in approximately the correct condition when voicing resumes.
  • the switch 50 is controlled by detector 17, FIG. 1.
  • Noise is generated by amplifying the input noise of an operational amplifier 51, FIG. 6, operating with a high source impedance.
  • the feedback network 52 shown provides feedback only at d.c. and low frequencies so that the generator is operating essentially with open-loop gain for medium and high frequencies.
  • the terms low, medium and high are here used in the context of the speech frequency range.
  • the feedback configuration is designed to provide d.c. stability and a 6 dB/octave bass roll-off in the noise output. Frequency compensation provides a 6 dB/octave high frequency roll-off in open-loop gain, so that the net result is a broad band of noise.
  • the center frequency of the band can be set so that it is centered at the optimum value for unvoiced speech synthesis. Since the noise amplitude obtained varies from one amplifier to another the output signal is clipped to a standard ampli tude by a pair of reversed diodes 53 and 54 in parallel. The signal from the pitch generator 43 is reduced to the same amplitude for application to the voiced/unvoiced switch by a suitable tap on the multivibrator.
  • the voiced/unvoiced gate or switch 24, FIG. 1, is essentially a single-pole change-over switch, to pass the output of either the pitch generator, multivibrator 43, FIG. 6, or the noise generator 51, FIG. 6.
  • This is achieved electronically by the arrangement shown in FIG. 6, which is a two-channel analog switch using bi polar transistors 55 and 56 inthe saturated mode.
  • the control voltage swings between volts and -10 volts, and the transistors are so biased that one switches from a conducting state to a non-conducting state as the other switches in the reverse direction.
  • a dead space is avoided by the pair of diodes 57 and 58 in the bias chain.
  • the two outputs from the switch are fed to the squaring circuit 25, FIG. 1, which is realized by the differential amplifier 60, FIG. 6. This also rejects the spurious pulses whichare introduced into both lines via the switching transistors when the switching signal changes polarity.
  • This stage is arranged to square the signal by providing d.c. feedback only. Thus, at the output of this stage the signal consists of an infinitely clipped waveform which is either periodic (i.e., at the fundamental frequency during voicing), 'or random (during unvoiced speech).
  • the switching signal for switch 24, FIG. 1, is provided by detector 17, which is illustrated in detail in FIG. 7.
  • detector 17 Each of the analyzer channels, FIG. 2, has a low pass filter 29 following the rectifier 28. These low pass filtered channel signals reflect the energy levels in'each of the frequency ranges covered by the channels.
  • the two signals to be compared are derived fromthe two groups of channels by diode gate networks 18 and 19, FIG. 1. Asshown in detail in FIG. 7 these two groups of channels consist of the six lowest frequency channels and the four highest frequency channels. The largest signal in each group is transmitted by the relevant diode in the two gate groups through resistors 70 and 71.
  • the decision circuit is a comparator 72 so arranged that the output holds the switch in the unvoiced state when the upper group of channels has a larger signal that the lower. When the upper group has a smaller signal, the switch is in the voiced state. This decision relies upon the fact that during fricatives there is a fairly strong energy component at high frequencies but little energy at low frequencies, and vice versa.
  • a two level clamp 73 and 74 is applied to the high I frequency bus input to the comparator to ensure the correct input under all conditions for a helium environment where, in helium speech, the amplitude of the unvoiced components relative to the voiced components is relatively depressed as compared to normal speech due to the Donald Duck effect.
  • One clamp ensures that for h.f. signal levels above about 3 volts the output is always voiced, to cater to those vowel sounds which have a relatively high amount of energy at high frequencies.
  • the other clamp ensures that for signal levels lower than about 1 volt the switch is always in the unvoiced condition. This ensures that a steady background noise is produced at-the output during the absence of speech, rather than allowing erratic excitation depending upon the characteristics of the input noise.
  • Positive feedback is applied to the comparator to provide a small amount of hysteresis in order to avoid spurious transitions when the two inputs are similar.
  • the output pulse generator 26, FIG. 1 receives the rectangular waveform from the switch via a level detector 80, FIG.'8, which sharpens the edges and removes spurious ripples around zero by infinitely clipping the signal. The positive going edges of the signal are then used to trigger the monostable circuit 81, which has a power stage 82 and 83 added to provide the necessary output drive capability.
  • the charge and discharge paths for the monostable timing capacitor are separated by a diode 84 in such a way that the charge on the capacitor (and hence the pulse width) is dependent upon the elapsed time since the previous pulse. As the time decreases the pulse becomes shorter, and the effect is to tend to equalize the average power of the pulse train. The correction is not exact, and the power tends to increase somewhat as the pulse rate rises. This is desirable in the case of ahelium speech processor.
  • the pulse rate is highest during random excitation, so that the energy level is somewhat higher for unvoiced speech than for voiced speech and this helps to compensate for the fact that the energy in the fricatives is depressed during helium speech.
  • a helium environment vocoder having a multichannel speech analyzer, a multichannel speech synthesizer and an excitation system for extracting time varying characteristics of the excitation function of input helium speech and reproducing this function for the excitation of said synthesizer, said excitation system comprising:
  • first means coupled to said analyzer for extracting continuously the fundamental frequency from said input speech
  • second means coupled to said first means for producing a sequence of pulses at a rate having predetermined relationship to said fundamental frequency
  • noise generator to produce noise
  • third means coupled to said analyzer for determining whether said input speech is voiced or unvoiced;
  • fourth means coupled to said second means, said noise generator and said third means responsive to the output signal of said third means for selecting said sequence of pulses if said input speech is voiced and for selecting said noise from said noise generator if said input speech is unvoiced;
  • fifth means coupled to said fourth means and said synthesizer for applying the selected one of said sequence of pulses and said noise as an excitation input pulse stream to said synthesizer, said input pulse stream having a constant energy level;
  • said third means including a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group of channels, and
  • a two level clamp circuit coupled to said other of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold leveland to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
  • a helium enviornment vocoder having a multichannel speech analyzer, a multichannel speech synthesizer and an excitation system for extracting time varying characteristics of the excitation function of input helium speech and reproducing this function for the excitation of said synthesizer, said excitation system comprising:
  • first means coupled to said analyzer for extracting continuously the fundamental frequency from said input speech
  • second means coupled to said first means for producing a sequence of pulses at a rate having a predetermined relationship to said fundamental frequency
  • noise generator and said third means responsive to the output signal of said third means for selecting said sequence of pulses if said input speech is voiced and for selecting said noise from said noise generator if said input speech is unvoiced; and fifth means coupled to said fourth means and said synthesizer for applying the selected one of said sequence of pulses and said noise as an excitation input pulse stream to said synthesizer, said input pulse stream having a constant energy level; said second means including a voltage controlled multivibrator, and an input circuit for said multivibrator having a capacitor, and a pair of isolating diodes coupled to said capacito'r, said pair of diodes having coupled thereto for applying to said capacitor through said pair of diodes a direct current voltage bearing a predetermined relationship to said fundamental frequency.
  • a vocoder according to claim 2, wherein said third means includes a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group of channels,
  • a two level clamp circuit coupled to one of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
  • a helium enviornment vocoder having a multia noise generator to produce noise
  • third means coupled to said analyzer for determining whether said input speech is voiced or unvoiced;
  • fourth means coupled to said second means, said noise generator and said third means responsive to the output signal of said third means for selecting said sequence of pulses if said input speech is voiced and for selecting said noise from said noise generator if said input speech is unvoiced;
  • fifth means coupled to said fourth means and said synthesizer for applying the selected one of said sequence of pulses and said noise as an excitation input pulse stream to said synthesizer, said input pulse stream having a constant energy level;
  • each channel of said analyzer including said first means including sixth means coupled in parallel to the output of a given number of said rectifiers to produce asum of the output signals passed by each of said given number of said rectifiers, 10 a first differential amplifier having an output and two inputs, one of said two inputs of said first differential amplifier being coupled to said sixth means to receive said sum of said ouput signals,
  • a seventh means coupled between said sixth means and the other of said two inputs of said first differential amplifier to smooth said sum of said output signals prior to being applied to said other of said two inputs of said first differential amplifier
  • a monostable circuit coupled to said eighth means driven by a squared output signal from said eighth means, and a low pass filter means having gain coupled to the output of said monostable circuit to operate on the output signal of said monostable circuit.
  • a vocoder according to claim 4, wherein said third means includes a comparator having two inputs and an output, one
  • a two level clamp circuit coupled to one of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
  • a vocoder includes a voltage controlled multivibrator, and an input circuit for said multivibrator having a capacitor, and
  • a pair of isolating diodes coupled to said capacitor, said pair of diodes having coupled thereto for applying to said capacitor through said pair of diodes a direct current voltage bearing a predetermined relationship to said fundamental frequency.
  • a vocoder according to claim 6, wherein said third means includes a two level clamp circuit coupled to one of said two inputs to hold said output :in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
  • a vocoder according to claim 4, wherein said eighth means includes a second differential amplifier having an output and two inputs,
  • first integrating network coupled between said output of said first difierential amplifier and one of said two inputs of said second differential amplifier, and first integrating network having a first time constant
  • a second integrating network coupled between said output of said first differential amplifier and the other of said two inputs'of said second differential amplifier, said second] integrating network having a second time constant different than said first time constant
  • a vocoder includes a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group ofchannel's, and
  • a two level clamp circuit coupled to one of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
  • a vocoder according to claim 8 wherein said second means includes a voltage controlled multivibrator, and
  • an input circuit for said multivibrator having a capacitor, and a pair of isolating diodes coupled to said capacitor, said pair of diodes having coupled thereto,
  • a vocoder for applying to said capacitor through said pair of diodes a direct current voltage bearing a predetermined relationship to said fundamental frequency.
  • said third means includes a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels 1 1 12 being different than said first group of channels, said third means includes and a comparator having two inputs and an output, one a two level clamp circuit coupled to one of said two of said inputs being coupled to a first group of inputs to hold said output in one condition when channels of said analyzer having a passband rethe input exceeds a predetermined first threshold 5 lated to voiced speech and the other of said inlevel and to hold said output in the other condiputs being coupled to a second group of channels tion when that input falls below a second pre
  • a vocoder according to claim 10 further includand ing a two level clamp circuit coupled to one of said two switching means coupled to said second means, said inputs to hold said output in one condition when third means and said input circuit for said multivithe input exceeds a predetermined first threshold brator, said switching means being responsive to level and to hold said output in the other condithe detection of unvoiced speech by said third tion when that input falls below a second predemeans to disconnect said second means from said termined threshold level lower than said first input circuit for said multivibrator. threshold level.
  • a vocoder according to claim 12 wherein

Abstract

There is disclosed herein a vocoder system having a multichannel speech analyzer, a multichannel speech synthesizer and an excitation system for extracting time-varying characteristics of the excitation function of the input speech signal and reproducing this function for the excitation of the synthesizer. The excitation system includes a speech extractor, a voiced/unvoiced detector, a pitch generator, a noise generator, a voiced/unvoiced switch and an output pulse generator. The pitch extractor operates on the short term envelope of the speech and is derived from a signal which is the sum of the signals of all analyzer channels. Constant width pulses are generated and integrated to form a varying d.c. signal proportional to the frequency of the extracted pitch. During voiced speech a multivibrator controlled by the d.c. signal produces the input to the output pulse generator. The output pulse generator produces an output pulse train of constant energy and is applied for excitation of the synthesizer channels. During unvoiced speech the noise generator drives the output pulse generator.

Description

United States Patent 1 1 it Roworth HELIUM ENVIRONMENT VOCODER [75] Inventor: Donald Anthony Acott Roworth,
Sawbridgeworth, England [73] Assignee: International Standard Corporation,
' New York, NY.
221 Filed: MayS, 1972 211 Appl.No.:250,534
[30] Foreign Application Priority Data June 10, 1971 52 us. 01 .5179/1 SA 51 Int. Cl. G101 1/00 58 Field at Search..; 179/1 SA, 15.55 R
[56] References Cited UNITED STATES. PATENTS 3,012,098 12/1961 Riesz 179/1 SA 1 3,071,652 1/1963 Schr0eder. 179/1 SA 3,176,073 3/1965 Samuelson.... 179/1 SA 3,377,428 4/1968 Dersch 179/1 SA 3,418,662 12/1968 Bottomley. 179/1 SA 3,431,362 3/1969 Miller 179/1;SA 3,573,374 4/1971 Focht 179/1 SA 3,624,302 11/1971 Atal 179/1 SA 3,676,596 7/1972 Willems 179/1 SA UNFILTERED CHANNEL SIGNALS NETWORK Great Britain 19957/71 M 1 l1 GATE July 23,1974
Primary Examiner-Kathleen H. Claffy Assistant Examiner-Jon Bradford Leaheey Attorney, Agent, or Firm-John T. OHalloran; Menotti .l. lsombardi Jr.; Alfred C. Hill [57] ABSTRACT There is disclosed herein a vocoder system having a multichannel speech analyzer, a multichannel speech synthesizer and an excitation system for extracting time-varying characteristics of the excitation function of the input speech signal and reproducing this function for the excitation of the synthesizer. The excitation system includes a speech extractor, a voiced/unvoiced detector, a pitch generator, a noise generator, a voiced/unvoiced switch and an output pulse generator. The pitch extractor operates on the short term envelope of the speech and is derived from a signal which is the-sum of the signals of all analyzer channels. Constant width pulses are generated and integrated to form a varying d.c. signal proportional to the frequency of the extracted pitch. During voiced speech a multivibrator controlled by the dc. signal produces the input to the output pulse generator. The
output pulse generator produces an output pulse train of constant energy and is applied. for excitation of the synthesizer channels. During unvoiced speech the noise generator drives the output pulse generator.
13 Claims, 8 Drawing Figures ourrur I SQUARING PuLsr N015: 23 GEN.
V.C. MVBR. 24
Pmimwmm 3.825.685
SHEET 4 OF 7 PATENTEU JUL 2 31974 SHEET SN 7 a? 95 658% 5 BE EV E75 5 cm ,(OEEEQ 5E 330528-552 H 8 So 3 w PA-IENTEDJULZSISH 3, 25, 5
SHEET 7 BF 7 FROM LE CHANNELSE FROM Hf CHANNELS FROM /uv 8O l HELIUM ENVIRONMENT VOCODER BACKGROUND OF THE INVENTION This invention relates to a speech processor, such as a vocoder, and in particular to an excitationsystem therefor.
Such processors are especially useful in the processing of helium speech, which suffers severe distortion due to the speaker breathing an exotic gas mixture at abnormal pressures. This distortion is sometimes referred to as the Donald Duck effect.
SUMMARY OF THE INVENTION speech; second means coupled to the first means for producing a sequence of pulses at a rate having a predetermined relationship to the fundamental frequency; a noise generator to produce noise; third means coupled to the analyzer for determining whether the input speech is voiced or unvoiced; fourth means coupled to the second means, the noise generator and the third means responsive to the output signal of the third means for selecting the sequence of pulses if said input speech is voiced and for selectingthe noise from the noise generator if the input speech is unvoiced; and
. I 2 Stillanother featureof the present invention is the provision of a vocoder as defined above wherein the second means includes a voltage controlled multivibra tor, and an input circuit for the multivibrator having a capacitor, and a pair of isolating diodes coupled to the capacitor, the pair of diodes having coupled thereto for applying to the capacitor through the pair of diodes a direct current voltage bearing a predetermined relationship to the fundamental frequency.
A further feature of the present invention is the provision of a vocoder as defined above and further including switching means coupled to the second means, the third means and the input circuit for the multivibrator, the switching means being responsive to the detection of unvoiced speech by the third means to disconnect the second means from the input circuit for the multivibrator.
Still a further feature of the present invention is the provision of a vocoder as defined above wherein the third means includes a comparator having two inputs and an output, one of the inputs being coupled to a first group of channels of the analyzer having a passband related to voiced speech and the other of the inputs being coupled to a second group of channels of the analyzer having a passband related to unvoiced speech, the second group of channels being different than the first group of channels, and a two level clamp circuit coupled to one of the two inputs to hold the output in one condition when the input exceeds .a predetermined first threshold level and to hold the output in the othercondition when that input falls below a second predetermined threshold level lower than the first threshold level. 1
' this invention will become more apparent by reference fifth means coupled to the'fourth means and the synthesizer for applying the selected one of the sequence of pulses and the noise as an excitation input pulse stream to the synthesizer, the input pulse stream having a constant energy level.
Another feature of the present invention is the provision of a vocoder as defined above wherein each mannel of the analyzer includes a bandpass filter coupled to the input speech signal, the passband of the bandpass filter being different for each channel, and a rectifier.
coupled to the output of the bandpass filter; and the first means includes sixth means coupled in parallel to the output of a given number of the rectifiers to produce a sum of the output signals passed by each of the given number of the rectifiers, a first differential amplifier having an output and two inputs, one of the two inputs of the first differential amplifier being coupled to the sixth means to receive the sum of the output signals,
a seventh means coupled between the sixth means and the other of the two inputs of the second differential amplifier to smooth the sum of the output'signals prior to being applied to the'other of the two inputs of the first differential amplifier, eighth means coupled to the output of the first differential amplifier to square the output signal of the differential amplifier, a monostable circuit coupled to the eighth means driven by a squared output signal from the eighth means, and'a low pass filter means having gain coupled to the output of the monostable circuit to operate on the output signal of the monostable circuit.
to the following description taken in conjunction with the accompanying drawing in which:
FIG. 1 is a block diagramof a helium environment vocoder arrangement in accordance with the principles of the present invention;'
FIG. 2 illustrates the block diagram of the multichannel analyzer and synthesizer channels of FIG. 1;.
FIG. 3 illustrates the schematic diagram of the first and second stages of the pitch extractor section of the excitation system of FIG. 1;
FIG. 4 illustrates the frequency response of the second stage of FIG. 3;
FIG. 5 illustrates the schematic diagram of the third and fourth stages of-the pitch extraction section of the excitation system of FIG. 1;
FIG. 6 illustrates, the schematic diagram of the pitch and noise generators of the excitation system of FIG. 1;
FIG. 7 illustrates the schematic diagram of the voiced/unvoiced detector for the excitation system of FIG. 1; and
FIG. 8-illustratesthe schematic diagram of the output pulse generator for the excitation system of FIG. 1.
' DESCRIPTION OF THE PREFERRED e EMBODIMENT 7 In thegeneral arrangement shown in FIG. 1 the input analyzer channel also produces a channel signal which is the input for the corresponding channel in the synthesizer 12, the outputs of which are summed to give the processed speech output. The excitation system can be broadly divided into two sections, namely, pitch extraction from analyzer and pitch generation for the synthesizer. The pitch extraction section operates upon the short-term envelope of the speech which is obtained from all the analyzer channels via summing network ll. The dc. component and unwanted high frequency a.c. components are eliminated by dc. eliminator l3 and squaring circuit 14, respectively, to give a square wave at the fundamental frequency. This square wave is then converted to a dc. voltage whose amplitude is proportional to the instantaneous frequency averaged over a short period by pulse generator and integrator 16, respectively. Meanwhile, a voiced/unvoiced decision is made by voiced/unvoiced detector 17, based on the relative high and low frequency energies in the multichannel inputs to the synthesizer, as derived by two groups of diode gates 18 and 19. Detector 17 controls the application of the dc. voltage representing pitch to a voltage controlled multi-vibrator 20 via gate 21 and backlash circuit 22. At the same time detector 17 is responsible for selecting either the output of multivibrator 20, or the output of noise generator 23 by way of the changeover switch 24. The selected output is fed via a squaring circuit 25 to an output pulse generator 26 which provides the excitation signal for the synthesizer channels.
Turning now 'to the details of the arrangement of FIG. 1, the analyzer 10 consists of a number of similar channels, say 22 in all, each of which has a bandpass filter 27, FIG. 2, followed by a rectifier 28. Each bandpass filter 27 covers a different portion of the speech spectrum, and provides an output which, during voiced speech, is approximately equivalent to a pure tone at thev filter center frequency amplitude modulated with a sawtooth waveform at the fundamental frequency. After rectification, with a short time constant tank capacitor (not shown), the result is a dc. signal with a sawtooth type of waveform and a superimposed a.c. ripple component at the channel input filter center frequency. By summing-this signal from all the channels in parallel in summing network 11, the sawtooth component, which is approximately the same for all the channels, is enhanced and the ripple components, being different for each channel, are diminished. The result is a dc. signal whose amplitude is varying in accordance with the dynamics of the speech signal, and upon which is imposed a sawtooth waveform at the fundamental frequency.
It is now necessary to eliminate the dc. component and unwanted high frequency a.c. components to produce a square wave at the fundamental frequency. Care is required in this operation: rapid rises in speech energy can produce a sharply rising d.c.- waveform and this transient must be eliminated without losing the information from the wanted low frequency a.c. component. This could normally be achieved by a high pass filter, but if the filter has a sufficientlyrapid roll-off to provide good discrimination its inpulse response will produce spurious signals which distort the required a.c. information.
The first stage of the pitch extraction section uses a differential amplifier 30, FIG. 3, to compare the input waveform from summing network 11 with a smoothed version of itself. The input waveform is applied to one input terminal of the amplifier and also to a tank circuit 31, from whichthe other input is derived. The rise time and dc. components of the two signals are the same, but one signal has a smaller sawtooth component and a longer decay time than the other signal. Thus, the dc. component is eliminated for rising and steady signals, but falling signals (such as at the end of a word) produce a temporary d.c. offset due to one input to the differential amplifier taking longer to decay than the other.
This remaining d.c. component is eliminated, the derived a.c. component low pass filtered and the remainder squared by the second stage of FIG. 3.. Again the signal is applied to both inputs of a differential amplifier 32 with RC integrating networks at each input. The two integrating networks 33 and 34 are arranged to have slightly different integrating times (about 20 percent difference) and this results in a frequency response as shown in FIG. 4: the gain falls away at about 6 dB/octave in both directions from a center frequency which is set by the network time constants. The dc. components of the signal are completely eliminated without introducing any high pass networks: effectively the operation relies on the difference in phase shift between the two integrating networks 33 and 34. This stage is also arranged to square the output by utilizing the feedback network 35, which provides d.c. feedback but no a.c. feedback.
Next it is necessary to convert the square wave into a dc. voltage whose amplitude is proportional to the instantaneous frequency averaged over a short period. The square wave is applied to a monostable circuit 40, FIG. 5, which produces a pulse of fixed width each time it is triggered by a negative going edge in the square wave input. These pulses are then passed through a two stage low pass filter. The first stage 41- produces gain, set by the potentiometer for unity frequency ratio. The second stage 42 has only unity gain and its output is a varying do. with minimal ripple at the pulse repetition frequency. The gain is set so that the succeeding voltage controlled multivibrator 20, FIG. 1, produces pulses at the correct rate. The second stage of the filter is arranged to have a dc. shift sufficient to compensate exactly for the dc. threshold exhibited in the control characteristic of multivibrator 20, due to the need to overcome the V of the control transistors.
The varying dc. voltage from the transistor of stage 42 is used to control the pitch generator multivibrator 43, FIG. 6. This is a conventional cross-coupled multivibrator in which the charging current for the coupling capacitors is provided by current-source transistors 44 and 45 to whose bases is applied the control voltage.
There may be still some ripple remaining on the dc.
control voltage and to prevent this from frequency modulating the multivibrator output a backlash circuit 22, FIG. 1, is interposed in the control circuit. This backlash circuit takes the form of two germanium diodes 46 and 47 and a tank capacitor 48, as shown in FIG. 6. Changes in control voltage must be greater than the forward voltage of the diodes to be transferred to the control point of the multivibrator, so that significant changes in control voltage are transmitted while ripple components are rejected.
Gate 21 of FIG. 1 is realized by an FET switch 50 placed in the control line. During unvoiced speech the control voltage effectively falls to zero and without the switch the multivibrator output would cease. Then, when voicing recommenced, the system would take a finite and not insignificant time to build up to the correct frequency. This is avoided by arranging for the switch 50 to be opened during the absence of voicing and then the effect of the bias network and tank capacitor 48 is to keep the multivibrator running. The multivibratorfrequency will, however, drift slowly towards a pre-set median value. In this way the multivibrator is already operating in approximately the correct condition when voicing resumes. The switch 50 is controlled by detector 17, FIG. 1.
For unvoiced speech a noise generator 23, FIG. 1, is required. Noise is generated by amplifying the input noise of an operational amplifier 51, FIG. 6, operating with a high source impedance. The feedback network 52 shown provides feedback only at d.c. and low frequencies so that the generator is operating essentially with open-loop gain for medium and high frequencies. The terms low, medium and high are here used in the context of the speech frequency range. The feedback configuration is designed to provide d.c. stability and a 6 dB/octave bass roll-off in the noise output. Frequency compensation provides a 6 dB/octave high frequency roll-off in open-loop gain, so that the net result is a broad band of noise. By adjustment of the value of the capacitor in the feedback loop the center frequency of the band can be set so that it is centered at the optimum value for unvoiced speech synthesis. Since the noise amplitude obtained varies from one amplifier to another the output signal is clipped to a standard ampli tude by a pair of reversed diodes 53 and 54 in parallel. The signal from the pitch generator 43 is reduced to the same amplitude for application to the voiced/unvoiced switch by a suitable tap on the multivibrator.
The voiced/unvoiced gate or switch 24, FIG. 1, is essentially a single-pole change-over switch, to pass the output of either the pitch generator, multivibrator 43, FIG. 6, or the noise generator 51, FIG. 6. This is achieved electronically by the arrangement shown in FIG. 6, which is a two-channel analog switch using bi polar transistors 55 and 56 inthe saturated mode. The control voltage swings between volts and -10 volts, and the transistors are so biased that one switches from a conducting state to a non-conducting state as the other switches in the reverse direction. A dead space is avoided by the pair of diodes 57 and 58 in the bias chain. When one of the transistors is conducting it acts as a low impedance shunt and adequately attenuates the signal on that line.
The two outputs from the switch are fed to the squaring circuit 25, FIG. 1, which is realized by the differential amplifier 60, FIG. 6. This also rejects the spurious pulses whichare introduced into both lines via the switching transistors when the switching signal changes polarity. This stage is arranged to square the signal by providing d.c. feedback only. Thus, at the output of this stage the signal consists of an infinitely clipped waveform which is either periodic (i.e., at the fundamental frequency during voicing), 'or random (during unvoiced speech).
The switching signal for switch 24, FIG. 1, is provided by detector 17, which is illustrated in detail in FIG. 7. Each of the analyzer channels, FIG. 2, has a low pass filter 29 following the rectifier 28. These low pass filtered channel signals reflect the energy levels in'each of the frequency ranges covered by the channels. To
make the voicing/unvoiced decision a comparison is made of the maximum signal amplitude in two groups of channels, one group covering the upper frequencies of the speech spectrum and the other group covering the lower frequencies. The circuit of FIG. 7 will, de-
pending on certain rules built into it, make the decision as to whether the speech is voiced or unvoiced.
The two signals to be compared are derived fromthe two groups of channels by diode gate networks 18 and 19, FIG. 1. Asshown in detail in FIG. 7 these two groups of channels consist of the six lowest frequency channels and the four highest frequency channels. The largest signal in each group is transmitted by the relevant diode in the two gate groups through resistors 70 and 71. The decision circuit is a comparator 72 so arranged that the output holds the switch in the unvoiced state when the upper group of channels has a larger signal that the lower. When the upper group has a smaller signal, the switch is in the voiced state. This decision relies upon the fact that during fricatives there is a fairly strong energy component at high frequencies but little energy at low frequencies, and vice versa.
A two level clamp 73 and 74 is applied to the high I frequency bus input to the comparator to ensure the correct input under all conditions for a helium environment where, in helium speech, the amplitude of the unvoiced components relative to the voiced components is relatively depressed as compared to normal speech due to the Donald Duck effect. One clamp ensures that for h.f. signal levels above about 3 volts the output is always voiced, to cater to those vowel sounds which have a relatively high amount of energy at high frequencies. The other clamp ensures that for signal levels lower than about 1 volt the switch is always in the unvoiced condition. This ensures that a steady background noise is produced at-the output during the absence of speech, rather than allowing erratic excitation depending upon the characteristics of the input noise. Positive feedback is applied to the comparator to provide a small amount of hysteresis in order to avoid spurious transitions when the two inputs are similar.
The output pulse generator 26, FIG. 1, receives the rectangular waveform from the switch via a level detector 80, FIG.'8, which sharpens the edges and removes spurious ripples around zero by infinitely clipping the signal. The positive going edges of the signal are then used to trigger the monostable circuit 81, which has a power stage 82 and 83 added to provide the necessary output drive capability.
The charge and discharge paths for the monostable timing capacitor are separated by a diode 84 in such a way that the charge on the capacitor (and hence the pulse width) is dependent upon the elapsed time since the previous pulse. As the time decreases the pulse becomes shorter, and the effect is to tend to equalize the average power of the pulse train. The correction is not exact, and the power tends to increase somewhat as the pulse rate rises. This is desirable in the case of ahelium speech processor. The pulse rate is highest during random excitation, so that the energy level is somewhat higher for unvoiced speech than for voiced speech and this helps to compensate for the fact that the energy in the fricatives is depressed during helium speech.
While I have described above the principles of my invention in connection with specific apparatus it is to be clearly understood that this description is made only by way of, example and not as a limitation to the scope of my invention as set forth in the objects thereof and in the accompanying claims.
I claim:
1. A helium environment vocoder having a multichannel speech analyzer, a multichannel speech synthesizer and an excitation system for extracting time varying characteristics of the excitation function of input helium speech and reproducing this function for the excitation of said synthesizer, said excitation system comprising:
first means coupled to said analyzer for extracting continuously the fundamental frequency from said input speech;
second means coupled to said first means for producing a sequence of pulses at a rate having predetermined relationship to said fundamental frequency;
a noise generator to produce noise;
third means coupled to said analyzer for determining whether said input speech is voiced or unvoiced;
fourth means coupled to said second means, said noise generator and said third means responsive to the output signal of said third means for selecting said sequence of pulses if said input speech is voiced and for selecting said noise from said noise generator if said input speech is unvoiced; and
fifth means coupled to said fourth means and said synthesizer for applying the selected one of said sequence of pulses and said noise as an excitation input pulse stream to said synthesizer, said input pulse stream having a constant energy level;
said third means including a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group of channels, and
a two level clamp circuit coupled to said other of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold leveland to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
2. A helium enviornment vocoder having a multichannel speech analyzer, a multichannel speech synthesizer and an excitation system for extracting time varying characteristics of the excitation function of input helium speech and reproducing this function for the excitation of said synthesizer, said excitation system comprising:
first means coupled to said analyzer for extracting continuously the fundamental frequency from said input speech;
second means coupled to said first means for producing a sequence of pulses at a rate having a predetermined relationship to said fundamental frequency;
noise generator and said third means responsive to the output signal of said third means for selecting said sequence of pulses if said input speech is voiced and for selecting said noise from said noise generator if said input speech is unvoiced; and fifth means coupled to said fourth means and said synthesizer for applying the selected one of said sequence of pulses and said noise as an excitation input pulse stream to said synthesizer, said input pulse stream having a constant energy level; said second means including a voltage controlled multivibrator, and an input circuit for said multivibrator having a capacitor, and a pair of isolating diodes coupled to said capacito'r, said pair of diodes having coupled thereto for applying to said capacitor through said pair of diodes a direct current voltage bearing a predetermined relationship to said fundamental frequency. 3. A vocoder according to claim 2, wherein said third means includes a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group of channels,
and
a two level clamp circuit coupled to one of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
4. A helium enviornment vocoder having a multia noise generator to produce noise;
third means coupled to said analyzer for determining whether said input speech is voiced or unvoiced;
fourth means coupled to said second means, said noise generator and said third means responsive to the output signal of said third means for selecting said sequence of pulses if said input speech is voiced and for selecting said noise from said noise generator if said input speech is unvoiced; and
fifth means coupled to said fourth means and said synthesizer for applying the selected one of said sequence of pulses and said noise as an excitation input pulse stream to said synthesizer, said input pulse stream having a constant energy level;
each channel of said analyzer including said first means including sixth means coupled in parallel to the output of a given number of said rectifiers to produce asum of the output signals passed by each of said given number of said rectifiers, 10 a first differential amplifier having an output and two inputs, one of said two inputs of said first differential amplifier being coupled to said sixth means to receive said sum of said ouput signals,
a seventh means coupled between said sixth means and the other of said two inputs of said first differential amplifier to smooth said sum of said output signals prior to being applied to said other of said two inputs of said first differential amplifier,
eighth means coupled to said output of said first differential amplifier to square the output signal of said first differential amplifier, t
a monostable circuit coupled to said eighth means driven by a squared output signal from said eighth means, and a low pass filter means having gain coupled to the output of said monostable circuit to operate on the output signal of said monostable circuit.
5. A vocoder according to claim 4, wherein said third means includes a comparator having two inputs and an output, one
of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group of channels, and
a two level clamp circuit coupled to one of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
6. A vocoder according to claim 4, wherein said second'means includes a voltage controlled multivibrator, and an input circuit for said multivibrator having a capacitor, and
a pair of isolating diodes coupled to said capacitor, said pair of diodes having coupled thereto for applying to said capacitor through said pair of diodes a direct current voltage bearing a predetermined relationship to said fundamental frequency.
7. A vocoder according to claim 6, wherein said third means includes a two level clamp circuit coupled to one of said two inputs to hold said output :in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
8. A vocoder according to claim 4, wherein said eighth means includes a second differential amplifier having an output and two inputs,
a first integrating network coupled between said output of said first difierential amplifier and one of said two inputs of said second differential amplifier, and first integrating network having a first time constant,
a second integrating network coupled between said output of said first differential amplifier and the other of said two inputs'of said second differential amplifier, said second] integrating network having a second time constant different than said first time constant, and
a feedback network coupled between said output of said second differential amplifier and one of said two inputs of said second differential amplifier, said feedback network. providing direct current negative feedback and no alternating current feedback. t 9. A vocoder according to claim 8, wherein said third means includes a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group ofchannel's, and
a two level clamp circuit coupled to one of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
10. A vocoder according to claim 8, wherein said second means includes a voltage controlled multivibrator, and
an input circuit for said multivibrator having a capacitor, and a pair of isolating diodes coupled to said capacitor, said pair of diodes having coupled thereto,
for applying to said capacitor through said pair of diodes a direct current voltage bearing a predetermined relationship to said fundamental frequency. i 11. A vocoder according to claim 10, wherein said third means includes a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels 1 1 12 being different than said first group of channels, said third means includes and a comparator having two inputs and an output, one a two level clamp circuit coupled to one of said two of said inputs being coupled to a first group of inputs to hold said output in one condition when channels of said analyzer having a passband rethe input exceeds a predetermined first threshold 5 lated to voiced speech and the other of said inlevel and to hold said output in the other condiputs being coupled to a second group of channels tion when that input falls below a second predeof said analyzer having a passband related to untermined threshold level lower than said first voiced speech, said second group of channels threshold level. being different than said first group of channels, 12. A vocoder according to claim 10, further includand ing a two level clamp circuit coupled to one of said two switching means coupled to said second means, said inputs to hold said output in one condition when third means and said input circuit for said multivithe input exceeds a predetermined first threshold brator, said switching means being responsive to level and to hold said output in the other condithe detection of unvoiced speech by said third tion when that input falls below a second predemeans to disconnect said second means from said termined threshold level lower than said first input circuit for said multivibrator. threshold level. 13. A vocoder according to claim 12, wherein

Claims (13)

1. A helium environment vocoder having a multichannel speech analyzer, a multichannel speech synthesizer and an excitation system for extracting time varying characteristics of the excitation function of input helium speech and reproducing this function for the excitation of said synthesizer, said excitation system comprising: first means coupled to said analyzer for extracting continuously the fundamental frequency from said input speech; second means coupled to said first means for producing a sequence of pulses at a rate having predetermined relationship to said fundamental frequency; a noise generator to produce noise; third means coupled to said analyzer for determining whether said input speech is voiced or unvoiced; fourth means coupled to said second means, said noise generator and said third means responsive to the output signal of said third means for selecting said sequence of pulses if said input speech is voiced and for selecting said noise from said noise generator if said input speech is unvoiced; and fifth means coupled to said fourth means and said synthesizer for applying the selected one of said sequence of pulses and said noise as an excitation input pulse stream to said synthesizer, said input pulse stream having a constant energy level; said third means including a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group of channels, and a two level clamp circuit coupled to said other of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
2. A helium enviornment vocoder having a multichannel speech analyzer, a multichannel speech synthesizer and an excitation system for extracting time varying characteristics of the excitation function of input helium speech and reproducing this function for the excitation of said synthesizer, said excitation system comprising: first means coupled to said analyzer for extracting continuously the fundamental frequency from said input speech; second means coupled to said first means for producing a sequence of pulses at a rate having a predetermined relationship to said fundamental frequency; a noise generator to produce noise; third means coupled to said analyzer for determining whether said input speech is voiced or unvoiced; fourth means coupled to said second means, said noise generator and said third means responsive to the output signal of said third means for selecting said sequence of pulses if said input speech is voiced and for selecting said noise from said noise generator if said input speech is unvoiced; and fifth means coupled to said fourth means and said synthesizer for applying the selected one of said sequence of pulses and said noise as an excitation input pulse stream to said synthesizer, said input pulse stream having a constant energy level; said second means including a voltage controlled multivibrator, and an input circuit for said multivibrator having a capacitor, and a pair of isolating diodes coupled to said capacitor, said pair of diodes having coupled thereto for applying to said capacitor through said pair of diodes a direct current voltage bearing a predetermined relationship to said fundamental frequency.
3. A vocoder according to claim 2, wherein said third means includes a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group of channels, and a two level clamp circuit coupled to one of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
4. A helium enviornment vocoder having a multichannel speech analyzer, a multichannel speech synthesizer and an excitation system for extracting time varying characteristics of the excitation function of input helium speech and reproducing this function for the excitation of said synthesizer, said excitation system comprising: first means coupled to said analyzer for extracting continuously the fundamental frequency from said input speech; second means coupled to said first means for producing a sequence of pulses at a rate having a predetermined relationship to said fundamental frequency; a noise generator to produce noise; third means coupled to said analyzer for determining whether said input speech is voiced or unvoiced; fourth means coupled to said second means, said noise generator and Said third means responsive to the output signal of said third means for selecting said sequence of pulses if said input speech is voiced and for selecting said noise from said noise generator if said input speech is unvoiced; and fifth means coupled to said fourth means and said synthesizer for applying the selected one of said sequence of pulses and said noise as an excitation input pulse stream to said synthesizer, said input pulse stream having a constant energy level; each channel of said analyzer including a bandpass filter coupled to said input speech signal, the passband of said bandpass filter being different for each channel, and a rectifier coupled to the output of said bandpass filter; and said first means including sixth means coupled in parallel to the output of a given number of said rectifiers to produce a sum of the output signals passed by each of said given number of said rectifiers, a first differential amplifier having an output and two inputs, one of said two inputs of said first differential amplifier being coupled to said sixth means to receive said sum of said ouput signals, a seventh means coupled between said sixth means and the other of said two inputs of said first differential amplifier to smooth said sum of said output signals prior to being applied to said other of said two inputs of said first differential amplifier, eighth means coupled to said output of said first differential amplifier to square the output signal of said first differential amplifier, a monostable circuit coupled to said eighth means driven by a squared output signal from said eighth means, and a low pass filter means having gain coupled to the output of said monostable circuit to operate on the output signal of said monostable circuit.
5. A vocoder according to claim 4, wherein said third means includes a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group of channels, and a two level clamp circuit coupled to one of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
6. A vocoder according to claim 4, wherein said second means includes a voltage controlled multivibrator, and an input circuit for said multivibrator having a capacitor, and a pair of isolating diodes coupled to said capacitor, said pair of diodes having coupled thereto for applying to said capacitor through said pair of diodes a direct current voltage bearing a predetermined relationship to said fundamental frequency.
7. A vocoder according to claim 6, wherein said third means includes a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group of channels, and a two level clamp circuit coupled to one of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
8. A vocoder according to claim 4, wherein said eighth means includes a second differential amplifier having an output and two inputs, a first integrating network coupled between saiD output of said first differential amplifier and one of said two inputs of said second differential amplifier, and first integrating network having a first time constant, a second integrating network coupled between said output of said first differential amplifier and the other of said two inputs of said second differential amplifier, said second integrating network having a second time constant different than said first time constant, and a feedback network coupled between said output of said second differential amplifier and one of said two inputs of said second differential amplifier, said feedback network providing direct current negative feedback and no alternating current feedback.
9. A vocoder according to claim 8, wherein said third means includes a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group of channels, and a two level clamp circuit coupled to one of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
10. A vocoder according to claim 8, wherein said second means includes a voltage controlled multivibrator, and an input circuit for said multivibrator having a capacitor, and a pair of isolating diodes coupled to said capacitor, said pair of diodes having coupled thereto for applying to said capacitor through said pair of diodes a direct current voltage bearing a predetermined relationship to said fundamental frequency.
11. A vocoder according to claim 10, wherein said third means includes a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group of channels, and a two level clamp circuit coupled to one of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
12. A vocoder according to claim 10, further including switching means coupled to said second means, said third means and said input circuit for said multivibrator, said switching means being responsive to the detection of unvoiced speech by said third means to disconnect said second means from said input circuit for said multivibrator.
13. A vocoder according to claim 12, wherein said third means includes a comparator having two inputs and an output, one of said inputs being coupled to a first group of channels of said analyzer having a passband related to voiced speech and the other of said inputs being coupled to a second group of channels of said analyzer having a passband related to unvoiced speech, said second group of channels being different than said first group of channels, and a two level clamp circuit coupled to one of said two inputs to hold said output in one condition when the input exceeds a predetermined first threshold level and to hold said output in the other condition when that input falls below a second predetermined threshold level lower than said first threshold level.
US00250534A 1971-06-10 1972-05-05 Helium environment vocoder Expired - Lifetime US3825685A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1995771 1971-06-10

Publications (1)

Publication Number Publication Date
US3825685A true US3825685A (en) 1974-07-23

Family

ID=10137952

Family Applications (1)

Application Number Title Priority Date Filing Date
US00250534A Expired - Lifetime US3825685A (en) 1971-06-10 1972-05-05 Helium environment vocoder

Country Status (4)

Country Link
US (1) US3825685A (en)
FR (1) FR2141363A5 (en)
GB (1) GB1334177A (en)
NL (1) NL7207374A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5113449A (en) * 1982-08-16 1992-05-12 Texas Instruments Incorporated Method and apparatus for altering voice characteristics of synthesized speech
US6928406B1 (en) * 1999-03-05 2005-08-09 Matsushita Electric Industrial Co., Ltd. Excitation vector generating apparatus and speech coding/decoding apparatus
US20060111908A1 (en) * 2004-11-25 2006-05-25 Casio Computer Co., Ltd. Data synthesis apparatus and program

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3012098A (en) * 1941-11-22 1961-12-05 Bell Telephone Labor Inc Telephone privacy
US3071652A (en) * 1959-05-08 1963-01-01 Bell Telephone Labor Inc Time domain vocoder
US3176073A (en) * 1961-12-04 1965-03-30 Gen Dynamics Corp Buzz-hiss decision system for a channel vocoder
US3377428A (en) * 1960-12-29 1968-04-09 Ibm Voiced sound detector circuits and systems
US3418662A (en) * 1965-03-31 1968-12-31 Nat Res Dev Prosthetic hand with improved control system for activation by electromyogram signals
US3431362A (en) * 1966-04-22 1969-03-04 Bell Telephone Labor Inc Voice-excited,bandwidth reduction system employing pitch frequency pulses generated by unencoded baseband signal
US3573374A (en) * 1968-01-25 1971-04-06 Philco Ford Corp Formant vocoder utilizing resonator damping
US3624302A (en) * 1969-10-29 1971-11-30 Bell Telephone Labor Inc Speech analysis and synthesis by the use of the linear prediction of a speech wave
US3676596A (en) * 1969-07-09 1972-07-11 Philips Corp Periodicity analyser for a quasi-periodic signal including a detection circuit

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3012098A (en) * 1941-11-22 1961-12-05 Bell Telephone Labor Inc Telephone privacy
US3071652A (en) * 1959-05-08 1963-01-01 Bell Telephone Labor Inc Time domain vocoder
US3377428A (en) * 1960-12-29 1968-04-09 Ibm Voiced sound detector circuits and systems
US3176073A (en) * 1961-12-04 1965-03-30 Gen Dynamics Corp Buzz-hiss decision system for a channel vocoder
US3418662A (en) * 1965-03-31 1968-12-31 Nat Res Dev Prosthetic hand with improved control system for activation by electromyogram signals
US3431362A (en) * 1966-04-22 1969-03-04 Bell Telephone Labor Inc Voice-excited,bandwidth reduction system employing pitch frequency pulses generated by unencoded baseband signal
US3573374A (en) * 1968-01-25 1971-04-06 Philco Ford Corp Formant vocoder utilizing resonator damping
US3676596A (en) * 1969-07-09 1972-07-11 Philips Corp Periodicity analyser for a quasi-periodic signal including a detection circuit
US3624302A (en) * 1969-10-29 1971-11-30 Bell Telephone Labor Inc Speech analysis and synthesis by the use of the linear prediction of a speech wave

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5113449A (en) * 1982-08-16 1992-05-12 Texas Instruments Incorporated Method and apparatus for altering voice characteristics of synthesized speech
US6928406B1 (en) * 1999-03-05 2005-08-09 Matsushita Electric Industrial Co., Ltd. Excitation vector generating apparatus and speech coding/decoding apparatus
US20060111908A1 (en) * 2004-11-25 2006-05-25 Casio Computer Co., Ltd. Data synthesis apparatus and program
US7523037B2 (en) * 2004-11-25 2009-04-21 Casio Computer Co., Ltd. Data synthesis apparatus and program

Also Published As

Publication number Publication date
FR2141363A5 (en) 1973-01-19
GB1334177A (en) 1973-10-17
NL7207374A (en) 1972-12-12

Similar Documents

Publication Publication Date Title
KR100495718B1 (en) Audio signal processing circuits and methods and audio systems
Noll et al. Short‐Time “Cepstrum” Pitch Detection
US3825685A (en) Helium environment vocoder
US4630300A (en) Front-end processor for narrowband transmission
US4506379A (en) Method and system for discriminating human voice signal
US2928902A (en) Signal transmission
US4158751A (en) Analog speech encoder and decoder
US3479599A (en) Signal sensitive depressed threshold detector
Kretsinger et al. The use of fast limiting to improve the intelligibility of speech in noise
GB664401A (en) Improvements in thermionic valve circuits
US3125723A (en) shaver
GB1008565A (en) Improvements in or relating to voiced sound detection circuits
US3573374A (en) Formant vocoder utilizing resonator damping
US3974336A (en) Speech processing system
JPH10173455A (en) Automatic dynamic range control circuit
CA1091163A (en) Sound processing method and apparatus
FR2423089A1 (en) NOISE LIMITING CIRCUIT ELIMINATING DISTORTION AT HIGH AMPLITUDES
US4600915A (en) Digital-to-analog converter circuit
US3499986A (en) Speech synthesizer
US3465102A (en) Readout analyzer
RU2032232C1 (en) Device for registration of speech pause in vocoder path
SU900467A1 (en) Noise suppressor
SU1443182A1 (en) Device for processing acoustic signals
JPH0632537B2 (en) Howling suppressor
SU1499525A1 (en) Compensator for multiplicative distortion of signals for systems with acoustic feedback

Legal Events

Date Code Title Description
AS Assignment

Owner name: STC PLC, 10 MALTRAVERS STREET, LONDON, WC2R 3HA, E

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST.;ASSIGNOR:INTERNATIONAL STANDARD ELECTRIC CORPORATION, A DE CORP.;REEL/FRAME:004761/0721

Effective date: 19870423

Owner name: STC PLC,ENGLAND

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INTERNATIONAL STANDARD ELECTRIC CORPORATION, A DE CORP.;REEL/FRAME:004761/0721

Effective date: 19870423