WO2015067741A1 - Apparatus and method for active voice training - Google Patents

Apparatus and method for active voice training Download PDF

Info

Publication number
WO2015067741A1
WO2015067741A1 PCT/EP2014/074016 EP2014074016W WO2015067741A1 WO 2015067741 A1 WO2015067741 A1 WO 2015067741A1 EP 2014074016 W EP2014074016 W EP 2014074016W WO 2015067741 A1 WO2015067741 A1 WO 2015067741A1
Authority
WO
WIPO (PCT)
Prior art keywords
electrical signal
voice
input electrical
processing unit
voice processing
Prior art date
Application number
PCT/EP2014/074016
Other languages
French (fr)
Inventor
Thierry GAUJARENGUES
Klaus LOHMANN
Original Assignee
Soundev Holding Sa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Soundev Holding Sa filed Critical Soundev Holding Sa
Publication of WO2015067741A1 publication Critical patent/WO2015067741A1/en

Links

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • G09B19/04Speaking
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B19/00Teaching not covered by other main groups of this subclass
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • H04R1/1091Details not provided for in groups H04R1/1008 - H04R1/1083
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2201/00Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
    • H04R2201/10Details of earpieces, attachments therefor, earphones or monophonic headphones covered by H04R1/10 but not provided for in any of its subgroups
    • H04R2201/107Monophonic and stereophonic headphones with microphone for two-way hands free communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2460/00Details of hearing devices, i.e. of ear- or headphones covered by H04R1/10 or H04R5/033 but not provided for in any of their subgroups, or of hearing aids covered by H04R25/00 but not provided for in any of its subgroups
    • H04R2460/13Hearing devices using bone conduction transducers

Definitions

  • the present invention generally relates to speech and/or voice training and more particularly to an apparatus and a method to actively support the user's learning efforts.
  • the present invention proposes in one of its aspects an apparatus comprising a microphone capable of being placed essentially in front of a user's mouth for capturing the sound of the user's voice and converting said sound into an input electrical signal. It further comprises a voice processing unit for processing said input electrical signal into an output electrical signal.
  • the apparatus also includes at least two electromechanical vibration transducers for applying vibrations to the user's skull, which electromechanical vibration transducers are capable of converting said output electrical signal into vibrations.
  • the voice processing unit of the apparatus comprises an electronic circuit assembly having a tone control unit to alter the input electrical signal depending on the amplitude of the input electrical signal, wherein the tone control unit alters the input electrical signal into said output electrical signal by applying a differential raising of a treble portion H if the input electrical signal is above a defined speech intensity threshold T for a defined trigger time ti .
  • the invention also provides a voice processing unit for processing an input electrical signal from the voice of a user captured by a microphone into an output electrical signal for producing vibrations by at least two electromechanical vibration transducers.
  • a voice processing unit comprises an electronic circuit assembly having a tone control unit to alter the input electrical signal depending on the amplitude of the input electrical signal, wherein the tone control unit alters the input electrical signal into said output electrical signal by applying a differential raising of a treble portion H if the input electrical signal is above a defined speech intensity threshold T for a defined trigger time ti .
  • the present invention concerns the use of an apparatus or of a voice processing unit as described herein for dynamic, frequency-selective and active voice-training.
  • the invention proposes a method for dynamic, frequency-selective and active voice-training using the apparatus or the voice processing unit as described herein.
  • a microphone is placed essentially in front of a user's mouth for capturing the sound of his voice and at least two electromechanical vibration transducers are placed against the user's skull, preferably against both temporal bones, more preferably the mastoid parts of the temporal bones.
  • the tone control unit alters the input electrical signal into said output electrical signal by applying a differential raising of a bass portion B if the input electrical signal is below said threshold T for a defined holding time t 2 .
  • the apparatus or the voice processing unit therefore further comprises dynamic voice-detection preferably based on intensity, including specified hold-delays, hysteresis, and signal ramping.
  • the apparatus or the voice processing unit presented herein preferably also comprises an automatic, preferably voice-coupled, equalizer section and/or high pass-rumble-protection.
  • the apparatus of the present invention does not comprise conventional loudspeakers to be placed over the user's ears. Hence, the user is able to normally hear all the sounds of the surroundings. The user is therefore able to use the apparatus in all circumstances and does not need to take the apparatus off for communication with other persons.
  • the audio-vocal loop is the natural process by which everyone receives, analyzes, assimilates and continuously adjusts the information transmitted by the voice. To properly function, the audio-vocal loop is based on the capacity of auditory discrimination, phonological awareness and integration of rhythm that each individual must be able to perform effortlessly. These skills are far beyond the word and are necessary in any learning process.
  • the apparatus of the present invention significantly improves attention, concentration, verbal working memory, short-term memory, speech and pronunciation, and/or rhythm and balance.
  • the basic principle is founded on being closer to the sensory message through bone conduction. Indeed, when we make a sound, we vibrate our vocal cords. This vibration is transmitted primarily by bone conduction and secondarily by air to the inner ear. Thus in our ears, we hear our own voice without difficulty. This bone transmission is fundamental because it avoids loss and alteration of the message related to air and ambient noise.
  • the apparatus of the invention thus allows to specifically amplifying the transmission through bone conduction.
  • the effect is an easier analysis and enhanced assimilation of the information the user emits.
  • the user starting in an optimal self-listening position is simply more aware about what he says.
  • the ear is not blocked by a loudspeaker, it continues to capture the ambient noise in the normal way.
  • the individual can concentrate on his own voice without being isolated from external sounds, so he learns to discriminate relevant sounds from those which are not.
  • he can normally hear another person and interact with it without removing the headset (parents, teachers, therapists, coaches, ).
  • the present apparatus is equipped with a voice processing unit implementing a dynamic filter that is responsive to the intensity of the voice.
  • the user hears his own voice by bone conduction but filtered with alternating contrast. This alternation will occur mainly on the attack words and long vowels.
  • "higher sounds" are more profoundly involved in the construction of language.
  • the user By amplifying and thus promote their perception, the user will instantly and unconsciously significantly improve the quality and speed of his voice. His voice will progressively become richer in harmonics, smoother, warmer, more fluent and dynamic.
  • the present alternative filter allows to raise the treble portion and optionally to correspondingly reduce the bass portion of the voice. It thus enhances the transmission of high frequency harmonics, which play a fundamental role in the cortical stimulation and revitalization.
  • the voice processing unit splits the input electrical signal in a first dynamic voice detection signal and a second audio equalizer signal, the first dynamic voice detection signal being converted into a high voltage signal if the input electrical signal is above threshold T for said time ti and into a defined low voltage signal if the input electrical signal is below threshold T, preferably after a defined holding time t 2 .
  • the speech intensity threshold T is between -40 and -80 db (dbV at microphone-input) at a frequency of 1 kHz, preferably between -54 and -70 db (dbV at microphone-input) at a frequency of 1 kHz.
  • threshold T implements a hysteresis function: the threshold level (dbV) for raising the treble portion (upswitch) is preferably different (such as about 8-10 dbV - (higher)) from the threshold for raising the bass portion (backswitch).
  • the trigger time ti is chosen in order to allow for a fast capturing of the voice while preventing surrounding noise to trigger the capturing when the user is not speaking.
  • trigger time ti is generally selected between 1 and 1000 ms, preferably between 5 and 500 ms, particularly preferably between 10 and 50 ms.
  • the treble portion H to be raised in intensity during speech by the voice processing unit is generally situated between 500 Hz and 20 kHz, preferably between 800 Hz and 15 kHz.
  • the bass portion B in the context of the invention usually is between 5 Hz and 1 kHz, preferably between 20 Hz and 900 Hz, particularly preferably between 100 Hz and 800 Hz.
  • holding time t 2 is selected between 1 and 2000 ms, preferably between 2 and 1000 ms, particularly preferably between 5 and 500 ms, and in particular between 20 and 200 ms.
  • the apparatus takes the form of a headset for being placed on the user's head including a microphone and the at least two transducers, while the voice processing unit itself is integrated in a separate casing and connected to the headset either by wires or wirelessly.
  • the apparatus is in the form of a headset for being placed on the user's head with all the components, such as the microphone, the transducers and the voice processing unit integrated to the headset.
  • Fig. 1 is a schematic diagram of a preferred embodiment of an apparatus of the present invention.
  • a preferred embodiment of the apparatus is a dynamic, frequency-selective, active voice training unit, and therefore advantageously contains several special (psychoacoustic, audiophile) circuits:
  • the apparatus or voice processing unit has the following features and functions, in which the pre-amplified microphone signal is split into two sections: a dynamic voice detection section and an audio-equalizer section.
  • the DC-voltage output of the dynamic voice detection effects the audio-equalizer section, and therefore the frequency content appearance at the apparatus' output.
  • the apparatus effects only the frequency response of the audio-signal itself, but not the average sound-level at 1 kHz.
  • the dynamic voice detection causes a defined "low" DC voltage output, if there is no or only a weak signal (below defined threshold) present at the microphone.
  • the dynamic voice detection causes a defined "high" DC voltage output, since the user is reaching a defined speech intensity threshold (such as -58 dbV at microphone-input) and can hold this intensity above threshold for a defined time (e.g. 25 ms).
  • a defined speech intensity threshold such as -58 dbV at microphone-input
  • the DC output of the dynamic voice detection is thereafter "ramped" by a specified time, both, up and down, such as from 10 to 20 ms, e.g. about 15 ms.
  • the "ramped" DC output of the dynamic voice detection is preferably forwarded to voltage controlled resistors, which effect the audio-equalizer section in a specified manner, according to the training effect and psychoacoustic needs.
  • the resulting equalized (mono) audio signal is forwarded to a two-channel headphone amplifier with separate Lo-Z outputs (e.g. 10 Ohms) for left and right bone conduction.
  • Lo-Z outputs e.g. 10 Ohms
  • the present apparatus will preferably be placed at the mastoid bone left and right of the user's head. Therefore, there is no isolation of the ears, and the user can continue to perceive acoustic signals from the surroundings.
  • the apparatus is thus a small and lightweight bone conduction headset combined to a voice processing unit, either as a separate clip-box or directly affixed to or integrated into the headset, preferably powered by 3.7 V Li-ion battery with USB charging.
  • a voice processing unit either as a separate clip-box or directly affixed to or integrated into the headset, preferably powered by 3.7 V Li-ion battery with USB charging.
  • the user gets the training effect by listening to his own speech.
  • the audio- path is fully analog.
  • the headset microphone position will most preferably be in front of the user's mouth, to reduce the risk for feedback.
  • the apparatus is very versatile and may be used
  • the method making use of the present apparatus is generally practiced in sessions, each of which should preferably last between about 20 minutes and 1 .5 hours a day.
  • the sessions should be repeated during several weeks, preferably however with so-called "integration" breaks.
  • Fig. 1 is a schematic diagram of a preferred embodiment of the apparatus according to the invention comprising a microphone 10 providing an input electrical signal (Mic Input), a voice processing unit 20, as well as two transducers 30 (Bone Conduction Headphone).
  • the reference numbers between brackets in Fig. 1 refer to the paragraph number in which the relevant feature is further described.
  • the input electrical signal is generally preamplified by preamplifier 201 .
  • the preamplified signal is then split into a (first) dynamic voice detection signal and a (second) audio equalizer signal.
  • the dynamic voice detection signal is processed by converting it into a high voltage signal 234 if the input electrical signal is above threshold 231 (T) for a trigger time 232 (ti) and into a defined low voltage signal 234 if the input electrical signal is below threshold 231 (T), preferably for a defined holding time 233 (t 2 ).
  • the DC output 234 of the dynamic voice detection is thereafter ramped 235 by a specified time, both, up and down, such as for 15 ms.
  • the ramped DC output of the dynamic voice detection signal is thereafter sent to voltage controlled resistors 236 (VCR), which effect the audio-equalizer section in a predetermined manner, either through treble tone control unit 240 or bass tone control unit 260.
  • Buffer 250 merely is a pull up amplifier.
  • the resulting equalized (mono) audio signal is forwarded to a two-channel headphone amplifier 270 with separate Lo-Z outputs for left and right bone conduction transducers 30.

Abstract

An apparatus comprises a microphone capable of being placed essentially in front of a user's mouth for capturing the sound of the user's voice and converting said sound into an input electrical signal, a voice processing unit for processing said input electrical signal into an output electrical signal, at least two electromechanical vibration transducers for applying vibrations to the user's skull, which electromechanical vibration transducers are capable of converting said output electrical signal into vibrations. The apparatus is characterized in that the voice processing unit comprises an electronic circuit assembly having a tone control unit to alter the input electrical signal depending on the amplitude of the input electrical signal, wherein the tone control unit alters the input electrical signal into said output electrical signal by applying a differential raising of a treble portion (H) if the input electrical signal is above a defined speech intensity threshold (T) for a defined trigger time (ti).

Description

APPARATUS AND METHOD FOR ACTIVE VOICE TRAINING Technical field
[0001 ] The present invention generally relates to speech and/or voice training and more particularly to an apparatus and a method to actively support the user's learning efforts.
Background Art
[0002] Although a number of solutions exist for allegedly or truly supporting speech and/or voice training, these solutions are generally subject to one or more drawbacks. Some integrate conventional headphones and thus sequester the user from the (sounds of the) surroundings and/or (at least partly) prevent communication with a helping person, such as with a coach or trainer. Others combine aerial sound transmission with bone conduction, which makes these solutions subject to the same disadvantages of acoustic user isolation. Furthermore, it has been noted that the use of bone conduction alone does not always yield the positive training results expected or claimed.
Technical problem
[0003] It is therefore an object of the present invention to provide an (alternative) apparatus and method which effectively allow to actively training speech and voice without isolating the user from the surrounding sounds and noises.
General Description of the Invention
[0004] In order to overcome (at least some of) the above-mentioned problems, the present invention proposes in one of its aspects an apparatus comprising a microphone capable of being placed essentially in front of a user's mouth for capturing the sound of the user's voice and converting said sound into an input electrical signal. It further comprises a voice processing unit for processing said input electrical signal into an output electrical signal. The apparatus also includes at least two electromechanical vibration transducers for applying vibrations to the user's skull, which electromechanical vibration transducers are capable of converting said output electrical signal into vibrations. The voice processing unit of the apparatus according to the invention comprises an electronic circuit assembly having a tone control unit to alter the input electrical signal depending on the amplitude of the input electrical signal, wherein the tone control unit alters the input electrical signal into said output electrical signal by applying a differential raising of a treble portion H if the input electrical signal is above a defined speech intensity threshold T for a defined trigger time ti .
[0005] In a further aspect, the invention also provides a voice processing unit for processing an input electrical signal from the voice of a user captured by a microphone into an output electrical signal for producing vibrations by at least two electromechanical vibration transducers. Such a voice processing unit comprises an electronic circuit assembly having a tone control unit to alter the input electrical signal depending on the amplitude of the input electrical signal, wherein the tone control unit alters the input electrical signal into said output electrical signal by applying a differential raising of a treble portion H if the input electrical signal is above a defined speech intensity threshold T for a defined trigger time ti .
[0006] In a still further aspect the present invention concerns the use of an apparatus or of a voice processing unit as described herein for dynamic, frequency-selective and active voice-training.
[0007] In yet another aspect, the invention proposes a method for dynamic, frequency-selective and active voice-training using the apparatus or the voice processing unit as described herein. In such a method a microphone is placed essentially in front of a user's mouth for capturing the sound of his voice and at least two electromechanical vibration transducers are placed against the user's skull, preferably against both temporal bones, more preferably the mastoid parts of the temporal bones.
[0008] Preferably, the tone control unit alters the input electrical signal into said output electrical signal by applying a differential raising of a bass portion B if the input electrical signal is below said threshold T for a defined holding time t2.
[0009] In a preferred embodiment, the apparatus or the voice processing unit, therefore further comprises dynamic voice-detection preferably based on intensity, including specified hold-delays, hysteresis, and signal ramping. [0010] The apparatus or the voice processing unit presented herein preferably also comprises an automatic, preferably voice-coupled, equalizer section and/or high pass-rumble-protection.
[001 1 ] It is important to note that the apparatus of the present invention does not comprise conventional loudspeakers to be placed over the user's ears. Hence, the user is able to normally hear all the sounds of the surroundings. The user is therefore able to use the apparatus in all circumstances and does not need to take the apparatus off for communication with other persons.
[0012] The audio-vocal loop is the natural process by which everyone receives, analyzes, assimilates and continuously adjusts the information transmitted by the voice. To properly function, the audio-vocal loop is based on the capacity of auditory discrimination, phonological awareness and integration of rhythm that each individual must be able to perform effortlessly. These skills are far beyond the word and are necessary in any learning process.
[0013] For many reasons, this loop may be disrupted. By optimizing its operating capacity, the apparatus of the present invention significantly improves attention, concentration, verbal working memory, short-term memory, speech and pronunciation, and/or rhythm and balance.
[0014] The basic principle is founded on being closer to the sensory message through bone conduction. Indeed, when we make a sound, we vibrate our vocal cords. This vibration is transmitted primarily by bone conduction and secondarily by air to the inner ear. Thus in our ears, we hear our own voice without difficulty. This bone transmission is fundamental because it avoids loss and alteration of the message related to air and ambient noise. The apparatus of the invention thus allows to specifically amplifying the transmission through bone conduction.
[0015] The effect is an easier analysis and enhanced assimilation of the information the user emits. The user starting in an optimal self-listening position is simply more aware about what he says. The ear is not blocked by a loudspeaker, it continues to capture the ambient noise in the normal way. The individual can concentrate on his own voice without being isolated from external sounds, so he learns to discriminate relevant sounds from those which are not. In addition, he can normally hear another person and interact with it without removing the headset (parents, teachers, therapists, coaches, ...).
[0016] Even if bone conduction enhances self-listening, an important aspect is the quality of the message emitted by the voice. The present apparatus is equipped with a voice processing unit implementing a dynamic filter that is responsive to the intensity of the voice. The user hears his own voice by bone conduction but filtered with alternating contrast. This alternation will occur mainly on the attack words and long vowels. In fact, it has been found that "higher sounds" are more profoundly involved in the construction of language. By amplifying and thus promote their perception, the user will instantly and unconsciously significantly improve the quality and speed of his voice. His voice will progressively become richer in harmonics, smoother, warmer, more fluent and dynamic.
[0017] Moreover, the present alternative filter allows to raise the treble portion and optionally to correspondingly reduce the bass portion of the voice. It thus enhances the transmission of high frequency harmonics, which play a fundamental role in the cortical stimulation and revitalization.
[0018] In a particularly preferred embodiment, the voice processing unit (of the apparatus) splits the input electrical signal in a first dynamic voice detection signal and a second audio equalizer signal, the first dynamic voice detection signal being converted into a high voltage signal if the input electrical signal is above threshold T for said time ti and into a defined low voltage signal if the input electrical signal is below threshold T, preferably after a defined holding time t2.
[0019] In the context of the present invention, the speech intensity threshold T is between -40 and -80 db (dbV at microphone-input) at a frequency of 1 kHz, preferably between -54 and -70 db (dbV at microphone-input) at a frequency of 1 kHz. In a further preferred embodiment, threshold T implements a hysteresis function: the threshold level (dbV) for raising the treble portion (upswitch) is preferably different (such as about 8-10 dbV - (higher)) from the threshold for raising the bass portion (backswitch).
[0020] The trigger time ti is chosen in order to allow for a fast capturing of the voice while preventing surrounding noise to trigger the capturing when the user is not speaking. In practice, trigger time ti is generally selected between 1 and 1000 ms, preferably between 5 and 500 ms, particularly preferably between 10 and 50 ms.
[0021 ] The treble portion H to be raised in intensity during speech by the voice processing unit is generally situated between 500 Hz and 20 kHz, preferably between 800 Hz and 15 kHz. The bass portion B in the context of the invention usually is between 5 Hz and 1 kHz, preferably between 20 Hz and 900 Hz, particularly preferably between 100 Hz and 800 Hz. In the context of the present invention there may be a certain overlap between bass and treble portions without departing from the object. However the intersection between both portions only represents a small fraction of the used spectra.
[0022] It is generally advantageous to keep the capturing of the microphone and processing by the voice processing unit even after the end of the last sound of the user in order to allow for appropriate handling of pausing between words or sentences. Hence, it is generally desirable to provide for a certain hysteresis in order to avoid too rapid switching between treble raising and bass raising. Therefore, in a still further preferred variant of the apparatus or of the voice processing unit, holding time t2 is selected between 1 and 2000 ms, preferably between 2 and 1000 ms, particularly preferably between 5 and 500 ms, and in particular between 20 and 200 ms.
[0023] In practice, it may be useful to pre-amplify the input electrical signal provided by the microphone before its further processing in the voice processing unit.
[0024] In one embodiment of the apparatus presented and described herein, the apparatus takes the form of a headset for being placed on the user's head including a microphone and the at least two transducers, while the voice processing unit itself is integrated in a separate casing and connected to the headset either by wires or wirelessly.
[0025] Alternatively and preferably, the apparatus is in the form of a headset for being placed on the user's head with all the components, such as the microphone, the transducers and the voice processing unit integrated to the headset. Brief Description of the Drawing
[0026] Preferred embodiments of the invention will now be described, by way of example, with reference to the accompanying drawing in which:
Fig. 1 is a schematic diagram of a preferred embodiment of an apparatus of the present invention.
[0027] Further details and advantages of the present invention will be apparent from the following detailed description of several not limiting embodiments with reference to the attached drawing.
Description of Preferred Embodiments
[0028] Unlike usual voice amplifiers (like hearing aid amplifiers and products) a preferred embodiment of the apparatus is a dynamic, frequency-selective, active voice training unit, and therefore advantageously contains several special (psychoacoustic, audiophile) circuits:
• dynamic voice-detection (intensity), including specified hold-delays, hysteresis, and signal ramping
• automatic (voice-coupled) shelve equalizer section (fine-tuned for the training effect), additional high pass rumble protection
• special amplifier-section to drive a customized Lo-Z bone conductor headphone
• a small "one piece" ergonomic design, customized for the use of a special (lightweight) bone conductor headphone/headset microphone.
[0029] In a preferred variant, the apparatus or voice processing unit has the following features and functions, in which the pre-amplified microphone signal is split into two sections: a dynamic voice detection section and an audio-equalizer section.
[0030] Finally, the DC-voltage output of the dynamic voice detection effects the audio-equalizer section, and therefore the frequency content appearance at the apparatus' output. [0031] Unlike usual dynamic audio units such as compressor/I imiter/noise gate, the apparatus effects only the frequency response of the audio-signal itself, but not the average sound-level at 1 kHz.
[0032] The dynamic voice detection causes a defined "low" DC voltage output, if there is no or only a weak signal (below defined threshold) present at the microphone.
[0033] The dynamic voice detection causes a defined "high" DC voltage output, since the user is reaching a defined speech intensity threshold (such as -58 dbV at microphone-input) and can hold this intensity above threshold for a defined time (e.g. 25 ms).
[0034] In case the user's speech intensity drops below the threshold (such as -66 dbV at microphone-input because of hysteresis), and after a defined holding time (such as 70 ms), the output of the dynamic voice detection becomes "low" again.
[0035] The DC output of the dynamic voice detection is thereafter "ramped" by a specified time, both, up and down, such as from 10 to 20 ms, e.g. about 15 ms.
[0036] The "ramped" DC output of the dynamic voice detection is preferably forwarded to voltage controlled resistors, which effect the audio-equalizer section in a specified manner, according to the training effect and psychoacoustic needs.
[0037] The resulting equalized (mono) audio signal is forwarded to a two-channel headphone amplifier with separate Lo-Z outputs (e.g. 10 Ohms) for left and right bone conduction.
[0038] Unlike usual headphones, the present apparatus will preferably be placed at the mastoid bone left and right of the user's head. Therefore, there is no isolation of the ears, and the user can continue to perceive acoustic signals from the surroundings.
[0039] In a preferred embodiment of the invention, the apparatus is thus a small and lightweight bone conduction headset combined to a voice processing unit, either as a separate clip-box or directly affixed to or integrated into the headset, preferably powered by 3.7 V Li-ion battery with USB charging. [0040] The user gets the training effect by listening to his own speech. The audio- path is fully analog. The headset microphone position will most preferably be in front of the user's mouth, to reduce the risk for feedback.
[0041 ] The utilization of the present apparatus is easy: it is sufficient to appropriately place the apparatus such that the microphone is in front of the mouth and the transducers against both temporal bones and to normally speak into the microphone.
[0042] The apparatus is very versatile and may be used
- alone: reading aloud, singing, rehearsing, pronunciation and expression exercises, repetition exercises,
- with parents or any other helping person: reading, homework
- with a professional (therapist, teacher, coach),
- in everyday conversation, oral presentation, discussions.
[0043] To be effective, the method making use of the present apparatus (or voice processing unit) is generally practiced in sessions, each of which should preferably last between about 20 minutes and 1 .5 hours a day. The sessions should be repeated during several weeks, preferably however with so-called "integration" breaks.
[0044] Fig. 1 is a schematic diagram of a preferred embodiment of the apparatus according to the invention comprising a microphone 10 providing an input electrical signal (Mic Input), a voice processing unit 20, as well as two transducers 30 (Bone Conduction Headphone). The reference numbers between brackets in Fig. 1 refer to the paragraph number in which the relevant feature is further described.
[0045] The input electrical signal is generally preamplified by preamplifier 201 . The preamplified signal is then split into a (first) dynamic voice detection signal and a (second) audio equalizer signal.
[0046] The dynamic voice detection signal is processed by converting it into a high voltage signal 234 if the input electrical signal is above threshold 231 (T) for a trigger time 232 (ti) and into a defined low voltage signal 234 if the input electrical signal is below threshold 231 (T), preferably for a defined holding time 233 (t2). The DC output 234 of the dynamic voice detection is thereafter ramped 235 by a specified time, both, up and down, such as for 15 ms. The ramped DC output of the dynamic voice detection signal is thereafter sent to voltage controlled resistors 236 (VCR), which effect the audio-equalizer section in a predetermined manner, either through treble tone control unit 240 or bass tone control unit 260. Buffer 250 merely is a pull up amplifier. The resulting equalized (mono) audio signal is forwarded to a two-channel headphone amplifier 270 with separate Lo-Z outputs for left and right bone conduction transducers 30.
Legend:
10 Microphone
20 Voice Processing Unit
201 Preamplifier
210 Volume control
220 High pass filter
230 Dynamic voice detection
231 Threshold T
232 Trigger time ti
233 Hold time t2
234 DC output/Hysteresis
235 Ramping
236 Voltage controlled resistor(s)
240 Tone control: treble
250 Buffer
260 Tone control: bass
270 Amplifier(s)
30 Transducer(s)

Claims

Claims
1 . An apparatus comprising a microphone capable of being placed essentially in front of a user's mouth for capturing the sound of the user's voice and converting said sound into an input electrical signal, a voice processing unit for processing said input electrical signal into an output electrical signal, at least two electromechanical vibration transducers for applying vibrations to the user's skull, which electromechanical vibration transducers are capable of converting said output electrical signal into vibrations, characterized in that the voice processing unit comprises an electronic circuit assembly having a tone control unit to alter the input electrical signal depending on the amplitude of the input electrical signal, wherein the tone control unit alters the input electrical signal into said output electrical signal by applying a differential raising of a treble portion (H) if the input electrical signal is above a defined speech intensity threshold (T) for a defined trigger time (ti).
2. A voice processing unit for processing an input electrical signal from the voice of a user captured by a microphone into an output electrical signal for producing vibrations by at least two electromechanical vibration transducers, characterized in that the voice processing unit comprises an electronic circuit assembly having a tone control unit to alter the input electrical signal depending on the amplitude of the input electrical signal, wherein the tone control unit alters the input electrical signal into said output electrical signal by applying a differential raising of a treble portion (H) if the input electrical signal is above a defined speech intensity threshold (T) for a defined trigger time (ti).
3. The apparatus as claimed in claim 1 or the voice processing unit as claimed in claim 2, wherein the tone control unit alters the input electrical signal into said output electrical signal by applying a differential raising of a bass portion (B) if the input electrical signal is below said threshold (T) for a defined holding time (t2).
4. The apparatus as claimed in claim 1 or 3 or the voice processing unit as claimed in claim 2 or 3, further comprising dynamic voice-detection preferably based on intensity, including specified hold-delays and signal ramping.
5. The apparatus as claimed in claim 1 or 3 to 4 or the voice processing unit as claimed in claim 2 to 4, further comprising an automatic, preferably voice- coupled, shelve equalizer section and/or high pass-rumble-protection.
6. The apparatus as claimed in claim 1 or 3 to 5 or the voice processing unit as claimed in claim 2 to 5, wherein the voice processing unit splits the input electrical signal in a first dynamic voice detection signal and a second audio equalizer signal, the first dynamic voice detection signal being converted into a high voltage signal if the input electrical signal is above threshold (T) for said time (ti) and into a defined low voltage signal if the input electrical signal is below threshold (T), preferably a defined holding time (t2).
7. The apparatus as claimed in claim 1 or 3 to 6 or the voice processing unit as claimed in claim 2 to 6, wherein speech intensity threshold (T) is between -40 and -80 dbV at microphone input at a frequency of 1 kHz, preferably between -54 and -70 dbV at microphone input at a frequency of 1 kHz, trigger time (ti) is between 1 and 1000 ms, preferably between 5 and 500, particularly preferably between 10 and 50 ms, and/or treble portion (H) is between 500 Hz and 20 kHz, preferably between 800 Hz and 15 kHz.
8. The apparatus as claimed in claim 1 or 3 to 7 or the voice processing unit as claimed in claim 2 to 7, wherein holding time (t2) is between 1 and 1000 ms, preferably between 5 and 500 ms, particularly preferably between 20 and 200 ms, and/or bass portion (B) is between 5 Hz and 1 kHz, preferably between 20 Hz and 900 Hz, particularly preferably between 100 and 800 Hz.
9. The apparatus as claimed in claim 1 or 3 to 8 or the voice processing unit as claimed in claim 2 to 8, wherein the input electrical signal is pre-amplified.
10. The apparatus as claimed in any one of claims 1 or 3 to 9, wherein: the apparatus is in the form of a headset for being placed on the user's head with the voice processing unit integrated in a separate casing and connected to the headset by wires or wirelessly.
1 1 . The apparatus as claimed in any one of claims 1 or 3 to 9, wherein: the apparatus is in the form of a headset for being placed on the user's head with the voice processing unit integrated to the headset.
12. A method for dynamic, frequency-selective and active voice training wherein an apparatus as claimed in any of claims 1 or 3 to 1 1 is used by placing the microphone essentially in front of a user's mouth for capturing the sound of his voice and by placing the at least two electromechanical vibration transducers against the user's skull, preferably against both temporal bones, more preferably the mastoid parts of the temporal bones.
13. Use of an apparatus as claimed in claims 1 or 3 to 1 1 or of a voice processing unit as claimed in claims 2 to 9 for dynamic, frequency-selective and active voice training.
PCT/EP2014/074016 2013-11-08 2014-11-07 Apparatus and method for active voice training WO2015067741A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
LU92306A LU92306B1 (en) 2013-11-08 2013-11-08 Apparatus and method for active voice training
LU92306 2013-11-08

Publications (1)

Publication Number Publication Date
WO2015067741A1 true WO2015067741A1 (en) 2015-05-14

Family

ID=49667538

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2014/074016 WO2015067741A1 (en) 2013-11-08 2014-11-07 Apparatus and method for active voice training

Country Status (2)

Country Link
LU (1) LU92306B1 (en)
WO (1) WO2015067741A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3329591A1 (en) * 2015-07-27 2018-06-06 TDK Corporation Electronic circuit for a microphone and microphone

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB974332A (en) * 1960-02-15 1964-11-04 Ile D Etudes Et De Brevets Mot Improvements in and relating to apparatus for audio-vocal conditioning and to a method and apparatus for modifying the acoustic properties of a room
EP0226333A1 (en) * 1985-11-14 1987-06-24 Vocaltech, Inc. Vocal tactile feedback apparatus
US6456721B1 (en) * 1998-05-11 2002-09-24 Temco Japan Co., Ltd. Headset with bone conduction speaker and microphone
US20090060231A1 (en) * 2007-07-06 2009-03-05 Thomas William Buroojy Bone Conduction Headphones
US20130202135A1 (en) * 2009-07-10 2013-08-08 Atlantic Signal, Llc Bone conduction commjnications headset with hearing protection

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB974332A (en) * 1960-02-15 1964-11-04 Ile D Etudes Et De Brevets Mot Improvements in and relating to apparatus for audio-vocal conditioning and to a method and apparatus for modifying the acoustic properties of a room
EP0226333A1 (en) * 1985-11-14 1987-06-24 Vocaltech, Inc. Vocal tactile feedback apparatus
US6456721B1 (en) * 1998-05-11 2002-09-24 Temco Japan Co., Ltd. Headset with bone conduction speaker and microphone
US20090060231A1 (en) * 2007-07-06 2009-03-05 Thomas William Buroojy Bone Conduction Headphones
US20130202135A1 (en) * 2009-07-10 2013-08-08 Atlantic Signal, Llc Bone conduction commjnications headset with hearing protection

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ANONYMOUS: "Direct Auditory Feedback & Bone Conduction Enhancement Course Manual", 31 May 2012 (2012-05-31), pages 1 - 28, XP055125511, Retrieved from the Internet <URL:http://www.integratedlistening.com/wp-content/ils-files/2012/05/auditory-feedback-bc-enhancement-manual.pdf> [retrieved on 20140626] *
THOMPSON V M ET AL: "THE EMERGING FIELD OF SOUND TRAINING", IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE, IEEE SERVICE CENTER, PISACATAWAY, NJ, US, vol. 18, no. 2, 30 April 1999 (1999-04-30), pages 89 - 96, XP000804966, ISSN: 0739-5175, DOI: 10.1109/51.752984 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3329591A1 (en) * 2015-07-27 2018-06-06 TDK Corporation Electronic circuit for a microphone and microphone
EP3329591B1 (en) * 2015-07-27 2021-12-01 TDK Corporation Electronic circuit for a microphone and microphone

Also Published As

Publication number Publication date
LU92306B1 (en) 2015-05-11

Similar Documents

Publication Publication Date Title
US10904664B2 (en) Device for generating chest-chamber acoustic resonance and delivering the resultant audio and haptic to headphones
US20210050030A1 (en) System and apparatus for real-time speech enhancement in noisy environments
US4685448A (en) Vocal tactile feedback method and associated apparatus
US6644973B2 (en) System for improving reading and speaking
CN106888414A (en) The control of the own voices experience of the speaker with inaccessible ear
KR101056079B1 (en) Frequency Change Feedback for Treating Non-stuttering Beds
US20050095564A1 (en) Methods and devices for treating non-stuttering speech-language disorders using delayed auditory feedback
KR20200138050A (en) Ambient sound enhancement and acoustic noise cancellation based on context
JP6400796B2 (en) Listening assistance device to inform the wearer&#39;s condition
US4212119A (en) Audio-vocal integrator apparatus
SE528409C2 (en) Electronic sound monitor for use as stethoscope, has vibration transducer adapted to transform vibrations to electrical signals and arranged in bell-shaped vibration collecting structure
US20150305920A1 (en) Methods and system to reduce stuttering using vibration detection
US20160293041A1 (en) Apparatus and Method to Reduce Tone Deafness
US3453749A (en) Teaching by sound application
AU2010347009B2 (en) Method for training speech recognition, and training device
WO2015067741A1 (en) Apparatus and method for active voice training
CN209951556U (en) Hearing auxiliary rehabilitation system
RU126592U1 (en) SIMPLEX SPEECH REPAIR DEVICE
KR101821671B1 (en) Bluetooth sound device with music listening function and surroundings hearing function
KR20230172804A (en) Bone conduction earphone with noise reduction function
JPH0193298A (en) Self voice sensitivity suppression type hearing aid
CN109925122A (en) A kind of hearing recovering aid system and recovery training method
KR20230172802A (en) Bone conduction earphones with automatic volume control
JP2021117359A (en) Voice clarification device and voice clarifying method
SUZUKI et al. Nonlinear Tone Control Method by ThACE for Evaluation of Minute Sounds

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14798764

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14798764

Country of ref document: EP

Kind code of ref document: A1