WO2000001200A1 - Method and apparatus for processing sound - Google Patents

Method and apparatus for processing sound Download PDF

Info

Publication number
WO2000001200A1
WO2000001200A1 PCT/GB1999/002063 GB9902063W WO0001200A1 WO 2000001200 A1 WO2000001200 A1 WO 2000001200A1 GB 9902063 W GB9902063 W GB 9902063W WO 0001200 A1 WO0001200 A1 WO 0001200A1
Authority
WO
WIPO (PCT)
Prior art keywords
sounds
detected
angular relation
determined
time differences
Prior art date
Application number
PCT/GB1999/002063
Other languages
French (fr)
Inventor
Leslie Samuel Smith
Original Assignee
University Of Stirling
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Stirling filed Critical University Of Stirling
Priority to AU45258/99A priority Critical patent/AU4525899A/en
Priority to EP99928142A priority patent/EP1090531A1/en
Priority to JP2000557662A priority patent/JP2002519973A/en
Publication of WO2000001200A1 publication Critical patent/WO2000001200A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R25/00Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
    • H04R25/40Arrangements for obtaining a desired directivity characteristic
    • H04R25/407Circuits for combining signals of a plurality of transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2201/00Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
    • H04R2201/40Details of arrangements for obtaining desired directional characteristic by combining a number of identical transducers covered by H04R1/40 but not provided for in any of its subgroups
    • H04R2201/403Linear arrays of transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R25/00Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
    • H04R25/50Customised settings for obtaining desired overall acoustical characteristics
    • H04R25/502Customised settings for obtaining desired overall acoustical characteristics using analog signal processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/03Application of parametric coding in stereophonic audio systems

Definitions

  • the present invention relates to method and apparatus for processing sound, and in particular but not exclusively to a hearing aid, and in particular to an interactive directional hearing aid.
  • Most current hearing aids tackle the problem of hearing loss by (i) detecting the sound using a single microphone, (ii) selectively transforming the incoming sound, possibly initially converting the sound to a digital form so that more sophisticated digital signal processing techniques can be used, and (iii) re-transmitting the sound in the ear canal (or, in the case of cochlear implants, directly stimulating the nerves of the spiral ganglion in the organ of Corti) .
  • the use of a single microphone means that selectively amplifying sounds coming from a particular direction can only be achieved by taking advantage of the shape of the directional response of the microphone.
  • using a highly directional microphone leads to a different problem: the inability to detect sounds from certain other directions.
  • IIDs interaural intensity differences
  • ITDs interaural time differences
  • US patent 3946168, and US patent 3975599 disclose use of two microphones in a single housing, pointing in different directions, and switched between the directional input signals, essentially taking advantage of the different directional characteristics of the two microphones.
  • a similar approach, using both omni- and unidirectional microphones and including some adaptive equalisation is taken in US patent 5524056.
  • a more sophisticated approach uses a number of microphones in pairs, spaced one half-wavelength (of the frequencies of interest) apart across the user's body. The signals from these microphones are summed, bandpassed, and amplified. This provides directionality in the region of the chosen frequencies in the direction that the user is facing. This is extended in US patent 5737430 to include wireless connection to an ear-placed hearing aid.
  • DSP portable digital signal processing
  • cochlear implant techniques are often used in place of auditory retransmission. These excite the neurons of the spiral ganglion directly.. Unfortunately, it is not possible to stimulate all of the auditory nerve in this way, as the shape of the cochlea precludes this. Thus, only the high frequency (basal) end of the cochlea can be stimulated so that the user is presented with a much impoverished signal.
  • the auditory system appears to use a number of different cues in performing streaming.
  • these include relative intensity, relative timings of sudden increases in intensity in parts of the spectrum (onsets) , relative timing of features of the envelope of bandpassed sound (most notably amplitude modulation peaks) , and relative timings of the peaks and troughs of the bandpassed signal (waveform- synchronous features) .
  • Relative intensity is most used at low frequencies where the head shadow results in sounds from the sides being much stronger in one ear than the other: this is less pronounced at higher frequencies due to the sound waves diffracting round the head.
  • IID is more effective at low frequencies (due to the shadow effect of the head)
  • ITD at medium and high frequencies (because the signal period is large compared with the difference in signal path times to each of the two ears) (see figure 1) .
  • the inner hair cells of the organ of Corti transduce the pressure wave on the basilar membrane of the cochlea, they are more likely to cause a spike on the auditory nerve at one part of the phase of the incoming signal, at least for frequencies between about 20Hz and 4KHz .
  • This phase locking is believed to be important in the detection of the inter-aural time difference later in the processing of auditory signals. So long as the period of the signal is long compared to the ITD, this provides an important and unambiguous cue.
  • the period is 400 microseconds. If the ITD is 150 microseconds
  • the ITD is the result of the combination of these multiple paths, and this may cause incorrect estimates of the direction of the sound source.
  • the direct path is always the fastest, and generally the least attenuated.
  • the sound direction computed from initial onsets is not affected by the existence of multiple paths.
  • onsets generated by the arrival of signals from reflected paths are attempting to generate responses from the same onset cells that just fired because of the signal from the direct path. These reflected onsets will not be as strong as the original onset, and will be attempting to stimulate cells which are likely to be in their refractory period.
  • Ciocca Grouping in pitch perception: Effects of onset asynchrony and ear of presentation of a mistuned component, J. Acoustical Soc. of America, 91, 6, 3381-3390, 1992; C. J. Darwin and R. W. Hukin, Perceptual segregation of a harmonic from a vowel by interaural time difference and frequency proximity, J. Acoustical Soc. of America, 102 (4), 2316-2324, 1997).
  • Most real environmental sounds are complex, containing energy at many different frequencies. Some are unpitched, and some are pitched. But we do not normally notice any particular difficulty in determining the direction of different types of sound. We suggest this is because the auditory system uses all of the techniques above, plus IID
  • IID is useful particularly with low frequency sounds
  • onsets are useful with sounds which start suddenly, whether pitched or unpitched (such as a handclap)
  • waveform-synchronous techniques are useful with medium frequency sounds
  • amplitude modulation based techniques are useful with high frequency sounds which display amplitude modulation when bandpass-filtered.
  • Hearing loss causes loss of information on all aspects of the fine time structure. Clearly for those bands of the signal not detected at all at the inner hair cells there will be no fine timing information available at all. Additionally, for those bands of the signal for which an area of the organ of Corti is non- functional , detection will occur only on neighbouring areas of the organ of Corti. There may be a loss of synchronisation of the auditory nerve signal to both the features of the envelope modulation, and to the peaks and troughs of the bandpassed signal itself. We suggest that this loss of fine timing information is one of the primary reasons for hearing impaired people finding sound streaming difficult.
  • An object of the embodiment of the present invention is to provide an interactive system whereby the classes of features of sound may be detected synthetically, and used to find the direction of incoming sounds. This directional information can then be used to stream sounds.
  • the preferred embodiment of the present invention provides an interactive system in which the precise timing of the signals produced by bandpassing incoming sounds at two microphones is detected using techniques based on what appears to happen in the early auditory system. This, along with the IID in each channel, is used to determine the direction of the sound source, and this directional information is displayed to the user. The user selects which elements of the sound should be presented by interactively selecting some of the elements displayed. User interaction may consist of the user pointing their head in a particular direction, or may take place using some form of display and graphics tablet. The final presentation of the selected auditory information may use the auditory modality (using selective amplification, attenuation and possibly resynthe ⁇ i ⁇ ) , or the visual modality.
  • the system described here in may also be used as a part of an auditory system for a robot .
  • Sound sources from particular directions may be selected, making the later interpretation of the incoming ⁇ ound field much simpler than if the whole sound field (from many sources simultaneously) must be interpreted at once.
  • a method of processing sound comprising the .steps of: detecting sounds at at least two spaced detecting locations ; analysing the detected sounds to identify the angular relation between respective sound sources and said detecting locations; permitting selection of an angular relation associated with a particular sound source; and processing the detected sounds in response to said selection to highlight a stream of sound associated with said particular ⁇ ound source.
  • the angular relation between the respective sound sources is determined at least in part by reference to time differences between the sounds from the respective ⁇ ound sources as detected at the spaced detecting locations.
  • the angular relation between the respective sound ⁇ ource ⁇ is determined with reference to time differences determined with reference to at least one feature of the detected ⁇ ounds, the feature being selected from: pitch phase; amplitude modulation phase; and onset.
  • time differences are determined with reference to a plurality of features of the detected sounds .
  • the angular relation between the respective sound source ⁇ i ⁇ determined at least in part by reference to intensity differences between the sound ⁇ from the re ⁇ pective sound source ⁇ as detected at the spaced detecting locations.
  • the sound ⁇ are detected at locations corresponding to the ear of a user, and the angular relation between the respective sound sources may be determined by reference to interaural time differences
  • ITD ⁇ interaural inten ⁇ ity differences
  • IIDs interaural inten ⁇ ity differences
  • the method further comprises selectively filtering the detected ⁇ ound ⁇ from said spaced locations into a plurality of channels and then comparing features of sound of each channel from one location with features of ⁇ ound from a corre ⁇ ponding channel associated with the other location.
  • a method of proces ⁇ ing sounds emanating from a plurality of sound sources comprising the ⁇ tep ⁇ of : detecting sounds at at least two spaced detecting locations ; analysing the detected sounds to determine the angular relation between the respective sound source ⁇ and said detecting location ⁇ by reference to at lea ⁇ t one of inten ⁇ ity difference ⁇ between the sound ⁇ from the re ⁇ pective ⁇ ound ⁇ ource ⁇ a ⁇ detected at the ⁇ paced detecting locations and time differences between the sounds from the respective ⁇ ound ⁇ ources a ⁇ detected at the ⁇ paced detecting location ⁇ ; and ⁇ treaming the sounds associated with at least one sound source on the basi ⁇ of said determined angular relation .
  • the apparatus compri ⁇ ing: means for detecting sounds at at least two ⁇ paced detecting locations; means for analysing the detected sounds to identify the angular relation between respective sound source ⁇ and said detecting locations; means for permitting selection of an identified angular relation as ⁇ ociated with a particular ⁇ ound source ; and mean ⁇ for processing the detected sounds in response to said selection to highlight a stream of ⁇ ound a ⁇ ociated with ⁇ aid particular ⁇ ound source .
  • apparatus for processing sounds emanating from a plurality of sound source ⁇ comprising : means for detecting sounds at at least two ⁇ paced detecting locations; means for analysing the detected sounds to determine the angular relation between the respective sound ⁇ ource ⁇ and said detecting locations by reference to at least one of intensity difference ⁇ between the ⁇ ound ⁇ from the re ⁇ pective sound source ⁇ a ⁇ detected at the spaced detecting locations and time difference ⁇ between the sounds from the respective sound source ⁇ a ⁇ detected at the spaced detecting locations; and means for streaming the sounds as ⁇ ociated with at least one sound source on the basis of said determined angular relation.
  • Figure 1 is a graph of interaural time delay (ITD) as a function of angle from a ⁇ ource of ⁇ ound;
  • Figure 2 i ⁇ a ⁇ chematic overview, in block diagram form, of apparatu ⁇ for processing sound in accordance with a preferred embodiment of the present invention;
  • Figure 2a illustrates the outputs from the bandpass filters of Figure 2 in greater detail
  • Figure 3 i ⁇ a block diagram illustrating the determination of interaural intensity difference (IID) in the apparatus of Figure 2;
  • FIGS 4, 5 and 6 are block diagrams illustrating the determination of interaural time difference ⁇ (ITDs) ba ⁇ ed on on ⁇ et in the apparatu ⁇ of Figure 2 ;
  • FIG. 7 to 11 are block diagram ⁇ illustrating the determination of ITDs based on amplitude modulated (AM) ⁇ ignal ⁇ in the apparatu ⁇ of Figure 2;
  • AM amplitude modulated
  • Figure 12 is a block diagram illustrating the determination of simple ITDs in the apparatus of Figure 2;
  • Figure 13 i ⁇ a block diagram illustrating the display of I IDs and ITD ⁇ in the apparatu ⁇ of Figure 2;
  • Figure ⁇ 14 to 16 are block diagram ⁇ illu ⁇ trating the processing of the user's interaction with the display of Figure 13.
  • interaural time delay ITD
  • i ⁇ interaural time delay graphed as a function of the angle between a source of sound and straight ahead, for an inter-ear separation (signal path difference) of 150mm.
  • the source di ⁇ tance is assumed to be large compared to the distance between the ears.
  • the apparatu ⁇ of the preferred embodiment of the pre ⁇ ent invention determine ⁇ ITD by a number of different route ⁇ , which information is then utilised to allow the apparatus to be used to stream sounds for a user.
  • FIG 2 is an overview of apparatus 10 for processing sound in accordance with a preferred embodiment of the pre ⁇ ent invention.
  • the Figure ⁇ how ⁇ the two input transducers 12, 14 (Microphone L and Microphone R) , and two multiple channel bandpass filters 16, 18.
  • the microphones 12, 14 may be placed in the ear, or on the back of the ear, or at any suitable point, separated by an appropriate distance.
  • the microphones 12, 14 are of an omnidirectional type : the directivity of the system is not achieved through microphone directional sensitivity. Further, the microphones are matched, though this i ⁇ not crucial.
  • Each bandpass filter 16, 18 separates the incoming electrical signal from the respective microphone 12, 14 into a number of bands, as illustrated in Figure 2a.
  • These bands may overlap, and have a broad tuning: that i ⁇ , they have a characteri ⁇ tic roughly ⁇ imilar to the band ⁇ found in the ⁇ en ⁇ itivity analysis of real animal cochleae. A ⁇ an approximation, they have a bandwidth of about 10% of the centre frequency at 6dB.
  • the bandpas ⁇ filter ⁇ 16, 18 are matched to each other.
  • the filters 16, 18 have a fixed and known delay characteristic, and the delay characteri ⁇ tic i ⁇ the same (or very close to the same) for the two filters 16,
  • Both bandpas ⁇ filters 16, 18 will have the same number of outputs: the precise number i ⁇ not material, but the performance of the ⁇ y ⁇ tem improves as the number of filter ⁇ increases .
  • the features of the signal ⁇ of the individual channel ⁇ are processed to provide information on interaural intensity differences (IIDs) and interaural time difference ⁇ (ITDs) .
  • IIDs interaural intensity differences
  • ITDs interaural time difference ⁇
  • the resulting information is presented to the user in a format which allows the u ⁇ er to identify and ⁇ elect sources of sound ⁇ , based on the direction of the sound reaching the user.
  • the signals from the channel or channels primarily associated with the selected source are then proces ⁇ ed to ⁇ uit the u ⁇ er' ⁇ particular requirement ⁇ , thereby effectively streaming the sound from the selected source, and minimising the effect of sound ⁇ or "noise" from other source .
  • the outputs from the filter ⁇ 16, 18 are ⁇ ubject to four different forms of analysi ⁇ , the necessary hardware being pre ⁇ ented in the Figure ⁇ in the form of block ⁇ .
  • Each form of analysis is described below briefly, in turn.
  • the intensity of sound in each channel is computed, at 20, 22, and the determined intensities from the two microphone ⁇ 12, 14 compared on a channel-by-channel ba ⁇ i ⁇ , at 24, to provide a mea ⁇ ure of interaural inten ⁇ ity (IID) for each channel.
  • IID interaural inten ⁇ ity
  • Each IID indicates a particular angle between the microphones 12, 14 and the source of ⁇ ound, and thi ⁇ information i ⁇ stored, at 26, and also relayed to an interactive display 28.
  • a ⁇ will be described, thi ⁇ di ⁇ play receive ⁇ ⁇ imilar inputs from the results of the other forms of analy ⁇ is, to provide more complete information for the user.
  • the user may then interact with the display to select a particular "angle", or more particularly may select to be presented with sound from the source corresponding to that angle.
  • ITDs interaural time differences
  • AM amplitude modulation
  • the ⁇ ignal pha ⁇ e ⁇ for each channel are detected and grouped, at 44 and 46, and the re ⁇ ulting information u ⁇ ed to compute ⁇ ignal phase ITD, at 48. Again the resulting ⁇ ignal pha ⁇ e ITD ⁇ are ⁇ tored, at 50, and relayed to the interactive di ⁇ play 28 and the re ⁇ ynthe ⁇ i ⁇ ⁇ ub ⁇ tation 30 in a ⁇ omewhat ⁇ imilar manner to the IID, on ⁇ et ITD and AM ITD a ⁇ described above.
  • the IID i ⁇ computed repeatedly (for example, every 25m ⁇ ) for each channel, at 25.
  • the IID computed i ⁇ then turned into an e ⁇ timate of the angle of incidence of the ⁇ ound, at 24, u ⁇ ing an e ⁇ timate of the head-related transfer function. Note that this function is itself a complex function of the frequency of the sound.
  • the angles thus e ⁇ timated for each channel are grouped together, at 27, and a number of e ⁇ timate ⁇ of the incident angle of sounds made. These are then ⁇ ent to the di ⁇ play ⁇ ub ⁇ ystem 28.
  • Figure 4 of the drawing ⁇ shows the onset detector 32 and the onset clustering detector 32a. There is one on ⁇ et detector for each of the band ⁇ produced by the bandpa ⁇ filter ⁇ of figure 2. Those from on the left side are processed separately from those from on the right side.
  • Each onset detector 32, 33 detect ⁇ on ⁇ et ⁇ ( ⁇ udden increases in energy) in a single channel.
  • the output of the onset detector is written onset (x,i), where x is either L or R (for left or right side),, and i identifie ⁇ the bandpa ⁇ channel.
  • the on ⁇ et detector 32 output ⁇ from a ⁇ ingle side are fed to the on ⁇ et cluster detector 32a.
  • the onset cluster detector 32a groups together those onsets which have occurred within a short time (taking into account the differences in delay time acros ⁇ the filter bank) .
  • Each ⁇ ignal is, at any time, either 1 (signifying that thi ⁇ channel i ⁇ currently part of an on ⁇ et clu ⁇ ter) or 0 ( ⁇ ignifying that thi ⁇ channel i ⁇ not part of an onset cluster) .
  • FIG. 5 shows the left and right onset cluster signal ⁇ being compo ⁇ ed, at 34a, to form a composite onset clu ⁇ ter ⁇ ignal.
  • Thi ⁇ compo ⁇ ition 34a may take the form of a set of n AND gates or a set of OR gates.
  • Figure 6 shows the onset signal ⁇ from the left and right from each channel being compared in time, and a value for the time differential computed. There are n ⁇ uch value ⁇ computed. For channel ⁇ for which there wa ⁇ no onset signal, no output (null) will be produced: similarly for channels in which the lowest value for the time difference between the left and right onset ⁇ ignal i ⁇ too large (that i ⁇ , ha ⁇ a value which could not be produced from any ⁇ ignal direction), no output (null) will be produced.
  • the value ⁇ . produced will be gated, at 34b, by the compo ⁇ ite onset signal cluster signal. This ⁇ ignal will select cluster ⁇ of on ⁇ et ⁇ (generally one at a time) . There are n output ⁇ produced: each i ⁇ either a value for the time differential, or null.
  • One on ⁇ et cluster is produced at a time, and the onset store 36 store ⁇ the channel sets of recent grouped onsets, indexed by grouped ITD, that is according to the determined direction of sound for the channel.
  • Figure 7 and figure 8 show how the amplitude modulation i ⁇ detected.
  • the output ⁇ from the bandpas ⁇ filters 16, 18 are rectified and smoothed, at 60, and the rectified ⁇ moothed output is supplied to an AM mapping network 62.
  • This network 62 (as ⁇ hown in figure 8) has a number of excitatory neurons (m) 64, and one inhibitory neuron 66 (shaded) .
  • the input to all the excitatory neurons is the same: the input to the inhibitory neuron i ⁇ the (delayed) output of the excitatory neurons .
  • the excitatory neuron ⁇ are arranged so that they are each particularly sen ⁇ itive to amplitude modulation at some small range of frequencies.
  • the effect of the network is that, for amplitude modulated input, one of the excitatory neurons (the mapping neurons) fire ⁇ in pha ⁇ e with the amplitude modulation.
  • the inhibitory neuron pul ⁇ e ⁇ whenever there i ⁇ a sufficient amount of amplitude modulation.
  • the AM selection stage takes the output from all the excitatory neurons. This is gated by the output from the inhibitory neuron (so that null output is produced in the absence of amplitude modulated input) . It reduces the pulse output to a single amplitude modulated channel, by selecting only the active excitatory neuron output . Additionally, it codes the identity of the excitatory neuron producing this output: this supplies information on the frequency of the amplitude modulation.
  • Figure 9 shows the production of the table 68 used in grouping amplitude modulated signals.
  • the amplitude modulation signals we produce a table with an entry of 1 for each output for each AM frequency output from a bandpassed channel, and 0 otherwise.
  • each row of the table may contain at most one 1 entry. If the same AM frequency i ⁇ found in more than one channel, then there will be columns with more than one 1 entry.
  • the illustration show ⁇ a situation with 15 bandpassed channels, and 12 AM distingui ⁇ hable AM frequency band ⁇ .
  • bandpassed bands 2, 3, and 9 have found AM in AM channel 3
  • bandpassed bands 6, 7, 8, and 11 have found AM in AM channel 7
  • bandpassed bands 10, 12, and 14 have found AM in AM frequency channel 11.
  • One table i ⁇ produced for each side (left, right) a composite table is produced by ANDing the left and right table ⁇ .
  • Figure 10 illu ⁇ trates how the columns of the table are used to gate the AM signal ⁇ at 40, ⁇ electing only tho ⁇ e with the same AM frequency for compari ⁇ on, and generation of interaural time difference ⁇ (ITDs) .
  • ITDs interaural time difference ⁇
  • pulse signal ⁇ (which will be at the same frequency - namely frequency band 7, and whose pulse times reflect the phase of the amplitude modulation, that is the pulses are in phase with the amplitude modulation) are then fed in pairs (left, right) to circuitry 70 which computes the time difference between these signal ⁇ .
  • the value ⁇ across the different selected bands are then processed at 72 (for example, averaged, or the modal or median value selected) to produce the AM time differential signal for this AM frequency band.
  • FIG 9 An AM time difference signal will be produced for each nonzero column in figure 9: that is for each AM frequency band detected in both Left and Right channels.
  • Figure 11 show ⁇ how recent time difference ⁇ ignal ⁇ produced for each nonzero column are ⁇ tored: that i ⁇ , the set of channels a ⁇ ociated with each ⁇ ignal i ⁇ ⁇ tored, indexed by the (grouped) ITD, a ⁇ input from interactive di ⁇ play 28.
  • Figure 12 shows how simple (ungrouped) ITDs are computed from the output of each pair (left, right) of bandpassed channels.
  • Thi ⁇ i ⁇ achieved by u ⁇ ing a phase- locked pulse generator 74, 75 (which may, for example, generate a pulse on each positive-going zero-crossing) , and then calculating the time difference between these pulses. For low frequencies, these estimates tend to be unreliable, and for high frequencies, they can be ambiguous. However, there is a range of medium frequencies for which good estimates can be made. One time difference estimate will be produced for each
  • (medium-frequency) channel (medium-frequency) channel. These may be grouped together prior to further usage.
  • Recent time difference estimates are stored at 50: that i ⁇ , the channel ⁇ associated with each grouped time difference estimate are stored, indexed by the time difference (ITD) itself.
  • Figure 13 show ⁇ the time difference signals from the three sources (onsets (Figure 6), amplitude modulation (Figure 11) and waveform-synchronou ⁇ proces ⁇ ing (Figure 12)) are displayed on the display 28, in the form of a direction display. That i ⁇ , the time difference ⁇ between the left and right channel ⁇ i ⁇ interpreted as an angle. The angles computed from the IIDs (figure 3) are also displayed.
  • the display takes the form of a semicircle, because the sy ⁇ tem cannot distinguish between sounds from in front and behind; darker areas correspond to estimated directions of sound source.
  • the user interacts with the display, selecting a particular direction (e.g. by touching the display) from which they wish to be presented with ⁇ ounds.
  • the di ⁇ play returns the angle selected, and this is then processed.
  • a low power flat touch panel display (such as those used in colour portable computers) may be utilised.
  • Figure 14 illustrates how the signal ⁇ controlling the signal to be presented to the user are generated from the information recovered from the interactive display 28, and the stored information at the onset store 36 ( Figure 6) , the AM difference signal store 42 ( Figure 11) and the stored waveform-based time difference signal store 50 ( Figure 12) .
  • the angle output from the interactive display 28 is computed, at 76, from the user's interaction with the display.
  • Thi ⁇ is converted into an estimate of the IID and
  • the channel contributions from the low, medium and high frequency channels are normalised, at 82, 84, to provide mixing signals.
  • Figure 15 show ⁇ how the angle computed from the user's interaction with the display is used to index into the store 26 of low- frequency channel contributions, to estimate which of the low frequency (LF) channels gave rise to IIDs which were likely to have been produced by signals from that direction. This will use the head-related tran ⁇ fer function which i ⁇ different at different frequencie ⁇ .
  • Figure 16 shows ⁇ how the final ⁇ ignal for representation to the user is generated.
  • the mixing signal ⁇ ControlLMix and ControlRMix are generated a ⁇ de ⁇ cribed in figure 14, and control left and right channel mixer ⁇ 86, 88.
  • the present invention is intended to mimic, to a certain extent, the processing of sound ⁇ in the early auditory ⁇ y ⁇ tem of a human (or other mammal) , a ⁇ di ⁇ cus ⁇ ed below.
  • the input to the system comes from two microphones 12, 14 (L, for left, and R, for right in the figure ⁇ ), which are placed a distance apart.
  • the microphones may be, for example, placed at the end of the auditory canal, or elsewhere on the pinna. Placing the microphones at the end of the auditory canal allows the pinna transfer characteristic to alter the relative strength ⁇ of different frequencie ⁇ . If final presentation is binaural, this information is useful to the user, allowing them to place the sound in space better (J.
  • the microphones 12, 14 transduce the acoustic signal into a electrical signal. These electrical signals are amplified (maintaining the same frequency/phase response in both channels) , and fed into identical bandpass filter banks 16, 18. These filter banks 16, 18 perform a similar task to that of the cochlea. Each of these filter banks 16, 18 produces a large number of outputs, one for each channel. These channel outputs are used as input to modules which emulate the onset detectors, waveform- synchrony detectors and amplitude modulation detectors of the neurobiological sy ⁇ tem. However, not all channel ⁇ will use all three module ⁇ .
  • the synthetic onset detector must have a very short, but constant latency: this latency needs to be constant over a wide range both of intensitie ⁇ and of rates of increase. Since onsets may be used in location of both pitched and unpitched sounds, each- onset detector may receive input from a range of bandpassed channels. Unlike the biological system we use one precise onset detector per channel, rather than rely on .population coding.
  • Waveform- synchrony is primarily of use at low to medium frequencies, as discussed earlier.
  • the synthetic waveform-synchrony detector will provide an output at a specific part of the phase of the signal (for example, at each positive-going zero-crossing) .
  • Amplitude modulation is primarily useful at medium to high frequencies. Note that effective use of AM is predicated on the bandpas ⁇ filter having a wide-band re ⁇ ponse such as the response of the real cochlea. Again, the detector must provide an output at a particular point in the envelope, for example at peaks, and again, jitter needs to be minimised.
  • the display shows the azimuthal direction of the different incoming sounds (though not whether the sound is ahead of or behind the user) as computed from IIDs and head-related transfer functions, and from ITDs.
  • This on its own may be used to draw attention to features of the auditory environment. However, it may be rendered more useful to the hearing impaired by permitting them to interact with it to select the information to be presented to them by the hearing aid itself. How this is best achieved in a particular application will depend on factors which will vary from user to user, such as whether they are willing to use their hands to interact with the sy ⁇ tem, or would prefer to interact only by turning their heads.
  • Two main modes of sound selection are likely to be utilised.
  • the user turns to face the (known) source of the sound ⁇ in which they are interested.
  • the sounds to be selected are then those with low ITD and IID.
  • a map of the incoming sounds is produced and displayed, and the user selects the sounds to be presented.
  • the information to be presented to the u ⁇ er may be pre ⁇ ented monaurally or binaurally.
  • the result of the user' ⁇ interaction with the interactive display is an angle, ⁇ between - ⁇ /2 and + ⁇ /2 or , if the user reque ⁇ t ⁇ only tho ⁇ e sources directly ahead.
  • This angle is used to compute the expected IID and ITD for signal ⁇ from that direction.
  • OutDataL and OutDataR are used for medium and high frequencies: ' for lower frequencies, the same approach is made using the IID.
  • the resultant output, OutData is a multichannel signal, suitable for visual display. For auditory presentation, the signals in the different channels are added together in a manner which reflects the user' ⁇ hearing deficit.
  • Exactly how the selected sounds are presented to the user depends very much on the sensory faculties of the user. If there is sufficient residual hearing, then selective amplification may be most suitable: if the residual hearing is restricted to particular frequency bands, then resynthesis may be more appropriate. It may also be pos ⁇ ible to mix these two techniques . Alternatively, presentation may use the visual modality.
  • the selected sound, produced as outlined in Figures 14 to 16 may have some channels selectively amplified to make up for the hearing deficit.
  • the resulting sound may be presented (a) monaurally, if there is only sufficient residual hearing in one ear, or (b) binaurally if there i ⁇ sufficient residual hearing in both ears. In this case, we would pre ⁇ ent the data from OutDataL to he left ear, and from OutDataR to the right ear.
  • ControlLRMix signal to alter the gain on the signals from the two ears .
  • the signal we start from is the OutData signal.
  • the information from the sound is presented in one particular direction visually, utilising a colour display to present information about how the power of the ⁇ ound is distributed over the spectrum.
  • One pos ⁇ ibility (which does not use the interactive display, but displays all the incoming sound) is to choose the colour to match the ITD, and to make the intensity reflect the strength of the ⁇ ignal.
  • DSP digital signal processing
  • VLSI subthreshold analog VLSI
  • tran ⁇ conductance amplifier mirrors the characteristics of the biological system rather better than either digital on/off switches, or more linear analogue devices.
  • Sound detection may be through microphone ⁇ .
  • direct silicon transducers for pressure waves may be use .
  • the microphones are omnidirectional: we need to receive signals from all directions so we can estimate the directions of sound sources .
  • This processing takes place in stages.
  • the first stage (after transduction) is cochlear filtering, and thi ⁇ is followed (in each bandpas ⁇ ed channel) by (in parallel) inten ⁇ ity computation, pitch phase detection, and envelope processing (that is amplitude modulation phase detection and onset detection) .
  • the results of this proces ⁇ ing (for all channel ⁇ , and for both ears) are used to generate ITD e ⁇ timates for each feature type for each channel. This information is then used in determining what should be presented to the user.
  • Pitch phase detection in animals relies on population coding by spiking neurons which are more likely to spike at a particular phase of the movement of the basilar membrane.
  • Neuromorphic implementations of this are discus ⁇ ed by Liu et al (W. Liu, A.G. Andreou, and Jr. M.H. Goldstein Voiced speech representation by an analog silicon model of the auditory periphery IEEE Trans. Neural Networks. 3(3) :477-- 487, 1993) and in techniques by Van Schaik (A. van Schaik. Analogue VLSI Building Blocks for an Electronic Auditory Pathway. PhD thesi ⁇ , autoimmune Polytechnique Federale de Lausanne, 1997), where a version of Meddis's hair cell model (M.J.
  • OutData signal (or OutDataL and OutDataR signals in the case of binaural presentation) .
  • Auditory presentation technology may, for example, utilise remote generation of the signal, and transmission of the signal to the in-ear transducers by wireless technology.
  • it may be necessary to adjust the spectral energy distribution and compress the signal to take best advantage of the residual hearing present .
  • bandpass characteristic of current neuromorphic filters is not as sharp as is preferred we may counterbalance this by (i) selectively amplifying those channels for which the chosen ITD is most strongly represented and (ii) subtracting the content of those channels in which the ITD chosen is under-represented.

Abstract

A method of processing sound comprises the steps of: detecting sounds at at least two spaced detecting locations; analysing the detected sounds to identify the angular relation between respective sound sources and the detecting locations; permitting selection of an angular relation associated with a particular sound source; and processing the detected sounds in response to the selection to highlight a stream of sound associated with the particular sound source. The method may be utilised in a hearing aid, to allow a user to stream sounds by interactively selecting a particular source, and thus minimise background 'noise'.

Description

METHOD AND APPARATUS FOR PROCESSING SOUND
FIELD OF THE INVENTION
The present invention relates to method and apparatus for processing sound, and in particular but not exclusively to a hearing aid, and in particular to an interactive directional hearing aid.
BACKGROUND
Most current hearing aids tackle the problem of hearing loss by (i) detecting the sound using a single microphone, (ii) selectively transforming the incoming sound, possibly initially converting the sound to a digital form so that more sophisticated digital signal processing techniques can be used, and (iii) re-transmitting the sound in the ear canal (or, in the case of cochlear implants, directly stimulating the nerves of the spiral ganglion in the organ of Corti) . The use of a single microphone means that selectively amplifying sounds coming from a particular direction can only be achieved by taking advantage of the shape of the directional response of the microphone. However, using a highly directional microphone leads to a different problem: the inability to detect sounds from certain other directions. In embodiments of the present invention, we use two microphones and incorporate methods apparently used by animal auditory systems to separate out the different sources (or streams) of sound present in incident sound.
Even where two or more microphones are used, few existing systems permit the user to interact with the system to select the most appropriate sound processing. This is a disadvantage because the nature of the problem the user faces in sound interpretation depends strongly on the user's environment; this may vary from a quiet environment, with only one source of sound, to a noisy room, with many different sources of sound. The primary information used by the auditory system for determining the direction of a sound source is interaural intensity differences (IIDs) , and interaural time differences (ITDs) . To be able to estimate IID or ITD, a hearing aid must have more than one microphone.
US patent 3946168, and US patent 3975599 disclose use of two microphones in a single housing, pointing in different directions, and switched between the directional input signals, essentially taking advantage of the different directional characteristics of the two microphones. A similar approach, using both omni- and unidirectional microphones and including some adaptive equalisation is taken in US patent 5524056. A more sophisticated approach (US patent 4751738) uses a number of microphones in pairs, spaced one half-wavelength (of the frequencies of interest) apart across the user's body. The signals from these microphones are summed, bandpassed, and amplified. This provides directionality in the region of the chosen frequencies in the direction that the user is facing. This is extended in US patent 5737430 to include wireless connection to an ear-placed hearing aid.
The advent of portable digital signal processing (DSP) has meant that more sophisticated signal processing strategies can be adopted. DSP techniques have been applied to binaural systems (which have two microphones and two output transducers, one per ear) (US patent 5479522, US patent 5651071) , resulting in a system which selectively amplifies signals characteristic of speech, while maintaining the precise timing of the signals so as to permit the user to detect sound source direction. In this way the systems also perform noise reduction. Directionality has been added using DSP techniques to perform beamforming (US patent 5511128) . Implementation of these entities using wireless communication is described in US patent 5757932. Techniques which attempt to compromise between the conflicting goals of maximally directional response, and preservation of IID and ITD binaural cues for source direction finding are compared by Desloge et al (J.G. Desloge, .M Rabinowitz, and P.M. Zurek, Microphone- array hearing aids with binaural output - part l:fixed- processing systems. IEEE Transactions on Speech and Audio Processing, 5 (6) : 529- -542 , 1997). In Kollmeier et al (B. Kollmeier, J. Peissig, and V. Hohmann . Binaural noise- reduction hearing aid scheme with real-time processing in the frequency domain. Scandinavian Audiology: Supplement, 38:28--28, 1993), an algorithm which attempts to amplify only those sounds with the appropriate IID and ITD for sources straight ahead is described.
Another multiple microphone technique pioneered at the University of Paisley, Scotland uses sub-band adaptive processing, and this allows statistically different signals (such as speech and car noise) to be separated, improving the signal-to-noise ratio (SNR) (P.W. Shields and D. Campbell, Multi -microphone sub-band adaptive signal processing for improvement of hearing aid performance: preliminary results using normal hearing volunteers; Proc ICASSP97, pages I, 415--418, 1997; P. Shields, M. Girolami, D. Campbell, and C. Fyfe, Adaptive processing schemes inspired by binaural unmasking for enhancement of speech corrupted with noise and reverberation; in L.S. Smith and A. Hamilton, editors, Neuromorphic Systems: engineering silicon from neurobiology, pages 61--74. World Scientific, 1998; A Hussain and D. Campbell, Binaural sub- band adaptive speech enhancement using a human cochlear model and artificial neural networks, in L.S. Smith and A. Hamilton, editors, Neuromorphic Systems: engineering silicon from neurobiology, pages 75--86. World Scientific, 1998) . Additionally, anti-Hebbian learning techniques from the blind signal deconvolution (independent components analysis) school can be used, allowing different sound streams to be recovered.
Both normal and hearing- impaired listeners can move their heads to assist in picking out the source in which they are interested. The earliest hearing aids were mechanical, and the user could move them to alter the direction in which they pointed. Most early electronic hearing aids had little additional user interfacing: they could be switched on or off, and had, perhaps, a number of settings, and/or a volume control. Most modern hearing aids are configurable, and are set at dispensing to compensate for the particular hearing loss of the user: however, they are not usually user-reconfigurable thereafter. Some recent hearing aids have additional alterable settings. In US patent 5524056, the particular microphones to be used may be altered. Similarly in US patent 5636285 a voice-controlled re-settable technique is described.
Most hearing aids retransmit sound in the auditory canal. The provision of appropriate selective transformations between the incoming sound and the transmitted sound is normally based on making up the hearing that the user appears to have lost. However, partial deafness is not simply a decrease in sensitivity in parts of the spectrum: if it were, then this approach would be entirely successful . The problem is that much of the loss of sensitivity is due to failure of the hair cell transduction system, particularly at the basal (high frequency) end of the cochlea. Simply amplifying high frequency signals will not result in stimulation of the auditory nerve cells that these inner hair cells innervate. Instead, other (undamaged) hair cells from elsewhere in the cochlea will respond, mixing their response to the amplified high frequency sounds with their original response. Selective amplification thus results in increases in the response on the auditory nerve to a wider range of the spectrum, but at the expense of place-based frequency resolution. Further, for many hearing impaired subjects, the distance between audible sounds and painful sounds is small, making the procedure of adjusting a frequency-sensitive amplifier difficult.
For subjects with little or no residual hearing, cochlear implant techniques are often used in place of auditory retransmission. These excite the neurons of the spiral ganglion directly.. Unfortunately, it is not possible to stimulate all of the auditory nerve in this way, as the shape of the cochlea precludes this. Thus, only the high frequency (basal) end of the cochlea can be stimulated so that the user is presented with a much impoverished signal.
Another possibility when there is little or no residual hearing, (and in particular where there is damage to the auditory nerve or brainstem) is to use a different modality, such as the visual modality. This is suggested in US patent 5029216, where a spectacle-mounted system which can give warning to a hearing- impaired driver that there is an emergency vehicle approaching is described. Whether the sound is retransmitted, whether the auditory nerves are directly stimulated, or whether the visual domain is used, both the sounds of interest and noise are likely to be presented. One can concentrate on areas of the spectrum in which speech is most likely to occur, but then, one will be amplifying all the speakers talking at once, leading to the commonly found problem that users of hearing aids can pick out speech when one speaker talks, but not when there are a number of speakers. Separating out what is of interest and what is noise is difficult because it is likely to vary in time and as the listener moves around. The result is that the signal in which the user is interested and noise tend to both be amplified. It was these problems that the multi-microphone techniques discussed above aimed to solve: however, their directionality is often restricted to the direction in which the user is facing.
The physiology of the early auditory system is well known, and well described in, for example, J.O. Pickles, An Introduction to the Physiology of Hearing; Academic Press, 2nd edition, 1988. This physiology is very similar across a wide range of mammals and this suggests that whatever is happening at this stage is (i) effective, and (ii) not predicated on specifically human aspects of auditory processing. We suggest that what is going on is that the sound is being streamed (A.S. Bregman. Auditory scene analysis MIT Press, 1990) both monaurally and binaurally. This seems likely (i) because the same problems of streaming are found across the animal kingdom and (ii) because logically, streaming of sounds should precede interpretation.
The auditory system appears to use a number of different cues in performing streaming. For binaural streaming, these include relative intensity, relative timings of sudden increases in intensity in parts of the spectrum (onsets) , relative timing of features of the envelope of bandpassed sound (most notably amplitude modulation peaks) , and relative timings of the peaks and troughs of the bandpassed signal (waveform- synchronous features) . Relative intensity is most used at low frequencies where the head shadow results in sounds from the sides being much stronger in one ear than the other: this is less pronounced at higher frequencies due to the sound waves diffracting round the head. For monaural streaming, the co-occurrence and relative timing across the spectrum of onsets, and the co-occurrence of same-frequency amplitude modulation in medium and high frequency areas of the spectrum appear to be used. This list is not intended to be exhaustive, but to give examples of the range of techniques in simultaneous use by the early auditory system. Apart from relative intensity, all the features above have their roots in the fine time- structure of the sound. These features may be grouped into three classes (S. Rosen. Temporal information in speech: acoustic, auditory and linguistic aspects. Phil. Trans. R. Soc . London B, 336:367-373, 1992; L.S. Smith. Extracting features from the short-term structure of cochlear filtered sound. In J.A. Bullinaria, D.W. Glasspool, and H. Houghton, editors, 4th Neural Computation and Psychology Workshop, London, 9- 11 April 1997, pages 113--125. Springer Verlag, 1998) and features from all three classes contain information which may be used in monaural grouping and sound direction finding. The primary source of the differences in these features between the two ear systems are the inter-aural time difference (ITD) and inter-aural intensity difference (IID) . The exact forms that these take are described in for example J. Blauert, Spatial Hearing, MIT Press, revised edition, 1996. From this discussion, it is clear that IID is more effective at low frequencies (due to the shadow effect of the head) , and ITD at medium and high frequencies (because the signal period is large compared with the difference in signal path times to each of the two ears) (see figure 1) .
Because of the way in which the inner hair cells of the organ of Corti transduce the pressure wave on the basilar membrane of the cochlea, they are more likely to cause a spike on the auditory nerve at one part of the phase of the incoming signal, at least for frequencies between about 20Hz and 4KHz . This phase locking is believed to be important in the detection of the inter-aural time difference later in the processing of auditory signals. So long as the period of the signal is long compared to the ITD, this provides an important and unambiguous cue.
However, at, for example, 2.5 KHz, the period is 400 microseconds. If the ITD is 150 microseconds
(corresponding to θ=25 degrees) , then we would expect the signals reaching the two ears to be 3π/4 out of phase. But this is not distinguishable from the signals being 2π - 3π/4 out of phase, corresponding to an ITD of 250 microseconds, which is θ = 37 degrees. So, medium to high frequencies result in ambiguous directions.
At high frequencies (above about 4KHz) , the phase- locking breaks down, so that waveform synchronous ITD cannot be used to locate constant high frequency sounds .
Sudden increases in intensity of sound result in onset cells in the cochlear nucleus firing (J. 0. Pickles. An Introduction to the Physiology of Hearing. Academic Press, 2nd edition, 1988.) . These onset cells have a short latency, and are relatively insensitive to intensity (J.S. Rothman, E.D. Young, and P.D. Manis. Convergence of auditory nerve fibers onto bushy cells in the ventral cochlear nucleus: Implications of a computational model. Journal of Neurophysiology, 70 (6) : 2562- -2583 , 1993; J.S. Rothman and E.D. Young. Enhancement of neural synchronization in computational models of ventral cochlear nucleus bushy cells. Auditory Neuroscience, 2:47--62, 1996) . Since the intensity is likely to be different at the two detectors this is important. The latency is very short, and the population coding of a number of cells is believed to permit the timing of the onset to be precisely measured (D.C. Fitzpatrick, R. Batra, T.R. Stanford, and S. Kuwuda. A neuronal population code for sound localization, Nature, 388:871-874, 1997; B.C. Skottun. Sound localization and neurons. Nature, 393:531, 1998). As a result, the ITD of the onset envelope can provide useful directional information, even for signals at high frequencies .
In a normal environment , most sounds reach the ears by many different paths, due to the presence of reflecting surfaces. One result of this is that the ITD is the result of the combination of these multiple paths, and this may cause incorrect estimates of the direction of the sound source. However, the direct path is always the fastest, and generally the least attenuated. Thus, the sound direction computed from initial onsets is not affected by the existence of multiple paths. Additionally, onsets generated by the arrival of signals from reflected paths are attempting to generate responses from the same onset cells that just fired because of the signal from the direct path. These reflected onsets will not be as strong as the original onset, and will be attempting to stimulate cells which are likely to be in their refractory period.
Many pitched sounds consist of multiple harmonics of a low- frequency fundamental. This is true of many animal noises, including voiced sounds in speech. As a result, being able to find the direction of these is particularly important. The nature of the bandpass filtering which occurs on the cochlea is such that for these sounds, a number of adjacent harmonics are present at many higher frequency locations of the cochlea. This results in the energy of the movement of the basilar membrane of the cochlea being modulated in amplitude at the frequency of the fundamental . The result is that the auditory nerve output is similarly modulated. Stellate (chopper) cells in the cochlear nucleus appear to be particularly sensitive to amplitude modulated signals, and can amplify this modulation (A.R. Palmer and I.M. Winter. Cochlear nerve and cochlear nucleus responses to the fundamental frequency of voiced speech sounds and harmonic complex tones. Advances in the Bioscienceε, 83:231--239, 1992). Using the difference in the timings of the peaks and troughs of this amplitude modulation, the auditory system can find the direction of certain high frequency sounds, even although waveform-synchronous phase locking is absent.
Recent work in auditory psychophyεics suggests that the ITDs detected from onsets, amplitude modulation, and waveform synchronous processing are not used directly, but are grouped monaurally first (S. Carlile, The Physical and Psychophysical Basis of Sound Localization, in Virtual Auditory Space: Generation and Applications, edited by S. Carlile, R. G. Landes Company, 1996; J. F. Culling and Q. ' Summerfield, Perceptual separation of concurrent speech sounds: Absence of across-frequency grouping by common interaural delay, J. Acoustical Soc. of America, 98, Vol. 2, 785-796, 1995; C. J. Darwin and V. Ciocca, Grouping in pitch perception: Effects of onset asynchrony and ear of presentation of a mistuned component, J. Acoustical Soc. of America, 91, 6, 3381-3390, 1992; C. J. Darwin and R. W. Hukin, Perceptual segregation of a harmonic from a vowel by interaural time difference and frequency proximity, J. Acoustical Soc. of America, 102 (4), 2316-2324, 1997). This appears to be most effective because the ITDs are computed from a number of channels simultaneously, reducing the likelihood of errors. Most real environmental sounds are complex, containing energy at many different frequencies. Some are unpitched, and some are pitched. But we do not normally notice any particular difficulty in determining the direction of different types of sound. We suggest this is because the auditory system uses all of the techniques above, plus IID
(and perhaps some other techniques of which we are not aware) . Certainly, (i) IID is useful particularly with low frequency sounds (ii) onsets are useful with sounds which start suddenly, whether pitched or unpitched (such as a handclap) (iii) waveform-synchronous techniques are useful with medium frequency sounds and (iv) amplitude modulation based techniques are useful with high frequency sounds which display amplitude modulation when bandpass-filtered.
(Note that people find it difficult to find the precise direction of pure constant high frequency tones: the IID is small, and the waveform synchronous processing breaks down) .
Hearing loss causes loss of information on all aspects of the fine time structure. Clearly for those bands of the signal not detected at all at the inner hair cells there will be no fine timing information available at all. Additionally, for those bands of the signal for which an area of the organ of Corti is non- functional , detection will occur only on neighbouring areas of the organ of Corti. There may be a loss of synchronisation of the auditory nerve signal to both the features of the envelope modulation, and to the peaks and troughs of the bandpassed signal itself. We suggest that this loss of fine timing information is one of the primary reasons for hearing impaired people finding sound streaming difficult.
An object of the embodiment of the present invention is to provide an interactive system whereby the classes of features of sound may be detected synthetically, and used to find the direction of incoming sounds. This directional information can then be used to stream sounds.
SUMMARY OF THE INVENTION
Accordingly, the preferred embodiment of the present invention provides an interactive system in which the precise timing of the signals produced by bandpassing incoming sounds at two microphones is detected using techniques based on what appears to happen in the early auditory system. This, along with the IID in each channel, is used to determine the direction of the sound source, and this directional information is displayed to the user. The user selects which elements of the sound should be presented by interactively selecting some of the elements displayed. User interaction may consist of the user pointing their head in a particular direction, or may take place using some form of display and graphics tablet. The final presentation of the selected auditory information may use the auditory modality (using selective amplification, attenuation and possibly resyntheεiε) , or the visual modality.
The system described here in may also be used as a part of an auditory system for a robot . Sound sources from particular directions may be selected, making the later interpretation of the incoming εound field much simpler than if the whole sound field (from many sources simultaneously) must be interpreted at once. According to one aspect of the present invention there is provided a method of processing sound, the method comprising the .steps of: detecting sounds at at least two spaced detecting locations ; analysing the detected sounds to identify the angular relation between respective sound sources and said detecting locations; permitting selection of an angular relation associated with a particular sound source; and processing the detected sounds in response to said selection to highlight a stream of sound associated with said particular εound source.
Preferably, the angular relation between the respective sound sources is determined at least in part by reference to time differences between the sounds from the respective εound sources as detected at the spaced detecting locations. Most preferably, the angular relation between the respective sound εourceε is determined with reference to time differences determined with reference to at least one feature of the detected εounds, the feature being selected from: pitch phase; amplitude modulation phase; and onset. Ideally, time differences are determined with reference to a plurality of features of the detected sounds .
Preferably also, the angular relation between the respective sound sourceε iε determined at least in part by reference to intensity differences between the soundε from the reεpective sound sourceε as detected at the spaced detecting locations.
Preferably also, the soundε are detected at locations corresponding to the ear of a user, and the angular relation between the respective sound sources may be determined by reference to interaural time differences
(ITDε) and interaural intenεity differences (IIDs) .
Preferably alεo, the method further comprises selectively filtering the detected εoundε from said spaced locations into a plurality of channels and then comparing features of sound of each channel from one location with features of εound from a correεponding channel associated with the other location.
According to another aεpect of the present invention, there is provided a method of procesεing sounds emanating from a plurality of sound sources, the method comprising the εtepε of : detecting sounds at at least two spaced detecting locations ; analysing the detected sounds to determine the angular relation between the respective sound sourceε and said detecting locationε by reference to at leaεt one of intenεity differenceε between the soundε from the reεpective εound εourceε aε detected at the εpaced detecting locations and time differences between the sounds from the respective εound εources aε detected at the εpaced detecting locationε; and εtreaming the sounds associated with at least one sound source on the basiε of said determined angular relation .
According to a further aεpect of the preεent invention there iε provided apparatuε for proceεεing sound, the apparatus compriεing: means for detecting sounds at at least two εpaced detecting locations; means for analysing the detected sounds to identify the angular relation between respective sound sourceε and said detecting locations; means for permitting selection of an identified angular relation asεociated with a particular εound source ; and meanε for processing the detected sounds in response to said selection to highlight a stream of εound aεεociated with εaid particular εound source .
According to a still further aspect of the present invention there is provided apparatus for processing sounds emanating from a plurality of sound sourceε, the apparatus comprising : means for detecting sounds at at least two εpaced detecting locations; means for analysing the detected sounds to determine the angular relation between the respective sound εourceε and said detecting locations by reference to at least one of intensity differenceε between the εoundε from the reεpective sound sourceε aε detected at the spaced detecting locations and time differenceε between the sounds from the respective sound sourceε aε detected at the spaced detecting locations; and means for streaming the sounds asεociated with at least one sound source on the basis of said determined angular relation. These and other aεpectε of the preεent invention will become apparent from the following deεcription when taken in combination with the accompanying drawingε, in which:
Figure 1 is a graph of interaural time delay (ITD) as a function of angle from a εource of εound; Figure 2 iε a εchematic overview, in block diagram form, of apparatuε for processing sound in accordance with a preferred embodiment of the present invention;
Figure 2a illustrates the outputs from the bandpass filters of Figure 2 in greater detail; Figure 3 iε a block diagram illustrating the determination of interaural intensity difference (IID) in the apparatus of Figure 2;
Figures 4, 5 and 6 are block diagrams illustrating the determination of interaural time differenceε (ITDs) baεed on onεet in the apparatuε of Figure 2 ;
Figureε 7 to 11 are block diagramε illustrating the determination of ITDs based on amplitude modulated (AM) εignalε in the apparatuε of Figure 2;
Figure 12 is a block diagram illustrating the determination of simple ITDs in the apparatus of Figure 2; Figure 13 iε a block diagram illustrating the display of I IDs and ITDε in the apparatuε of Figure 2; and Figureε 14 to 16 are block diagramε illuεtrating the processing of the user's interaction with the display of Figure 13.
Reference is firεt made to Figure 1 of the drawingε , in which interaural time delay (ITD) iε graphed as a function of the angle between a source of sound and straight ahead, for an inter-ear separation (signal path difference) of 150mm. The source diεtance is assumed to be large compared to the distance between the ears. As will be deεcribed, the apparatuε of the preferred embodiment of the preεent invention determineε ITD by a number of different routeε, which information is then utilised to allow the apparatus to be used to stream sounds for a user.
Figure 2 is an overview of apparatus 10 for processing sound in accordance with a preferred embodiment of the preεent invention. The Figure εhowε the two input transducers 12, 14 (Microphone L and Microphone R) , and two multiple channel bandpass filters 16, 18. The microphones
12, 14 may be placed in the ear, or on the back of the ear, or at any suitable point, separated by an appropriate distance. The microphones 12, 14 are of an omnidirectional type : the directivity of the system is not achieved through microphone directional sensitivity. Further, the microphones are matched, though this iε not crucial.
Each bandpass filter 16, 18 separates the incoming electrical signal from the respective microphone 12, 14 into a number of bands, as illustrated in Figure 2a. These bands may overlap, and have a broad tuning: that iε, they have a characteriεtic roughly εimilar to the bandε found in the εenεitivity analysis of real animal cochleae. Aε an approximation, they have a bandwidth of about 10% of the centre frequency at 6dB.
The bandpasε filterε 16, 18 are matched to each other. In addition, the filters 16, 18 have a fixed and known delay characteristic, and the delay characteriεtic iε the same (or very close to the same) for the two filters 16,
18.
Both bandpasε filters 16, 18 will have the same number of outputs: the precise number iε not material, but the performance of the εyεtem improves as the number of filterε increases .
From the bandpass filters 16, 18, the features of the signalε of the individual channelε are processed to provide information on interaural intensity differences (IIDs) and interaural time differenceε (ITDs) . The resulting information is presented to the user in a format which allows the uεer to identify and εelect sources of soundε, based on the direction of the sound reaching the user. The signals from the channel or channels primarily associated with the selected source are then procesεed to εuit the uεer'ε particular requirementε, thereby effectively streaming the sound from the selected source, and minimising the effect of soundε or "noise" from other source .
The operation of the apparatuε 10 will be deεcribed initially with reference to Figure 2, followed by more detailed descriptionε of the manner in which individual features of the sound are detected, analysed and processed.
In the illustrated embodiment, the outputs from the filterε 16, 18 are εubject to four different forms of analysiε, the necessary hardware being preεented in the Figureε in the form of blockε . Each form of analysis is described below briefly, in turn.
The intensity of sound in each channel is computed, at 20, 22, and the determined intensities from the two microphoneε 12, 14 compared on a channel-by-channel baεiε, at 24, to provide a meaεure of interaural intenεity (IID) for each channel. Each IID indicates a particular angle between the microphones 12, 14 and the source of εound, and thiε information iε stored, at 26, and also relayed to an interactive display 28. Aε will be described, thiε diεplay receiveε εimilar inputs from the results of the other forms of analyεis, to provide more complete information for the user. The user may then interact with the display to select a particular "angle", or more particularly may select to be presented with sound from the source corresponding to that angle. The output from the diεplay 28 iε then relayed to the respective stores 26, from which details of the channels "contributing" to or indicative of the selected angle are extracted and relayed to a resyntheεis subεtation 30. The channel-related information being received from the εtores 26 by the subεtation 30 iε uεed to εelect and process signals directly from the filterε 16, 18, which εignalε are then εelectively processed to highlight the appropriate channel inputs and to present the εignalε to the uεer in an appropriate form, for example by εelectively amplifying the selected channels .
In addition to IID, the senεitivity and accuracy of the apparatuε iε improved by detecting and processing interaural time differences (ITDs) for each channel by three different methods, aε described briefly below. Firstly, onset is detected and computed for each microphone, at 32 and 33, and the ITD computed and gated, at 34, for each channel. The resulting onset ITDs are stored, at 36, and relayed to the interactive display 28 and the resynthesis subεtation 30 in a εomewhat εimilar manner to the IID aε described above.
Secondly, the amplitude modulation (AM) for each channel iε detected and grouped, at 38 and 39, and the reεulting information uεed to compute and gate AM ITD, at 40. Again the resulting AM ITDs are stored, at 42, and relayed to the interactive display 28 and the resynthesis εubεtation 30 in a εomewhat εimilar manner to the IID and onset ITD aε deεcribed above.
Thirdly, the εignal phaεeε for each channel are detected and grouped, at 44 and 46, and the reεulting information uεed to compute εignal phase ITD, at 48. Again the resulting εignal phaεe ITDε are εtored, at 50, and relayed to the interactive diεplay 28 and the reεyntheεiε εubεtation 30 in a εomewhat εimilar manner to the IID, onεet ITD and AM ITD aε described above. These operations will now be described in greater detail, firstly by reference to Figure 3 of the drawings, which illustrates the computation of the estimate of the angle from which the dominant incoming sound originates using the inter-aural intensity difference between the left and right inputε, one estimate being made per channel.
The IID iε computed repeatedly (for example, every 25mε) for each channel, at 25. The IID computed iε then turned into an eεtimate of the angle of incidence of the εound, at 24, uεing an eεtimate of the head-related transfer function. Note that this function is itself a complex function of the frequency of the sound. The angles thus eεtimated for each channel are grouped together, at 27, and a number of eεtimateε of the incident angle of sounds made. These are then εent to the diεplay εubεystem 28.
Figure 4 of the drawingε shows the onset detector 32 and the onset clustering detector 32a. There is one onεet detector for each of the bandε produced by the bandpaεε filterε of figure 2. Those from on the left side are processed separately from those from on the right side.
Each onset detector 32, 33 detectε onεetε (εudden increases in energy) in a single channel. The output of the onset detector iε, in thiε implementation, a pulse. This pulse is produced very quickly after the increase in energy begins: in addition, the delay before the pulse is produced is independent of the size of the onset. Another way of saying this is to say that the latency of the pulse iε low, and independent of the onset size. The output of the onset detector is written onset (x,i), where x is either L or R (for left or right side),, and i identifieε the bandpaεε channel.
The onεet detector 32 outputε from a εingle side are fed to the onεet cluster detector 32a. There iε one onset cluster detector for each microphone (i.e. for each side) . The onset cluster detector 32a groups together those onsets which have occurred within a short time (taking into account the differences in delay time acrosε the filter bank) .
The output of the onεet cluεter detector 32a iε an n- channel εignal (where n iε the number of bandpaεε bandε) . Each εignal is, at any time, either 1 (signifying that thiε channel iε currently part of an onεet cluεter) or 0 (εignifying that thiε channel iε not part of an onset cluster) .
Figure 5 shows the left and right onset cluster signalε being compoεed, at 34a, to form a composite onset cluεter εignal. Thiε compoεition 34a may take the form of a set of n AND gates or a set of OR gates.
Figure 6 shows the onset signalε from the left and right from each channel being compared in time, and a value for the time differential computed. There are n εuch valueε computed. For channelε for which there waε no onset signal, no output (null) will be produced: similarly for channels in which the lowest value for the time difference between the left and right onset εignal iε too large (that iε, haε a value which could not be produced from any εignal direction), no output (null) will be produced.
The valueε . produced will be gated, at 34b, by the compoεite onset signal cluster signal. This εignal will select clusterε of onεetε (generally one at a time) . There are n outputε produced: each iε either a value for the time differential, or null.
One onεet cluster is produced at a time, and the onset store 36 storeε the channel sets of recent grouped onsets, indexed by grouped ITD, that is according to the determined direction of sound for the channel.
Figure 7 and figure 8 show how the amplitude modulation iε detected. The outputε from the bandpasε filters 16, 18 are rectified and smoothed, at 60, and the rectified εmoothed output is supplied to an AM mapping network 62. Thiε implementation iε baεed on Smith L. S., A one-dimenεional frequency map implemented using a network of integrate-and-fire neurons, in ICANN 98, Volume 2, p991- 995, Springer verlag 1998. This network 62 (as εhown in figure 8) has a number of excitatory neurons (m) 64, and one inhibitory neuron 66 (shaded) . The input to all the excitatory neurons is the same: the input to the inhibitory neuron iε the (delayed) output of the excitatory neurons . The excitatory neuronε are arranged so that they are each particularly senεitive to amplitude modulation at some small range of frequencies. The effect of the network is that, for amplitude modulated input, one of the excitatory neurons (the mapping neurons) fireε in phaεe with the amplitude modulation. In addition, the inhibitory neuron pulεeε whenever there iε a sufficient amount of amplitude modulation.
The AM selection stage takes the output from all the excitatory neurons. This is gated by the output from the inhibitory neuron (so that null output is produced in the absence of amplitude modulated input) . It reduces the pulse output to a single amplitude modulated channel, by selecting only the active excitatory neuron output . Additionally, it codes the identity of the excitatory neuron producing this output: this supplies information on the frequency of the amplitude modulation.
Figure 9 shows the production of the table 68 used in grouping amplitude modulated signals. In order to group the amplitude modulation signals (so that they can be used to compute grouped ITDs) we produce a table with an entry of 1 for each output for each AM frequency output from a bandpassed channel, and 0 otherwise. Thus each row of the table may contain at most one 1 entry. If the same AM frequency iε found in more than one channel, then there will be columns with more than one 1 entry. The illustration showε a situation with 15 bandpassed channels, and 12 AM distinguiεhable AM frequency bandε. In the table shown, bandpassed bands 2, 3, and 9 have found AM in AM channel 3, bandpassed bands 6, 7, 8, and 11 have found AM in AM channel 7, and bandpassed bands 10, 12, and 14 have found AM in AM frequency channel 11.
One table iε produced for each side (left, right) : a composite table is produced by ANDing the left and right tableε .
Figure 10 illuεtrates how the columns of the table are used to gate the AM signalε at 40, εelecting only thoεe with the same AM frequency for compariεon, and generation of interaural time differenceε (ITDs) . In the figure, we εhow a εystem with 15 bandpassed channels. The output of column 7 of the table above has been used to gate these signals. Thuε, only bands 6, 7, 8, and 11 have been selected. These pulse signalε (which will be at the same frequency - namely frequency band 7, and whose pulse times reflect the phase of the amplitude modulation, that is the pulses are in phase with the amplitude modulation) are then fed in pairs (left, right) to circuitry 70 which computes the time difference between these signalε. The valueε across the different selected bands are then processed at 72 (for example, averaged, or the modal or median value selected) to produce the AM time differential signal for this AM frequency band.
An AM time difference signal will be produced for each nonzero column in figure 9: that is for each AM frequency band detected in both Left and Right channels. Figure 11 showε how recent time difference εignalε produced for each nonzero column are εtored: that iε, the set of channels aεεociated with each εignal iε εtored, indexed by the (grouped) ITD, aε input from interactive diεplay 28. Figure 12 shows how simple (ungrouped) ITDs are computed from the output of each pair (left, right) of bandpassed channels. Thiε iε achieved by uεing a phase- locked pulse generator 74, 75 (which may, for example, generate a pulse on each positive-going zero-crossing) , and then calculating the time difference between these pulses. For low frequencies, these estimates tend to be unreliable, and for high frequencies, they can be ambiguous. However, there is a range of medium frequencies for which good estimates can be made. One time difference estimate will be produced for each
(medium-frequency) channel. These may be grouped together prior to further usage.
Recent time difference estimates are stored at 50: that iε, the channelε associated with each grouped time difference estimate are stored, indexed by the time difference (ITD) itself.
Figure 13 showε the time difference signals from the three sources (onsets (Figure 6), amplitude modulation (Figure 11) and waveform-synchronouε procesεing (Figure 12)) are displayed on the display 28, in the form of a direction display. That iε, the time differenceε between the left and right channelε iε interpreted as an angle. The angles computed from the IIDs (figure 3) are also displayed.
In this embodiment, the display takes the form of a semicircle, because the syεtem cannot distinguish between sounds from in front and behind; darker areas correspond to estimated directions of sound source. The user interacts with the display, selecting a particular direction (e.g. by touching the display) from which they wish to be presented with εounds. The diεplay returns the angle selected, and this is then processed. A low power flat touch panel display (such as those used in colour portable computers) may be utilised.
Figure 14 illustrates how the signalε controlling the signal to be presented to the user are generated from the information recovered from the interactive display 28, and the stored information at the onset store 36 (Figure 6) , the AM difference signal store 42 (Figure 11) and the stored waveform-based time difference signal store 50 (Figure 12) .
The angle output from the interactive display 28 is computed, at 76, from the user's interaction with the display. Thiε is converted into an estimate of the IID and
ITD, at 78 and 80, which sound from that direction would lead to. The channel contributions from the low, medium and high frequency channels are normalised, at 82, 84, to provide mixing signals.
Figure 15 showε how the angle computed from the user's interaction with the display is used to index into the store 26 of low- frequency channel contributions, to estimate which of the low frequency (LF) channels gave rise to IIDs which were likely to have been produced by signals from that direction. This will use the head-related tranεfer function which iε different at different frequencieε .
Figure 16 εhowε how the final εignal for representation to the user is generated. The mixing signalε ControlLMix and ControlRMix are generated aε deεcribed in figure 14, and control left and right channel mixerε 86, 88. The final mixing of the signals for the two ears, in mixerε 90, 92 iε controlled by εignalε ControlLRMix, and these will depend on the nature of the user's hearing loss.
As noted, previously, the present invention is intended to mimic, to a certain extent, the processing of soundε in the early auditory εyεtem of a human (or other mammal) , aε diεcusεed below. The input to the system comes from two microphones 12, 14 (L, for left, and R, for right in the figureε), which are placed a distance apart. The microphones may be, for example, placed at the end of the auditory canal, or elsewhere on the pinna. Placing the microphones at the end of the auditory canal allows the pinna transfer characteristic to alter the relative strengthε of different frequencieε . If final presentation is binaural, this information is useful to the user, allowing them to place the sound in space better (J. Blauert Spatial Hearing MIT Press, revised edition, 1996) . The microphones 12, 14 transduce the acoustic signal into a electrical signal. These electrical signals are amplified (maintaining the same frequency/phase response in both channels) , and fed into identical bandpass filter banks 16, 18. These filter banks 16, 18 perform a similar task to that of the cochlea. Each of these filter banks 16, 18 produces a large number of outputs, one for each channel. These channel outputs are used as input to modules which emulate the onset detectors, waveform- synchrony detectors and amplitude modulation detectors of the neurobiological syεtem. However, not all channelε will use all three moduleε .
Earlier work (L.S. Smith. Onεet-based sound segmentation. In D.S. Touretzky, M.C. Mozer, and M.E. Hasselmo, editors, Advances in Neural Information Procesεing Systems 8, pages 729- -735. MIT Press, 1996) has used onsets in different channels for sound εeg entation: however, the integrate-and-fire neuron modelε used there have a latency (time from the sudden increase occurring to the neuron firing) which is dependent on the volume of the sound, and the speed of increase, unlike those of the real onset cells intensity (J.S. Rothman, E.D. Young, and P.D. Manis . Convergence of auditory nerve fibers onto bushy cells in the ventral cochlear nucleus: Implications of a computational model. Journal of Neurophysiology, 70 (6) :2562--2583 , 1993; J.S. Rothman and E.D. Young. Enhancement of neural synchronization in computational models of ventral cochlear nucleus bushy cells. Auditory Neuroscience, 2:47--62, 1996). The synthetic onset detector must have a very short, but constant latency: this latency needs to be constant over a wide range both of intensitieε and of rates of increase. Since onsets may be used in location of both pitched and unpitched sounds, each- onset detector may receive input from a range of bandpassed channels. Unlike the biological system we use one precise onset detector per channel, rather than rely on .population coding.
Waveform- synchrony is primarily of use at low to medium frequencies, as discussed earlier. The synthetic waveform-synchrony detector will provide an output at a specific part of the phase of the signal (for example, at each positive-going zero-crossing) . For precise measurement of the ITD, there needs to be as little jitter as possible. Amplitude modulation is primarily useful at medium to high frequencies. Note that effective use of AM is predicated on the bandpasε filter having a wide-band reεponse such as the response of the real cochlea. Again, the detector must provide an output at a particular point in the envelope, for example at peaks, and again, jitter needs to be minimised.
The display shows the azimuthal direction of the different incoming sounds (though not whether the sound is ahead of or behind the user) as computed from IIDs and head-related transfer functions, and from ITDs. This on its own may be used to draw attention to features of the auditory environment. However, it may be rendered more useful to the hearing impaired by permitting them to interact with it to select the information to be presented to them by the hearing aid itself. How this is best achieved in a particular application will depend on factors which will vary from user to user, such as whether they are willing to use their hands to interact with the syεtem, or would prefer to interact only by turning their heads.
Two main modes of sound selection are likely to be utilised. In the embodiment likely to be preferred by most users, the user turns to face the (known) source of the soundε in which they are interested. The sounds to be selected are then those with low ITD and IID. In another embodiment, as described above a map of the incoming sounds is produced and displayed, and the user selects the sounds to be presented. The information to be presented to the uεer may be preεented monaurally or binaurally. The following discusεion, as illustrated in figureε 14-16, refers to binaural presentation. The result of the user'ε interaction with the interactive display is an angle, θ between -π/2 and +π/2 or , if the user requeεtε only thoεe sources directly ahead. This angle is used to compute the expected IID and ITD for signalε from that direction. In the illuεtrated embodiment, the ITD eεtimate iε uεed to index into the stored ITD/channel list for onsets, amplitude modulation, and waveform synchronouε subsystems. For each channel, these three values are used to compute an estimate of that channel's contribution produced by sources from that direction. For each ear, these estimates are normalised (to a length of 1) , and this vector (ControlLMix for the left ear, and ControlRMix for the right ear) used to control a mixer (Figure 16) . This produces two multichannel outputs, OutDataL and OutDataR. These are used for medium and high frequencies: 'for lower frequencies, the same approach is made using the IID. The resultant output, OutData, is a multichannel signal, suitable for visual display. For auditory presentation, the signals in the different channels are added together in a manner which reflects the user'ε hearing deficit.
Exactly how the selected sounds are presented to the user depends very much on the sensory faculties of the user. If there is sufficient residual hearing, then selective amplification may be most suitable: if the residual hearing is restricted to particular frequency bands, then resynthesis may be more appropriate. It may also be posεible to mix these two techniques . Alternatively, presentation may use the visual modality.
The selected sound, produced as outlined in Figures 14 to 16 may have some channels selectively amplified to make up for the hearing deficit. The resulting sound may be presented (a) monaurally, if there is only sufficient residual hearing in one ear, or (b) binaurally if there iε sufficient residual hearing in both ears. In this case, we would preεent the data from OutDataL to he left ear, and from OutDataR to the right ear.
Additionally, we can still use the ControlLRMix signal to alter the gain on the signals from the two ears .
Where the residual hearing is restricted to a small part of the auditory spectrum, or, indeed, where the presentation takes place through an implant, it may be more appropriate to resynthesize the sound to take advantage of whatever hearing is available. Again, the signal we start from is the OutData signal.
Where there is little or no residual hearing, the information from the sound is presented in one particular direction visually, utilising a colour display to present information about how the power of the εound is distributed over the spectrum.
One posεibility (which does not use the interactive display, but displays all the incoming sound) is to choose the colour to match the ITD, and to make the intensity reflect the strength of the εignal.
Alternatively, one may uεe the interactive diεplay to select the direction of the εourceε to be preεented, and use the colour of the display to show the presence and pitch of amplitude modulation, keeping white for non- amplitude modulated areas of the spectrum, and again using the intensity to show the signal strength. This would use information present in Figures 8 to 10, but not used in the auditory presentation. The interactive system operates under strict real-time constraints. In addition, the system is ideally light, and wearable, and runs with a low power consumption. Most current sophiεticated hearing aids use digital signal processing (DSP) techniques. DSP circuits are generally organised as reconfigurable fast parallel multipliers and adders. This is highly appropriate for convolution computation, and is highly effective for digital filtering. Non-linear operations are also possible on such circuits. However, although DSP technology is very fast; it is not inherently parallel, and we wish to process multiple channels simultaneously. In addition, there is a speed/power tradeoff.
An alternative technology is subthreshold analog VLSI (C. Mead. Analog VLSI and Neural Systems. Addison-Wesley, 1989) . This technology works at extremely low power levels, and this allows highly parallel circuits which operate at low power to be utilised. In addition, the exponential characteristic of one of the basic componentε, the tranεconductance amplifier, mirrors the characteristics of the biological system rather better than either digital on/off switches, or more linear analogue devices.
Sound detection may be through microphoneε . Alternatively, direct silicon transducers for pressure waves may be use . In the preferred embodiment there are two microphones, mounted on the user's ears (either behind the ear, or in the auditory canal) . The microphones are omnidirectional: we need to receive signals from all directions so we can estimate the directions of sound sources .
We describe below one possible neuromorphic implementation of some of the processing described above, though it is understood that this implementation is given by way of example only and is not intended to limit the scope of the invention. This processing takes place in stages. For the input from each ear, the first stage (after transduction) is cochlear filtering, and thiε is followed (in each bandpasεed channel) by (in parallel) intenεity computation, pitch phase detection, and envelope processing (that is amplitude modulation phase detection and onset detection) . The results of this procesεing (for all channelε, and for both ears) are used to generate ITD eεtimates for each feature type for each channel. This information is then used in determining what should be presented to the user.
The use of neuromorphic technology for real-time cochlear filtering was initially proposed by Lyon et al (R.F. Lyon and C. Mead. An analog electronic cochlea. IEEE
Transactions on Acoustics, Speech and Signal Processing,
36 (7) :1119--1134, 1988) and has been extended by Lazzaro
(J. Lazzaro and C. Mead Silicon modeling of pitch perception Proceedingε of the National Academy of Scienceε of the United States. 86 (23 ) : 9597- -9601 , 1989), Liu and Andreou (W. Liu, A.G. Andreou, and Jr. M.H. Goldstein. Analog cochlear model for multiresolution speech analysis. In Advances in Neural Information Processing Systems 5, pages 666--673, 1993, and W. Liu, A.G. Andreou, and Jr. M.H. Goldstein. Voiced speech representation by an analog silicon model of the auditory periphery. IEEE Trans. Neural Networks, 3 (3 ) : 477- -487 , 1993), Watts (L. Watts Cochlear Mechanics: Analysis and Analog VLSI PhD thesis, California Institute of Technology, 1993), and more recently by Frangiere and van Schaik ( E. Fragniere, A. van Schaik, and E.A. Vittoz Deεign of an analogue VLSI model of an active cochlea Analog Integrated Circuits and Signal Processing, 12:19--35, 1997). The advantages of the neuromorphic solution are that it is inherently real-time, and low power, unlike DSP implementations. At present it is not yet possible to achieve as high a quality factor (Q) or as many stages as achieved by the human cochlea, but the most recent techniques (A. van Schaik. Analogue VLSI Building Blocks for an Electronic Auditory Pathway. PhD thesiε, Ecole Polytechnique Federale de Lausanne, 1997) can provide 104 stageε uεing a second order low-pass filter cascade.
Pitch phase detection in animals relies on population coding by spiking neurons which are more likely to spike at a particular phase of the movement of the basilar membrane. Neuromorphic implementations of this are discusεed by Liu et al (W. Liu, A.G. Andreou, and Jr. M.H. Goldstein Voiced speech representation by an analog silicon model of the auditory periphery IEEE Trans. Neural Networks. 3(3) :477-- 487, 1993) and in techniques by Van Schaik (A. van Schaik. Analogue VLSI Building Blocks for an Electronic Auditory Pathway. PhD thesiε, Ecole Polytechnique Federale de Lausanne, 1997), where a version of Meddis's hair cell model (M.J. Hewitt and R. Meddis An evaluation of eight computer models of mammalian inner hair-cell function, Journal of the Acoustical Society of America, 90(2) :904-- 917, 1991) is implemented. In both these cases, both the tendency to synchronize with the input signal below about 4KHz, and the rapid and short-term adaptation are modelled. However, if the aim is simply to encode the phase of the signal emanating from each bandpass filter, then a further available technique would be rectification followed by peak detection, or alternatively, simple positive-going zero crosεing detection. Either of these can be easily accomplished using neuromorphic techniques. Lazzaro et al
(J. Lazzaro and CA. Mead. A silicon model of auditory localization. Neural Computation, l(l):47--57, 1989) have implemented neuromorphically a model of the barn owl's auditory localisation εyεtem uεing a detector sensitive to zero-crossings of the derivative of the half-wave rectified bandpass filter output. Although it would be posεible for a neuromorphic system to retain waveform- synchronouε operation at high frequencieε, εource direction detection iε difficult because of the short period of these signalε . Matching the peakε leadε to ambiguity in the εource direction. However, if the result of bandpasεing the signal at high frequencies is that there is amplitude modulation at a lower frequency, then the difference in the phase of the modulation between the two detectors may be used. Neuromorphic detection of amplitude modulation (modelling stellate cells in the cochlear nucleus) is discussed by Van Schaik (A. van Schaik Analogue VLSI Building Blocks for an Electronic Auditory Pathway PhD thesis, Ecole Polytechnique Federale de Lausanne, 1997) in the context of periodicity extraction. Although the same techniques could be used for ITD estimation, it is perhaps simpler to low-pass filter the waveform- synchronous phase detector, and generate a pulse on each peak (or on each positive-going zero-crossing) . Neuromorphic implementation of the onset detector may be achieved using a neuromorphic spiking neuron.
Since there are three independent techniques for ITD computation in each channel (although amplitude modulation would not be used below about IKHz, and waveform synchrony would not be used above about 4KHz) , we are liable to have both a number of estimates at different parts of the spectrum, and even a number of estimates at each part of the spectrum. There may be many sound sources at any one time, so that all theεe eεtimateε may well be correct. A mixture of subthreshold analogue, εupra-threεhold analogue and digital techniques may be applied to the production of the neuromorphic implementation of the control signal generation and of the mixers.
What is to be presented will be produced from the OutData signal (or OutDataL and OutDataR signals in the case of binaural presentation) . Auditory presentation technology may, for example, utilise remote generation of the signal, and transmission of the signal to the in-ear transducers by wireless technology. In addition, it may be necessary to adjust the spectral energy distribution and compress the signal to take best advantage of the residual hearing present . Noting that the bandpass characteristic of current neuromorphic filters is not as sharp as is preferred we may counterbalance this by (i) selectively amplifying those channels for which the chosen ITD is most strongly represented and (ii) subtracting the content of those channels in which the ITD chosen is under-represented.
It will be understood that the embodiments of the invention herein before described are given by way of example only and are not meant to limit the scope thereof in any way.

Claims

1. A method of processing sound, the method comprising the steps of : detecting sounds at at least two ╬╡paced detecting locations; analy╬╡ing the detected ╬╡ounds to identify the angular relation between respective sound sources and said detecting locations; permitting selection of an angular relation associated with a particular sound source; and processing the detected sounds in response to said selection to highlight a stream of sound associated with said particular sound source.
2. The method of claim 1, wherein the angular relation between the re╬╡pective ╬╡ound sources is determined at least in part by reference to time differences between the ╬╡ounds from the respective sound source╬╡ a╬╡ detected at the spaced detecting locations.
3. The method of claim 2, wherein the angular relation between the respective sound ╬╡ource╬╡ i╬╡ determined with reference to time difference╬╡ determined with reference to at least one feature of the detected sounds, the feature being selected from: waveform phase; amplitude modulation phase ; and onset .
4. The method of claim 3, wherein time differences are determined with reference to a plurality of features of the detected sounds.
5. The method of any of the preceding claims, wherein the angular relation between the respective sound source╬╡ is determined at least in part by reference to inten╬╡ity differences between the sounds from the respective sound sources as detected at the spaced detecting locations.
6. The method of any of the preceding claims, wherein the sounds are detected at locations corresponding to the ears of a user.
7. The method of claim 6, wherein the angular relation between the respective ╬╡ound ╬╡ources is determined at least in part by reference to interaural time differences (ITDs) .
8. The method of claims 6 or 7 , wherein the angular relation between the respective sound sources is determined at lea╬╡t in part by reference to interaural inten╬╡ity difference╬╡ (IIDs) .
9. The method of any of the preceding claims, further comprising selectively filtering the detected sounds from said spaced locations into a plurality of channels and then comparing features of ╬╡ound of each channel from one location with feature╬╡ of sound from a corresponding channel a╬╡sociated with the other location.
10. The method of claim 9 when the angular relation between the respective sound sources is determined with reference to time differences determined with reference to waveform phase of the detected sounds and wherein the time differences are grouped by clustering values of the time differences .
11. The method of claim 9 when the angular relation between the respective sound sources is determined with reference to time differences determined with reference to onset of the detected sounds and wherein onsets are grouped monaurally prior to determination of the time differences.
12. The method of claim 9 when the angular relation between the respective sound source╬╡ i╬╡ determined with reference to time differences determined with reference to amplitude modulation phase of the detected sound╬╡ and wherein the amplitude modulation channel╬╡ are grouped by amplitude modulation frequency prior to determination of the time differences .
13. A method of processing sounds emanating from a plurality of sound sources, the method comprising the steps of: detecting sounds at at least two spaced detecting locations ; analysing the detected sounds to determine the angular relation between the respective sound sources and said detecting locations by reference to at least one of intensity differences between the sounds from the respective sound sources as detected at the spaced detecting locations and time differences between the sounds from the respective sound sources as detected at the spaced detecting locations; and streaming the sounds associated with at least one sound source on the basis of said determined angular relation.
14. Apparatus for proce╬╡╬╡ing sound, the apparatus comprising: means for detecting sounds at at least two spaced detecting locations ,- means for analysing the detected sounds to identify the angular relation between re╬╡pective ╬╡ound sources and said detecting locations; means for permitting selection of an identified angular relation associated with a particular sound source; and means for proce╬╡╬╡ing the detected ╬╡ound╬╡ in response to said selection to highlight a ╬╡tream of sound associated with said particular sound source.
15. The apparatu╬╡ of claim 14, wherein said analysing means include╬╡ means for determining the angular relation between the respective sound source╬╡ by determining time difference╬╡ between the sounds from the respective sound sources as detected at the spaced detecting locations.
16. The apparatus of claim 15, wherein said analysing means includes means for determining the angular relation between the respective sound sources by determining time differences between the sounds from the respective sound sources as detected at the spaced detecting locations by reference to at least one of waveform phase; amplitude modulation phase and onset.
17. The apparatus of claim 16, wherein said analysing means includes means for determining the angular relation between the respective ╬╡ound sources by determining time differences between the sounds from the respective sound source╬╡ a╬╡ detected at the spaced detecting location╬╡ by reference to a plurality of feature╬╡ of the detected ╬╡ounds .
18. The apparatu╬╡ of any of claims 13 to 17, wherein said analy╬╡ing mean╬╡ includes means for determining the angular relation between the respective sound source╬╡ by determining inten╬╡ity difference╬╡ between the ╬╡ound╬╡ from the respective sound sources as detected at the spaced detecting locations.
19. The apparatu╬╡ of any of claim╬╡ 13 to 18, wherein the apparatus is a hearing aid and said means for detecting sounds are adapted for positioning at locations corresponding to the ears of a user.
20. The hearing aid of claim 19, wherein said analysing means includes means for determining the angular relation between the respective sound sources by determining interaural time differences (ITDs) between the sounds from the respective sound sources.
21. The hearing aid of claims 19 or 20, wherein ╬╡aid analy╬╡ing means includes means for determining the angular relation between the respective sound sources by determining interaural intensity differences (IIDs) between the sounds from the respective sound source╬╡.
22. The apparatus of any of claims 14 to 21, further comprising means for selectively filtering the detected sounds from said ╬╡paced locations into a plurality of channels, the analysing mean╬╡ compri╬╡ing means for comparing features of ╬╡ound of each channel from one location with feature╬╡ of ╬╡ound from a corre╬╡ponding channel associated with the other location.
23. The method of claim 22 when the angular relation between the respective sound sources is determined with reference to time differences determined with reference to waveform phase of the detected sound╬╡ and wherein the time differences are grouped by clustering value╬╡ of the time difference╬╡ .
24. The method of claim 22 when the angular relation between the respective sound sources is determined with reference to time differences determined with reference to onset of the detected sound╬╡ and wherein onsets are grouped monaurally prior to determination of the time differences.
25. The method of claim 22 when the angular relation between the respective sound sources is determined with reference to time differences determined with reference to amplitude modulation phase of the detected sounds and wherein the amplitude modulation channels are grouped by amplitude modulation frequency prior to determination of the time differences.
26. Apparatus for processing ╬╡ound╬╡ emanating from a plurality of ╬╡ound sources, the apparatus comprising: means for detecting sounds at at least two spaced detecting locations; means for analysing the detected sounds to determine the angular relation between the respective ╬╡ound ╬╡ource╬╡ and said detecting locations by reference to at least one of intensity differences between the sounds from the respective sound sources as detected at the spaced detecting locations and time differences between the sounds from the respective sound sources as detected at the spaced detecting locations; and means for streaming the sounds associated with at least one sound source on the basis of said determined angular relation.
PCT/GB1999/002063 1998-06-30 1999-06-30 Method and apparatus for processing sound WO2000001200A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
AU45258/99A AU4525899A (en) 1998-06-30 1999-06-30 Method and apparatus for processing sound
EP99928142A EP1090531A1 (en) 1998-06-30 1999-06-30 Method and apparatus for processing sound
JP2000557662A JP2002519973A (en) 1998-06-30 1999-06-30 Audio processing method and apparatus

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB9813973.6 1998-06-30
GBGB9813973.6A GB9813973D0 (en) 1998-06-30 1998-06-30 Interactive directional hearing aid

Publications (1)

Publication Number Publication Date
WO2000001200A1 true WO2000001200A1 (en) 2000-01-06

Family

ID=10834550

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/GB1999/002063 WO2000001200A1 (en) 1998-06-30 1999-06-30 Method and apparatus for processing sound

Country Status (5)

Country Link
EP (1) EP1090531A1 (en)
JP (1) JP2002519973A (en)
AU (1) AU4525899A (en)
GB (1) GB9813973D0 (en)
WO (1) WO2000001200A1 (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1380028A2 (en) * 2001-04-11 2004-01-14 Phonak Ag Method for the elimination of noise signal components in an input signal for an auditory system, use of said method and a hearing aid
WO2004043537A1 (en) * 2002-11-13 2004-05-27 Advanced Bionics Corporation Method and system to convey the within-channel fine structure with a cochlear implant
US7149583B1 (en) 2003-04-09 2006-12-12 Advanced Bionics Corporation Method of using non-simultaneous stimulation to represent the within-channel fine structure
US7277760B1 (en) 2004-11-05 2007-10-02 Advanced Bionics Corporation Encoding fine time structure in presence of substantial interaction across an electrode array
US7512245B2 (en) 2003-02-25 2009-03-31 Oticon A/S Method for detection of own voice activity in a communication device
US7702396B2 (en) 2003-11-21 2010-04-20 Advanced Bionics, Llc Optimizing pitch allocation in a cochlear implant
EP1653768A3 (en) * 2004-11-02 2010-06-02 Siemens Audiologische Technik GmbH Method for reducing interference power in a directional microphone and corresponding acoustical system
WO2010148169A1 (en) * 2009-06-17 2010-12-23 Med-El Elektromedizinische Geraete Gmbh Spatial audio object coding (saoc) decoder and postprocessor for hearing aids
US8027733B1 (en) 2005-10-28 2011-09-27 Advanced Bionics, Llc Optimizing pitch allocation in a cochlear stimulation system
US8199945B2 (en) * 2006-04-21 2012-06-12 Siemens Audiologische Technik Gmbh Hearing instrument with source separation and corresponding method
US8965519B2 (en) 2004-11-05 2015-02-24 Advanced Bionics Ag Encoding fine time structure in presence of substantial interaction across an electrode array
EP1423988B2 (en) 2001-08-08 2015-03-18 Semiconductor Components Industries, LLC Directional audio signal processing using an oversampled filterbank
US9147157B2 (en) 2012-11-06 2015-09-29 Qualcomm Incorporated Methods and apparatus for identifying spectral peaks in neuronal spiking representation of a signal
US9393412B2 (en) 2009-06-17 2016-07-19 Med-El Elektromedizinische Geraete Gmbh Multi-channel object-oriented audio bitstream processor for cochlear implants
US10142761B2 (en) 2014-03-06 2018-11-27 Dolby Laboratories Licensing Corporation Structural modeling of the head related impulse response
CN113556660A (en) * 2021-08-01 2021-10-26 武汉左点科技有限公司 Hearing-aid method and device based on virtual surround sound technology

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5004094B2 (en) * 2008-03-04 2012-08-22 国立大学法人北陸先端科学技術大学院大学 Digital watermark embedding apparatus, digital watermark detection apparatus, digital watermark embedding method, and digital watermark detection method
US8525868B2 (en) * 2011-01-13 2013-09-03 Qualcomm Incorporated Variable beamforming with a mobile platform
KR101934999B1 (en) 2012-05-22 2019-01-03 삼성전자주식회사 Apparatus for removing noise and method for performing thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE1566857A1 (en) * 1967-11-18 1970-04-30 Krupp Gmbh Device for binaural signal reception in sonar systems
US4904078A (en) * 1984-03-22 1990-02-27 Rudolf Gorike Eyeglass frame with electroacoustic device for the enhancement of sound intelligibility
JPH0739000A (en) * 1992-12-05 1995-02-07 Kazumoto Suzuki Selective extract method for sound wave in optional direction
JPH08285674A (en) * 1995-04-11 1996-11-01 Takayoshi Hirata Directive wave receiving system using anharmonic frequency analyzing method
JPH09247800A (en) * 1996-03-12 1997-09-19 Matsushita Electric Ind Co Ltd Method for extracting left right sound image direction
US5757932A (en) * 1993-09-17 1998-05-26 Audiologic, Inc. Digital hearing aid system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE1566857A1 (en) * 1967-11-18 1970-04-30 Krupp Gmbh Device for binaural signal reception in sonar systems
US4904078A (en) * 1984-03-22 1990-02-27 Rudolf Gorike Eyeglass frame with electroacoustic device for the enhancement of sound intelligibility
JPH0739000A (en) * 1992-12-05 1995-02-07 Kazumoto Suzuki Selective extract method for sound wave in optional direction
US5757932A (en) * 1993-09-17 1998-05-26 Audiologic, Inc. Digital hearing aid system
JPH08285674A (en) * 1995-04-11 1996-11-01 Takayoshi Hirata Directive wave receiving system using anharmonic frequency analyzing method
JPH09247800A (en) * 1996-03-12 1997-09-19 Matsushita Electric Ind Co Ltd Method for extracting left right sound image direction

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
PATENT ABSTRACTS OF JAPAN vol. 1995, no. 5 30 June 1995 (1995-06-30) *
PATENT ABSTRACTS OF JAPAN vol. 1997, no. 3 31 March 1997 (1997-03-31) *
PATENT ABSTRACTS OF JAPAN vol. 1998, no. 1 30 January 1998 (1998-01-30) *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1380028A2 (en) * 2001-04-11 2004-01-14 Phonak Ag Method for the elimination of noise signal components in an input signal for an auditory system, use of said method and a hearing aid
EP1423988B2 (en) 2001-08-08 2015-03-18 Semiconductor Components Industries, LLC Directional audio signal processing using an oversampled filterbank
US7317945B2 (en) 2002-11-13 2008-01-08 Advanced Bionics Corporation Method and system to convey the within-channel fine structure with a cochlear implant
WO2004043537A1 (en) * 2002-11-13 2004-05-27 Advanced Bionics Corporation Method and system to convey the within-channel fine structure with a cochlear implant
US7512245B2 (en) 2003-02-25 2009-03-31 Oticon A/S Method for detection of own voice activity in a communication device
US7149583B1 (en) 2003-04-09 2006-12-12 Advanced Bionics Corporation Method of using non-simultaneous stimulation to represent the within-channel fine structure
US8620445B2 (en) 2003-11-21 2013-12-31 Advanced Bionics Ag Optimizing pitch allocation in a cochlear implant
US7702396B2 (en) 2003-11-21 2010-04-20 Advanced Bionics, Llc Optimizing pitch allocation in a cochlear implant
US8180455B2 (en) 2003-11-21 2012-05-15 Advanced Bionics, LLV Optimizing pitch allocation in a cochlear implant
EP1653768A3 (en) * 2004-11-02 2010-06-02 Siemens Audiologische Technik GmbH Method for reducing interference power in a directional microphone and corresponding acoustical system
US7277760B1 (en) 2004-11-05 2007-10-02 Advanced Bionics Corporation Encoding fine time structure in presence of substantial interaction across an electrode array
US8965519B2 (en) 2004-11-05 2015-02-24 Advanced Bionics Ag Encoding fine time structure in presence of substantial interaction across an electrode array
US8027733B1 (en) 2005-10-28 2011-09-27 Advanced Bionics, Llc Optimizing pitch allocation in a cochlear stimulation system
US8295937B2 (en) 2005-10-28 2012-10-23 Advanced Bionics, Llc Optimizing pitch allocation in a cochlear stimulation system
US8199945B2 (en) * 2006-04-21 2012-06-12 Siemens Audiologische Technik Gmbh Hearing instrument with source separation and corresponding method
WO2010148169A1 (en) * 2009-06-17 2010-12-23 Med-El Elektromedizinische Geraete Gmbh Spatial audio object coding (saoc) decoder and postprocessor for hearing aids
US9393412B2 (en) 2009-06-17 2016-07-19 Med-El Elektromedizinische Geraete Gmbh Multi-channel object-oriented audio bitstream processor for cochlear implants
US9147157B2 (en) 2012-11-06 2015-09-29 Qualcomm Incorporated Methods and apparatus for identifying spectral peaks in neuronal spiking representation of a signal
US10142761B2 (en) 2014-03-06 2018-11-27 Dolby Laboratories Licensing Corporation Structural modeling of the head related impulse response
CN113556660A (en) * 2021-08-01 2021-10-26 武汉左点科技有限公司 Hearing-aid method and device based on virtual surround sound technology
CN113556660B (en) * 2021-08-01 2022-07-19 武汉左点科技有限公司 Hearing-aid method and device based on virtual surround sound technology

Also Published As

Publication number Publication date
JP2002519973A (en) 2002-07-02
GB9813973D0 (en) 1998-08-26
EP1090531A1 (en) 2001-04-11
AU4525899A (en) 2000-01-17

Similar Documents

Publication Publication Date Title
WO2000001200A1 (en) Method and apparatus for processing sound
CA2621940C (en) Method and device for binaural signal enhancement
EP1522868B1 (en) System for determining the position of a sound source and method therefor
Dietz et al. Auditory model based direction estimation of concurrent speakers from binaural signals
Roman et al. Speech segregation based on sound localization
US10469961B2 (en) Binaural hearing systems and methods for preserving an interaural level difference between signals generated for each ear of a user
JP3521914B2 (en) Super directional microphone array
US20040252849A1 (en) Microphone array for preserving soundfield perceptual cues
CN108122559B (en) Binaural sound source positioning method based on deep learning in digital hearing aid
DE102019129330A1 (en) Conference system with a microphone array system and method for voice recording in a conference system
CN106999710B (en) Bilateral hearing implant matching of ILD based on measured ITD
JP2008543143A (en) Acoustic transducer assembly, system and method
CN101754081A (en) Improvements in hearing aid algorithms
Culling et al. Spatial hearing
US20170080228A1 (en) Interaural Coherence Based Cochlear Stimulation Using Adapted Fine Structure Processing
Ricketts The impact of head angle on monaural and binaural performance with directional and omnidirectional hearing aids
Grabke et al. Cocktail party processors based on binaural models
CN106658319A (en) Sound processing for a bilateral cochlear implant system
De Sena et al. Localization uncertainty in time-amplitude stereophonic reproduction
Goldsworthy et al. Two-microphone spatial filtering provides speech reception benefits for cochlear implant users in difficult acoustic environments
Bissmeyer et al. Adaptive spatial filtering improves speech reception in noise while preserving binaural cues
Ozimek et al. Speech intelligibility for different spatial configurations of target speech and competing noise source in a horizontal and median plane
Colburn et al. Binaural directional hearing—Impairments and aids
WO2018106567A1 (en) Interaural coherence based cochlear stimulation using adapted fine structure processing
Wightman et al. Reassessment of the role of head movements in human sound localization

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AL AM AT AU AZ BA BB BG BR BY CA CH CN CU CZ DE DK EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW SD SL SZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
ENP Entry into the national phase

Ref country code: JP

Ref document number: 2000 557662

Kind code of ref document: A

Format of ref document f/p: F

WWE Wipo information: entry into national phase

Ref document number: 1999928142

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 09720766

Country of ref document: US

WWP Wipo information: published in national office

Ref document number: 1999928142

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWW Wipo information: withdrawn in national office

Ref document number: 1999928142

Country of ref document: EP