EP2094032A1 - Audio signal, method and apparatus for encoding or transmitting the same and method and apparatus for processing the same - Google Patents

Audio signal, method and apparatus for encoding or transmitting the same and method and apparatus for processing the same Download PDF

Info

Publication number
EP2094032A1
EP2094032A1 EP08101732A EP08101732A EP2094032A1 EP 2094032 A1 EP2094032 A1 EP 2094032A1 EP 08101732 A EP08101732 A EP 08101732A EP 08101732 A EP08101732 A EP 08101732A EP 2094032 A1 EP2094032 A1 EP 2094032A1
Authority
EP
European Patent Office
Prior art keywords
audio
channels
encoding
src
ambisonics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP08101732A
Other languages
German (de)
French (fr)
Inventor
Johann-Markus Batke
Klaus Eilts-Grimm
Jürgen Schmidt
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Deutsche Thomson OHG
Original Assignee
Deutsche Thomson OHG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Deutsche Thomson OHG filed Critical Deutsche Thomson OHG
Priority to EP08101732A priority Critical patent/EP2094032A1/en
Publication of EP2094032A1 publication Critical patent/EP2094032A1/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/002Non-adaptive circuits, e.g. manually adjustable or static, for enhancing the sound image or the spatial distribution
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/03Application of parametric coding in stereophonic audio systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/11Application of ambisonics in stereophonic audio systems

Definitions

  • This invention relates to a generic audio signal format, a method and an apparatus for encoding or transmitting and a method and an apparatus for processing the same.
  • Audio formats relate directly to the audio channels, i.e. the audio information is stored like it is fed to the loudspeaker.
  • the playback of such files is strictly bound to correct positioning of the loudspeakers.
  • Audio formats like 2.0 (stereo) and 5.1 (surround sound) are able to reproduce a spatial impression of the audio content. However, this spatial impression is strictly two-dimensional (2D).
  • 2D two-dimensional
  • extensions of these audio formats with higher number of audio channels like 7.1 or 9.1 stick to 2D sound field representation.
  • the newly developed 22.2 format 1 is a format capable of representing audio content with height information. 1 K.Hamasaki, T. Nishiguchi, R.Okumaura, Y.Nakayama: "Wide listening area with exceptional spatial sound quality of a 22.2 multichannel sound system" (Audio Engineering Society Preprints, Vienna, Austria, May 2007 )
  • a new audio format may use the description of the spatial sound field.
  • a known solution for spatial sound field description is based on Higher Order Ambisonics (HOA), a technology that describes spatial sounds fields using the coefficients of the FOURIER-BESSEL series (also known under different names 2 ). The possible spatial resolution using this description is determined by the order N of the series.
  • HOA Higher Order Ambisonics
  • This representation is very flexible and can hold any type of audio information, e.g. traditional stereo signals or surround sound.
  • Loudspeaker channels are treated as a source at a distinct position (e.g. the loudspeaker's position).
  • an Ambisonics representation and particularly a Higher Order Ambisonics representation would enable such optimized playback of sound. It would also be desirable to minimize the effort for encoding, decoding and transcoding to and from Ambisonics representations, which also would minimize a loss of quality.
  • a conventional audio signal is enhanced by additional data or metadata, wherein the additional data comprise sound source position information that enables conversion of the conventional audio signal into a Higher Order Ambisonics (HOA) representation of the sound field.
  • HOA Higher Order Ambisonics
  • a method for encoding or transmitting an audio signal comprises steps of providing one or more audio source signals, determining respective position information and encoding or transmitting the audio source signals together with metadata that comprise the determined audio source position information.
  • the method for encoding or transmitting an audio signal further comprises steps of determining the number of source channels that are required for Ambisonics encoding, said number being (2N+1) for the 2D case and (N+1) 2 for the 3D case (N is the order of Ambisonics encoding), determining the number of available transmission or storage channels, and comparing the number of source channels required for Ambisonics encoding of the order N with the number of available transmission or storage channels, depending on said comparison, generating a mode decision information having either a first value if the number of source channels required for Ambisonics encoding of the order N is not less than the number of available transmission or storage channels, or having a different second value otherwise, and generating, and storing or transmitting, the Ambisonics encoded version of the audio signal if the mode decision information has said first value, or otherwise storing or transmitting the received or retrieved audio signal.
  • the mode decision information In one embodiment where the Ambisonics encoded version of the audio signal is transmitted, the mode decision information
  • a method for processing an audio signal comprises steps of receiving or retrieving from storage an encoded audio signal, extracting first audio source signals and additional information from the received or retrieved signal, wherein the first audio source signals relate to first audio source positions provided by the additional information, transforming the first audio source signals relating to first audio source positions into second audio source signals relating to different second audio source positions, and supplying said second audio source signals for storage or playback.
  • the step of transforming the first audio source signals into second audio source signals comprises generating an Ambisonics representation of the sound field from a conventional audio source signal and said additional information describing the positions of sound sources, wherein the Ambisonics signal can be of higher order (HOA).
  • HOA higher order
  • a general audio production chain as shown in Fig.1 , acquisition of audio signals is achieved by one or more microphones M.
  • the audio signals are encoded E, stored S and later decoded D for reproduction via one or more loudspeakers LS.
  • each of the audio signals in the decoded signal relates to a particular loudspeaker.
  • the audio signals relate to the left and right microphones and loudspeakers.
  • Fig.2 a shows a conventional loudspeaker setup for playback in stereo
  • Fig.2 b a conventional loudspeaker setup for surround sound, also known as 5.0 format.
  • 5.0 format a conventional loudspeaker setup for surround sound
  • the optimal angles between loudspeakers are subject to convention.
  • the respective audio signals relate to specific relative positions that cannot be changed for a given reproduction system. That is, loudspeaker boxes need to be positioned according to these fixed relative positions in order to optimize the sound reproduction.
  • Ambisonics and particularly HOA
  • the audio content is independent from the loudspeaker setup. Thus, it has to be processed to match a given setup, wherein it will be optimized to match this setup.
  • Second, 3D representation and high spatial resolution of audio content is fully supported.
  • a general Ambisonics based system is shown in Fig.3 .
  • a microphone array MA acquires the signals in a spatial manner.
  • Position information P describing the microphone positions is added in an encoder E, which generates an Ambisonics representation 30 of given order N of the signals.
  • This signal can be transcoded TR into a conventional audio signal having a desired number of channels that relate to desired positions of the loudspeakers.
  • the localization of the different loudspeaker channels is the better, the higher the order N of the Ambisonics representation was.
  • the order N and the spatial positions (2-dimensional or 3-dimensional) have also an impact on the number of channels that the Ambisonics signal 30 requires, as described below.
  • the conventional audio signal can be reproduced LSA on a loudspeaker array that may but need not correspond to the microphone array. However, positions of the loudspeakers must be known for transcoding the signal.
  • a receiver performs a conversion from a given audio source arrangement to a required audio target arrangement, such as an individual loudspeaker arrangement.
  • An encoder or transmitter provides a conventional audio signal with one or more microphone/loudspeaker related channels (such as 5.0) and attached position information that defines the positions of the microphone/loudspeaker of each channel.
  • this signal can be converted to an Ambisonics representation, and in particular to a HOA (Higher Order Ambisonics) representation.
  • This signal can be stored or transcoded/re-mapped for a desired channel and position configuration (e.g. according to an actual loudspeaker configuration or a particular configuration desired for other reasons). It is possible to select at the receiver side whether the Ambisonics representation or the conventional representation with additional position information shall be stored or further processed.
  • the transmitted/received signal is backward compatible, since it can be decoded by conventional receivers that ignore the additional metadata information, and that the transmitted/received signal uses practically the same bandwidth than a conventional audio signal, since the additional metadata information is very little compared to the audio information (although it may be transmitted frequently, e.g. in fixed time intervals such as once every second, or every k audio frames).
  • the conversion from a given audio source arrangement into a HOA representation can be performed before transmitting, so that either the conventional audio signal plus position information or the generic HOA representation of the audio signal is transmitted.
  • the latter is preferred if the required number of transmission channels is equal for both formats.
  • the transmission signal comprises a mode indication showing whether the audio format is HOA or conventional, because two different formats are possible.
  • the receiver extracts and evaluates the indication and performs the further processing according to the mode indication.
  • FIG.6 a An audio processing system according to one embodiment of the invention is shown in Fig.6 a) .
  • a signal as described above, having one or more audio source signals X src and position information r src giving the positions of the audio source signals is received, and multiplexed 62 into a common signal 60.
  • This signal can be stored (not shown) or transmitted, e.g. between different devices in a network.
  • the signal is demultiplexed 63, wherein the audio source signals X' src and position information r' src are regained, and these are input into an Ambisonics encoder 64 that generates an Ambisonics signal 61.
  • the order of this signal may be determined according to the number of available positions r' src , but can also be influenced by available storage area and/or processing bandwidth.
  • the Ambisonics encoder 64 can be a HOA encoder, i.e. N>1.
  • the Ambisonics signal is fed into a transcoder TR and there re-mapped to a given loudspeaker configuration, and output to conventional multi-channel audio processing and loudspeakers LSA.
  • This processing system is that the step of re-mapping can easily be adapted to the actual loudspeaker configuration, so that after a change in this configuration the optimization can also be changed according to the new loudspeaker number and/or positions.
  • a new loudspeaker is added to the reproduction system, its position information is provided to the transcoder and the Ambisonics signal can be re-mapped to match the new configuration.
  • the position information can be provided by user input (e.g. using a GUI), or by automatic loudspeaker position measuring systems.
  • E.g. relative loudspeaker positions can be determined by reproducing a reference signal at a known position and measuring the different signal run-times. For the 3D case, reproduction of three reference signals at three distinct known positions can be used for automatic loudspeaker location.
  • FIG.6 b An audio processing system according to another embodiment of the invention is shown in Fig.6 b) .
  • the signal 60 being composed of one or more audio source signals X src and position information r src giving the positions of the audio source signals is received and demultiplexed 62 into its components. It is possible to select S 6 whether these components or the HOA encoded signal representation 61 shall be used for the further processing, such as optional storing 66, transcoding and multi-channel audio processing. As described above, it may be advantageous to store the generic HOA representation, depending on the application.
  • the selection signal 65 may depend on the parameters mentioned below, such as required order N, number of source positions or number of target (loudspeaker) positions.
  • the spatial positions refer to a spherical coordinate system.
  • the distance of sources (audio signals on the encoding side, loudspeakers on the playback side) is not taken into account for the sake of clarity. However, it is easily integrated to this encoding scheme using a known distance coding scheme, e.g. that of Jerome Daniel (reference cited above).
  • the directional position of the individual speakers is given by ⁇ i , ⁇ i in spherical coordinates, Y n m ⁇ i ⁇ ⁇ i is the spherical harmonic function.
  • the HOA coefficient transmission is usually done by transmission of the individual vector elements of A resulting from the Fourier-Bessel representation. This results in a possibly higher number of channels than formerly required and very high numbers of channels for high orders.
  • the invention results in a new audio format that is backward compatible to existing audio content. It is capable of holding audio content with full 3D information and any high spatial resolution, and therefore it is forward compatible with any audio content.
  • the invention aims to provide an efficient solution to this problem: for 2.0 and 5.1 audio content it is generally less expensive in terms of channel/storage capacity to transmit/store the original audio information and additionally the locations of the sources.
  • the result is a parameterised HOA signal representation at lowest cost.
  • the full HOA representation is calculated on the receiver side, if necessary.
  • a smooth mode selection is proposed that allows selection of the best possible format.
  • the HOA representation also provides the advantage of scalability.
  • a format like 22.2 requires 24 audio channels, as stated above.
  • the HOA representation is scalable in terms of the spatial resolution.
  • An audio channel of a traditional audio format like stereo is viewed upon here as an audio source with a distinct position.
  • An exemplary audio file format for HOA coefficients carrying several audio sources can be generated as follows:
  • O s source channels are required for HOA coefficient representation (see eq.3).
  • decoding of this new signal is done as follows:
  • the goal is to use a given number of channels available for transport in an optimal way. It is assumed that the number of source channels O src is higher than the number of available channels.
  • the integer number O chan defines the number of available channels.
  • Audio source signals X src and position information r src are provided, as described above, and can be encoded in two different modes: either they are multiplexed MX1 into a common data stream, so that a receiver is enabled to generate a HOA representation (since all the data necessary for a HOA representation are included), or a HOA representation is generated HOA e1 before transmission.
  • a mode selection signal MD it is possible to select S 1 ,S 2 ,S 3 one of the two modes.
  • This mode decision signal is obtained by the above-described comparison CMP between the required number of channels O S and the available number of channels O C . The latter may be fixed or given.
  • the required number of channels O S is determined in a block eq4 according to equation 4 above, using the result of the block eq3 that performs equation 3 above, and a spatial arrangement information 2D3D indicating whether the spatial arrangement is 2-dimensional or 3-dimensional.
  • the spatial arrangement information 2D3D is also an input to the block eq3 that performs equation 3.
  • the mode decision information MD is multiplexed MX2 into the output data stream A enc so that all necessary information for proper decoding is contained.
  • a decoder according to one embodiment of the invention is shown in Fig.5 .
  • a signal A enc as encoded by the encoder of Fig.4 is demultiplexed DMX2 so that the mode decision information MD' is obtained.
  • the remaining signal is either demultiplexed into its audio and position components X' src ,r' src and then HOA encoded HOA e (if it was not HOA encoded), or it is directly used if it is already HOA encoded.
  • Switching means S 4 ,S 5 controlled by the mode decision information MD' switch between these modes.
  • the HOA encoded signal is provided to a transcoder, as described above.
  • a device for encoding or transmitting an audio signal comprises means for providing one or more audio source signals, means for determining for each of said audio source signals a specific position to which it relates, means for generating data sets containing the determined positions of the audio sources, and means for encoding or transmitting the data sets together with said audio source signals.
  • said data sets are suitable for calculating a generic audio field representation based on Ambisonics representation.
  • the device further comprises means for determining the number (O s ) of source channels that are required for Ambisonics encoding, said number being (2N+1) for the 2D case and (N+1) 2 for the 3D case, where N is the order of Ambisonics encoding, means for determining the number (O c ) of available transmission or storage channels, means for comparing the number (O s ) of source channels required for Ambisonics encoding of the order N with the number (O c ) of available transmission or storage channels; means for generating, depending on said comparison, a mode decision information (MD), having a first value if the number (O s ) of source channels required for Ambisonics encoding of the order N is not less than the number of available transmission or storage channels (O c ), or having a different second value otherwise, and means HOA e for generating, and means for storing or transmitting, the Ambisonics encoded version of the audio signal if the mode decision information has said first value, or
  • a device for processing audio signals comprises means for receiving or retrieving from storage encoded audio signals, means for extracting first audio source signals and additional information from the received or retrieved signals, wherein the first audio source signals relate to first audio source positions provided by the additional information, means (TRC) for transforming the first audio source signals relating to first audio source positions into second audio source signals relating to different second audio source positions, and means (LSA) for supplying said second audio source signals for storage or playback.
  • the invention can be used for all kinds of audio processing devices. These may be targeting music reproduction, but also voice reproduction, such as multi-channel teleconferencing systems.
  • voice reproduction such as multi-channel teleconferencing systems.
  • spatial information can be added to conventional multi-channel audio signals, and scalability in terms of spatial resolution can be provided.

Abstract

Commonly used audio file formats relate directly to the audio channels, i.e. the audio information is stored like it is fed to the loudspeaker. However, for obtaining a general description of audio information, a paradigm shift is necessary. Instead of storing the loudspeaker channels, a new audio format may use the description of the spatial sound field. A method for encoding or transmitting an audio signal comprises the steps of providing one or more audio source signals (Xsrc), determining for each of said audio source signals a specific position to which it relates, generating data sets containing the determined positions (rsrc) of the audio sources, and encoding or transmitting (62) the data sets together with said audio source signals. Advantageously, this allows conversion into Higher Order Ambisonics (HOA) format and subsequent re-conversion into conventional audio data that can be adapted to a given loudspeaker configuration.

Description

    Field of the invention
  • This invention relates to a generic audio signal format, a method and an apparatus for encoding or transmitting and a method and an apparatus for processing the same.
  • Background
  • Commonly used audio file formats relate directly to the audio channels, i.e. the audio information is stored like it is fed to the loudspeaker. The playback of such files is strictly bound to correct positioning of the loudspeakers. Audio formats like 2.0 (stereo) and 5.1 (surround sound) are able to reproduce a spatial impression of the audio content. However, this spatial impression is strictly two-dimensional (2D). Also extensions of these audio formats with higher number of audio channels like 7.1 or 9.1 stick to 2D sound field representation. The newly developed 22.2 format1 is a format capable of representing audio content with height information.
    1 K.Hamasaki, T. Nishiguchi, R.Okumaura, Y.Nakayama: "Wide listening area with exceptional spatial sound quality of a 22.2 multichannel sound system" (Audio Engineering Society Preprints, Vienna, Austria, May 2007)
  • For obtaining a general description of audio information, a paradigm shift is necessary. Instead of storing the loudspeaker channels, a new audio format may use the description of the spatial sound field. A known solution for spatial sound field description is based on Higher Order Ambisonics (HOA), a technology that describes spatial sounds fields using the coefficients of the FOURIER-BESSEL series (also known under different names2). The possible spatial resolution using this description is determined by the order N of the series. This representation is very flexible and can hold any type of audio information, e.g. traditional stereo signals or surround sound. Loudspeaker channels are treated as a source at a distinct position (e.g. the loudspeaker's position). It is generally known in the art how to convert conventional audio signals into HOA representations and vice versa, whereby audio signals' positioning information is required. However, HOA files containing traditional audio formats may result in a higher number of audio channels than the original file. Therefore traditional sound representations instead of HOA are usually used.
    2 e.g. Jerome Daniel: Spatial Sound Encoding Including Near Field Effect: Introducing distance Coding Filters and a Viable, New Ambisonic Format. AES 23rd International Conference, Copenhagen, Denmark, 2003
  • Summary of the Invention
  • It would be desirable to reproduce sound as close as possible to the original sound source, using a given loudspeaker configuration, and optimizing the possibilities of the given loudspeaker configuration. Further, it would be desirable to have a sound representation that can be adapted to different actual loudspeaker configurations, so that sound can be reproduced optimally in any case. According to one aspect of the invention, an Ambisonics representation and particularly a Higher Order Ambisonics representation would enable such optimized playback of sound. It would also be desirable to minimize the effort for encoding, decoding and transcoding to and from Ambisonics representations, which also would minimize a loss of quality.
  • According to one aspect of the invention, a conventional audio signal is enhanced by additional data or metadata, wherein the additional data comprise sound source position information that enables conversion of the conventional audio signal into a Higher Order Ambisonics (HOA) representation of the sound field. Advantageously, this allows subsequent re-conversion into conventional audio data that can be adapted to a given loudspeaker configuration (e.g. during said re-conversion).
  • According to another aspect of the invention, a method for encoding or transmitting an audio signal comprises steps of providing one or more audio source signals, determining respective position information and encoding or transmitting the audio source signals together with metadata that comprise the determined audio source position information.
  • According to yet another aspect of the invention, the method for encoding or transmitting an audio signal further comprises steps of determining the number of source channels that are required for Ambisonics encoding, said number being (2N+1) for the 2D case and (N+1)2 for the 3D case (N is the order of Ambisonics encoding), determining the number of available transmission or storage channels, and comparing the number of source channels required for Ambisonics encoding of the order N with the number of available transmission or storage channels, depending on said comparison, generating a mode decision information having either a first value if the number of source channels required for Ambisonics encoding of the order N is not less than the number of available transmission or storage channels, or having a different second value otherwise, and generating, and storing or transmitting, the Ambisonics encoded version of the audio signal if the mode decision information has said first value, or otherwise storing or transmitting the received or retrieved audio signal. In one embodiment where the Ambisonics encoded version of the audio signal is transmitted, the mode decision information will also be transmitted. The order N may be determined as a function of a target number of reproduction channels, or from the number of available transmission or storage channels.
  • According to a further aspect of the invention, a method for processing an audio signal comprises steps of receiving or retrieving from storage an encoded audio signal, extracting first audio source signals and additional information from the received or retrieved signal, wherein the first audio source signals relate to first audio source positions provided by the additional information, transforming the first audio source signals relating to first audio source positions into second audio source signals relating to different second audio source positions, and supplying said second audio source signals for storage or playback.
  • According to one aspect of the invention, in the method for processing an audio signal, the step of transforming the first audio source signals into second audio source signals comprises generating an Ambisonics representation of the sound field from a conventional audio source signal and said additional information describing the positions of sound sources, wherein the Ambisonics signal can be of higher order (HOA).
  • Corresponding apparatuses that utilize the methods are disclosed in the following detailed description.
  • Advantageous embodiments of the invention are disclosed in the dependent claims, the following description and the figures.
  • Brief description of the drawings
  • Exemplary embodiments of the invention are described with reference to the accompanying drawings, which show in
    • Fig.1 a general audio production chain;
    • Fig.2 conventional loudspeaker setup for playback in stereo and surround sound;
    • Fig.3 the principle of Ambisonics encoding;
    • Fig.4 an encoder according to one embodiment of the invention;
    • Fig.5 a decoder according to one embodiment of the invention;
    • Fig.6a an audio processing system according to one embodiment;
    • Fig.6b an audio processing system according to another embodiment; and
    • Fig.7 an audio transmission system according to one embodiment.
    Detailed description of the invention
  • In a general audio production chain, as shown in Fig.1, acquisition of audio signals is achieved by one or more microphones M. The audio signals are encoded E, stored S and later decoded D for reproduction via one or more loudspeakers LS. Conventionally, each of the audio signals in the decoded signal relates to a particular loudspeaker. E.g. in a stereo setup, also denominated as 2.0 (since it has two direction-related audio channels and audio no channel that is not direction-related), the audio signals relate to the left and right microphones and loudspeakers.
  • Fig.2 a) shows a conventional loudspeaker setup for playback in stereo, and Fig.2 b) a conventional loudspeaker setup for surround sound, also known as 5.0 format. It is a convention that an angle of 60° must be between the two stereo loudspeaker boxes in order to reproduce the audio signal in the best possible manner. Similarly, in the 5.0 format the optimal angles between loudspeakers are subject to convention. Thus, the respective audio signals relate to specific relative positions that cannot be changed for a given reproduction system. That is, loudspeaker boxes need to be positioned according to these fixed relative positions in order to optimize the sound reproduction.
  • Using Ambisonics, and particularly HOA, as general representation of audio content has the following advantages. First, the audio content is independent from the loudspeaker setup. Thus, it has to be processed to match a given setup, wherein it will be optimized to match this setup. Second, 3D representation and high spatial resolution of audio content is fully supported.
  • A general Ambisonics based system is shown in Fig.3. A microphone array MA acquires the signals in a spatial manner. Position information P describing the microphone positions is added in an encoder E, which generates an Ambisonics representation 30 of given order N of the signals. This signal can be transcoded TR into a conventional audio signal having a desired number of channels that relate to desired positions of the loudspeakers. The localization of the different loudspeaker channels is the better, the higher the order N of the Ambisonics representation was. However, the order N and the spatial positions (2-dimensional or 3-dimensional) have also an impact on the number of channels that the Ambisonics signal 30 requires, as described below. The conventional audio signal can be reproduced LSA on a loudspeaker array that may but need not correspond to the microphone array. However, positions of the loudspeakers must be known for transcoding the signal.
  • There are various aspects of the invention. In one aspect, a receiver performs a conversion from a given audio source arrangement to a required audio target arrangement, such as an individual loudspeaker arrangement. An encoder or transmitter provides a conventional audio signal with one or more microphone/loudspeaker related channels (such as 5.0) and attached position information that defines the positions of the microphone/loudspeaker of each channel. At the receiver side this signal can be converted to an Ambisonics representation, and in particular to a HOA (Higher Order Ambisonics) representation. This signal can be stored or transcoded/re-mapped for a desired channel and position configuration (e.g. according to an actual loudspeaker configuration or a particular configuration desired for other reasons). It is possible to select at the receiver side whether the Ambisonics representation or the conventional representation with additional position information shall be stored or further processed.
  • Advantages of this aspect are that the transmitted/received signal is backward compatible, since it can be decoded by conventional receivers that ignore the additional metadata information, and that the transmitted/received signal uses practically the same bandwidth than a conventional audio signal, since the additional metadata information is very little compared to the audio information (although it may be transmitted frequently, e.g. in fixed time intervals such as once every second, or every k audio frames).
  • In another aspect of the invention, the conversion from a given audio source arrangement into a HOA representation can be performed before transmitting, so that either the conventional audio signal plus position information or the generic HOA representation of the audio signal is transmitted. The latter is preferred if the required number of transmission channels is equal for both formats. The transmission signal comprises a mode indication showing whether the audio format is HOA or conventional, because two different formats are possible. The receiver extracts and evaluates the indication and performs the further processing according to the mode indication.
  • An audio processing system according to one embodiment of the invention is shown in Fig.6 a). A signal as described above, having one or more audio source signals Xsrc and position information rsrc giving the positions of the audio source signals is received, and multiplexed 62 into a common signal 60. This signal can be stored (not shown) or transmitted, e.g. between different devices in a network. The signal is demultiplexed 63, wherein the audio source signals X'src and position information r'src are regained, and these are input into an Ambisonics encoder 64 that generates an Ambisonics signal 61. The order of this signal may be determined according to the number of available positions r'src, but can also be influenced by available storage area and/or processing bandwidth. In particular, it is advantageous that the Ambisonics encoder 64 can be a HOA encoder, i.e. N>1. The Ambisonics signal is fed into a transcoder TR and there re-mapped to a given loudspeaker configuration, and output to conventional multi-channel audio processing and loudspeakers LSA.
  • One advantage of this processing system is that the step of re-mapping can easily be adapted to the actual loudspeaker configuration, so that after a change in this configuration the optimization can also be changed according to the new loudspeaker number and/or positions. E.g. when a new loudspeaker is added to the reproduction system, its position information is provided to the transcoder and the Ambisonics signal can be re-mapped to match the new configuration.
  • The position information can be provided by user input (e.g. using a GUI), or by automatic loudspeaker position measuring systems. E.g. relative loudspeaker positions can be determined by reproducing a reference signal at a known position and measuring the different signal run-times. For the 3D case, reproduction of three reference signals at three distinct known positions can be used for automatic loudspeaker location.
  • An audio processing system according to another embodiment of the invention is shown in Fig.6 b). The signal 60 being composed of one or more audio source signals Xsrc and position information rsrc giving the positions of the audio source signals is received and demultiplexed 62 into its components. It is possible to select S6 whether these components or the HOA encoded signal representation 61 shall be used for the further processing, such as optional storing 66, transcoding and multi-channel audio processing. As described above, it may be advantageous to store the generic HOA representation, depending on the application. The selection signal 65 may depend on the parameters mentioned below, such as required order N, number of source positions or number of target (loudspeaker) positions.
  • The following section gives a brief overview on encoding and decoding HOA signals.
  • In the following, the spatial positions refer to a spherical coordinate system. The distance of sources (audio signals on the encoding side, loudspeakers on the playback side) is not taken into account for the sake of clarity. However, it is easily integrated to this encoding scheme using a known distance coding scheme, e.g. that of Jerome Daniel (reference cited above).
  • HOA encoding of audio signals is be done using Ψ w = A
    Figure imgb0001

    where Ψ is the mode matrix, w holds the speaker signals and A are the resulting HOA coefficients. The HOA coefficients in A are arranged in this order: A = A 0 0 A 1 - 1 A 1 0 A 1 1 T
    Figure imgb0002

    Vector A holds O = { N + 1 2 for 3 D representation 2 N + 1 for 2 D representation
    Figure imgb0003

    elements. The speaker signals w are arranged as w = w 1 t w 2 t w L t T
    Figure imgb0004

    where L is the number of loudspeakers. As an example, a stereo signal is simply described as w = [W1 (t), W2 (t)]T with left and right channel respectively. The mode matrix Ψ finally contains Ψ = Ψ 1 Ψ 2 Ψ L T
    Figure imgb0005

    where Ψi with i = 1...L are the mode vectors for the individual speaker positions containing ψ i = Y 0 0 θ i ϕ i , Y 1 - 1 θ i ϕ i , Y 1 0 θ i ϕ i , Y 1 1 θ i ϕ i , T
    Figure imgb0006
  • The directional position of the individual speakers is given by θii in spherical coordinates, Y n m θ i ϕ i
    Figure imgb0007
    is the spherical harmonic function. The position of the speaker is referred as r i = (riii). As an example, the stereo setup of two loudspeakers is described by rleft = (r, 90°, -30°) and rright = (r, 90°, 30°), where r denotes the speaker distance in meter, 90° is the declination angle and 30° is the azimuth angle.
  • The decoding of the HOA coefficients A is done using w dec = DA
    Figure imgb0008

    where D is the decoding matrix. It is chosen to pseudo inverse matrix D = Ψ Ψ Ψ - 1
    Figure imgb0009

    where t denotes the conjugate complex matrix transform. The property of the pseudo inverse = I
    Figure imgb0010

    with I denoting the identity matrix ensures the proper reconstruction of w.
  • The HOA coefficient transmission is usually done by transmission of the individual vector elements of A resulting from the Fourier-Bessel representation. This results in a possibly higher number of channels than formerly required and very high numbers of channels for high orders. Existing ideas for HOA usage inside audio formats are therefore limited to an order of N = 1.
  • The invention results in a new audio format that is backward compatible to existing audio content. It is capable of holding audio content with full 3D information and any high spatial resolution, and therefore it is forward compatible with any audio content.
  • Using a pure HOA signal representation to carry standard audio formats like 2.0 and 5.1 and some others has the disadvantage of a higher number of channels required for sound field representation. Therefore, in one embodiment of the invention, a parametric approach is suggested driven by the following: Stereo (2.0) is 2D and two audio channels are required, transport using HOA however requires O = 3 channels. Surround sound (5.1) is 2D and 6 audio channels are required, transport using HOA however requires O = 7 channels.
    New formats are capable of representing 3D information. As an example, 22.2 format requires 24 audio channels. For 3D HOA representation a number of O = 25 is necessary. The invention aims to provide an efficient solution to this problem: for 2.0 and 5.1 audio content it is generally less expensive in terms of channel/storage capacity to transmit/store the original audio information and additionally the locations of the sources. The result is a parameterised HOA signal representation at lowest cost. The full HOA representation is calculated on the receiver side, if necessary. Thus, a smooth mode selection is proposed that allows selection of the best possible format. The HOA representation also provides the advantage of scalability. A format like 22.2 requires 24 audio channels, as stated above. The order N defines the spatial resolution. To reduce the number of channels, the spatial resolution could be diminished. E.g. a HOA representation of order N = 3 would require only O = 16 channels. Generally, the HOA representation is scalable in terms of the spatial resolution.
  • An audio channel of a traditional audio format like stereo is viewed upon here as an audio source with a distinct position. An exemplary audio file format for HOA coefficients carrying several audio sources can be generated as follows:
    1. 1. Determine whether the arrangement of signal positions is plain (2D) or spatial (3D). Also the listener's position can be taken into account. The signal comprises Osrc channels, which is 2 in the stereo case. Depending on this information the necessary order N is calculated using N = { O src - 1 for 3 D O src - 1 / 2 for 2 D
      Figure imgb0011

      Using this order N in turn yields O s = { N + 1 2 for 3 D representation 2 N + 1 for 2 D representation
      Figure imgb0012
  • Os source channels are required for HOA coefficient representation (see eq.3).
    • 2. The minimum size of channels for transport or storage is achieved as follows:
  • If Os < Oc, then store a file containing
    1. (a) position of sources (example see above)
    2. (b) is the arrangement 2D or 3D (implicitly given by source positions)
    3. (c) order N (implicitly given by number of sources)
    4. (d) unprocessed audio signals
      otherwise do HOA encoding of the audio signals using order N as described above and store HOA coefficients.
      The result is an audio file with the minimum number of required channels. Depending on the playback device, the audio content is HOA encoded using the additionally stored parameters and then transcoded to the loudspeaker setup, or it is played back unprocessed (e.g. stereo content on a stereo device). In one embodiment, the file is converted into a signal for transmission using the following steps:
    • 3. If Os < Oc, the signal is multiplexed using a first multiplexer MX1, else the signal is HOA encoded.
    • 4. The result of the former step is multiplexed with a mode indication indicating the result of condition Os < Oc (i.e. the encoding mode) using a multiplexer MX2.
  • In one embodiment, decoding of this new signal is done as follows:
    1. 1. To extract the encoding mode information (i.e. Os < Oc), DMX2 is used.
    2. 2. If Os < Oc is true (i.e. HOA encoding mode), the signal is demultiplexed using DMX1, after that it is HOA encoded. In the other case it can be transcoded directly.
  • Another aspect of usage of the invention is described in the following. The goal is to use a given number of channels available for transport in an optimal way. It is assumed that the number of source channels Osrc is higher than the number of available channels.
  • A vector rsrc holds all positions of source channels, e.g. L sources are described using the positions r src = r 1 r 2 r L )
    Figure imgb0013

    with positions ri = (riii) of source number i. All positions are assumed to be different from each other (otherwise the situation is trivial, since two sources with same position can be added into one). Using spherical coordinates is not mandatory, though.
  • A vector x src holds all channels belonging to the source, e.g. L time signals x src = x 1 t x 2 t x L t
    Figure imgb0014
  • The integer number Ochan defines the number of available channels. The order N available for a HOA description of signal vector x src is calculated using N = { O chan - 1 for 3 D O chan - 1 2 for 2 D
    Figure imgb0015
  • Using this order N in turn yields O c = { N + 1 2 for 3 D representation 2 N + 1 for 2 D representation
    Figure imgb0016
  • This is the number of channels to use to describe a HOA signal with order N as calculated above. Encoding of a signal xsrc according to eq.11 with positional description rsrc, as described by eq.10, is done as follows. In this case the signal is adapted to the channel properties.
    1. 1. Use all positions in rsrc to determine if the arrangement is 2D or 3D.
    2. 2. Using this result and the given number Ochan of transport channels, the HOA representation of the source with maximum spatial resolution requires Oc signals following eq.13.
      If Os > Oc, this encoding ensures usage of given channel with maximum possible spatial resolution of the audio sources.
  • Mixing different HOA representations with different orders is possible. This is useful if different audio contents are encoded with different effort regarding spatial resolution. E.g. for a computer game, environment noise needs only low spatial resolution, whereas the audio information of the actor in the game should be encoded with high resolution.
  • An encoder according to one embodiment of the invention is shown in Fig.4. Audio source signals Xsrc and position information rsrc are provided, as described above, and can be encoded in two different modes: either they are multiplexed MX1 into a common data stream, so that a receiver is enabled to generate a HOA representation (since all the data necessary for a HOA representation are included), or a HOA representation is generated HOAe1 before transmission. Depending on a mode selection signal MD it is possible to select S1,S2,S3 one of the two modes. This mode decision signal is obtained by the above-described comparison CMP between the required number of channels OS and the available number of channels OC. The latter may be fixed or given. The required number of channels OS is determined in a block eq4 according to equation 4 above, using the result of the block eq3 that performs equation 3 above, and a spatial arrangement information 2D3D indicating whether the spatial arrangement is 2-dimensional or 3-dimensional. The spatial arrangement information 2D3D is also an input to the block eq3 that performs equation 3. Finally, the mode decision information MD is multiplexed MX2 into the output data stream Aenc so that all necessary information for proper decoding is contained.
  • A decoder according to one embodiment of the invention is shown in Fig.5. A signal Aenc as encoded by the encoder of Fig.4 is demultiplexed DMX2 so that the mode decision information MD' is obtained. Depending on this information MD' the remaining signal is either demultiplexed into its audio and position components X'src,r'src and then HOA encoded HOAe (if it was not HOA encoded), or it is directly used if it is already HOA encoded. Switching means S4,S5 controlled by the mode decision information MD' switch between these modes. The HOA encoded signal is provided to a transcoder, as described above.
  • In one embodiment, a device for encoding or transmitting an audio signal, comprises means for providing one or more audio source signals, means for determining for each of said audio source signals a specific position to which it relates, means for generating data sets containing the determined positions of the audio sources, and means for encoding or transmitting the data sets together with said audio source signals.
  • In one embodiment, said data sets are suitable for calculating a generic audio field representation based on Ambisonics representation.
  • In one embodiment, the device further comprises means for determining the number (Os) of source channels that are required for Ambisonics encoding, said number being (2N+1) for the 2D case and (N+1)2 for the 3D case, where N is the order of Ambisonics encoding, means for determining the number (Oc) of available transmission or storage channels,
    means for comparing the number (Os) of source channels required for Ambisonics encoding of the order N with the number (Oc) of available transmission or storage channels; means for generating, depending on said comparison, a mode decision information (MD), having a first value if the number (Os) of source channels required for Ambisonics encoding of the order N is not less than the number of available transmission or storage channels (Oc), or having a different second value otherwise, and means HOAe for generating, and means for storing or transmitting, the Ambisonics encoded version of the audio signal if the mode decision information has said first value, or otherwise storing or transmitting the received or retrieved audio signal.
  • In another embodiment, a device for processing audio signals comprises means for receiving or retrieving from storage encoded audio signals, means for extracting first audio source signals and additional information from the received or retrieved signals, wherein the first audio source signals relate to first audio source positions provided by the additional information, means (TRC) for transforming the first audio source signals relating to first audio source positions into second audio source signals relating to different second audio source positions, and means (LSA) for supplying said second audio source signals for storage or playback.
  • The invention can be used for all kinds of audio processing devices. These may be targeting music reproduction, but also voice reproduction, such as multi-channel teleconferencing systems. Advantageously, spatial information can be added to conventional multi-channel audio signals, and scalability in terms of spatial resolution can be provided.
  • It will be understood that the present invention has been described purely by way of example, and modifications of detail can be made without departing from the scope of the invention.
    Each feature disclosed in the description and (where appropriate) the claims and drawings may be provided independently or in any appropriate combination. Features may, where appropriate be implemented in hardware, software, or a combination of the two. Connections may, where applicable, be implemented as wireless connections or wired, not necessarily direct or dedicated, connections. Reference numerals appearing in the claims are by way of illustration only and shall have no limiting effect on the scope of the claims.

Claims (13)

  1. Audio signal (60) comprising one or more audio source signals (Xsrc) relating to specific positions and additional data (rsrc), characterized in that the additional data define said specific positions to which the audio source signals relate.
  2. Audio signal according to claim 1, wherein said additional data (rsrc) are suitable for calculating a generic audio field representation based on Ambisonics representation.
  3. Method for encoding or transmitting an audio signal, comprising the steps of
    - providing one or more audio source signals (Xsrc);
    characterized in the further steps of
    - determining for each of said audio source signals a specific position to which it relates;
    - generating data sets containing the determined positions (rsrc) of the audio sources; and
    - encoding or transmitting (62) the data sets together with said audio source signals.
  4. Method according to claim 3, wherein said data sets are suitable for calculating a generic audio field representation based on Ambisonics representation.
  5. Method according to claim 3 or 4, further comprising the steps of
    - determining the number (Os) of source channels that are required for Ambisonics encoding, said number being (2N+1) for the 2D case and (N+1)2 for the 3D case, where N is the order of Ambisonics encoding;
    - determining the number (Oc) of available transmission or storage channels; and
    - comparing the number (Os) of source channels required for Ambisonics encoding of the order N with the number (Oc) of available transmission or storage channels;
    - depending on said comparison, generating a mode decision information (MD), having a first value if the number (OS) of source channels required for Ambisonics encoding of the order N is not less than the number of available transmission or storage channels (OC), or having a different second value otherwise; and
    - generating (HOAe, 64), and storing or transmitting, the Ambisonics encoded version (61) of the audio signal if the mode decision information (MD,65) has said first value, or otherwise storing or transmitting the received or retrieved audio signal (60).
  6. Method according to claim 5, wherein the Ambisonics encoded version (61) of the audio signal is transmitted, further comprising the step of transmitting said mode decision information (MD).
  7. Method according to claim 5, further comprising the steps of
    - determining a number of sources, such as microphones, or target reproduction channels, such as loudspeakers;
    - selecting N being the order of HOA encoding as a function of the determined number of sources or reproduction channels.
  8. Method according to claim 5, wherein N is determined from the number (Oc) of available transmission or storage channels.
  9. Method for processing an audio signal, comprising the steps of
    - receiving or retrieving from storage an encoded audio signal (60);
    characterized in the further steps of
    - extracting (63) first audio source signals (X'src) and additional information (r'src) from the received or retrieved signal, wherein the first audio source signals (X'src) relate to first audio source positions provided by the additional information (r'src) ;
    - transforming (TRC) the first audio source signals relating to first audio source positions into second audio source signals relating to different second audio source positions; and
    - supplying (LSA) said second audio source signals for storage or playback.
  10. Method according to claim 9, wherein a generic audio field representation is calculated.
  11. Method according to claim 10, wherein the generic audio field representation is an Ambisonics representation (HOA) of an order higher than one.
  12. Apparatus for encoding or transmitting an audio signal, using a method according to claims 3-8.
  13. Apparatus for processing an audio signal, using a method according to any of the claims 9-11.
EP08101732A 2008-02-19 2008-02-19 Audio signal, method and apparatus for encoding or transmitting the same and method and apparatus for processing the same Withdrawn EP2094032A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP08101732A EP2094032A1 (en) 2008-02-19 2008-02-19 Audio signal, method and apparatus for encoding or transmitting the same and method and apparatus for processing the same

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP08101732A EP2094032A1 (en) 2008-02-19 2008-02-19 Audio signal, method and apparatus for encoding or transmitting the same and method and apparatus for processing the same

Publications (1)

Publication Number Publication Date
EP2094032A1 true EP2094032A1 (en) 2009-08-26

Family

ID=39543036

Family Applications (1)

Application Number Title Priority Date Filing Date
EP08101732A Withdrawn EP2094032A1 (en) 2008-02-19 2008-02-19 Audio signal, method and apparatus for encoding or transmitting the same and method and apparatus for processing the same

Country Status (1)

Country Link
EP (1) EP2094032A1 (en)

Cited By (215)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011117399A1 (en) * 2010-03-26 2011-09-29 Thomson Licensing Method and device for decoding an audio soundfield representation for audio playback
US8452037B2 (en) 2010-05-05 2013-05-28 Apple Inc. Speaker clip
US8560309B2 (en) 2009-12-29 2013-10-15 Apple Inc. Remote conferencing center
WO2014014891A1 (en) * 2012-07-16 2014-01-23 Qualcomm Incorporated Loudspeaker position compensation with 3d-audio hierarchical coding
US8644519B2 (en) 2010-09-30 2014-02-04 Apple Inc. Electronic devices with improved audio
US8811648B2 (en) 2011-03-31 2014-08-19 Apple Inc. Moving magnet audio transducer
US8858271B2 (en) 2012-10-18 2014-10-14 Apple Inc. Speaker interconnect
US8879761B2 (en) 2011-11-22 2014-11-04 Apple Inc. Orientation-based audio
US8892446B2 (en) 2010-01-18 2014-11-18 Apple Inc. Service orchestration for intelligent automated assistant
US8903108B2 (en) 2011-12-06 2014-12-02 Apple Inc. Near-field null and beamforming
AU2014265108A1 (en) * 2010-03-26 2014-12-11 Dolby International Ab Method and device for decoding an audio soundfield representation for audio playback
US8942410B2 (en) 2012-12-31 2015-01-27 Apple Inc. Magnetically biased electromagnet for audio applications
US8977584B2 (en) 2010-01-25 2015-03-10 Newvaluexchange Global Ai Llp Apparatuses, methods and systems for a digital conversation management platform
US8989428B2 (en) 2011-08-31 2015-03-24 Apple Inc. Acoustic systems in electronic devices
US9007871B2 (en) 2011-04-18 2015-04-14 Apple Inc. Passive proximity detection
US9020163B2 (en) 2011-12-06 2015-04-28 Apple Inc. Near-field null and beamforming
US9204236B2 (en) 2011-07-01 2015-12-01 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US9288603B2 (en) 2012-07-15 2016-03-15 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
US9300784B2 (en) 2013-06-13 2016-03-29 Apple Inc. System and method for emergency calls initiated by voice command
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US9357299B2 (en) 2012-11-16 2016-05-31 Apple Inc. Active protection for acoustic device
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
JP2016520864A (en) * 2013-04-29 2016-07-14 トムソン ライセンシングThomson Licensing Method and apparatus for compressing and decompressing higher-order ambisonics representations
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US9451354B2 (en) 2014-05-12 2016-09-20 Apple Inc. Liquid expulsion from an orifice
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
KR20160145646A (en) * 2014-04-11 2016-12-20 삼성전자주식회사 Method and apparatus for rendering sound signal, and computer-readable recording medium
US9525943B2 (en) 2014-11-24 2016-12-20 Apple Inc. Mechanically actuated panel acoustic system
US9535906B2 (en) 2008-07-31 2017-01-03 Apple Inc. Mobile device having human language translation capability with positional feedback
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US9620104B2 (en) 2013-06-07 2017-04-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9633674B2 (en) 2013-06-07 2017-04-25 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
US9633660B2 (en) 2010-02-25 2017-04-25 Apple Inc. User profiling for voice input processing
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
CN106663433A (en) * 2014-07-02 2017-05-10 高通股份有限公司 Reducing correlation between higher order ambisonic (HOA) background channels
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US9697822B1 (en) 2013-03-15 2017-07-04 Apple Inc. System and method for updating an adaptive speech recognition model
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
JP2017142520A (en) * 2013-05-29 2017-08-17 クゥアルコム・インコーポレイテッドQualcomm I Compression of decomposed representations of sound field
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9798393B2 (en) 2011-08-29 2017-10-24 Apple Inc. Text correction processing
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US9820033B2 (en) 2012-09-28 2017-11-14 Apple Inc. Speaker assembly
AU2016204408B2 (en) * 2010-03-26 2017-11-23 Dolby International Ab Method and device for decoding an audio soundfield representation for audio playback
CN107424618A (en) * 2012-07-16 2017-12-01 杜比国际公司 For the method, equipment and computer-readable medium decoded to HOA audio signals
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US9858948B2 (en) 2015-09-29 2018-01-02 Apple Inc. Electronic equipment with ambient noise sensing input circuitry
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US9881628B2 (en) 2016-01-05 2018-01-30 Qualcomm Incorporated Mixed domain coding of audio
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9900698B2 (en) 2015-06-30 2018-02-20 Apple Inc. Graphene composite acoustic diaphragm
US9922656B2 (en) 2014-01-30 2018-03-20 Qualcomm Incorporated Transitioning of ambient higher-order ambisonic coefficients
US9922642B2 (en) 2013-03-15 2018-03-20 Apple Inc. Training an at least partial voice command system
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
GB2554446A (en) * 2016-09-28 2018-04-04 Nokia Technologies Oy Spatial audio signal format generation from a microphone array using adaptive capture
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US9961475B2 (en) 2015-10-08 2018-05-01 Qualcomm Incorporated Conversion from object-based audio to HOA
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US9961467B2 (en) 2015-10-08 2018-05-01 Qualcomm Incorporated Conversion from channel-based audio to HOA
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US9966065B2 (en) 2014-05-30 2018-05-08 Apple Inc. Multi-command single utterance input method
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US10199051B2 (en) 2013-02-07 2019-02-05 Apple Inc. Voice trigger for a digital assistant
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US10249312B2 (en) 2015-10-08 2019-04-02 Qualcomm Incorporated Quantization of spatial vectors
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10303715B2 (en) 2017-05-16 2019-05-28 Apple Inc. Intelligent automated assistant for media exploration
US10311144B2 (en) 2017-05-16 2019-06-04 Apple Inc. Emoji word sense disambiguation
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US10332518B2 (en) 2017-05-09 2019-06-25 Apple Inc. User interface for correcting recognition errors
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10354011B2 (en) 2016-06-09 2019-07-16 Apple Inc. Intelligent automated assistant in a home environment
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US10395654B2 (en) 2017-05-11 2019-08-27 Apple Inc. Text normalization based on a data-driven learning network
US10403278B2 (en) 2017-05-16 2019-09-03 Apple Inc. Methods and systems for phonetic matching in digital assistant services
US10403283B1 (en) 2018-06-01 2019-09-03 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10402151B2 (en) 2011-07-28 2019-09-03 Apple Inc. Devices with enhanced audio
US10410637B2 (en) 2017-05-12 2019-09-10 Apple Inc. User-specific acoustic models
US10417266B2 (en) 2017-05-09 2019-09-17 Apple Inc. Context-aware ranking of intelligent response suggestions
US10445429B2 (en) 2017-09-21 2019-10-15 Apple Inc. Natural language understanding using vocabularies with compressed serialized tries
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US10474753B2 (en) 2016-09-07 2019-11-12 Apple Inc. Language identification using recurrent neural networks
US10482874B2 (en) 2017-05-15 2019-11-19 Apple Inc. Hierarchical belief states for digital assistants
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10496705B1 (en) 2018-06-03 2019-12-03 Apple Inc. Accelerated task performance
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10521466B2 (en) 2016-06-11 2019-12-31 Apple Inc. Data driven natural language event detection and classification
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US10568032B2 (en) 2007-04-03 2020-02-18 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
CN110827840A (en) * 2014-01-30 2020-02-21 高通股份有限公司 Decoding independent frames of ambient higher order ambisonic coefficients
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US10592604B2 (en) 2018-03-12 2020-03-17 Apple Inc. Inverse text normalization for automatic speech recognition
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US10636424B2 (en) 2017-11-30 2020-04-28 Apple Inc. Multi-turn canned dialog
US10643611B2 (en) 2008-10-02 2020-05-05 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US10657328B2 (en) 2017-06-02 2020-05-19 Apple Inc. Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
US10684703B2 (en) 2018-06-01 2020-06-16 Apple Inc. Attention aware virtual assistant dismissal
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10706373B2 (en) 2011-06-03 2020-07-07 Apple Inc. Performing actions associated with task items that represent tasks to perform
US10726832B2 (en) 2017-05-11 2020-07-28 Apple Inc. Maintaining privacy of personal information
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10733982B2 (en) 2018-01-08 2020-08-04 Apple Inc. Multi-directional dialog
US10733375B2 (en) 2018-01-31 2020-08-04 Apple Inc. Knowledge-based framework for improving natural language understanding
US10748546B2 (en) 2017-05-16 2020-08-18 Apple Inc. Digital assistant services based on device capabilities
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US10755703B2 (en) 2017-05-11 2020-08-25 Apple Inc. Offline personal assistant
US10755051B2 (en) 2017-09-29 2020-08-25 Apple Inc. Rule-based natural language processing
US10757491B1 (en) 2018-06-11 2020-08-25 Apple Inc. Wearable interactive audio device
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
US10770087B2 (en) 2014-05-16 2020-09-08 Qualcomm Incorporated Selecting codebooks for coding vectors decomposed from higher-order ambisonic audio signals
US10789959B2 (en) 2018-03-02 2020-09-29 Apple Inc. Training speaker recognition models for digital assistants
US10789945B2 (en) 2017-05-12 2020-09-29 Apple Inc. Low-latency intelligent automated assistant
US10791216B2 (en) 2013-08-06 2020-09-29 Apple Inc. Auto-activating smart responses based on activities from remote devices
US10791176B2 (en) 2017-05-12 2020-09-29 Apple Inc. Synchronization and task delegation of a digital assistant
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
US10810274B2 (en) 2017-05-15 2020-10-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
US10818288B2 (en) 2018-03-26 2020-10-27 Apple Inc. Natural assistant interaction
US10839159B2 (en) 2018-09-28 2020-11-17 Apple Inc. Named entity normalization in a spoken dialog system
US10873798B1 (en) 2018-06-11 2020-12-22 Apple Inc. Detecting through-body inputs at a wearable audio device
US10892996B2 (en) 2018-06-01 2021-01-12 Apple Inc. Variable latency device coordination
US10909331B2 (en) 2018-03-30 2021-02-02 Apple Inc. Implicit identification of translation payload with neural machine translation
US10928918B2 (en) 2018-05-07 2021-02-23 Apple Inc. Raise to speak
US10984780B2 (en) 2018-05-21 2021-04-20 Apple Inc. Global semantic word embeddings using bi-directional recurrent neural networks
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US11010127B2 (en) 2015-06-29 2021-05-18 Apple Inc. Virtual assistant for media playback
US11010561B2 (en) 2018-09-27 2021-05-18 Apple Inc. Sentiment prediction from textual data
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US11023513B2 (en) 2007-12-20 2021-06-01 Apple Inc. Method and apparatus for searching using an active ontology
US11140099B2 (en) 2019-05-21 2021-10-05 Apple Inc. Providing message response suggestions
US11145294B2 (en) 2018-05-07 2021-10-12 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US11170166B2 (en) 2018-09-28 2021-11-09 Apple Inc. Neural typographical error modeling via generative adversarial networks
US11204787B2 (en) 2017-01-09 2021-12-21 Apple Inc. Application integration with a digital assistant
US11217251B2 (en) 2019-05-06 2022-01-04 Apple Inc. Spoken notifications
US11227589B2 (en) 2016-06-06 2022-01-18 Apple Inc. Intelligent list reading
US11231904B2 (en) 2015-03-06 2022-01-25 Apple Inc. Reducing response latency of intelligent automated assistants
US11237797B2 (en) 2019-05-31 2022-02-01 Apple Inc. User activity shortcut suggestions
US11269678B2 (en) 2012-05-15 2022-03-08 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US11281993B2 (en) 2016-12-05 2022-03-22 Apple Inc. Model and ensemble compression for metric learning
US11289073B2 (en) 2019-05-31 2022-03-29 Apple Inc. Device text to speech
US11301477B2 (en) 2017-05-12 2022-04-12 Apple Inc. Feedback analysis of a digital assistant
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11307661B2 (en) 2017-09-25 2022-04-19 Apple Inc. Electronic device with actuators for producing haptic and audio output along a device housing
US11314370B2 (en) 2013-12-06 2022-04-26 Apple Inc. Method for extracting salient dialog usage from live data
US11334032B2 (en) 2018-08-30 2022-05-17 Apple Inc. Electronic watch with barometric vent
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11360641B2 (en) 2019-06-01 2022-06-14 Apple Inc. Increasing the relevance of new available information
US11386266B2 (en) 2018-06-01 2022-07-12 Apple Inc. Text correction
US11423908B2 (en) 2019-05-06 2022-08-23 Apple Inc. Interpreting spoken requests
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US11468282B2 (en) 2015-05-15 2022-10-11 Apple Inc. Virtual assistant in a communication session
US11475884B2 (en) 2019-05-06 2022-10-18 Apple Inc. Reducing digital assistant latency when a language is incorrectly determined
US11475898B2 (en) 2018-10-26 2022-10-18 Apple Inc. Low-latency multi-speaker speech recognition
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators
US11495218B2 (en) 2018-06-01 2022-11-08 Apple Inc. Virtual assistant operation in multi-device environments
US11496600B2 (en) 2019-05-31 2022-11-08 Apple Inc. Remote execution of machine-learned models
US11499255B2 (en) 2013-03-13 2022-11-15 Apple Inc. Textile product having reduced density
US11561144B1 (en) 2018-09-27 2023-01-24 Apple Inc. Wearable electronic device with fluid-based pressure sensing
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US11638059B2 (en) 2019-01-04 2023-04-25 Apple Inc. Content playback on multiple devices
JP7368563B2 (en) 2012-07-16 2023-10-24 ドルビー・インターナショナル・アーベー Method and apparatus for rendering audio sound field representation for audio playback
US11857063B2 (en) 2019-04-17 2024-01-02 Apple Inc. Audio output system for a wirelessly locatable tag

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1416769A1 (en) * 2002-10-28 2004-05-06 Electronics and Telecommunications Research Institute Object-based three-dimensional audio system and method of controlling the same
FR2847376A1 (en) * 2002-11-19 2004-05-21 France Telecom Digital sound word processing/acquisition mechanism codes near distance three dimensional space sounds following spherical base and applies near field filtering compensation following loudspeaker distance/listening position
EP1515308A1 (en) * 2003-09-09 2005-03-16 Nokia Corporation Multi-rate coding
US20050117753A1 (en) * 2003-12-02 2005-06-02 Masayoshi Miura Sound field reproduction apparatus and sound field space reproduction system
US20050141728A1 (en) * 1997-09-24 2005-06-30 Sonic Solutions, A California Corporation Multi-channel surround sound mastering and reproduction techniques that preserve spatial harmonics in three dimensions
DE102005008366A1 (en) * 2005-02-23 2006-08-24 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Device for driving wave-field synthesis rendering device with audio objects, has unit for supplying scene description defining time sequence of audio objects
US20070269063A1 (en) * 2006-05-17 2007-11-22 Creative Technology Ltd Spatial audio coding based on universal spatial cues
WO2007136187A1 (en) * 2006-05-19 2007-11-29 Electronics And Telecommunications Research Institute Object-based 3-dimensional audio service system using preset audio scenes

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050141728A1 (en) * 1997-09-24 2005-06-30 Sonic Solutions, A California Corporation Multi-channel surround sound mastering and reproduction techniques that preserve spatial harmonics in three dimensions
EP1416769A1 (en) * 2002-10-28 2004-05-06 Electronics and Telecommunications Research Institute Object-based three-dimensional audio system and method of controlling the same
FR2847376A1 (en) * 2002-11-19 2004-05-21 France Telecom Digital sound word processing/acquisition mechanism codes near distance three dimensional space sounds following spherical base and applies near field filtering compensation following loudspeaker distance/listening position
EP1515308A1 (en) * 2003-09-09 2005-03-16 Nokia Corporation Multi-rate coding
US20050117753A1 (en) * 2003-12-02 2005-06-02 Masayoshi Miura Sound field reproduction apparatus and sound field space reproduction system
DE102005008366A1 (en) * 2005-02-23 2006-08-24 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Device for driving wave-field synthesis rendering device with audio objects, has unit for supplying scene description defining time sequence of audio objects
US20070269063A1 (en) * 2006-05-17 2007-11-22 Creative Technology Ltd Spatial audio coding based on universal spatial cues
WO2007136187A1 (en) * 2006-05-19 2007-11-29 Electronics And Telecommunications Research Institute Object-based 3-dimensional audio service system using preset audio scenes

Cited By (366)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9646614B2 (en) 2000-03-16 2017-05-09 Apple Inc. Fast, language-independent method for user authentication by voice
US10318871B2 (en) 2005-09-08 2019-06-11 Apple Inc. Method and apparatus for building an intelligent automated assistant
US11928604B2 (en) 2005-09-08 2024-03-12 Apple Inc. Method and apparatus for building an intelligent automated assistant
US8930191B2 (en) 2006-09-08 2015-01-06 Apple Inc. Paraphrasing of user requests and results by automated digital assistant
US8942986B2 (en) 2006-09-08 2015-01-27 Apple Inc. Determining user intent based on ontologies of domains
US9117447B2 (en) 2006-09-08 2015-08-25 Apple Inc. Using event alert text as input to an automated assistant
US10568032B2 (en) 2007-04-03 2020-02-18 Apple Inc. Method and system for operating a multi-function portable electronic device using voice-activation
US11023513B2 (en) 2007-12-20 2021-06-01 Apple Inc. Method and apparatus for searching using an active ontology
US10381016B2 (en) 2008-01-03 2019-08-13 Apple Inc. Methods and apparatus for altering audio output signals
US9330720B2 (en) 2008-01-03 2016-05-03 Apple Inc. Methods and apparatus for altering audio output signals
US9626955B2 (en) 2008-04-05 2017-04-18 Apple Inc. Intelligent text-to-speech conversion
US9865248B2 (en) 2008-04-05 2018-01-09 Apple Inc. Intelligent text-to-speech conversion
US10108612B2 (en) 2008-07-31 2018-10-23 Apple Inc. Mobile device having human language translation capability with positional feedback
US9535906B2 (en) 2008-07-31 2017-01-03 Apple Inc. Mobile device having human language translation capability with positional feedback
US11348582B2 (en) 2008-10-02 2022-05-31 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US10643611B2 (en) 2008-10-02 2020-05-05 Apple Inc. Electronic devices with voice command and contextual data processing capabilities
US9959870B2 (en) 2008-12-11 2018-05-01 Apple Inc. Speech recognition involving a mobile device
US10795541B2 (en) 2009-06-05 2020-10-06 Apple Inc. Intelligent organization of tasks items
US10475446B2 (en) 2009-06-05 2019-11-12 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US9858925B2 (en) 2009-06-05 2018-01-02 Apple Inc. Using context information to facilitate processing of commands in a virtual assistant
US11080012B2 (en) 2009-06-05 2021-08-03 Apple Inc. Interface for a virtual digital assistant
US10283110B2 (en) 2009-07-02 2019-05-07 Apple Inc. Methods and apparatuses for automatic speech recognition
US8560309B2 (en) 2009-12-29 2013-10-15 Apple Inc. Remote conferencing center
US10553209B2 (en) 2010-01-18 2020-02-04 Apple Inc. Systems and methods for hands-free notification summaries
US10276170B2 (en) 2010-01-18 2019-04-30 Apple Inc. Intelligent automated assistant
US8903716B2 (en) 2010-01-18 2014-12-02 Apple Inc. Personalized vocabulary for digital assistant
US10679605B2 (en) 2010-01-18 2020-06-09 Apple Inc. Hands-free list-reading by intelligent automated assistant
US9548050B2 (en) 2010-01-18 2017-01-17 Apple Inc. Intelligent automated assistant
US8892446B2 (en) 2010-01-18 2014-11-18 Apple Inc. Service orchestration for intelligent automated assistant
US9318108B2 (en) 2010-01-18 2016-04-19 Apple Inc. Intelligent automated assistant
US10705794B2 (en) 2010-01-18 2020-07-07 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US10706841B2 (en) 2010-01-18 2020-07-07 Apple Inc. Task flow identification based on user intent
US10741185B2 (en) 2010-01-18 2020-08-11 Apple Inc. Intelligent automated assistant
US10496753B2 (en) 2010-01-18 2019-12-03 Apple Inc. Automatically adapting user interfaces for hands-free interaction
US11423886B2 (en) 2010-01-18 2022-08-23 Apple Inc. Task flow identification based on user intent
US10607140B2 (en) 2010-01-25 2020-03-31 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10984327B2 (en) 2010-01-25 2021-04-20 New Valuexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US9424862B2 (en) 2010-01-25 2016-08-23 Newvaluexchange Ltd Apparatuses, methods and systems for a digital conversation management platform
US9424861B2 (en) 2010-01-25 2016-08-23 Newvaluexchange Ltd Apparatuses, methods and systems for a digital conversation management platform
US9431028B2 (en) 2010-01-25 2016-08-30 Newvaluexchange Ltd Apparatuses, methods and systems for a digital conversation management platform
US10984326B2 (en) 2010-01-25 2021-04-20 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10607141B2 (en) 2010-01-25 2020-03-31 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US8977584B2 (en) 2010-01-25 2015-03-10 Newvaluexchange Global Ai Llp Apparatuses, methods and systems for a digital conversation management platform
US11410053B2 (en) 2010-01-25 2022-08-09 Newvaluexchange Ltd. Apparatuses, methods and systems for a digital conversation management platform
US10049675B2 (en) 2010-02-25 2018-08-14 Apple Inc. User profiling for voice input processing
US9633660B2 (en) 2010-02-25 2017-04-25 Apple Inc. User profiling for voice input processing
US10692504B2 (en) 2010-02-25 2020-06-23 Apple Inc. User profiling for voice input processing
US10037762B2 (en) 2010-03-26 2018-07-31 Dolby Laboratories Licensing Corporation Method and device for decoding an audio soundfield representation
US11217258B2 (en) 2010-03-26 2022-01-04 Dolby Laboratories Licensing Corporation Method and device for decoding an audio soundfield representation
AU2020201419B2 (en) * 2010-03-26 2021-09-16 Dolby International Ab Method and device for decoding an audio soundfield representation
US10522159B2 (en) 2010-03-26 2019-12-31 Dolby Laboratories Licensing Corporation Method and device for decoding an audio soundfield representation
WO2011117399A1 (en) * 2010-03-26 2011-09-29 Thomson Licensing Method and device for decoding an audio soundfield representation for audio playback
US11948583B2 (en) 2010-03-26 2024-04-02 Dolby Laboratories Licensing Corporation Method and device for decoding an audio soundfield representation
KR20210107165A (en) * 2010-03-26 2021-08-31 돌비 인터네셔널 에이비 Method and device for decoding an audio soundfield representation for audio playback
US9100768B2 (en) 2010-03-26 2015-08-04 Thomson Licensing Method and device for decoding an audio soundfield representation for audio playback
CN102823277A (en) * 2010-03-26 2012-12-12 汤姆森特许公司 Method and device for decoding an audio soundfield representation for audio playback
US10629211B2 (en) 2010-03-26 2020-04-21 Dolby Laboratories Licensing Corporation Method and device for decoding an audio soundfield representation
US9767813B2 (en) 2010-03-26 2017-09-19 Dolby Laboratories Licensing Corporation Method and device for decoding an audio soundfield representation for audio playback
KR101755531B1 (en) 2010-03-26 2017-07-07 돌비 인터네셔널 에이비 Method and device for decoding an audio soundfield representation for audio playback
AU2016204408B2 (en) * 2010-03-26 2017-11-23 Dolby International Ab Method and device for decoding an audio soundfield representation for audio playback
AU2011231565B2 (en) * 2010-03-26 2014-08-28 Dolby International Ab Method and device for decoding an audio soundfield representation for audio playback
US9460726B2 (en) 2010-03-26 2016-10-04 Dolby Laboratories Licensing Corporation Method and device for decoding an audio soundfield representation for audio playback
AU2014265108A1 (en) * 2010-03-26 2014-12-11 Dolby International Ab Method and device for decoding an audio soundfield representation for audio playback
AU2014265108B2 (en) * 2010-03-26 2016-06-30 Dolby International Ab Method and device for decoding an audio soundfield representation for audio playback
US10134405B2 (en) 2010-03-26 2018-11-20 Dolby Laboratories Licensing Corporation Method and device for decoding an audio soundfield representation
JP2021184611A (en) * 2010-03-26 2021-12-02 ドルビー・インターナショナル・アーベー Method and device for decoding audio field representation for audio playback
AU2018201133B2 (en) * 2010-03-26 2019-12-05 Dolby International Ab Method and device for decoding an audio soundfield representation for audio playback
US10063951B2 (en) 2010-05-05 2018-08-28 Apple Inc. Speaker clip
US8452037B2 (en) 2010-05-05 2013-05-28 Apple Inc. Speaker clip
US9386362B2 (en) 2010-05-05 2016-07-05 Apple Inc. Speaker clip
US8644519B2 (en) 2010-09-30 2014-02-04 Apple Inc. Electronic devices with improved audio
US10762293B2 (en) 2010-12-22 2020-09-01 Apple Inc. Using parts-of-speech tagging and named entity recognition for spelling correction
US10102359B2 (en) 2011-03-21 2018-10-16 Apple Inc. Device access using voice authentication
US10417405B2 (en) 2011-03-21 2019-09-17 Apple Inc. Device access using voice authentication
US9262612B2 (en) 2011-03-21 2016-02-16 Apple Inc. Device access using voice authentication
US8811648B2 (en) 2011-03-31 2014-08-19 Apple Inc. Moving magnet audio transducer
US9674625B2 (en) 2011-04-18 2017-06-06 Apple Inc. Passive proximity detection
US9007871B2 (en) 2011-04-18 2015-04-14 Apple Inc. Passive proximity detection
US10241644B2 (en) 2011-06-03 2019-03-26 Apple Inc. Actionable reminder entries
US10057736B2 (en) 2011-06-03 2018-08-21 Apple Inc. Active transport based notifications
US11350253B2 (en) 2011-06-03 2022-05-31 Apple Inc. Active transport based notifications
US11120372B2 (en) 2011-06-03 2021-09-14 Apple Inc. Performing actions associated with task items that represent tasks to perform
US10706373B2 (en) 2011-06-03 2020-07-07 Apple Inc. Performing actions associated with task items that represent tasks to perform
US11057731B2 (en) 2011-07-01 2021-07-06 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US10244343B2 (en) 2011-07-01 2019-03-26 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US11641562B2 (en) 2011-07-01 2023-05-02 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US9204236B2 (en) 2011-07-01 2015-12-01 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US9838826B2 (en) 2011-07-01 2017-12-05 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US10609506B2 (en) 2011-07-01 2020-03-31 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US9549275B2 (en) 2011-07-01 2017-01-17 Dolby Laboratories Licensing Corporation System and tools for enhanced 3D audio authoring and rendering
US10771742B1 (en) 2011-07-28 2020-09-08 Apple Inc. Devices with enhanced audio
US10402151B2 (en) 2011-07-28 2019-09-03 Apple Inc. Devices with enhanced audio
US9798393B2 (en) 2011-08-29 2017-10-24 Apple Inc. Text correction processing
US8989428B2 (en) 2011-08-31 2015-03-24 Apple Inc. Acoustic systems in electronic devices
US10241752B2 (en) 2011-09-30 2019-03-26 Apple Inc. Interface for a virtual digital assistant
US8879761B2 (en) 2011-11-22 2014-11-04 Apple Inc. Orientation-based audio
US10284951B2 (en) 2011-11-22 2019-05-07 Apple Inc. Orientation-based audio
US8903108B2 (en) 2011-12-06 2014-12-02 Apple Inc. Near-field null and beamforming
US9020163B2 (en) 2011-12-06 2015-04-28 Apple Inc. Near-field null and beamforming
US11069336B2 (en) 2012-03-02 2021-07-20 Apple Inc. Systems and methods for name pronunciation
US10134385B2 (en) 2012-03-02 2018-11-20 Apple Inc. Systems and methods for name pronunciation
US9483461B2 (en) 2012-03-06 2016-11-01 Apple Inc. Handling speech synthesis of content for multiple languages
US9953088B2 (en) 2012-05-14 2018-04-24 Apple Inc. Crowd sourcing information to fulfill user requests
US11269678B2 (en) 2012-05-15 2022-03-08 Apple Inc. Systems and methods for integrating third party services with a digital assistant
US10079014B2 (en) 2012-06-08 2018-09-18 Apple Inc. Name recognition system
US9495129B2 (en) 2012-06-29 2016-11-15 Apple Inc. Device, method, and user interface for voice-activated navigation and browsing of a document
US9788133B2 (en) 2012-07-15 2017-10-10 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
US9288603B2 (en) 2012-07-15 2016-03-15 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for backward-compatible audio coding
JP7368563B2 (en) 2012-07-16 2023-10-24 ドルビー・インターナショナル・アーベー Method and apparatus for rendering audio sound field representation for audio playback
CN107424618A (en) * 2012-07-16 2017-12-01 杜比国际公司 For the method, equipment and computer-readable medium decoded to HOA audio signals
CN104429102B (en) * 2012-07-16 2017-12-15 高通股份有限公司 Compensated using the loudspeaker location of 3D audio hierarchical decoders
JP2015527821A (en) * 2012-07-16 2015-09-17 クゥアルコム・インコーポレイテッドQualcomm Incorporated Loudspeaker position compensation using 3D audio hierarchical coding
CN104429102A (en) * 2012-07-16 2015-03-18 高通股份有限公司 Loudspeaker position compensation with 3d-audio hierarchical coding
US9473870B2 (en) 2012-07-16 2016-10-18 Qualcomm Incorporated Loudspeaker position compensation with 3D-audio hierarchical coding
KR101759005B1 (en) 2012-07-16 2017-07-17 퀄컴 인코포레이티드 Loudspeaker position compensation with 3d-audio hierarchical coding
WO2014014891A1 (en) * 2012-07-16 2014-01-23 Qualcomm Incorporated Loudspeaker position compensation with 3d-audio hierarchical coding
CN107424618B (en) * 2012-07-16 2021-01-08 杜比国际公司 Method, apparatus and computer readable medium for decoding HOA audio signals
US9576574B2 (en) 2012-09-10 2017-02-21 Apple Inc. Context-sensitive handling of interruptions by intelligent digital assistant
US9971774B2 (en) 2012-09-19 2018-05-15 Apple Inc. Voice-based media searching
US9820033B2 (en) 2012-09-28 2017-11-14 Apple Inc. Speaker assembly
US8858271B2 (en) 2012-10-18 2014-10-14 Apple Inc. Speaker interconnect
US9357299B2 (en) 2012-11-16 2016-05-31 Apple Inc. Active protection for acoustic device
US8942410B2 (en) 2012-12-31 2015-01-27 Apple Inc. Magnetically biased electromagnet for audio applications
US10978090B2 (en) 2013-02-07 2021-04-13 Apple Inc. Voice trigger for a digital assistant
US10714117B2 (en) 2013-02-07 2020-07-14 Apple Inc. Voice trigger for a digital assistant
US10199051B2 (en) 2013-02-07 2019-02-05 Apple Inc. Voice trigger for a digital assistant
US11499255B2 (en) 2013-03-13 2022-11-15 Apple Inc. Textile product having reduced density
US9368114B2 (en) 2013-03-14 2016-06-14 Apple Inc. Context-sensitive handling of interruptions
US9697822B1 (en) 2013-03-15 2017-07-04 Apple Inc. System and method for updating an adaptive speech recognition model
US9922642B2 (en) 2013-03-15 2018-03-20 Apple Inc. Training an at least partial voice command system
US11758344B2 (en) 2013-04-29 2023-09-12 Dolby Laboratories Licensing Corporation Methods and apparatus for compressing and decompressing a higher order ambisonics representation
US11895477B2 (en) 2013-04-29 2024-02-06 Dolby Laboratories Licensing Corporation Methods and apparatus for compressing and decompressing a higher order ambisonics representation
US10623878B2 (en) 2013-04-29 2020-04-14 Dolby Laboratories Licensing Corporation Methods and apparatus for compressing and decompressing a higher order ambisonics representation
US10999688B2 (en) 2013-04-29 2021-05-04 Dolby Laboratories Licensing Corporation Methods and apparatus for compressing and decompressing a higher order ambisonics representation
CN107293304A (en) * 2013-04-29 2017-10-24 杜比国际公司 The method and apparatus for representing to be compressed to higher order ambisonics and decompressing
JP2016520864A (en) * 2013-04-29 2016-07-14 トムソン ライセンシングThomson Licensing Method and apparatus for compressing and decompressing higher-order ambisonics representations
US10264382B2 (en) 2013-04-29 2019-04-16 Dolby Laboratories Licensing Corporation Methods and apparatus for compressing and decompressing a higher order ambisonics representation
US11284210B2 (en) 2013-04-29 2022-03-22 Dolby Laboratories Licensing Corporation Methods and apparatus for compressing and decompressing a higher order ambisonics representation
CN107293304B (en) * 2013-04-29 2021-01-05 杜比国际公司 Method and apparatus for compressing and decompressing higher order ambisonics representations
US11962990B2 (en) 2013-05-29 2024-04-16 Qualcomm Incorporated Reordering of foreground audio objects in the ambisonics domain
JP2017142520A (en) * 2013-05-29 2017-08-17 クゥアルコム・インコーポレイテッドQualcomm I Compression of decomposed representations of sound field
US9883312B2 (en) 2013-05-29 2018-01-30 Qualcomm Incorporated Transformed higher order ambisonics audio data
US10499176B2 (en) 2013-05-29 2019-12-03 Qualcomm Incorporated Identifying codebooks to use when coding spatial components of a sound field
US11146903B2 (en) 2013-05-29 2021-10-12 Qualcomm Incorporated Compression of decomposed representations of a sound field
JP2017199013A (en) * 2013-05-29 2017-11-02 クゥアルコム・インコーポレイテッドQualcomm I Compression of decomposed representations of sound field
US9980074B2 (en) 2013-05-29 2018-05-22 Qualcomm Incorporated Quantization step sizes for compression of spatial components of a sound field
US9633674B2 (en) 2013-06-07 2017-04-25 Apple Inc. System and method for detecting errors in interactions with a voice-based digital assistant
US9966060B2 (en) 2013-06-07 2018-05-08 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9620104B2 (en) 2013-06-07 2017-04-11 Apple Inc. System and method for user-specified pronunciation of words for speech synthesis and recognition
US9582608B2 (en) 2013-06-07 2017-02-28 Apple Inc. Unified ranking with entropy-weighted information for phrase-based semantic auto-completion
US10657961B2 (en) 2013-06-08 2020-05-19 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US9966068B2 (en) 2013-06-08 2018-05-08 Apple Inc. Interpreting and acting upon commands that involve sharing information with remote devices
US10185542B2 (en) 2013-06-09 2019-01-22 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10176167B2 (en) 2013-06-09 2019-01-08 Apple Inc. System and method for inferring user intent from speech inputs
US11048473B2 (en) 2013-06-09 2021-06-29 Apple Inc. Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant
US10769385B2 (en) 2013-06-09 2020-09-08 Apple Inc. System and method for inferring user intent from speech inputs
US9300784B2 (en) 2013-06-13 2016-03-29 Apple Inc. System and method for emergency calls initiated by voice command
US10791216B2 (en) 2013-08-06 2020-09-29 Apple Inc. Auto-activating smart responses based on activities from remote devices
US11314370B2 (en) 2013-12-06 2022-04-26 Apple Inc. Method for extracting salient dialog usage from live data
US9922656B2 (en) 2014-01-30 2018-03-20 Qualcomm Incorporated Transitioning of ambient higher-order ambisonic coefficients
CN110827840B (en) * 2014-01-30 2023-09-12 高通股份有限公司 Coding independent frames of ambient higher order ambisonic coefficients
CN110827840A (en) * 2014-01-30 2020-02-21 高通股份有限公司 Decoding independent frames of ambient higher order ambisonic coefficients
KR102258784B1 (en) 2014-04-11 2021-05-31 삼성전자주식회사 Method and apparatus for rendering sound signal, and computer-readable recording medium
KR20210064421A (en) * 2014-04-11 2021-06-02 삼성전자주식회사 Method and apparatus for rendering sound signal, and computer-readable recording medium
US10873822B2 (en) 2014-04-11 2020-12-22 Samsung Electronics Co., Ltd. Method and apparatus for rendering sound signal, and computer-readable recording medium
CN106664500A (en) * 2014-04-11 2017-05-10 三星电子株式会社 Method and apparatus for rendering sound signal, and computer-readable recording medium
AU2015244473B2 (en) * 2014-04-11 2018-05-10 Samsung Electronics Co., Ltd. Method and apparatus for rendering sound signal, and computer-readable recording medium
KR102302672B1 (en) 2014-04-11 2021-09-15 삼성전자주식회사 Method and apparatus for rendering sound signal, and computer-readable recording medium
US20170034639A1 (en) * 2014-04-11 2017-02-02 Samsung Electronics Co., Ltd. Method and apparatus for rendering sound signal, and computer-readable recording medium
KR20160145646A (en) * 2014-04-11 2016-12-20 삼성전자주식회사 Method and apparatus for rendering sound signal, and computer-readable recording medium
US11245998B2 (en) 2014-04-11 2022-02-08 Samsung Electronics Co.. Ltd. Method and apparatus for rendering sound signal, and computer-readable recording medium
US11785407B2 (en) 2014-04-11 2023-10-10 Samsung Electronics Co., Ltd. Method and apparatus for rendering sound signal, and computer-readable recording medium
CN110610712A (en) * 2014-04-11 2019-12-24 三星电子株式会社 Method and apparatus for rendering sound signal and computer-readable recording medium
AU2018208751B2 (en) * 2014-04-11 2019-11-28 Samsung Electronics Co., Ltd. Method and apparatus for rendering sound signal, and computer-readable recording medium
US10674299B2 (en) 2014-04-11 2020-06-02 Samsung Electronics Co., Ltd. Method and apparatus for rendering sound signal, and computer-readable recording medium
US9451354B2 (en) 2014-05-12 2016-09-20 Apple Inc. Liquid expulsion from an orifice
US10063977B2 (en) 2014-05-12 2018-08-28 Apple Inc. Liquid expulsion from an orifice
US9620105B2 (en) 2014-05-15 2017-04-11 Apple Inc. Analyzing audio input for efficient speech and music recognition
US10770087B2 (en) 2014-05-16 2020-09-08 Qualcomm Incorporated Selecting codebooks for coding vectors decomposed from higher-order ambisonic audio signals
US10592095B2 (en) 2014-05-23 2020-03-17 Apple Inc. Instantaneous speaking of content on touch devices
US9502031B2 (en) 2014-05-27 2016-11-22 Apple Inc. Method for supporting dynamic grammars in WFST-based ASR
US9966065B2 (en) 2014-05-30 2018-05-08 Apple Inc. Multi-command single utterance input method
US10657966B2 (en) 2014-05-30 2020-05-19 Apple Inc. Better resolution when referencing to concepts
US11257504B2 (en) 2014-05-30 2022-02-22 Apple Inc. Intelligent assistant for home automation
US9430463B2 (en) 2014-05-30 2016-08-30 Apple Inc. Exemplar-based natural language processing
US10699717B2 (en) 2014-05-30 2020-06-30 Apple Inc. Intelligent assistant for home automation
US10169329B2 (en) 2014-05-30 2019-01-01 Apple Inc. Exemplar-based natural language processing
US10170123B2 (en) 2014-05-30 2019-01-01 Apple Inc. Intelligent assistant for home automation
US9842101B2 (en) 2014-05-30 2017-12-12 Apple Inc. Predictive conversion of language input
US10083690B2 (en) 2014-05-30 2018-09-25 Apple Inc. Better resolution when referencing to concepts
US10078631B2 (en) 2014-05-30 2018-09-18 Apple Inc. Entropy-guided text prediction using combined word and character n-gram language models
US10289433B2 (en) 2014-05-30 2019-05-14 Apple Inc. Domain specific language for encoding assistant dialog
US9785630B2 (en) 2014-05-30 2017-10-10 Apple Inc. Text prediction using combined word N-gram and unigram language models
US9760559B2 (en) 2014-05-30 2017-09-12 Apple Inc. Predictive text input
US10497365B2 (en) 2014-05-30 2019-12-03 Apple Inc. Multi-command single utterance input method
US11133008B2 (en) 2014-05-30 2021-09-28 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US9734193B2 (en) 2014-05-30 2017-08-15 Apple Inc. Determining domain salience ranking from ambiguous words in natural speech
US9633004B2 (en) 2014-05-30 2017-04-25 Apple Inc. Better resolution when referencing to concepts
US9715875B2 (en) 2014-05-30 2017-07-25 Apple Inc. Reducing the need for manual start/end-pointing and trigger phrases
US10417344B2 (en) 2014-05-30 2019-09-17 Apple Inc. Exemplar-based natural language processing
US10714095B2 (en) 2014-05-30 2020-07-14 Apple Inc. Intelligent assistant for home automation
US10878809B2 (en) 2014-05-30 2020-12-29 Apple Inc. Multi-command single utterance input method
US10904611B2 (en) 2014-06-30 2021-01-26 Apple Inc. Intelligent automated assistant for TV user interactions
US9668024B2 (en) 2014-06-30 2017-05-30 Apple Inc. Intelligent automated assistant for TV user interactions
US9338493B2 (en) 2014-06-30 2016-05-10 Apple Inc. Intelligent automated assistant for TV user interactions
US10659851B2 (en) 2014-06-30 2020-05-19 Apple Inc. Real-time digital assistant knowledge updates
CN106663433A (en) * 2014-07-02 2017-05-10 高通股份有限公司 Reducing correlation between higher order ambisonic (HOA) background channels
US10446141B2 (en) 2014-08-28 2019-10-15 Apple Inc. Automatic speech recognition based on user feedback
US9818400B2 (en) 2014-09-11 2017-11-14 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10431204B2 (en) 2014-09-11 2019-10-01 Apple Inc. Method and apparatus for discovering trending terms in speech requests
US10789041B2 (en) 2014-09-12 2020-09-29 Apple Inc. Dynamic thresholds for always listening speech trigger
US10127911B2 (en) 2014-09-30 2018-11-13 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10390213B2 (en) 2014-09-30 2019-08-20 Apple Inc. Social reminders
US9986419B2 (en) 2014-09-30 2018-05-29 Apple Inc. Social reminders
US9646609B2 (en) 2014-09-30 2017-05-09 Apple Inc. Caching apparatus for serving phonetic pronunciations
US9886432B2 (en) 2014-09-30 2018-02-06 Apple Inc. Parsimonious handling of word inflection via categorical stem + suffix N-gram language models
US10453443B2 (en) 2014-09-30 2019-10-22 Apple Inc. Providing an indication of the suitability of speech recognition
US9668121B2 (en) 2014-09-30 2017-05-30 Apple Inc. Social reminders
US10438595B2 (en) 2014-09-30 2019-10-08 Apple Inc. Speaker identification and unsupervised speaker adaptation techniques
US10074360B2 (en) 2014-09-30 2018-09-11 Apple Inc. Providing an indication of the suitability of speech recognition
US9525943B2 (en) 2014-11-24 2016-12-20 Apple Inc. Mechanically actuated panel acoustic system
US10362403B2 (en) 2014-11-24 2019-07-23 Apple Inc. Mechanically actuated panel acoustic system
US10552013B2 (en) 2014-12-02 2020-02-04 Apple Inc. Data detection
US11556230B2 (en) 2014-12-02 2023-01-17 Apple Inc. Data detection
US9711141B2 (en) 2014-12-09 2017-07-18 Apple Inc. Disambiguating heteronyms in speech synthesis
US9865280B2 (en) 2015-03-06 2018-01-09 Apple Inc. Structured dictation using intelligent automated assistants
US11231904B2 (en) 2015-03-06 2022-01-25 Apple Inc. Reducing response latency of intelligent automated assistants
US10930282B2 (en) 2015-03-08 2021-02-23 Apple Inc. Competing devices responding to voice triggers
US11087759B2 (en) 2015-03-08 2021-08-10 Apple Inc. Virtual assistant activation
US9721566B2 (en) 2015-03-08 2017-08-01 Apple Inc. Competing devices responding to voice triggers
US10529332B2 (en) 2015-03-08 2020-01-07 Apple Inc. Virtual assistant activation
US10567477B2 (en) 2015-03-08 2020-02-18 Apple Inc. Virtual assistant continuity
US10311871B2 (en) 2015-03-08 2019-06-04 Apple Inc. Competing devices responding to voice triggers
US9886953B2 (en) 2015-03-08 2018-02-06 Apple Inc. Virtual assistant activation
US9899019B2 (en) 2015-03-18 2018-02-20 Apple Inc. Systems and methods for structured stem and suffix language models
US9842105B2 (en) 2015-04-16 2017-12-12 Apple Inc. Parsimonious continuous-space phrase representations for natural language processing
US11468282B2 (en) 2015-05-15 2022-10-11 Apple Inc. Virtual assistant in a communication session
US10083688B2 (en) 2015-05-27 2018-09-25 Apple Inc. Device voice control for selecting a displayed affordance
US11127397B2 (en) 2015-05-27 2021-09-21 Apple Inc. Device voice control
US10127220B2 (en) 2015-06-04 2018-11-13 Apple Inc. Language identification from short strings
US10356243B2 (en) 2015-06-05 2019-07-16 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10681212B2 (en) 2015-06-05 2020-06-09 Apple Inc. Virtual assistant aided communication with 3rd party service in a communication session
US10101822B2 (en) 2015-06-05 2018-10-16 Apple Inc. Language input correction
US11025565B2 (en) 2015-06-07 2021-06-01 Apple Inc. Personalized prediction of responses for instant messaging
US10255907B2 (en) 2015-06-07 2019-04-09 Apple Inc. Automatic accent detection using acoustic models
US10186254B2 (en) 2015-06-07 2019-01-22 Apple Inc. Context-based endpoint detection
US11010127B2 (en) 2015-06-29 2021-05-18 Apple Inc. Virtual assistant for media playback
US9900698B2 (en) 2015-06-30 2018-02-20 Apple Inc. Graphene composite acoustic diaphragm
US10747498B2 (en) 2015-09-08 2020-08-18 Apple Inc. Zero latency digital assistant
US11500672B2 (en) 2015-09-08 2022-11-15 Apple Inc. Distributed personal assistant
US10671428B2 (en) 2015-09-08 2020-06-02 Apple Inc. Distributed personal assistant
US9697820B2 (en) 2015-09-24 2017-07-04 Apple Inc. Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks
US10366158B2 (en) 2015-09-29 2019-07-30 Apple Inc. Efficient word encoding for recurrent neural network language models
US9858948B2 (en) 2015-09-29 2018-01-02 Apple Inc. Electronic equipment with ambient noise sensing input circuitry
US11010550B2 (en) 2015-09-29 2021-05-18 Apple Inc. Unified language modeling framework for word prediction, auto-completion and auto-correction
US11587559B2 (en) 2015-09-30 2023-02-21 Apple Inc. Intelligent device identification
US9961475B2 (en) 2015-10-08 2018-05-01 Qualcomm Incorporated Conversion from object-based audio to HOA
US10249312B2 (en) 2015-10-08 2019-04-02 Qualcomm Incorporated Quantization of spatial vectors
US9961467B2 (en) 2015-10-08 2018-05-01 Qualcomm Incorporated Conversion from channel-based audio to HOA
US11526368B2 (en) 2015-11-06 2022-12-13 Apple Inc. Intelligent automated assistant in a messaging environment
US10691473B2 (en) 2015-11-06 2020-06-23 Apple Inc. Intelligent automated assistant in a messaging environment
US10354652B2 (en) 2015-12-02 2019-07-16 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10049668B2 (en) 2015-12-02 2018-08-14 Apple Inc. Applying neural network language models to weighted finite state transducers for automatic speech recognition
US10942703B2 (en) 2015-12-23 2021-03-09 Apple Inc. Proactive assistance based on dialog communication between devices
US10223066B2 (en) 2015-12-23 2019-03-05 Apple Inc. Proactive assistance based on dialog communication between devices
US9881628B2 (en) 2016-01-05 2018-01-30 Qualcomm Incorporated Mixed domain coding of audio
US10446143B2 (en) 2016-03-14 2019-10-15 Apple Inc. Identification of voice inputs providing credentials
US9934775B2 (en) 2016-05-26 2018-04-03 Apple Inc. Unit-selection text-to-speech synthesis based on predicted concatenation parameters
US9972304B2 (en) 2016-06-03 2018-05-15 Apple Inc. Privacy preserving distributed evaluation framework for embedded personalized systems
US11227589B2 (en) 2016-06-06 2022-01-18 Apple Inc. Intelligent list reading
US10249300B2 (en) 2016-06-06 2019-04-02 Apple Inc. Intelligent list reading
US11069347B2 (en) 2016-06-08 2021-07-20 Apple Inc. Intelligent automated assistant for media exploration
US10049663B2 (en) 2016-06-08 2018-08-14 Apple, Inc. Intelligent automated assistant for media exploration
US10354011B2 (en) 2016-06-09 2019-07-16 Apple Inc. Intelligent automated assistant in a home environment
US10067938B2 (en) 2016-06-10 2018-09-04 Apple Inc. Multilingual word prediction
US10509862B2 (en) 2016-06-10 2019-12-17 Apple Inc. Dynamic phrase expansion of language input
US10490187B2 (en) 2016-06-10 2019-11-26 Apple Inc. Digital assistant providing automated status report
US10733993B2 (en) 2016-06-10 2020-08-04 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10192552B2 (en) 2016-06-10 2019-01-29 Apple Inc. Digital assistant providing whispered speech
US11037565B2 (en) 2016-06-10 2021-06-15 Apple Inc. Intelligent digital assistant in a multi-tasking environment
US10521466B2 (en) 2016-06-11 2019-12-31 Apple Inc. Data driven natural language event detection and classification
US10297253B2 (en) 2016-06-11 2019-05-21 Apple Inc. Application integration with a digital assistant
US10269345B2 (en) 2016-06-11 2019-04-23 Apple Inc. Intelligent task discovery
US11152002B2 (en) 2016-06-11 2021-10-19 Apple Inc. Application integration with a digital assistant
US10942702B2 (en) 2016-06-11 2021-03-09 Apple Inc. Intelligent device arbitration and control
US10580409B2 (en) 2016-06-11 2020-03-03 Apple Inc. Application integration with a digital assistant
US10089072B2 (en) 2016-06-11 2018-10-02 Apple Inc. Intelligent device arbitration and control
US10474753B2 (en) 2016-09-07 2019-11-12 Apple Inc. Language identification using recurrent neural networks
US10553215B2 (en) 2016-09-23 2020-02-04 Apple Inc. Intelligent automated assistant
US10043516B2 (en) 2016-09-23 2018-08-07 Apple Inc. Intelligent automated assistant
GB2554446A (en) * 2016-09-28 2018-04-04 Nokia Technologies Oy Spatial audio signal format generation from a microphone array using adaptive capture
US11281993B2 (en) 2016-12-05 2022-03-22 Apple Inc. Model and ensemble compression for metric learning
US10593346B2 (en) 2016-12-22 2020-03-17 Apple Inc. Rank-reduced token representation for automatic speech recognition
US11204787B2 (en) 2017-01-09 2021-12-21 Apple Inc. Application integration with a digital assistant
US11656884B2 (en) 2017-01-09 2023-05-23 Apple Inc. Application integration with a digital assistant
US10417266B2 (en) 2017-05-09 2019-09-17 Apple Inc. Context-aware ranking of intelligent response suggestions
US10741181B2 (en) 2017-05-09 2020-08-11 Apple Inc. User interface for correcting recognition errors
US10332518B2 (en) 2017-05-09 2019-06-25 Apple Inc. User interface for correcting recognition errors
US10755703B2 (en) 2017-05-11 2020-08-25 Apple Inc. Offline personal assistant
US10726832B2 (en) 2017-05-11 2020-07-28 Apple Inc. Maintaining privacy of personal information
US10847142B2 (en) 2017-05-11 2020-11-24 Apple Inc. Maintaining privacy of personal information
US10395654B2 (en) 2017-05-11 2019-08-27 Apple Inc. Text normalization based on a data-driven learning network
US11405466B2 (en) 2017-05-12 2022-08-02 Apple Inc. Synchronization and task delegation of a digital assistant
US10791176B2 (en) 2017-05-12 2020-09-29 Apple Inc. Synchronization and task delegation of a digital assistant
US10789945B2 (en) 2017-05-12 2020-09-29 Apple Inc. Low-latency intelligent automated assistant
US11301477B2 (en) 2017-05-12 2022-04-12 Apple Inc. Feedback analysis of a digital assistant
US10410637B2 (en) 2017-05-12 2019-09-10 Apple Inc. User-specific acoustic models
US10482874B2 (en) 2017-05-15 2019-11-19 Apple Inc. Hierarchical belief states for digital assistants
US10810274B2 (en) 2017-05-15 2020-10-20 Apple Inc. Optimizing dialogue policy decisions for digital assistants using implicit feedback
US10311144B2 (en) 2017-05-16 2019-06-04 Apple Inc. Emoji word sense disambiguation
US10303715B2 (en) 2017-05-16 2019-05-28 Apple Inc. Intelligent automated assistant for media exploration
US11217255B2 (en) 2017-05-16 2022-01-04 Apple Inc. Far-field extension for digital assistant services
US10748546B2 (en) 2017-05-16 2020-08-18 Apple Inc. Digital assistant services based on device capabilities
US10403278B2 (en) 2017-05-16 2019-09-03 Apple Inc. Methods and systems for phonetic matching in digital assistant services
US10909171B2 (en) 2017-05-16 2021-02-02 Apple Inc. Intelligent automated assistant for media exploration
US10657328B2 (en) 2017-06-02 2020-05-19 Apple Inc. Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling
US10445429B2 (en) 2017-09-21 2019-10-15 Apple Inc. Natural language understanding using vocabularies with compressed serialized tries
US11907426B2 (en) 2017-09-25 2024-02-20 Apple Inc. Electronic device with actuators for producing haptic and audio output along a device housing
US11307661B2 (en) 2017-09-25 2022-04-19 Apple Inc. Electronic device with actuators for producing haptic and audio output along a device housing
US10755051B2 (en) 2017-09-29 2020-08-25 Apple Inc. Rule-based natural language processing
US10636424B2 (en) 2017-11-30 2020-04-28 Apple Inc. Multi-turn canned dialog
US10733982B2 (en) 2018-01-08 2020-08-04 Apple Inc. Multi-directional dialog
US10733375B2 (en) 2018-01-31 2020-08-04 Apple Inc. Knowledge-based framework for improving natural language understanding
US10789959B2 (en) 2018-03-02 2020-09-29 Apple Inc. Training speaker recognition models for digital assistants
US10592604B2 (en) 2018-03-12 2020-03-17 Apple Inc. Inverse text normalization for automatic speech recognition
US10818288B2 (en) 2018-03-26 2020-10-27 Apple Inc. Natural assistant interaction
US10909331B2 (en) 2018-03-30 2021-02-02 Apple Inc. Implicit identification of translation payload with neural machine translation
US11145294B2 (en) 2018-05-07 2021-10-12 Apple Inc. Intelligent automated assistant for delivering content from user experiences
US10928918B2 (en) 2018-05-07 2021-02-23 Apple Inc. Raise to speak
US10984780B2 (en) 2018-05-21 2021-04-20 Apple Inc. Global semantic word embeddings using bi-directional recurrent neural networks
US10892996B2 (en) 2018-06-01 2021-01-12 Apple Inc. Variable latency device coordination
US10684703B2 (en) 2018-06-01 2020-06-16 Apple Inc. Attention aware virtual assistant dismissal
US10984798B2 (en) 2018-06-01 2021-04-20 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US11386266B2 (en) 2018-06-01 2022-07-12 Apple Inc. Text correction
US10403283B1 (en) 2018-06-01 2019-09-03 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US11009970B2 (en) 2018-06-01 2021-05-18 Apple Inc. Attention aware virtual assistant dismissal
US11495218B2 (en) 2018-06-01 2022-11-08 Apple Inc. Virtual assistant operation in multi-device environments
US10720160B2 (en) 2018-06-01 2020-07-21 Apple Inc. Voice interaction at a primary device to access call functionality of a companion device
US10504518B1 (en) 2018-06-03 2019-12-10 Apple Inc. Accelerated task performance
US10496705B1 (en) 2018-06-03 2019-12-03 Apple Inc. Accelerated task performance
US10944859B2 (en) 2018-06-03 2021-03-09 Apple Inc. Accelerated task performance
US10757491B1 (en) 2018-06-11 2020-08-25 Apple Inc. Wearable interactive audio device
US11743623B2 (en) 2018-06-11 2023-08-29 Apple Inc. Wearable interactive audio device
US10873798B1 (en) 2018-06-11 2020-12-22 Apple Inc. Detecting through-body inputs at a wearable audio device
US11334032B2 (en) 2018-08-30 2022-05-17 Apple Inc. Electronic watch with barometric vent
US11740591B2 (en) 2018-08-30 2023-08-29 Apple Inc. Electronic watch with barometric vent
US11010561B2 (en) 2018-09-27 2021-05-18 Apple Inc. Sentiment prediction from textual data
US11561144B1 (en) 2018-09-27 2023-01-24 Apple Inc. Wearable electronic device with fluid-based pressure sensing
US10839159B2 (en) 2018-09-28 2020-11-17 Apple Inc. Named entity normalization in a spoken dialog system
US11170166B2 (en) 2018-09-28 2021-11-09 Apple Inc. Neural typographical error modeling via generative adversarial networks
US11462215B2 (en) 2018-09-28 2022-10-04 Apple Inc. Multi-modal inputs for voice commands
US11475898B2 (en) 2018-10-26 2022-10-18 Apple Inc. Low-latency multi-speaker speech recognition
US11638059B2 (en) 2019-01-04 2023-04-25 Apple Inc. Content playback on multiple devices
US11348573B2 (en) 2019-03-18 2022-05-31 Apple Inc. Multimodality in digital assistant systems
US11857063B2 (en) 2019-04-17 2024-01-02 Apple Inc. Audio output system for a wirelessly locatable tag
US11475884B2 (en) 2019-05-06 2022-10-18 Apple Inc. Reducing digital assistant latency when a language is incorrectly determined
US11307752B2 (en) 2019-05-06 2022-04-19 Apple Inc. User configurable task triggers
US11217251B2 (en) 2019-05-06 2022-01-04 Apple Inc. Spoken notifications
US11423908B2 (en) 2019-05-06 2022-08-23 Apple Inc. Interpreting spoken requests
US11140099B2 (en) 2019-05-21 2021-10-05 Apple Inc. Providing message response suggestions
US11496600B2 (en) 2019-05-31 2022-11-08 Apple Inc. Remote execution of machine-learned models
US11360739B2 (en) 2019-05-31 2022-06-14 Apple Inc. User activity shortcut suggestions
US11237797B2 (en) 2019-05-31 2022-02-01 Apple Inc. User activity shortcut suggestions
US11289073B2 (en) 2019-05-31 2022-03-29 Apple Inc. Device text to speech
US11360641B2 (en) 2019-06-01 2022-06-14 Apple Inc. Increasing the relevance of new available information
US11488406B2 (en) 2019-09-25 2022-11-01 Apple Inc. Text detection using global geometry estimators

Similar Documents

Publication Publication Date Title
EP2094032A1 (en) Audio signal, method and apparatus for encoding or transmitting the same and method and apparatus for processing the same
JP5032977B2 (en) Multi-channel encoder
US9794721B2 (en) System and method for capturing, encoding, distributing, and decoding immersive audio
KR101315077B1 (en) Scalable multi-channel audio coding
RU2431940C2 (en) Apparatus and method for multichannel parametric conversion
JP4601669B2 (en) Apparatus and method for generating a multi-channel signal or parameter data set
JP4322207B2 (en) Audio encoding method
CN101479786B (en) Method for encoding and decoding object-based audio signal and apparatus thereof
JP2013174891A (en) High quality multi-channel audio encoding and decoding apparatus
US20200013426A1 (en) Synchronizing enhanced audio transports with backward compatible audio transports
CN104054126A (en) Spatial audio rendering and encoding
NO340421B1 (en) Frequency-based coding of audio channels in parametric multi-channel coding system
KR102172279B1 (en) Encoding and decdoing apparatus for supprtng scalable multichannel audio signal, and method for perporming by the apparatus
JP5173811B2 (en) Audio signal decoding method and apparatus
KR102149411B1 (en) Apparatus and method for generating audio data, apparatus and method for playing audio data
US11081116B2 (en) Embedding enhanced audio transports in backward compatible audio bitstreams
KR100636145B1 (en) Exednded high resolution audio signal encoder and decoder thereof
US9312971B2 (en) Apparatus and method for transmitting audio object
US11062713B2 (en) Spatially formatted enhanced audio data for backward compatible audio bitstreams
JP2014204322A (en) Acoustic signal reproducing device and acoustic signal preparation device
KR102370348B1 (en) Apparatus and method for providing the audio metadata, apparatus and method for providing the audio data, apparatus and method for playing the audio data
KR20220030983A (en) Apparatus and method for providing the audio metadata, apparatus and method for providing the audio data, apparatus and method for playing the audio data
JP2022553111A (en) System, method, and apparatus for conversion of channel-based audio to object-based audio

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MT NL NO PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA MK RS

17P Request for examination filed

Effective date: 20100226

AKX Designation fees paid

Designated state(s): DE FR GB

17Q First examination report despatched

Effective date: 20100713

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20120515