CN101523483B

CN101523483B - Method for the rendition of text information by speech in a vehicle

Info

Publication number: CN101523483B
Application number: CN2007800382076A
Authority: CN
Inventors: S·泽尔朔普
Original assignee: Audi AG
Current assignee: Audi AG
Priority date: 2006-11-29
Filing date: 2007-10-19
Publication date: 2013-07-24
Anticipated expiration: 2027-10-19
Also published as: DE102006056286A1; CN101523483A; WO2008064742A1; DE102006056286B4

Abstract

The invention relates to a method for the rendition of text information by speech in a vehicle according to which the following steps are carried out: a) Preparation of text elements in a unit external to the vehicle; b) Production and preparation of specific pronunciation information for the respective text elements; c) Transfer of the text elements and the specific pronunciation information to aprocessing unit in the vehicle; d) Assignment of the specific pronunciation information to the respective text elements; e) Rendition of the text elements, taking into consideration the specific pron unciation information, by an electronic speech device in the vehicle.

Description

In automobile, pass through the method for voice reproduction text message

Technical field

The present invention relates to a kind of method of in automobile, passing through the voice reproduction text message.

Background technology

The such existing system of known for example navigational system in automobile, they can reproduce the information that is stored as text module (Textbausteine) by acoustic speech signal ground.These systems are limited to the basic text element (Basis-Textelement) of storage, and it is reproduced to have only the basic text element of conduct to pass through voice.Can not this system of expansion.

In addition, the system of the known text message that wherein can in automobile, receive from the outside by voice reproduction.In this important problem is can not be by voice free burial ground for the destitute and reproduce these text messages understandably undoubtedly.

Summary of the invention

Therefore, the technical problem to be solved in the present invention provides a kind of method, utilizes this method can improve in automobile by the voice reproduction text message.

This technical matters by a kind of in automobile the method by the voice reproduction text message solve, wherein carry out following steps:

A) in the unit of automobile external, provide text element;

B) produce and provide specific pronunciation information for each text element;

C) described text element and described specific pronunciation information are sent to the processing unit of automotive interior;

D) described specific pronunciation information is distributed to the corresponding text element;

E) under the situation of considering described specific pronunciation information, reproduce described text element by the electronic speech device in the automobile,

Wherein, in the unit of automotive interior, before voice output system puts into operation, store basic text element and corresponding basic pronunciation information; And

The text element that is sent in the automobile is compared with basic text element, and do not considering simultaneously that the specific pronunciation information of text element is used for the voice output of text.

Embodiment

In the method according to the invention, in automobile, pass through the voice signal rendition of text information.The text message that reproduces is provided in the unit of automobile external as text element.Text element also can produce in the unit of this automobile external in principle.

In addition, produce and provide specific pronunciation information for each text element.Text element and specific pronunciation information will be transmitted or be sent to the processing unit of automotive interior.Specific pronunciation information is assigned to the corresponding text element.Under the situation of considering specific pronunciation information, reproduce text element by the electronic speech device in the automobile.By this mode, can notify a plurality of differences and personalized text message by the voice reproduction of the remarkable improvement in the automobile.Especially optimize text message by the supplementary that provides as specific pronunciation information is provided from the outside, can significantly improve the no doubt and the intelligibility of voice signal.Free burial ground for the destitute and reproduce very complicated text understandably undoubtedly thus.

Preferably, at automobile external specific pronunciation information is distributed to the corresponding text element.Can improve the dirigibility of waiting to reproduce text thus.Can significantly reduce needed electronics storage space in the automobile in addition.

But, also can specific pronunciation information be distributed to the corresponding text element at automotive interior.

Preferably, pronunciation information is stored in the database, wherein according to needing this database of search to search needed each information.

Preferably, text element and specific pronunciation information are sent in the automobile when automobilism, especially wirelessly transmit.

Preferably, specific pronunciation information and/or produced with standardized format at the distribution of text message.Preferably, can be specific pronunciation information and/or produce with SSML (SSML) language at the distribution of text message at this.

Preferably, in the unit or processing unit of automotive interior, before basic coming into operation, and therefore before offering the final user of voice output system, basic text element is stored with corresponding basic pronunciation information.

Preferably, be sent in the automobile text element and basic text element relatively, and do not considering to be sent to the correct voice output that text element in the automobile is used for text simultaneously with the specific pronunciation information of text element.

Preferably, by the digital broadcasting medium, especially transmit text element and specific pronunciation information by digital broadcast network.

Under text element, comprise single speech and sentence element or whole sentence.Can also under a text element, comprise a plurality of sentences.

By reading aloud text message according to the template and the pronunciation scheme (Ausspracheschemata) of storage, phonetic synthesis produces voice signal according to text message.The software that is used for voice output as the basis is called as phonetic synthesis or text-to-speech (TTS) engine.Tts engine can be by adding for each speech in the pronunciation or sentence structure, being supported as the pronunciation information of grammer to text.This for example can be used in the navigational system.Tts engine has the following advantages: people can work under the declaimer's who does not have nature situation, but also can produce new so-called prompting afterwards, i.e. text output.Be stored in the automobile by the audio file of optimizing that tts engine produced, and inquired by incident, be equivalent to current navigation output, wherein for example according to reach and next target between should after 200m, turn round left by the voice signal explanation at a distance of specific range.Sentence element is dynamically combined by the module (Bausteinen) that is stored in the automobile.These basic text elements are stored in the system as basic information, so that can guarantee basic function prevailingly aspect the voice output of text message.But this is the finite aggregate that provides in advance regularly of text element, and it is not enough for very different text messages and expression.

Now, can optimize the voice output of very different text messages, wherein realize by in the unit of automobile external, carrying out this optimization basically at automobile external by the method according to this invention or its preferred implementation.So, in optimization, produce conversion script (Transskript), i.e. the method for voice production (Lautsprache) specific to tts engine.This conversion script can dynamically be sent in the automobile, perhaps can be stored in the automobile after transmitting.Then, in automobile, carry out the audio frequency output of sound.The text of reading with auxiliary content or specific pronunciation information can be in automobile by tts engine and so-calledly be converted into audio frequency output similarly from car conversion (Offboardumsetzung).The significant advantage that can obtain is thus, new multiple different content of text messages can be provided in the automobile afterwards, and reproduce out by this system with improved voice output.Thereby, can especially wirelessly transmit text message content by broadcast medium, and by voice signal free burial ground for the destitute output text message content undoubtedly in automobile.So the auxiliary content that externally produces as specific pronunciation information can be used for the pronunciation of the no doubt of automobile, and guarantee remarkable improvement to intelligibility.The content of optimizing at pronunciation also can be sent to automobile by communication service.

Tts engine can be explained optimization and carry out gratifying output.In addition, significantly reduced needed storage space by this method, because storage is 10 to 100 times of the needed storage space of text that comprise optimization with the text formatting storage as the needed storage space of basic text element that has corresponding basic pronunciation information in a large number of basic base (Wortbasis) in this system.Therefore, preferably, represent to optimize text message at voice, and produce audio file from car ground or at automobile external, and in automobile an output audio file.

So, preferably, voice-optimizing (Sprachoptimierung) is described with normalized form, thus different tts engines explanatory content in the same manner.This is especially especially favourable under the situation that message is dynamically introduced, because these message must be handled by all receivers.A kind of possible standard of voice-optimizing is the SSML language, for example can define a subclass by it, and corresponding receiver system supports this subclass and transmitting element that this subclass is provided.

Particularly advantageously be, with the basis of Automatic Optimal as the voice output of very different text messages.For example can stipulate that continuous updating is the text message that communication service is sent, thereby be very bothersome content manual examination (check) pronunciation characteristics at this.Can improve this point by Automatic Optimal.

A kind of exemplary approach about Automatic Optimal is that at first input text, and pronunciation data storehouse is loaded with specific pronunciation information.Then, the text element and the basic text element of the text that transmitted compared, and be the additional corresponding pronunciation rule of the text.Owing to both there had been the pronunciation information of having stored and having distributed in advance for basic text element, have again specific to the pronunciation information of the text element that transmits with the text, therefore whole text can be based on each pronunciation information, and says whole text with the possible pronunciation of the best.Can't be even transmit by basic text element understanding or the textual portions that is not covered by basic text element, also free burial ground for the destitute and clearly represent these almost unacquainted text elements undoubtedly by voice signal, because also distributed specific pronunciation information for these text elements, these specific pronunciation information are that the individual car of taking leave of produces, and are additionally transmitted together as supplementary.

So the output of whole text can be carried out or automatically reproduced by the determined moment of autoist.Therefore, autoist can oneself be determined the moment and the duration of reproduction.

Can stipulate in addition, can carry out aftertreatment by editor, especially manual aftertreatment.Can realize thus improving once more, and as if start mode of learning.

Claims

1. method by the voice reproduction text message in automobile, wherein carry out following steps:

A) in the unit of automobile external, provide text element;

It is characterized in that, in the unit of automotive interior, before voice output system puts into operation, store basic text element and corresponding basic pronunciation information; And

2. method according to claim 1 is characterized in that, at automobile external described specific pronunciation information is distributed to the corresponding text element.

3. method according to claim 1 is characterized in that, at automotive interior described specific pronunciation information is distributed to the corresponding text element.

4. according to the described method of one of claim 1 to 3, it is characterized in that described specific pronunciation information is stored in the database, wherein said database is searched according to needs.

5. according to the described method of one of claim 1 to 3, it is characterized in that described text element and described specific pronunciation information are sent in the automobile when automobilism.

6. method according to claim 5 is characterized in that, described text element and described specific pronunciation information are transmitted wirelessly in the automobile when automobilism.

7. according to the described method of one of claim 1 to 3, it is characterized in that described specific pronunciation information and/or produce with standardized format at the distribution of text element.

8. method according to claim 7 is characterized in that, described specific pronunciation information and/or produce with the SSML language at the distribution of text element.

9. according to the described method of one of claim 1 to 3, it is characterized in that, transmit described text element and described specific pronunciation information by broadcast medium.

10. method according to claim 9 is characterized in that, transmits described text element and described specific pronunciation information by digital broadcast network.