CN103268316A

CN103268316A - Image recognition and voiced translation method and image recognition and voiced translation device

Info

Publication number: CN103268316A
Application number: CN2013102054637A
Authority: CN
Inventors: 于洋
Original assignee: JIANGSU YUANKUN TECHNOLOGY DEVELOPMENT Co Ltd
Current assignee: JIANGSU YUANKUN TECHNOLOGY DEVELOPMENT Co Ltd
Priority date: 2013-05-27
Filing date: 2013-05-27
Publication date: 2013-08-28

Abstract

The invention relates to an image recognition and voiced translation method and an image recognition and voiced translation device. The image recognition and voiced translation method includes the steps of image collection, image recognition, language analysis, translational processing, translation result storage, voice conversion and voice output. The image recognition and voiced translation device comprises an image collection module, an image recognizer, a language analyzer, a semantic rule bank, a translation processor, a storage device, an Internet interface and a voice recognition converter, wherein the image recognizer includes a word bank and a language bank. The image recognition and voiced translation device is not limited by voice, can achieve translation as soon as an image is taken, supports shoot of cameras of various mobile terminals, reduces requirements for pixels of the cameras of the mobile terminals, supports recognition of various picture formats, outputs voiced translation results, and is high in translation accuracy.

Description

A kind of picture recognition voice translation method and interpreting equipment thereof

Technical field

The present invention relates to picture recognition and voiced translation technology, more specifically relate to a kind of picture recognition voice translation method and interpreting equipment thereof, be used for the translation field.

Background technology

Voice translation automatically are with a kind of voice process speech recognition and then are translated as another kind of voice.The most common voice translation automatically are that foreign received pronunciation is translated as the Chinese received pronunciation.

Another developing direction translated automatically in voice is that dialect phonetic is translated as received pronunciation.For the Chinese dialects voice are translated as Chinese putonghua speech automatically, the work of exploitation dialect translation engine launches gradually.Difficulty, the word speed difference of dialect identification, the standard degree of mandarin directly affects the accuracy of translation result.With regard to word speed difference, present speech recognition system adopts single-sensor, and namely sound transducer carries out speech recognition, can not the voice messaging of catching be segmented, can only adopt constant speed rate Tracking Recognition, the template in voice messaging and the system library is carried out matching judgment.If the speed of capturing information and the template that is stored in the system library are more or less the same, then can operate as normal, otherwise erroneous judgement will appear.Yet in the actual life, people speak and can not always keep identical speed, but constantly change, this certainly will cause the error of speech recognition system to increase, so that reduces the practicality of speech recognition system.

Voice translation automatically are better at the translation effect of word, phrase, but at than long sentence, can not accurately identify, and accurately translate.

For the word that can not read or words and phrases, the automatic translation of voice can't be translated.

The picture translation is with picture process image recognition and then extracts information in the picture that the result of translation mostly is Word message greatly.Existing picture translation is translated the pictorial information that camera obtains, and mobile phone shows translation.The picture translation is all high to the picture pixel request.General user's mobile phone pixel can not reach the standard of picture translation brief, and therefore, the shooting of described mobile phone can not be translated by identification, and the picture translation can not be promoted the use of on a large scale.At present, the picture translation relatively is fit to the picture of short essay word, label poster and so on.In addition, for other picture statements, the picture translation can not be translated directly and accurately.Better solution is scanning identification earlier, and then utilizes translation tool translation or human translation.This scheme has not realized the picture translation veritably.

Summary of the invention

The present invention overcomes the automatic translation technology defective of above-mentioned voice, in conjunction with the picture translation, proposes a kind of picture recognition voice translation method and interpreting equipment thereof.

There is not the speed problem of language in the picture recognition speech translation system, the problem that does not also have dialect, be not subjected to the restriction of language, as long as the picture that meets form (as gif, jpeg, png) is provided, will carry out the translation of target language according to the literal on the picture.The technical matters that the present invention mainly solves is the language output voice on the identification picture, and the technical matters that the present invention mainly solves also comprises the picture recognition rate that strengthens, and improves the translation accuracy rate.Described system is lower to the picture pixel request, has avoided causing recognition failures because pixel is low, has improved translation efficiency and accuracy greatly.

Picture recognition voice translation method of the present invention and interpreting equipment thereof are as follows:

A kind of picture recognition voice translation method is as follows:

(1) image capture module carries out image acquisition to object picture, and the image acquisition result is imported the image recognition device;

(2) utilize self lexicon and language library, described image recognition device carries out image recognition to described image acquisition result, and recognition result is imported speech analyser;

(3) utilize the semantic rules storehouse, described speech analyser carries out language analysis to described recognition result, and analysis result is imported translation processor;

(4) described translation processor is translated processing to described analysis result, obtains translation result; Described translation processor carries out morphology, grammer, semantic test to described translation result; If described translation result fails inspection, described translation processor imports the image recognition device with described translation result; If described translation result is by inspection, described translation result is stored in memory device, and/or the input internet; Described translation processor imports the speech recognition conversion device with described translation result;

(5) described speech recognition conversion device carries out speech conversion to described translation result, and voice output.

Preferably, for the image acquisition result who surpasses threshold value, described step (1) comprises that also image capture module imports the image digitazation module with described image acquisition result and carries out image digitazation, described image digitazation module imports image segmentating device with described image digitazation result, described image segmentating device carries out cutting apart of statement and paragraph with described image digitazation result, segmentation result is analyzed, the comparative analysis result is imported the image recognition device; Described step (2) comprises that also described image recognition device utilizes described comparative analysis result that described image acquisition result is carried out image recognition.

Described threshold value need be set in advance according to the processing power of image recognition device, comprises paragraph sizes with the image acquisition result and number is relevant.

A kind of picture recognition speech translation apparatus comprises: image capture module, image digitazation module, image segmentating device, the image recognition device that comprises lexicon and language library, speech analyser, semantic rules storehouse, translation processor, memory device, internet interface, speech recognition conversion device;

Described image capture module imports the image recognition device with the image acquisition result, will import the image digitazation module above the image acquisition result of threshold value;

Described image digitazation module is carried out image digitazation to described image acquisition result, and described image digitazation module imports image segmentating device with described image digitazation result;

Described image segmentating device will carry out cutting apart of statement and paragraph above the image acquisition result of threshold value, and segmentation result is analyzed, and the comparative analysis result is imported the image recognition device;

The described image recognition device that comprises lexicon and language library with described image acquisition result, comparative analysis matching language lexicon and language library as a result, imports speech analyser with recognition result;

Described speech analyser carries out language analysis to described recognition result, and retrieval visit semantic rules storehouse receives the result for retrieval that the semantic rules storehouse imports, and analysis result is imported translation processor;

Described semantic rules storehouse receives speech analyser retrieval visit, and result for retrieval is imported speech analyser;

Described translation processor is translated processing to described analysis result, obtains translation result, and translation result is stored in memory device, and/or by internet interface input internet, described translation result is imported the speech recognition conversion device;

Described speech recognition conversion device carries out speech conversion to described translation result, and voice output.

Image capture module is supported multiple portable terminal, as the cell phone of operating systems such as ipad, touch and ios, android.

By technique scheme, picture recognition voice translation method of the present invention and interpreting equipment thereof possess following advantage and beneficial effect at least:

The present invention is not subjected to voice restriction puzzlement, can namely clap namely and translate, and supports the shooting of multiple mobile terminal camera, has reduced the identification requirement to mobile whole camera pixel.The present invention supports the plurality of picture format identification, output voiced translation result, translation precision height.

The picture recognition speech translation apparatus is easy to operate, just can take pictures anywhere or anytime and translate so long as have the portable terminal of camera, satisfies the needs of more users.

Description of drawings

Fig. 1 is the process flow diagram of picture recognition voice translation method of the present invention;

Embodiment

Below in conjunction with accompanying drawing 1, describe the specific embodiment of the present invention in detail.

As shown in Figure 1, the software image acquisition module imports the file and picture (newspaper, magazine, books etc.) of mobile terminal camera head shooting, carries out image digitazation, perhaps directly imports the existing image of mobile phone.Image capture module imports the image recognition device with the image acquisition result.

As shown in Figure 1, for the collection result that surpasses threshold value, described step (1) comprises that also image capture module imports image segmentating device with the image acquisition result, described image segmentating device carries out cutting apart of statement and paragraph with described image acquisition result, segmentation result is analyzed, the comparative analysis result is also imported the image recognition device.

As shown in Figure 1, the image recognition device utilizes its lexicon that comprises and language library to carry out image recognition.The image recognition support is identified automatically, the frame choosing is identified and line identification, allows recognition result more meet the user and selects requirement.The image recognition device imports speech analyser with recognition result.

As shown in Figure 1, speech analyser utilizes the semantic rules storehouse that described recognition result is carried out language analysis, and analysis result is imported translation processor.

As shown in Figure 1, translation processor is translated processing with analysis result, obtains translation result.Except Chinese, also support multilingual, as English, Japanese, Korean etc.Translation result can be stored by memory device, also can insert the internet and share interaction.Preserve translation result, and have memory function, when translating identical vocabulary again, save analysis time, translation result directly is provided.Translation processor imports the speech recognition converter with translation result.

Translation processor is finished and is received described analysis result, finishes the residue translation, and the storage translation result is shared interactive translation result, and undesirable translation result is imported the image recognition device, and satisfactory translation result is imported the speech recognition conversion device.

As shown in Figure 1, the speech recognition conversion device with translation result with carrying out speech conversion, with voice output.

The preferred embodiment of the present invention just is used for helping to set forth the present invention.The present invention is not limited to above-mentioned embodiment, and under the situation that does not deviate from flesh and blood of the present invention, any distortion that it may occur to persons skilled in the art that, improvement, replacement all belong to scope of the present invention.

Claims

1. a picture recognition voice translation method is as follows:

2. according to the picture recognition voice translation method of claim 1, for the image acquisition result who surpasses threshold value, described step (1) comprises that also image capture module imports the image digitazation module with described image acquisition result and carries out image digitazation, described image digitazation module imports image segmentating device with described image digitazation result, described image segmentating device carries out cutting apart of statement and paragraph with described image digitazation result, segmentation result is analyzed, the comparative analysis result is imported the image recognition device; Described step (2) comprises that also described image recognition device utilizes described comparative analysis result that described image acquisition result is carried out image recognition.

3. a picture recognition speech translation apparatus comprises: image capture module, image digitazation module, image segmentating device, the image recognition device that comprises lexicon and language library, speech analyser, semantic rules storehouse, translation processor, memory device, internet interface, speech recognition conversion device;