CN103268316A - Image recognition and voiced translation method and image recognition and voiced translation device - Google Patents

Image recognition and voiced translation method and image recognition and voiced translation device Download PDF

Info

Publication number
CN103268316A
CN103268316A CN2013102054637A CN201310205463A CN103268316A CN 103268316 A CN103268316 A CN 103268316A CN 2013102054637 A CN2013102054637 A CN 2013102054637A CN 201310205463 A CN201310205463 A CN 201310205463A CN 103268316 A CN103268316 A CN 103268316A
Authority
CN
China
Prior art keywords
result
image
translation
recognition
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2013102054637A
Other languages
Chinese (zh)
Inventor
于洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
JIANGSU YUANKUN TECHNOLOGY DEVELOPMENT Co Ltd
Original Assignee
JIANGSU YUANKUN TECHNOLOGY DEVELOPMENT Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by JIANGSU YUANKUN TECHNOLOGY DEVELOPMENT Co Ltd filed Critical JIANGSU YUANKUN TECHNOLOGY DEVELOPMENT Co Ltd
Priority to CN2013102054637A priority Critical patent/CN103268316A/en
Publication of CN103268316A publication Critical patent/CN103268316A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention relates to an image recognition and voiced translation method and an image recognition and voiced translation device. The image recognition and voiced translation method includes the steps of image collection, image recognition, language analysis, translational processing, translation result storage, voice conversion and voice output. The image recognition and voiced translation device comprises an image collection module, an image recognizer, a language analyzer, a semantic rule bank, a translation processor, a storage device, an Internet interface and a voice recognition converter, wherein the image recognizer includes a word bank and a language bank. The image recognition and voiced translation device is not limited by voice, can achieve translation as soon as an image is taken, supports shoot of cameras of various mobile terminals, reduces requirements for pixels of the cameras of the mobile terminals, supports recognition of various picture formats, outputs voiced translation results, and is high in translation accuracy.

Description

A kind of picture recognition voice translation method and interpreting equipment thereof
Technical field
The present invention relates to picture recognition and voiced translation technology, more specifically relate to a kind of picture recognition voice translation method and interpreting equipment thereof, be used for the translation field.
Background technology
Voice translation automatically are with a kind of voice process speech recognition and then are translated as another kind of voice.The most common voice translation automatically are that foreign received pronunciation is translated as the Chinese received pronunciation.
Another developing direction translated automatically in voice is that dialect phonetic is translated as received pronunciation.For the Chinese dialects voice are translated as Chinese putonghua speech automatically, the work of exploitation dialect translation engine launches gradually.Difficulty, the word speed difference of dialect identification, the standard degree of mandarin directly affects the accuracy of translation result.With regard to word speed difference, present speech recognition system adopts single-sensor, and namely sound transducer carries out speech recognition, can not the voice messaging of catching be segmented, can only adopt constant speed rate Tracking Recognition, the template in voice messaging and the system library is carried out matching judgment.If the speed of capturing information and the template that is stored in the system library are more or less the same, then can operate as normal, otherwise erroneous judgement will appear.Yet in the actual life, people speak and can not always keep identical speed, but constantly change, this certainly will cause the error of speech recognition system to increase, so that reduces the practicality of speech recognition system.
Voice translation automatically are better at the translation effect of word, phrase, but at than long sentence, can not accurately identify, and accurately translate.
For the word that can not read or words and phrases, the automatic translation of voice can't be translated.
The picture translation is with picture process image recognition and then extracts information in the picture that the result of translation mostly is Word message greatly.Existing picture translation is translated the pictorial information that camera obtains, and mobile phone shows translation.The picture translation is all high to the picture pixel request.General user's mobile phone pixel can not reach the standard of picture translation brief, and therefore, the shooting of described mobile phone can not be translated by identification, and the picture translation can not be promoted the use of on a large scale.At present, the picture translation relatively is fit to the picture of short essay word, label poster and so on.In addition, for other picture statements, the picture translation can not be translated directly and accurately.Better solution is scanning identification earlier, and then utilizes translation tool translation or human translation.This scheme has not realized the picture translation veritably.
Summary of the invention
The present invention overcomes the automatic translation technology defective of above-mentioned voice, in conjunction with the picture translation, proposes a kind of picture recognition voice translation method and interpreting equipment thereof.
There is not the speed problem of language in the picture recognition speech translation system, the problem that does not also have dialect, be not subjected to the restriction of language, as long as the picture that meets form (as gif, jpeg, png) is provided, will carry out the translation of target language according to the literal on the picture.The technical matters that the present invention mainly solves is the language output voice on the identification picture, and the technical matters that the present invention mainly solves also comprises the picture recognition rate that strengthens, and improves the translation accuracy rate.Described system is lower to the picture pixel request, has avoided causing recognition failures because pixel is low, has improved translation efficiency and accuracy greatly.
Picture recognition voice translation method of the present invention and interpreting equipment thereof are as follows:
A kind of picture recognition voice translation method is as follows:
(1) image capture module carries out image acquisition to object picture, and the image acquisition result is imported the image recognition device;
(2) utilize self lexicon and language library, described image recognition device carries out image recognition to described image acquisition result, and recognition result is imported speech analyser;
(3) utilize the semantic rules storehouse, described speech analyser carries out language analysis to described recognition result, and analysis result is imported translation processor;
(4) described translation processor is translated processing to described analysis result, obtains translation result; Described translation processor carries out morphology, grammer, semantic test to described translation result; If described translation result fails inspection, described translation processor imports the image recognition device with described translation result; If described translation result is by inspection, described translation result is stored in memory device, and/or the input internet; Described translation processor imports the speech recognition conversion device with described translation result;
(5) described speech recognition conversion device carries out speech conversion to described translation result, and voice output.
Preferably, for the image acquisition result who surpasses threshold value, described step (1) comprises that also image capture module imports the image digitazation module with described image acquisition result and carries out image digitazation, described image digitazation module imports image segmentating device with described image digitazation result, described image segmentating device carries out cutting apart of statement and paragraph with described image digitazation result, segmentation result is analyzed, the comparative analysis result is imported the image recognition device; Described step (2) comprises that also described image recognition device utilizes described comparative analysis result that described image acquisition result is carried out image recognition.
Described threshold value need be set in advance according to the processing power of image recognition device, comprises paragraph sizes with the image acquisition result and number is relevant.
A kind of picture recognition speech translation apparatus comprises: image capture module, image digitazation module, image segmentating device, the image recognition device that comprises lexicon and language library, speech analyser, semantic rules storehouse, translation processor, memory device, internet interface, speech recognition conversion device;
Described image capture module imports the image recognition device with the image acquisition result, will import the image digitazation module above the image acquisition result of threshold value;
Described image digitazation module is carried out image digitazation to described image acquisition result, and described image digitazation module imports image segmentating device with described image digitazation result;
Described image segmentating device will carry out cutting apart of statement and paragraph above the image acquisition result of threshold value, and segmentation result is analyzed, and the comparative analysis result is imported the image recognition device;
The described image recognition device that comprises lexicon and language library with described image acquisition result, comparative analysis matching language lexicon and language library as a result, imports speech analyser with recognition result;
Described speech analyser carries out language analysis to described recognition result, and retrieval visit semantic rules storehouse receives the result for retrieval that the semantic rules storehouse imports, and analysis result is imported translation processor;
Described semantic rules storehouse receives speech analyser retrieval visit, and result for retrieval is imported speech analyser;
Described translation processor is translated processing to described analysis result, obtains translation result, and translation result is stored in memory device, and/or by internet interface input internet, described translation result is imported the speech recognition conversion device;
Described speech recognition conversion device carries out speech conversion to described translation result, and voice output.
Image capture module is supported multiple portable terminal, as the cell phone of operating systems such as ipad, touch and ios, android.
By technique scheme, picture recognition voice translation method of the present invention and interpreting equipment thereof possess following advantage and beneficial effect at least:
The present invention is not subjected to voice restriction puzzlement, can namely clap namely and translate, and supports the shooting of multiple mobile terminal camera, has reduced the identification requirement to mobile whole camera pixel.The present invention supports the plurality of picture format identification, output voiced translation result, translation precision height.
The picture recognition speech translation apparatus is easy to operate, just can take pictures anywhere or anytime and translate so long as have the portable terminal of camera, satisfies the needs of more users.
Description of drawings
Fig. 1 is the process flow diagram of picture recognition voice translation method of the present invention;
Embodiment
Below in conjunction with accompanying drawing 1, describe the specific embodiment of the present invention in detail.
As shown in Figure 1, the software image acquisition module imports the file and picture (newspaper, magazine, books etc.) of mobile terminal camera head shooting, carries out image digitazation, perhaps directly imports the existing image of mobile phone.Image capture module imports the image recognition device with the image acquisition result.
As shown in Figure 1, for the collection result that surpasses threshold value, described step (1) comprises that also image capture module imports image segmentating device with the image acquisition result, described image segmentating device carries out cutting apart of statement and paragraph with described image acquisition result, segmentation result is analyzed, the comparative analysis result is also imported the image recognition device.
As shown in Figure 1, the image recognition device utilizes its lexicon that comprises and language library to carry out image recognition.The image recognition support is identified automatically, the frame choosing is identified and line identification, allows recognition result more meet the user and selects requirement.The image recognition device imports speech analyser with recognition result.
As shown in Figure 1, speech analyser utilizes the semantic rules storehouse that described recognition result is carried out language analysis, and analysis result is imported translation processor.
As shown in Figure 1, translation processor is translated processing with analysis result, obtains translation result.Except Chinese, also support multilingual, as English, Japanese, Korean etc.Translation result can be stored by memory device, also can insert the internet and share interaction.Preserve translation result, and have memory function, when translating identical vocabulary again, save analysis time, translation result directly is provided.Translation processor imports the speech recognition converter with translation result.
Translation processor is finished and is received described analysis result, finishes the residue translation, and the storage translation result is shared interactive translation result, and undesirable translation result is imported the image recognition device, and satisfactory translation result is imported the speech recognition conversion device.
As shown in Figure 1, the speech recognition conversion device with translation result with carrying out speech conversion, with voice output.
The preferred embodiment of the present invention just is used for helping to set forth the present invention.The present invention is not limited to above-mentioned embodiment, and under the situation that does not deviate from flesh and blood of the present invention, any distortion that it may occur to persons skilled in the art that, improvement, replacement all belong to scope of the present invention.

Claims (3)

1. a picture recognition voice translation method is as follows:
(1) image capture module carries out image acquisition to object picture, and the image acquisition result is imported the image recognition device;
(2) utilize self lexicon and language library, described image recognition device carries out image recognition to described image acquisition result, and recognition result is imported speech analyser;
(3) utilize the semantic rules storehouse, described speech analyser carries out language analysis to described recognition result, and analysis result is imported translation processor;
(4) described translation processor is translated processing to described analysis result, obtains translation result; Described translation processor carries out morphology, grammer, semantic test to described translation result; If described translation result fails inspection, described translation processor imports the image recognition device with described translation result; If described translation result is by inspection, described translation result is stored in memory device, and/or the input internet; Described translation processor imports the speech recognition conversion device with described translation result;
(5) described speech recognition conversion device carries out speech conversion to described translation result, and voice output.
2. according to the picture recognition voice translation method of claim 1, for the image acquisition result who surpasses threshold value, described step (1) comprises that also image capture module imports the image digitazation module with described image acquisition result and carries out image digitazation, described image digitazation module imports image segmentating device with described image digitazation result, described image segmentating device carries out cutting apart of statement and paragraph with described image digitazation result, segmentation result is analyzed, the comparative analysis result is imported the image recognition device; Described step (2) comprises that also described image recognition device utilizes described comparative analysis result that described image acquisition result is carried out image recognition.
3. a picture recognition speech translation apparatus comprises: image capture module, image digitazation module, image segmentating device, the image recognition device that comprises lexicon and language library, speech analyser, semantic rules storehouse, translation processor, memory device, internet interface, speech recognition conversion device;
Described image capture module imports the image recognition device with the image acquisition result, will import the image digitazation module above the image acquisition result of threshold value;
Described image digitazation module is carried out image digitazation to described image acquisition result, and described image digitazation module imports image segmentating device with described image digitazation result;
Described image segmentating device will carry out cutting apart of statement and paragraph above the image acquisition result of threshold value, and segmentation result is analyzed, and the comparative analysis result is imported the image recognition device;
The described image recognition device that comprises lexicon and language library with described image acquisition result, comparative analysis matching language lexicon and language library as a result, imports speech analyser with recognition result;
Described speech analyser carries out language analysis to described recognition result, and retrieval visit semantic rules storehouse receives the result for retrieval that the semantic rules storehouse imports, and analysis result is imported translation processor;
Described semantic rules storehouse receives speech analyser retrieval visit, and result for retrieval is imported speech analyser;
Described translation processor is translated processing to described analysis result, obtains translation result, and translation result is stored in memory device, and/or by internet interface input internet, described translation result is imported the speech recognition conversion device;
Described speech recognition conversion device carries out speech conversion to described translation result, and voice output.
CN2013102054637A 2013-05-27 2013-05-27 Image recognition and voiced translation method and image recognition and voiced translation device Pending CN103268316A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2013102054637A CN103268316A (en) 2013-05-27 2013-05-27 Image recognition and voiced translation method and image recognition and voiced translation device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2013102054637A CN103268316A (en) 2013-05-27 2013-05-27 Image recognition and voiced translation method and image recognition and voiced translation device

Publications (1)

Publication Number Publication Date
CN103268316A true CN103268316A (en) 2013-08-28

Family

ID=49011946

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2013102054637A Pending CN103268316A (en) 2013-05-27 2013-05-27 Image recognition and voiced translation method and image recognition and voiced translation device

Country Status (1)

Country Link
CN (1) CN103268316A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106855854A (en) * 2016-12-29 2017-06-16 北京奇虎科技有限公司 A kind of recognition methods of english information and device
CN106874264A (en) * 2017-02-17 2017-06-20 郑州云海信息技术有限公司 A kind of intelligent real-time translation system based on cloud computing
CN106980482A (en) * 2017-03-31 2017-07-25 联想(北京)有限公司 A kind of information displaying method and the first electronic equipment
CN107957994A (en) * 2017-10-30 2018-04-24 努比亚技术有限公司 A kind of interpretation method, terminal and computer-readable recording medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5546538A (en) * 1993-12-14 1996-08-13 Intel Corporation System for processing handwriting written by user of portable computer by server or processing by the computer when the computer no longer communicate with server
JP2003178260A (en) * 2001-12-10 2003-06-27 Canon Inc Data processing method
CN1584874A (en) * 2004-06-15 2005-02-23 汪兰珍 Intelligent collecting, linguistic intertranslation, speech synthetic method and apparatus
CN1758671A (en) * 2004-10-09 2006-04-12 乐金电子(中国)研究开发中心有限公司 Mobile communication terminal with function for converting shooted letters into voice and method thereof
US20060285748A1 (en) * 2005-06-15 2006-12-21 Fuji Xerox Co., Ltd. Document processing device
CN101211335A (en) * 2006-12-27 2008-07-02 乐金电子(中国)研究开发中心有限公司 Mobile communication terminal with translation function, translation system and translation method
CN101354748A (en) * 2007-07-23 2009-01-28 英华达(上海)电子有限公司 Device, method and mobile terminal for recognizing character

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5546538A (en) * 1993-12-14 1996-08-13 Intel Corporation System for processing handwriting written by user of portable computer by server or processing by the computer when the computer no longer communicate with server
JP2003178260A (en) * 2001-12-10 2003-06-27 Canon Inc Data processing method
CN1584874A (en) * 2004-06-15 2005-02-23 汪兰珍 Intelligent collecting, linguistic intertranslation, speech synthetic method and apparatus
CN1758671A (en) * 2004-10-09 2006-04-12 乐金电子(中国)研究开发中心有限公司 Mobile communication terminal with function for converting shooted letters into voice and method thereof
US20060285748A1 (en) * 2005-06-15 2006-12-21 Fuji Xerox Co., Ltd. Document processing device
CN101211335A (en) * 2006-12-27 2008-07-02 乐金电子(中国)研究开发中心有限公司 Mobile communication terminal with translation function, translation system and translation method
CN101354748A (en) * 2007-07-23 2009-01-28 英华达(上海)电子有限公司 Device, method and mobile terminal for recognizing character

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
OVER_88: "OCR", 《HTTP://WWW.BAIKE.COM/WIKDOC/SP/QR/HISTORY/VERSION.DO?VER=13&HISIDEN=KULEDX0VE,BFDRHGZTUG,FZQW》 *
ニ仴菏: "OCR(光学字符识别)", 《HTTP://BAIKE.SOGOU.COM/H609092.HTM?SP=L36854880》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106855854A (en) * 2016-12-29 2017-06-16 北京奇虎科技有限公司 A kind of recognition methods of english information and device
CN106874264A (en) * 2017-02-17 2017-06-20 郑州云海信息技术有限公司 A kind of intelligent real-time translation system based on cloud computing
CN106980482A (en) * 2017-03-31 2017-07-25 联想(北京)有限公司 A kind of information displaying method and the first electronic equipment
CN106980482B (en) * 2017-03-31 2020-03-24 联想(北京)有限公司 Information display method and first electronic equipment
US10685642B2 (en) 2017-03-31 2020-06-16 Lenovo (Beijing) Co., Ltd. Information processing method
CN107957994A (en) * 2017-10-30 2018-04-24 努比亚技术有限公司 A kind of interpretation method, terminal and computer-readable recording medium

Similar Documents

Publication Publication Date Title
CN109741732B (en) Named entity recognition method, named entity recognition device, equipment and medium
CN110675854B (en) Chinese and English mixed speech recognition method and device
US11043213B2 (en) System and method for detection and correction of incorrectly pronounced words
CN111145720B (en) Method, system, device and storage medium for converting text into voice
CN109686383B (en) Voice analysis method, device and storage medium
CN109801628B (en) Corpus collection method, apparatus and system
AU2015318386A1 (en) Intelligent scoring method and system for text objective question
US20080294433A1 (en) Automatic Text-Speech Mapping Tool
JP2020030408A (en) Method, apparatus, device and medium for identifying key phrase in audio
CN109192194A (en) Voice data mask method, device, computer equipment and storage medium
CN107564528B (en) Method and equipment for matching voice recognition text with command word text
US20230089308A1 (en) Speaker-Turn-Based Online Speaker Diarization with Constrained Spectral Clustering
CN113450774A (en) Training data acquisition method and device
CN112818680B (en) Corpus processing method and device, electronic equipment and computer readable storage medium
CN103268316A (en) Image recognition and voiced translation method and image recognition and voiced translation device
CN109872714A (en) A kind of method, electronic equipment and storage medium improving accuracy of speech recognition
CN111881297A (en) Method and device for correcting voice recognition text
CN111402892A (en) Conference recording template generation method based on voice recognition
Martínez-Villaronga et al. Language model adaptation for video lectures transcription
CN116246610A (en) Conference record generation method and system based on multi-mode identification
CN115269884A (en) Method, device and related equipment for generating video corpus
CN115273834A (en) Translation machine and translation method
CN113393841B (en) Training method, device, equipment and storage medium of voice recognition model
CN112734604A (en) Device for providing multi-mode intelligent case report and record generation method thereof
CN110415689B (en) Speech recognition device and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130828