EP0138079A2 - Character recognition apparatus and method for recognising characters associated with diacritical marks - Google Patents
Character recognition apparatus and method for recognising characters associated with diacritical marks Download PDFInfo
- Publication number
- EP0138079A2 EP0138079A2 EP84111043A EP84111043A EP0138079A2 EP 0138079 A2 EP0138079 A2 EP 0138079A2 EP 84111043 A EP84111043 A EP 84111043A EP 84111043 A EP84111043 A EP 84111043A EP 0138079 A2 EP0138079 A2 EP 0138079A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- character
- characters
- unknown
- diacritical
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/10—Image acquisition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/15—Cutting or merging image elements, e.g. region growing, watershed or clustering-based techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Definitions
- the present invention relates to the field of character recognition and more particularly to the recognition of characters or symbols which may have associated therewith diacritical marks, using optical character recognition apparatus.
- the Japanese phonetic alphabet Katakana is of this type, and this alphabet will be used herein to describe the present invention, by way of example and not by way of limitation.
- Prior art character recognition apparatus has taken into account the need to provide special means to accommodate characters which may have an associated diacritical mark and to recognise the difference between a character with a diacritical mark and one without.
- United States Patent A-3,710,321 provides such an arrangement.
- a central, horizontal row area contains the major characters or base symbols of the alphabet. Certain of these major characters can have diacritical marks associated therewith, in areas above or below the major character.
- Each Katakana sound is a syllable formed by selectively adding each of the vowels A, I, U, E and O to each of the consonants K, S, T, N, H, M, Y, R and W.
- the combinations YI, YE, WI, WU and WE are excluded.
- the A, I, U, E, O and N sounds by themselves are also included.
- the pronunciation for certain of these basic symbols can be modified by adding diacritical marks, either two small lines collectively called a nigori or a small circle called a maru, immediately adjacent the upper right of the basic symbol.
- diacritical marks either two small lines collectively called a nigori or a small circle called a maru
- a long vowel symbol written as a dash follows certain basic symbols to alter their pronunciation.
- the two diacritical marks plus the long vowel symbol are combined with certain of the basic letters to expand the overall Katakana alphabet to include 72 characters. Those Katakana characters with a diacritical mark are often called sonants, and those without a diacritical mark are called non-sonants.
- the object of the present invention is to provide improved character recognition apparatus and a method of operating such apparatus for the identification of unknown characters and any diacritical marks associated therewith.
- a method of recognising unknown characters ot a known character set, some of the characters having diacritical marks associated therewith, including the steps of
- Apparatus in accordance with the present invention can be used with Katakana characters when the diacritical marks are either written in their natural adjacent form, or separately written under the constraints required by the above-mentioned OCR machines. All seventy-two Katakana alphabet shapes plus the two diacritical marks as separate stand-alone symbols can be recognised utilising the present invention, without special rules, thereby allowing optical character recognition to be extended to include general public handwriting and writing done according to the rules of the Katakana language.
- Apparatus in accordance with the present invention for recognising unknown characters, where some of the characters may be associated with diacritical marks operates as follows.
- the character data of an unknown character which may be associated with a diacritical mark is stored.
- From the stored character data is extracted a portion of the data that corresponds to the predetermined area of the unknown character space which is the expected location ot a diacritical mark.
- the extracted diacritical mark character data and the rest of the stored character data of the unknown character are examined to recognise the character.
- the diacritical marks are located to the upper right portion of the character.
- the portion of the character data of the unknown character that corresponds to the upper right portion of the character space is extracted and examined for recognition.
- This extracted diacritical mark character data is upper-right justified and remains un-normatised during the examination and recognition process.
- the remaining character data ot the unknown character is normalised and also examined for recognition of the Katakana character.
- Characters modified by diacritical marks are normally wider than the same characters without diacritical marks and in implementing the present invention these differences are used to initially separate sonant characters from non-sonant characters.
- a row of unknown characters is scanned in their entirety parallel to the direction in which the lines are read, (i.e. horizontally read lines are horizontally scanned) and a horizontal pronie is generated for the row of unknown characters.
- Logical tests are performed on the resulting profile data to separate the profile into individual segments representing an unknown character including any diacritical mark.
- a proposed segmentation point for the horizontal profile coincides with a gap in the profile, the segmentation point is established.
- the horizontal protile is continuous (i.e.
- the location of the proposed segmentation point is adjusted as tollows. If the continuity of the profile extends less than a predetermined distance into the next-right-character position, the proposed segmentation point for the current character position is adjusted to the right to include the extended portion, which may be a diacritical mark. it the extension exceeds a predetermined distance, the position of the segmentation point can be adjusted to the lett to account for a possible stroke extension of the next character (i.e. such as a diacritical mark or an adjacent character) written immediately to the right of the current character.
- the present invention permits recognition of characters that have been written under the usual handwritten rules or under the constraints of prior art OCR machines.
- Height and width parameters are used to initially separate a modified character from an unmodified character by analysing their relative size from measurements of their horizontal and/or vertical profiles.
- sonant characters are initially separated trom non-sonant characters based upon differences in the horizontal width of the entire character, including any diacritical mark.
- the coding of the height and width of a character has been through exclusive-bit-coding, which is inferior to the inclusive-bit-coding used in one embodiment of the present invention.
- Exclusive-bit-coding sets a single unique bit for each specific value or range of height and width. With a single bit test, all characters with a specific height or width can be separated from all other characters which are either greater than or less than the specified value.
- the Japanese Industries Standard for assigning recognition results provides a unique one byte code point for each of the forty-six basic Katakana symbols, and for the two diacritical marks nigori and maru. These diacritical code points indicate that the sonant is present as a separate symbol in its own character space, but they do not distinguish between the diacritical mark written as a separate symbol from the more natural form where the diacritical mark is combined with the basic character. By assigning two additional code points one can indicate that the diacritical mark is included with the basic character.
- a particular procedure is used when a diacritical mark is recognised.
- the recognition result of the preceding character position is checked to verify that it is one of the twenty possible characters which can be modified by a nigori or that it is one of the five possible characters which can be modified by a maru. If the verification tails, the sonant and/or previous character recognitions can be rejected as invalid characters.
- FIG. 1 illustrates the use of an optical scanner 5 to generate data representing unknown characters scanned by the scanner.
- the scanner 5 is of a known construction and operation. It typically scans a document 6 on which are printed unknown characters, including any diacritical marks, in a direction parallel to the direction of reading of the characters.
- Figure 1 illustrates a horizontal scan ot a line ot characters 7. The scanner scans over the entire length of the document by moving either the document or scanner mechanism, and the entire width of the document by selecting a field of view of appropriate width and generates data representing the line of scanned characters.
- the generated character line data is preprocessed by generating a profile of the line of characters and dividing the profile into segments of individual characters, to thereby provide a block of character data for each individual character in the line.
- each block of character data has extracted therefrom the data representative of a predetermined localised area that corresponds to the expected location of a diacritical mark.
- the extracted character data and the rest of the character data are examined to recognise the respective diacritical mark and character.
- the preliminarily recognised character may then be subjected to a post processing verification procedure to ensure that any diacritical mark recognised is associated with a character that can may properly be associated with a diacritical mark under the applicable language rules.
- the illustrated characters HE, I, WA, N, KU, and H A do not include diacritical marks and are referred to as non-sonants.
- the character BU includes the diacritical mark nigori (double line) in the upper right corner, and the character PE includes the diacritical mark maru (circle), also in the upper right corner. Both ot these characters are reterred to as sonants.
- sonants Both ot these characters are reterred to as sonants.
- the width of each of the characters in the line and the spacing between the characters is determined by generating a horizontal profile of the line of characters.
- the optical scanner illustrated in Figure 1 scans over the line of characters in a series of scan lines extending parallel to the line of characters and spaced apart vertically and generates information representative of the characters in the line.
- This information consists of a set of coded signals for each of the scan lines representing the characters on the document.
- the scanner For each scan line the scanner generates logical zeros (0's) to represent a blank or background space, and logical ones (1's) to represent the presence of a portion of a character on the scan line.
- the scanned characters are thereby divided into picture elements (PELs).
- One method ot generating the horizontal profile is to sequentially provide the sets of binary data signals generated for each of the scan lines for a line of characters to a register having storage positions that correspond to the picture element (PEL) locations of the scanned characters.
- PEL picture element
- the binary data bits representing corresponding PELs in the sets of binary data are effectively logically combined using an OR function by successively providing them to the register.
- the corresponding register bit is set to a logical 1, and remains a logical 1 until the register is cleared after the entire line of characters has been scanned. For those PEL locations where there is only background space, the register bit will remain a logical zero.
- the horizontal character line profile such as the one illustrated at reference numeral 10 in Figure 2, appears as a series of black segments (logical 1's) separated by gaps of white (logical 0's).
- the black segments 11 correspond to the widths ot the characters and any associated diacritical marks.
- the white gaps 12 correspond to the separation between adjacent characters.
- the horizontal profile is next separated into segments that represent individual unknown characters, including any diacritical marks.
- an initial segmentation is made between characters based upon a given pitch (spacing between centres of adjacent characters), or the calculation of the pitch, which may be determined using known techniques.
- Logical tests are made to determine if an initial proposed segmentation point coincides with a natural segmentation point or gap between adjacent segments ot the horizontal profile corresponding to gaps between adjacent characters.
- the proposed segmentation point 20 coincides with a gap Zl between adjacent horizontal profile segments 22, 23, the proposed segmentation point is established as a separation point between characters.
- the proposed segmentation point 20 does not coincide with a natural gap between adjacent horizontal profile segments, tor example where the characters overlap or where a diacritical mark extends into the space occupied by the next-right-character, the horizontal protile is progressively tested by moving the proposed segmentation point up to a predetermined distance (for example 1.5mm) into the next character space to the right, as shown in Figure 4B. If the proposed segmentation point is moved by this testing operation into a position in which it coincides with a profile segment gap, the proposed segmentation point is established as a separation point between characters in its modified position, as illustrated by the dashed line in Figure 4B.
- a predetermined distance for example 1.5mm
- the sonant characters BU and PE are examples ot characters that may have a wider than expected horizontal profile segments, and it would be necessary to employ the above proposed segmentation point position modification technique in order to establish the gap between a wider character of this type and an adjacent character.
- the proposed segmentation point cannot be made to coincide with a profile segment gap by moving the proposed segmentation point up to a predetermined distance to the right, as illustrated by the arrow labelled 1, the proposed segmentation point is progressively moved up to a predetermined distance (for example 1.5mm) to the left, as illustrated by the arrow labelled 2. If a proposed segmentation point moved to the left coincides with a profile segment gap, the proposed segmentation point in its modified position is established as a separation point between characters, as illustrated by the dashed line in Figure 4C.
- the initial position of the proposed segmentation point is established as a separation point between characters, as illustrated by the arrow labelled 3. This accommodates the instance ot adjacent characters over-lapping, as shown in Figure 2 by the letters N and KU.
- This segmentation technique enables each diacritical mark to be included with its proper base character, according to the usual language rules for writing Katakana sonants.
- K atakana characters with diacritical marks are generally taller and wider than those without.
- a plot of the width of the horizontal character line profile, which is directly related to actual character width, versus the frequency ot character occurrence reflects that sonants are generally wider than non-sonants.
- width parameters can be used to reliably separate modified characters from unmodified characters.
- sonant identification begins, as illustrated in Figure 6, by encoding the character size and separating sonant characters from non-sonant characters based upon differences in the width of the horizontal character profiles, as illustrated in Figure 5.
- the character data corresponding to the localised area expected to contain the diacritical mark is extracted from each block of individual character data and justified. The extracted character data and the rest of the stored character data are then passed to the character recognition logic for identification of the unknown character.
- apparatus uses inclusive-bit-coding to logically store the height and width of the characters, including any diacritical marks.
- a 16-bit word is used to represent a character space that is 64 PELS wide, each bit corresponding to a width range of 4 PELS.
- Prior art techniques use exclusive-bit-encoding to set a single digit indicative ot the width of each character. With this type ot coding using a single bit test, all characters having a specific width can be separated from all others which are either greater or less than a specific value. However, to identify all characters greater than a specific value, a number of tests are required.
- inclusive-bit-encoding all bits of lower order than the bit set to represent a specific width are set to the same logic state (e. g. a logical 1).
- the width of a character is 38 PELS
- the inclusive-bit-encoded word associated with the character is "llll llll 1100 OOuu”.
- the exclusive-bit-encoded word is "0000 0000 0100 0000”.
- An additional advantage ot the inclusive-bit-coding is for automatic design of recognition logics.
- Automatic design programs usually use a statistical decision algorithm which selects one bit to separate two distinct classes ot characters based on some minimum error criteria. This coding scheme provides optimum intormation for such an algorithm.
- the character data corresponding to a predetermined localised area of a character space which is the expected location of a diacritical mark is extracted and examined to recognise any diacritical mark present.
- the diacritical mark is expected to be located in the upper right portion ot the character space , and a sonant cut-out or window 30, identified by the dashed lines in Figure 8, is defined and the character data corresponding to this localised area is placed into a sonant cut-out buffer.
- this cut-out is defined by the rightmost 13 PELS of the top 16 scan lines of the character image data (i.e. 1.6mm by 2.1mm). This size is believed to be optimum because it contains sufficient information for diacritical mark identification, it allows efficient data handling for microprocessor implementation, and is compatible with the recognition procedures for the base character.
- This location for the window was chosen because it provides the least interference with stroke segments of non-sonant characters or the base character of a sonant character.
- a fixed window location turther provides a simple, efficient and reliable cutting process.
- the window size or location may be changed as necessary or desirable for other alphabets to accommodate different diacritical marks, but, for the Katakana alphabet, it is preferred to have the upper boundary level with the top ot the base character and the right boundary level with the right-most portion of the base character.
- the character data within the window is registered to the upper right when it is placed into the sonant cut-out buffer, and it is stored in its un-normalised form. This locates the diacritical mark in the upper right corner of the window, which facilitates recognition of the diacritical mark, increases the reliability of the recognition, and reduces the storage required tor the recognition logic. Character data within the window that corresponds to the base character is superfluous.
- the character data in the sonant cut-out buffer may also be used to determine the presence or absence of certain diacritical mark features that may be used in the recognition process. These features are implemented in a tree-type logical testing structure, for example to separate sonant characters from non-sonant characters and to separate the two mark nigori from the circle maru.
- Typical character recognition techniques include normalising or reducing the unknown character data for comparison with known sets of character data of a standardised size. Since diacritical marks are usually smaller than the base character, normalisation may result in a loss of small details necessary to recognise the diacritical mark, or a loss of resolution. To retain this information, the original un-normalised character data corresponding to the expected location area for the diacritical mark is extracted from the character data for the character and examined in its un-normalised condition. However, due to the larger size of the base character, its character data is often normalised. Final character recognition is then performed using the normalised character data of the unknown character, the un-normalised character data ot the diacritical mark, and any diacritical mark features.
- the recognition result for the preceding character is examined to test if the preceding character is one of the twenty possible characters that may include a nigori or one of the five possible characters that may include a maru. If the preceding character turns out not to be a valid sonant base character, one or more reject codes will be generated.
- the aforesaid prior art OCR machines operate under the Japanese Industry Standard which assigns a unique hexadecimal byte to each of the 46 kana symbols, and assigns the hexadecimal bytes "BE” to the nigori diacritical mark and "BF" to the maru diacritical mark. These bytes identity the diacritical mark, and indicate that the diacritical mark is written as a symbol separate and apart trom a base character in its own character space.
- the present invention assigns two new code points to identify adjacent, hand written diacritical marks, yet includes the Japanese Industry Standard so that the output is compatible with existing U CR equipment. More specifically, "2F" is assigned to adjacent nigori, and "41" is assigned to adjacent maru.
Abstract
Description
- The present invention relates to the field of character recognition and more particularly to the recognition of characters or symbols which may have associated therewith diacritical marks, using optical character recognition apparatus.
- Languages by which human beings communicate employ a set of symbols which comprise an alphabet. Certain of these symbols can be modified by designated signs or marks, called diacritical marks, which are positioned as required by the rules of the language. These diacritical marks may, for example, require an altered pronunciation of the symbol or base character with which they are associated.
- The Japanese phonetic alphabet Katakana is of this type, and this alphabet will be used herein to describe the present invention, by way of example and not by way of limitation.
- Prior art character recognition apparatus has taken into account the need to provide special means to accommodate characters which may have an associated diacritical mark and to recognise the difference between a character with a diacritical mark and one without. United States Patent A-3,710,321 provides such an arrangement. In the device of this patent, a central, horizontal row area contains the major characters or base symbols of the alphabet. Certain of these major characters can have diacritical marks associated therewith, in areas above or below the major character. When a vertical scan of a major character, and the recognition of this character, indicates that this character is of the class with which a diacritical mark can be associated, then vertical scanning of the next major character is momentarily interrupted, and the scan is diverted to the diacritical area above or below (as the case may be) the just-recognised character. Thereafter, scanning of the major character continues, and special upper or lower diacritical recognition logic is enabled as such diacritical upper or lower areas are scanned.
- Of the two Japanese alphabets, Hiragana and Katakana, the latter is the accepted means of intertacing or representing the Japanese language to data processing equipment by means of character recognition apparatus. Each Katakana sound is a syllable formed by selectively adding each of the vowels A, I, U, E and O to each of the consonants K, S, T, N, H, M, Y, R and W. The combinations YI, YE, WI, WU and WE are excluded. The A, I, U, E, O and N sounds by themselves are also included. The pronunciation for certain of these basic symbols can be modified by adding diacritical marks, either two small lines collectively called a nigori or a small circle called a maru, immediately adjacent the upper right of the basic symbol. In addition, a long vowel symbol written as a dash follows certain basic symbols to alter their pronunciation. The two diacritical marks plus the long vowel symbol are combined with certain of the basic letters to expand the overall Katakana alphabet to include 72 characters. Those Katakana characters with a diacritical mark are often called sonants, and those without a diacritical mark are called non-sonants.
- This large symbol set makes manual keying a difficult, slow and costly means of entering data for processing in data processing apparatus. Several prior art optical character recognition (OCR) machines have been developed that automatically read handwritten Katakana symbols. However, due to the complexity of the sonant characters and the close location of the diacritical mark to the base symbol, these machines require that the diacritical mark be written as a separate mark, in its own character space, clearly separated from the base symbol which it modifies. As a result, only the 46 basic non-sonant character shapes plus the two separated and isolated diacritical marks are machine readable, and the 25 sonant characters written in their natural form with the diacritical mark located upper-right and adjacent to the basic symbol cannot be read. Thus, a special set of writing rules that differs from the usual rules of the Katakana language must be utilised for these OCR machines.
- The object of the present invention is to provide improved character recognition apparatus and a method of operating such apparatus for the identification of unknown characters and any diacritical marks associated therewith.
- According to the present character recognition apparatus tor recognising unknown characters of a known character set, some of the characters having diacritical marks associated therewith, including
- storage means for storing character data ot an unknown character which may be associated with a diacritical mark, and
- recognition means for examining the stored character data in order to recognise unknown characters,
- said recognition means comprises
- extracting means for extracting from the stored character data the character data corresponding to a predetermined portion of the character which is the expected location of a diacritical mark, and
- examining means for examining the extracted diacritical mark character data and the rest of the stored character data of the unknown character in order to recognise the unknown character and any diacritical mark associated therewith.
- According to another aspect of the invention a method of recognising unknown characters ot a known character set, some of the characters having diacritical marks associated therewith, including the steps of
- storing character data of an unknown character which may be associated with a diacritical mark, and
- examining the stored character data in order to recognise unknown characters,
- saict examination process comprises
- extracting from said stored character data the character data corresponding to a predetermined localised area which is the expected location of a diacritical mark, and
- examining said extracted diacritical mark character data and the rest ot the stored character data of said unknown character in order to recognise said unknown character and any diacritical mark associated therewith.
- Apparatus in accordance with the present invention can be used with Katakana characters when the diacritical marks are either written in their natural adjacent form, or separately written under the constraints required by the above-mentioned OCR machines. All seventy-two Katakana alphabet shapes plus the two diacritical marks as separate stand-alone symbols can be recognised utilising the present invention, without special rules, thereby allowing optical character recognition to be extended to include general public handwriting and writing done according to the rules of the Katakana language.
- Apparatus in accordance with the present invention for recognising unknown characters, where some of the characters may be associated with diacritical marks operates as follows. The character data of an unknown character which may be associated with a diacritical mark is stored. From the stored character data is extracted a portion of the data that corresponds to the predetermined area of the unknown character space which is the expected location ot a diacritical mark. The extracted diacritical mark character data and the rest of the stored character data of the unknown character are examined to recognise the character. For example, in the Katakana alphabet the diacritical marks are located to the upper right portion of the character. The portion of the character data of the unknown character that corresponds to the upper right portion of the character space is extracted and examined for recognition. This extracted diacritical mark character data is upper-right justified and remains un-normatised during the examination and recognition process. The remaining character data ot the unknown character is normalised and also examined for recognition of the Katakana character.
- Characters modified by diacritical marks are normally wider than the same characters without diacritical marks and in implementing the present invention these differences are used to initially separate sonant characters from non-sonant characters. A row of unknown characters is scanned in their entirety parallel to the direction in which the lines are read, (i.e. horizontally read lines are horizontally scanned) and a horizontal pronie is generated for the row of unknown characters. Logical tests are performed on the resulting profile data to separate the profile into individual segments representing an unknown character including any diacritical mark. When a proposed segmentation point for the horizontal profile coincides with a gap in the profile, the segmentation point is established. When the horizontal protile is continuous (i.e. when adjacent characters overlap) at a proposed segmentation point, the location of the proposed segmentation point is adjusted as tollows. If the continuity of the profile extends less than a predetermined distance into the next-right-character position, the proposed segmentation point for the current character position is adjusted to the right to include the extended portion, which may be a diacritical mark. it the extension exceeds a predetermined distance, the position of the segmentation point can be adjusted to the lett to account for a possible stroke extension of the next character (i.e. such as a diacritical mark or an adjacent character) written immediately to the right of the current character. Thus, tor the Katakana alphabet, the present invention permits recognition of characters that have been written under the usual handwritten rules or under the constraints of prior art OCR machines.
- Height and width parameters are used to initially separate a modified character from an unmodified character by analysing their relative size from measurements of their horizontal and/or vertical profiles. For the Katakana alphabet, sonant characters are initially separated trom non-sonant characters based upon differences in the horizontal width of the entire character, including any diacritical mark. In the past, the coding of the height and width of a character has been through exclusive-bit-coding, which is inferior to the inclusive-bit-coding used in one embodiment of the present invention. Exclusive-bit-coding sets a single unique bit for each specific value or range of height and width. With a single bit test, all characters with a specific height or width can be separated from all other characters which are either greater than or less than the specified value. To assist in identifying Katakana sonant characters it is desirable to separate all characters which are less than a specific height or width (i.e. non-sonants) from all other characters which are greater than that value (i.e. sonants). with exclusive-bit-coding this requires the testing of multiple bits. Using inclusive-bit-coding, a continuous string of bits is set to indicate that the character is at least as wide as each of the bits that is set. By testing a single bit it is possible to separate all characters which are greater than a specific height or width from all those which are less than or equal to this value. Thus the usually wider and taller sonant characters having diacritical marks may be separated from the non-sonant characters that do not include a diacritical mark.
- With respect to the encoding of the recognition results tor Katakana characters, the Japanese Industries Standard for assigning recognition results provides a unique one byte code point for each of the forty-six basic Katakana symbols, and for the two diacritical marks nigori and maru. These diacritical code points indicate that the sonant is present as a separate symbol in its own character space, but they do not distinguish between the diacritical mark written as a separate symbol from the more natural form where the diacritical mark is combined with the basic character. By assigning two additional code points one can indicate that the diacritical mark is included with the basic character.
- In accordance with one embodiment of the invention a particular procedure is used when a diacritical mark is recognised. The recognition result of the preceding character position is checked to verify that it is one of the twenty possible characters which can be modified by a nigori or that it is one of the five possible characters which can be modified by a maru. If the verification tails, the sonant and/or previous character recognitions can be rejected as invalid characters.
- In order that the invention may be more readily understood an embodiment of the invention will now be described with reference to the accompanying drawings, in whicn:
- Figure 1 is a schematic diagram reflecting the generation of data representing unknown characters by an optical scanner and the utilisation of the apparatus and method of the present invention for processing the data to obtain an output representing the unknown characters,
- Figure 2 ihustrates a line of several Katakana characters and the horizontal profile for the line of characters,
- Figure 3 is a flowchart illustrating the pre-processing steps tor segmenting a horizontal profile,
- Figures 4A-4D illustrate in greater detail the sequence ot steps set forth in the flowchart of Figure 3,
- Figure 5 is a graph of the frequency of character occurrence versus the width of the horizontal profile for Katakana non-sonant characters and Katakana sonant characters,
- Figure 6 is a flowchart illustrating the steps for sonant identnication,
- Figure 7 is a chart comparing exclusive-bit-coding and inclusive-bit-coding,
- Figure 8 illustrates a Katakana character associated with a diacritical mark, with the diacritical mark enclosed in a predetermined localised area, as indicated by dashed lines,
- Figure 9 illustrates the data in the predetermined localised area of Figure 8 justified to the upper right, and
- Figure 10 is a flowchart illustrating a post-processing verification technique to validate the proper association ot a diacritical mark with a character.
- It is to be understood that although Japanese Katakana characters are illustrated, the present invention may be utilised with any alphabet having some characters that can include diacritical marks and other characters which cannot include diacritical marks. The diacritical marks may be located in any predetermined areas with respect to the characters, according to the rules of the language.
- The schematic diagram of Figure 1 illustrates the use of an
optical scanner 5 to generate data representing unknown characters scanned by the scanner. Thescanner 5 is of a known construction and operation. It typically scans adocument 6 on which are printed unknown characters, including any diacritical marks, in a direction parallel to the direction of reading of the characters. Figure 1 illustrates a horizontal scan ot aline ot characters 7. The scanner scans over the entire length of the document by moving either the document or scanner mechanism, and the entire width of the document by selecting a field of view of appropriate width and generates data representing the line of scanned characters. The generated character line data is preprocessed by generating a profile of the line of characters and dividing the profile into segments of individual characters, to thereby provide a block of character data for each individual character in the line. To classify the sonant characters, each block of character data has extracted therefrom the data representative of a predetermined localised area that corresponds to the expected location of a diacritical mark. The extracted character data and the rest of the character data are examined to recognise the respective diacritical mark and character. The preliminarily recognised character may then be subjected to a post processing verification procedure to ensure that any diacritical mark recognised is associated with a character that can may properly be associated with a diacritical mark under the applicable language rules. - Reterring to Figure 2, the illustrated characters HE, I, WA, N, KU, and HA do not include diacritical marks and are referred to as non-sonants. The character BU includes the diacritical mark nigori (double line) in the upper right corner, and the character PE includes the diacritical mark maru (circle), also in the upper right corner. Both ot these characters are reterred to as sonants. As a preliminary to the process of recognising the characters in the line of characters the width of each of the characters in the line and the spacing between the characters is determined by generating a horizontal profile of the line of characters.
- In order to generate such a profile, the optical scanner illustrated in Figure 1 scans over the line of characters in a series of scan lines extending parallel to the line of characters and spaced apart vertically and generates information representative of the characters in the line. This information consists of a set of coded signals for each of the scan lines representing the characters on the document. For each scan line the scanner generates logical zeros (0's) to represent a blank or background space, and logical ones (1's) to represent the presence of a portion of a character on the scan line. The scanned characters are thereby divided into picture elements (PELs). One method ot generating the horizontal profile is to sequentially provide the sets of binary data signals generated for each of the scan lines for a line of characters to a register having storage positions that correspond to the picture element (PEL) locations of the scanned characters. Beginning with a clear register (i.e. all logical zeros), the binary data bits representing corresponding PELs in the sets of binary data are effectively logically combined using an OR function by successively providing them to the register. For each logical 1 in the sets of binary data, the corresponding register bit is set to a logical 1, and remains a logical 1 until the register is cleared after the entire line of characters has been scanned. For those PEL locations where there is only background space, the register bit will remain a logical zero. Having provided all of the sets of binary data to the register, it will reflect which horizontal positions have portions of character data present, and this data is reflected in the horizontal profile. Such a method of generating a horizontal character profile is described in United States Patent Application Serial No. 537280 filed 29 September 1983.
- The horizontal character line profile, such as the one illustrated at
reference numeral 10 in Figure 2, appears as a series of black segments (logical 1's) separated by gaps of white (logical 0's). Theblack segments 11 correspond to the widths ot the characters and any associated diacritical marks. Thewhite gaps 12 correspond to the separation between adjacent characters. - The horizontal profile is next separated into segments that represent individual unknown characters, including any diacritical marks.
- Heferring to Figures 3 and 4, an initial segmentation is made between characters based upon a given pitch (spacing between centres of adjacent characters), or the calculation of the pitch, which may be determined using known techniques. Logical tests are made to determine if an initial proposed segmentation point coincides with a natural segmentation point or gap between adjacent segments ot the horizontal profile corresponding to gaps between adjacent characters.
- As illustrated in Figure 4A, if the proposed
segmentation point 20 coincides with a gap Zl between adjacenthorizontal profile segments - It the proposed
segmentation point 20 does not coincide with a natural gap between adjacent horizontal profile segments, tor example where the characters overlap or where a diacritical mark extends into the space occupied by the next-right-character, the horizontal protile is progressively tested by moving the proposed segmentation point up to a predetermined distance (for example 1.5mm) into the next character space to the right, as shown in Figure 4B. If the proposed segmentation point is moved by this testing operation into a position in which it coincides with a profile segment gap, the proposed segmentation point is established as a separation point between characters in its modified position, as illustrated by the dashed line in Figure 4B. Referring to Figure 2, the sonant characters BU and PE are examples ot characters that may have a wider than expected horizontal profile segments, and it would be necessary to employ the above proposed segmentation point position modification technique in order to establish the gap between a wider character of this type and an adjacent character. - As shown in Figure 4C, if the proposed segmentation point cannot be made to coincide with a profile segment gap by moving the proposed segmentation point up to a predetermined distance to the right, as illustrated by the arrow labelled 1, the proposed segmentation point is progressively moved up to a predetermined distance (for example 1.5mm) to the left, as illustrated by the arrow labelled 2. If a proposed segmentation point moved to the left coincides with a profile segment gap, the proposed segmentation point in its modified position is established as a separation point between characters, as illustrated by the dashed line in Figure 4C.
- As shown in Figure 4U, if the proposed segmentation point could not be moved to the right (arrow 1) or to the left (arrow 2) to coincide with a natural separation point between characters, the initial position of the proposed segmentation point is established as a separation point between characters, as illustrated by the arrow labelled 3. This accommodates the instance ot adjacent characters over-lapping, as shown in Figure 2 by the letters N and KU.
- This segmentation technique enables each diacritical mark to be included with its proper base character, according to the usual language rules for writing Katakana sonants. One may alternatively, or additionally, use a vertical character profile, which is generated by suitable manipulation of the data from the scanner, or by physically reorienting the scanner with respect to the scanned document.
- As noted earlier, Katakana characters with diacritical marks are generally taller and wider than those without. As illustrated in Figure b, a plot of the width of the horizontal character line profile, which is directly related to actual character width, versus the frequency ot character occurrence reflects that sonants are generally wider than non-sonants. Thus, width parameters can be used to reliably separate modified characters from unmodified characters.
- Having separated the character line data into blocks of individual character data by segmenting the horizontal character line profiles as defined above, sonant identification begins, as illustrated in Figure 6, by encoding the character size and separating sonant characters from non-sonant characters based upon differences in the width of the horizontal character profiles, as illustrated in Figure 5. Next, the character data corresponding to the localised area expected to contain the diacritical mark is extracted from each block of individual character data and justified. The extracted character data and the rest of the stored character data are then passed to the character recognition logic for identification of the unknown character.
- To encode character size, apparatus according to the present invention uses inclusive-bit-coding to logically store the height and width of the characters, including any diacritical marks. Referring to Figure 7, a 16-bit word is used to represent a character space that is 64 PELS wide, each bit corresponding to a width range of 4 PELS. Prior art techniques use exclusive-bit-encoding to set a single digit indicative ot the width of each character. With this type ot coding using a single bit test, all characters having a specific width can be separated from all others which are either greater or less than a specific value. However, to identify all characters greater than a specific value, a number of tests are required.
- Using inclusive-bit-encoding, all bits of lower order than the bit set to represent a specific width are set to the same logic state (e. g. a logical 1). By way of example, it the width of a character is 38 PELS, the inclusive-bit-encoded word associated with the character is "llll llll 1100 OOuu". By comparison, the exclusive-bit-encoded word is "0000 0000 0100 0000". Using inclusive-bit-coding, it is necessary to test only a single bit of the inclusive-bit-encoded words to separate wider (or taller) characters from smaller characters. Again by way of example, by testing
bit 10 of the inclusive-bit-encoded register of Figure 7, all characters less than or equal to 40 PELS wide can be separated from all characters which are greater than 4U PELS wide. To accomplish the same result by testing the exclusive-bit-encoded register would require a minimum of seven bit tests for each word. - An additional advantage ot the inclusive-bit-coding is for automatic design of recognition logics. Automatic design programs usually use a statistical decision algorithm which selects one bit to separate two distinct classes ot characters based on some minimum error criteria. This coding scheme provides optimum intormation for such an algorithm.
- Referring to Figures 8 and 9, the character data corresponding to a predetermined localised area of a character space which is the expected location of a diacritical mark is extracted and examined to recognise any diacritical mark present.
- For Katakana characters, the diacritical mark is expected to be located in the upper right portion ot the character space , and a sonant cut-out or
window 30, identified by the dashed lines in Figure 8, is defined and the character data corresponding to this localised area is placed into a sonant cut-out buffer. In a preferred embodiment this cut-out is defined by the rightmost 13 PELS of the top 16 scan lines of the character image data (i.e. 1.6mm by 2.1mm). This size is believed to be optimum because it contains sufficient information for diacritical mark identification, it allows efficient data handling for microprocessor implementation, and is compatible with the recognition procedures for the base character. This location for the window was chosen because it provides the least interference with stroke segments of non-sonant characters or the base character of a sonant character. A fixed window location turther provides a simple, efficient and reliable cutting process. The window size or location may be changed as necessary or desirable for other alphabets to accommodate different diacritical marks, but, for the Katakana alphabet, it is preferred to have the upper boundary level with the top ot the base character and the right boundary level with the right-most portion of the base character. - in the preferred embodiment the character data within the window is registered to the upper right when it is placed into the sonant cut-out buffer, and it is stored in its un-normalised form. This locates the diacritical mark in the upper right corner of the window, which facilitates recognition of the diacritical mark, increases the reliability of the recognition, and reduces the storage required tor the recognition logic. Character data within the window that corresponds to the base character is superfluous.
- The character data in the sonant cut-out buffer may also be used to determine the presence or absence of certain diacritical mark features that may be used in the recognition process. These features are implemented in a tree-type logical testing structure, for example to separate sonant characters from non-sonant characters and to separate the two mark nigori from the circle maru.
- Typical character recognition techniques include normalising or reducing the unknown character data for comparison with known sets of character data of a standardised size. Since diacritical marks are usually smaller than the base character, normalisation may result in a loss of small details necessary to recognise the diacritical mark, or a loss of resolution. To retain this information, the original un-normalised character data corresponding to the expected location area for the diacritical mark is extracted from the character data for the character and examined in its un-normalised condition. However, due to the larger size of the base character, its character data is often normalised. Final character recognition is then performed using the normalised character data of the unknown character, the un-normalised character data ot the diacritical mark, and any diacritical mark features.
- following character recognition it may be desirable to verify that a separately identified diacritical mark should be associated with the preceding character. Referring to Figure 10, if a diacritical mark is present, the recognition result for the preceding character is examined to test if the preceding character is one of the twenty possible characters that may include a nigori or one of the five possible characters that may include a maru. If the preceding character turns out not to be a valid sonant base character, one or more reject codes will be generated.
- The aforesaid prior art OCR machines operate under the Japanese Industry Standard which assigns a unique hexadecimal byte to each of the 46 kana symbols, and assigns the hexadecimal bytes "BE" to the nigori diacritical mark and "BF" to the maru diacritical mark. These bytes identity the diacritical mark, and indicate that the diacritical mark is written as a symbol separate and apart trom a base character in its own character space. The present invention assigns two new code points to identify adjacent, hand written diacritical marks, yet includes the Japanese Industry Standard so that the output is compatible with existing UCR equipment. More specifically, "2F" is assigned to adjacent nigori, and "41" is assigned to adjacent maru.
- In the drawings and specification there has been set forth an exemplary embodiment ot the invention. It should be understood that while specific terms are used, they are employed in a generic and descriptive sense only and are not tor purposes of limitation.
- The full details of the character recognition apparatus have not been described since it will be fully appreciated by one skilled in the art how to construct apparatus which will operate as described herein.
is characterised in that
is characterised in that
Claims (13)
cnaracterised in that
said apparatus comprises segmentation means for segmenting said character line data to obtain character data of the individual unknown characters and any diacritical marks associated therewith.
said segmentation means comprises
whereby the individual portions of the profile between the segmentation points define the size of the separated individual segments of the unknown characters.
said examining means examines the un-normalised diacritical mark data in order to recognise said diacritical mark.
characterised in that said examination process comprises
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US06/537,279 US4611346A (en) | 1983-09-29 | 1983-09-29 | Method and apparatus for character recognition accommodating diacritical marks |
US537279 | 1983-09-29 |
Publications (3)
Publication Number | Publication Date |
---|---|
EP0138079A2 true EP0138079A2 (en) | 1985-04-24 |
EP0138079A3 EP0138079A3 (en) | 1988-07-06 |
EP0138079B1 EP0138079B1 (en) | 1991-08-07 |
Family
ID=24141985
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP84111043A Expired - Lifetime EP0138079B1 (en) | 1983-09-29 | 1984-09-17 | Character recognition apparatus and method for recognising characters associated with diacritical marks |
Country Status (5)
Country | Link |
---|---|
US (1) | US4611346A (en) |
EP (1) | EP0138079B1 (en) |
JP (1) | JPS6077274A (en) |
CA (1) | CA1208784A (en) |
DE (1) | DE3484890D1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2222475A (en) * | 1988-08-10 | 1990-03-07 | Caere Corp | Optical character recognition |
Families Citing this family (124)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS61250793A (en) * | 1985-04-30 | 1986-11-07 | Canon Inc | Character recognizing device |
US4887301A (en) * | 1985-06-05 | 1989-12-12 | Dest Corporation | Proportional spaced text recognition apparatus and method |
JP2835178B2 (en) * | 1990-11-28 | 1998-12-14 | 株式会社東芝 | Document reading device |
US5307424A (en) * | 1990-12-12 | 1994-04-26 | Eberhard Kuehl | Character recognition system |
US5515455A (en) * | 1992-09-02 | 1996-05-07 | The Research Foundation Of State University Of New York At Buffalo | System for recognizing handwritten words of cursive script |
US5373566A (en) * | 1992-12-24 | 1994-12-13 | Motorola, Inc. | Neural network-based diacritical marker recognition system and method |
US5946410A (en) * | 1996-01-16 | 1999-08-31 | Apple Computer, Inc. | Adaptive classifier for compound characters and other compound patterns |
US6453070B1 (en) * | 1998-03-17 | 2002-09-17 | Motorola, Inc. | Diacritical processing for unconstrained, on-line handwriting recognition using a forward search |
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
CN101128838B (en) * | 2005-02-28 | 2011-11-16 | Zi德库玛股份公司 | Recognition graph |
EP1854047A1 (en) * | 2005-02-28 | 2007-11-14 | ZI Decuma AB | Segmentation-based recognition |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US8423908B2 (en) * | 2006-09-08 | 2013-04-16 | Research In Motion Limited | Method for identifying language of text in a handheld electronic device and a handheld electronic device incorporating the same |
US20080189606A1 (en) * | 2007-02-02 | 2008-08-07 | Michal Rybak | Handheld electronic device including predictive accent mechanism, and associated method |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
US7925089B2 (en) * | 2007-09-18 | 2011-04-12 | Microsoft Corporation | Optimization of multi-label problems in computer vision |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
WO2010067118A1 (en) | 2008-12-11 | 2010-06-17 | Novauris Technologies Limited | Speech recognition involving a mobile device |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US20120311585A1 (en) | 2011-06-03 | 2012-12-06 | Apple Inc. | Organizing task items that represent tasks to perform |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US9431006B2 (en) | 2009-07-02 | 2016-08-30 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
WO2011089450A2 (en) | 2010-01-25 | 2011-07-28 | Andrew Peter Nelson Jerram | Apparatuses, methods and systems for a digital conversation management platform |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
US10762293B2 (en) | 2010-12-22 | 2020-09-01 | Apple Inc. | Using parts-of-speech tagging and named entity recognition for spelling correction |
JP5807342B2 (en) * | 2011-02-21 | 2015-11-10 | 富士ゼロックス株式会社 | Character recognition device and program |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US8994660B2 (en) | 2011-08-29 | 2015-03-31 | Apple Inc. | Text correction processing |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US9483461B2 (en) | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
US9280610B2 (en) | 2012-05-14 | 2016-03-08 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
US9576574B2 (en) | 2012-09-10 | 2017-02-21 | Apple Inc. | Context-sensitive handling of interruptions by intelligent digital assistant |
US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
KR102516577B1 (en) | 2013-02-07 | 2023-04-03 | 애플 인크. | Voice trigger for a digital assistant |
US9368114B2 (en) | 2013-03-14 | 2016-06-14 | Apple Inc. | Context-sensitive handling of interruptions |
WO2014144949A2 (en) | 2013-03-15 | 2014-09-18 | Apple Inc. | Training an at least partial voice command system |
WO2014144579A1 (en) | 2013-03-15 | 2014-09-18 | Apple Inc. | System and method for updating an adaptive speech recognition model |
WO2014189400A1 (en) | 2013-05-22 | 2014-11-27 | Axon Doo | A method for diacritisation of texts written in latin- or cyrillic-derived alphabets |
WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
EP3008641A1 (en) | 2013-06-09 | 2016-04-20 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
CN105265005B (en) | 2013-06-13 | 2019-09-17 | 苹果公司 | System and method for the urgent call initiated by voice command |
WO2014204339A1 (en) * | 2013-06-18 | 2014-12-24 | Abbyy Development Llc | Methods and systems that generate feature symbols with associated parameters in order to convert document images to electronic documents |
WO2015020942A1 (en) | 2013-08-06 | 2015-02-12 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
US9620105B2 (en) | 2014-05-15 | 2017-04-11 | Apple Inc. | Analyzing audio input for efficient speech and music recognition |
US10592095B2 (en) | 2014-05-23 | 2020-03-17 | Apple Inc. | Instantaneous speaking of content on touch devices |
US9502031B2 (en) | 2014-05-27 | 2016-11-22 | Apple Inc. | Method for supporting dynamic grammars in WFST-based ASR |
EP3149728B1 (en) | 2014-05-30 | 2019-01-16 | Apple Inc. | Multi-command single utterance input method |
US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US10289433B2 (en) | 2014-05-30 | 2019-05-14 | Apple Inc. | Domain specific language for encoding assistant dialog |
US9734193B2 (en) | 2014-05-30 | 2017-08-15 | Apple Inc. | Determining domain salience ranking from ambiguous words in natural speech |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10659851B2 (en) | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
US10446141B2 (en) | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
US9711141B2 (en) | 2014-12-09 | 2017-07-18 | Apple Inc. | Disambiguating heteronyms in speech synthesis |
US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
DK179309B1 (en) | 2016-06-09 | 2018-04-23 | Apple Inc | Intelligent automated assistant in a home environment |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK179049B1 (en) | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3710321A (en) | 1971-01-18 | 1973-01-09 | Ibm | Machine recognition of lexical symbols |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3638188A (en) * | 1969-10-17 | 1972-01-25 | Westinghouse Electric Corp | Classification method and apparatus for pattern recognition systems |
US4206442A (en) * | 1974-07-03 | 1980-06-03 | Nippon Electric Co., Ltd. | Letter segmenting apparatus for OCR comprising multi-level segmentor operable when binary segmenting fails |
JPS5156139A (en) * | 1974-11-13 | 1976-05-17 | Hitachi Ltd | Mojomitorisochi niokeru kiridashihoshiki |
JPS6043555B2 (en) * | 1980-02-26 | 1985-09-28 | 株式会社トキメック | Printed character cutting device |
DE3380462D1 (en) * | 1982-12-28 | 1989-09-28 | Nec Corp | Character pitch detecting apparatus |
-
1983
- 1983-09-29 US US06/537,279 patent/US4611346A/en not_active Expired - Fee Related
-
1984
- 1984-08-14 CA CA000461012A patent/CA1208784A/en not_active Expired
- 1984-08-18 JP JP59171017A patent/JPS6077274A/en active Granted
- 1984-09-17 EP EP84111043A patent/EP0138079B1/en not_active Expired - Lifetime
- 1984-09-17 DE DE8484111043T patent/DE3484890D1/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3710321A (en) | 1971-01-18 | 1973-01-09 | Ibm | Machine recognition of lexical symbols |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2222475A (en) * | 1988-08-10 | 1990-03-07 | Caere Corp | Optical character recognition |
US5131053A (en) * | 1988-08-10 | 1992-07-14 | Caere Corporation | Optical character recognition method and apparatus |
US5278920A (en) * | 1988-08-10 | 1994-01-11 | Caere Corporation | Optical character recognition method and apparatus |
US5278918A (en) * | 1988-08-10 | 1994-01-11 | Caere Corporation | Optical character recognition method and apparatus using context analysis and a parsing algorithm which constructs a text data tree |
US5381489A (en) * | 1988-08-10 | 1995-01-10 | Caere Corporation | Optical character recognition method and apparatus |
Also Published As
Publication number | Publication date |
---|---|
CA1208784A (en) | 1986-07-29 |
EP0138079B1 (en) | 1991-08-07 |
US4611346A (en) | 1986-09-09 |
DE3484890D1 (en) | 1991-09-12 |
JPS6077274A (en) | 1985-05-01 |
EP0138079A3 (en) | 1988-07-06 |
JPH0432430B2 (en) | 1992-05-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0138079B1 (en) | Character recognition apparatus and method for recognising characters associated with diacritical marks | |
US4926492A (en) | Optical character reading apparatus and method | |
US5359673A (en) | Method and apparatus for converting bitmap image documents to editable coded data using a standard notation to record document recognition ambiguities | |
EP0621542B1 (en) | Method and apparatus for automatic language determination of a script-type document | |
EP0163377B1 (en) | Pattern recognition system | |
EP0138445B1 (en) | Method and apparatus for segmenting character images | |
JPH07200745A (en) | Comparison method of at least two image sections | |
US4975974A (en) | Character recognition apparatus | |
RU2259592C2 (en) | Method for recognizing graphic objects using integrity principle | |
EP0516576A2 (en) | Method of discriminating between text and graphics | |
US5077809A (en) | Optical character recognition | |
JP3710164B2 (en) | Image processing apparatus and method | |
KR910007032B1 (en) | A method for truncating strings of characters and each character in korean documents recognition system | |
JPS59158482A (en) | Character recognizing device | |
JP2578767B2 (en) | Image processing method | |
JPH0514952B2 (en) | ||
JPH06119497A (en) | Character recognizing method | |
JPS60138689A (en) | Character recognizing method | |
JPH11134439A (en) | Method for recognizing word | |
JPH06187450A (en) | Pattern recognizing method and device therefor | |
JPH1011542A (en) | Character recognition device | |
JPS636686A (en) | Character recognizing device | |
JPH04335487A (en) | Character segmenting method for character recognizing device | |
JPS5914078A (en) | Reader of business form | |
JPS62169285A (en) | Document processor |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 19841123 |
|
AK | Designated contracting states |
Designated state(s): CH DE FR GB IT LI NL |
|
PUAL | Search report despatched |
Free format text: ORIGINAL CODE: 0009013 |
|
AK | Designated contracting states |
Kind code of ref document: A3 Designated state(s): CH DE FR GB IT LI NL |
|
17Q | First examination report despatched |
Effective date: 19900301 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): CH DE FR GB IT LI NL |
|
REF | Corresponds to: |
Ref document number: 3484890 Country of ref document: DE Date of ref document: 19910912 |
|
ET | Fr: translation filed | ||
ITF | It: translation for a ep patent filed |
Owner name: IBM - DR. ARRABITO MICHELANGELO |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: CH Payment date: 19911217 Year of fee payment: 8 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LI Effective date: 19920930 Ref country code: CH Effective date: 19920930 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: NL Payment date: 19920930 Year of fee payment: 9 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Effective date: 19940401 |
|
NLV4 | Nl: lapsed or anulled due to non-payment of the annual fee | ||
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 19950911 Year of fee payment: 12 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 19950921 Year of fee payment: 12 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 19960827 Year of fee payment: 13 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Effective date: 19960930 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Effective date: 19970603 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 19970917 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 19970917 |