US3709525A - Character recognition - Google Patents

Character recognition Download PDF

Info

Publication number
US3709525A
US3709525A US00871550A US3709525DA US3709525A US 3709525 A US3709525 A US 3709525A US 00871550 A US00871550 A US 00871550A US 3709525D A US3709525D A US 3709525DA US 3709525 A US3709525 A US 3709525A
Authority
US
United States
Prior art keywords
editing
symbol
line
sectors
transistor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US00871550A
Inventor
A Frank
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Scan Data Corp
Original Assignee
Scan Data Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Scan Data Corp filed Critical Scan Data Corp
Application granted granted Critical
Publication of US3709525A publication Critical patent/US3709525A/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K19/00Record carriers for use with machines and with at least a part designed to carry digital markings
    • G06K19/06Record carriers for use with machines and with at least a part designed to carry digital markings characterised by the kind of the digital marking, e.g. shape, nature, code
    • G06K19/08Record carriers for use with machines and with at least a part designed to carry digital markings characterised by the kind of the digital marking, e.g. shape, nature, code using markings of different kinds or more than one marking of the same kind in the same record carrier, e.g. one marking being sensed by optical and the other by magnetic means
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09FDISPLAYING; ADVERTISING; SIGNS; LABELS OR NAME-PLATES; SEALS
    • G09F23/00Advertising on or in specific articles, e.g. ashtrays, letter-boxes
    • G09F2023/0016Advertising on or in specific articles, e.g. ashtrays, letter-boxes on pens

Definitions

  • This invention relates to character recognition and more particularly to a method of editing a document 52 US. Cl ..2s3/1, 235/6l.l2 N prior to optical Scanning thereof in a character m0? 21 1111.01. ..G06k 19/00 "mo-n System Field of Search 283/1, 17, 340/1463 l Claim, 12 Drawing Figures '1 121521 21 11 E 21126111 1 1; 4a11+fi1 e 1;- q visa-1 PATENTEDJAI ems 3.709.525
  • the Planners h) f? 66 62 are pondering various combinations of American help and foreign self-help. The choices finally made will hinge in part on how much room is left for welfare programs as Vietnam was speeding rises.
  • Another object of the invention is to provide a font of editing symbols which enable a sheet of textual material to be corrected or altered without requiring manual reproduction of the material for conversion by a character recognition system into machine language.
  • Another object of this invention is to provide a new and improved method of altering textual material for reading the altered material directly by an optical scanning device.
  • Another object of this invention is to provide a new and improved method of reading altered textual material into a machine.
  • each of said symbols being comprised of a portion of a symbol comprising a vertically extending upright bar, a pair of horizontally extending top bars which extend to opposite sides of said upright bar from the top thereof, a pair of horizontally extending center bars which extend to opposite sides of said upright bar from the center thereof, and a pair of horizontally extending bottom bars which extend to opposite sides of said upright bar from the bottom thereof, whereby each of said editing symbols is comprised of said upright bar and a combination of the presence and absence of said horizontal bars.
  • a font of editing symbols is provided which are easily recognizable by a character recognition system though handwritten.
  • the symbols are comprised of a vertical bar and the combinatorial presence and absence of six horizontal bars which extend from the vertical upright bar.
  • These editing symbols are used in conjunction with textual material by insertion of a proper one of the symbols underneath the portion of textual material which is in error.
  • the page of textual material and handwritten symbols may then be read by a character recognition system which will automatically edit the textual material in accordance with the editing symbols.
  • FIG. 1 is an enlarged plan view of the basic editing symbol from which the font of editing symbols is comprised
  • FIG. 2 is a font of editing symbols comprised of the editing symbol shown in FIG. 1;
  • FIG. 3 is a plan view of a sheet of textual material as edited by a standard editing technique
  • FIG. 4 is a plan view of a sheet of the same textual material as that shown in FIG. 3 as edited in accordance with the invention
  • FIG. 5 is a schematic block diagram of an optical scanning system embodying the invention.
  • FIG. 6 is a schematic block diagram of the shift register used in the system.
  • FIG. 7 is a schematic diagram of the flip-flop circuitry used throughout the shift register
  • FIG. 8 is a schematic diagram of a recognition circuit used in the Feature Extraction Mask unit
  • FIG. 9 is a pictorial diagram illustrative of the operation of the recognition circuits.
  • FIG. 10 is a schematic block diagram of the flow of data throughout the system.
  • FIG. 11 is a schematic block diagram of the flow of data within a computer after a document has been scanned.
  • FIG. 12 is a pictorial diagram illustrative of the recognition circuits for an editing symbol.
  • the editing symbol embodying the present invention is generally shown at 20 in FIG. 1.
  • the editing symbol 20 is basically comprised of a vertical upright bar 22 which extends from the bottom to the top of the editing symbol.
  • the symbol also includes six horizontal bars 24, 26, 28, 30:32 and 34.
  • the first pair of horizontal bars 24 and 26 extend laterally from the top of upright bar 22 to the left and right sides, respectively.
  • the pair of bars 28 and 30 extend laterally from the center of upright bar 22 to the left and right, respectively.
  • the pair of bars 32 and 34 extend from the bottom end of upright bar 22 to the left and right sides, respectively.
  • the editing symbol 20 is the basic structure for a font of handwritten symbols which are formed as a combination of the upright bar 22 and a combination of the presence and absence of bars 24 to 34. This font of symbols is shown in FIG. 2. As can be seen, there are 64 symbols which can be comprised of the editing symbol shown in FIG. 1. That is, the number of combinations which can be derived from the presence and absence of six bars is 2 or 64. As will be seen hereinafter, the provision of an editing symbol having only a single vertical bar and a plurality of horizontal bars facilitates the easy recognition of the symbol.
  • Each of the symbols in FIG. 2 may be used to represent either an editing instruction, an alphabetic insertion or a linecasting instruction.
  • the editing symbol 36 which is comprised of vertical bar 22 and horizontal bars 24, 26, 32 and 34 and which is shown in the third column from the right and third row from the bottom in FIG. 2, may be used as an editing instruction to indicate a capital letter is required rather than a lower case. That is, where textual material is printed with a mistake such as not capitalizing the first letter of a proper noun, the editing symbol 36 may be written underneath the first letter of the word which needs capitalization. In this manner, when the textual materi' al is inserted in the character recognition system of the invention, the editing symbol 36 instructs the system to alter the textual material in accordance with the instruction. Thus, rather than the machine printing a lower case letter, a capital letter is printed instead.
  • ALPHABETIC INSERTIONS "l l f- -1 point l I pica line
  • the same symbol may be used not only for an editing instruction, but also for an alphabetic insertion or linecasting instruction. That is, there are only 64 editing symbols which may be made from the vertical bar 22 and horizontal bars 24 through 34 of the basic editing symbol 20. Thus, if the total number ofinstructions, alphabetic insertions and linecasting instructions are greater than 64 in number, it is necessary to use the editing symbols in more than one manner.
  • the manner in which the symbol is used can be determined by either its location or the adjacent editing symbols. For instance, an alphabetic insertion is always used after an editing instruction symbol. Therefore, the editing instruction symbol enables the computer to determine that the following symbol thereafter is an alphabetic insertion. It should also be understood that editing symbols may be used not only for alphabetic and graphic arts insertions, but numerical insertions and other symbols as well.
  • the sensing of the linecasting instructions in a position other than within the text as will be seen hereinafter enables determination by the computer that the symbol is specifically to be used as a graphic arts instruction as opposed to an alteration of the textual material.
  • the use of the editing symbols will be more clearly seen in conjunction with the example hereinafter shown.
  • FIG. 3 and FIG. 4 there is shown a sheet 38 and 40, respectively, of textual material which has been edited by the use of Government Printing Office (hereinafter abbreviated to GPO) symbols and by use of the editing symbols of the invention, respectively.
  • GPO Government Printing Office
  • FIG. 4 the same instruction is indicated in the top left-hand corner by editing symbols 42, 44 and 46.
  • Symbol 42 is comprised of the vertical upright bar 22 and the presence of horizontal bars 24, 26, 30 and 32 and the absence of the remaining bars (28 and 24).
  • Editing symbol 44 is comprised of the vertical upright bar 22 and the presence of horizontal bars 26, 28 and 34.
  • Editing symbol 46 is comprised of the vertical upright bar 22 and the presence of horizontal bars 26, 28, 30 and 32.
  • Editing symbol 42 indicates a linecasting instruction of Vogue Bold.
  • the editing symbol 44 is the instruction for point and editing symbol 46 is the linecasting instruction for ll pica line.
  • the first line of textual material in sheet 38 is in error in that the small s in the abbreviation U.S.” should be a capital S". This mistake is indicated by the GPO symbol of three parallel lines handwritten below the letter in error which indicates that a capital letter S:" should replace the lower case letters".
  • the symbol 36 may be placed underneath the small “s" on the first line of sheet 40 in FIG. 4 to indicate that a capital S should be inserted therefor.
  • the z is corrected to an 0" by inserting the editing symbol 48 underneath the z".
  • Editing symbol 48 indicates that the letter above it should be deleted.
  • the editing symbol 48 is followed by editing symbols 50 and 52.
  • Editing symbol 50 indicates that the letter a" should be inserted and editing symbol 52 indicates that there are no further letters to be inserted in place of the deleted z".
  • the word worlds should be world".
  • the GPO symbol indicating a deletion is written through the s" to indicate that the word should be world.
  • the editing symbol 54 which indicates that a deletion only should be made is inserted under the .s" in worlds", thus, indicating the deletion thereof.
  • the editing symbol 56 is inserted underneath the second d" and the in neededfood" to indicate that a space should be inserted between the letter d" and the letter j.
  • the word ahead" is in error in that the a" and the e" are not in the proper order.
  • the GPO symbol tr for transposing is placed underneath the "ae.
  • editing symbol 58 is inserted underneath the ae" to indicate that a transposition of the a and the e is necessary.
  • next line which is the first line of the second paragraph, there is an error in that the word of" should be inserted between "specter and starvation” and the first letter P" in the word Planners" should be a lower case.
  • the first error on the line is corrected on sheet 38 by the insertion of the GPO symbol Afor the start of an insertion between the words specter” and starvation and the insertion of the word of above the symbol.
  • the symbol 60 is placed below the space between the words specter" and starvation" to indicate the start of an insertion, and the symbols 62, 64 and 58 follow to indicate that the letters 0" and f should be inserted between specter" and starvation 0n sheet 38, the GPO symbol lc to indicate lower case is inserted above the P in Planners" to correct the ease error.
  • the editing symbol 48 is inserted underneath the letter "s in the word was" and underneath the letter c in the word speeding.
  • the first symbol 48 is followed by editing symbols 68 and 52.
  • the second editing symbol 48 is followed by editing symbols 70 and 52. These symbols are inserted after the symbol 48 which indicates the start of a deletion to indicate that the s" and the c" are to be replaced, respectively, by an r and an n".
  • the symbols are easy to write and are very flexible. Thus, the symbols may be used not only for instructions for altering or deleting, but for use to indicate alphabetic insertions, linecasting instructions as well as other insertions or instructions.
  • the schematic block diagram of a system which may be used to optically scan the edited sheet 40 is shown in FIG. 5.
  • the system includes a document handling unit 72, a scanner unit 74, an instruction control unit 76, a cross-correlation unit comprising a shift'register 78, feature extraction masks and logic circuitry comprised of the combination of features to characters circuits 82, and the code generator 84, a master control unit 86 and an input-output buffer unit 88.
  • the document handling unit 72 basically comprises a rotating cylindrical platen and a document input unit which feeds the incoming documents to the platen.
  • the rotating platen supports the documents and is adjacent the scanner unit 74.
  • Scanner unit 74 is a flying spot scanner and basically comprises a cathode ray tube shaping circuit.
  • the cathode ray tube supplies a raster of light which is directed at the document which is presently in position for being read on the rotating platen of the document handling unit 72.
  • the size of the raster and the location thereof are determined by the inputs on lines 100 and 102. Lines 100 and 102 are connected between the scanner unit 74 and the instruction control unit 76.
  • the lines 100 and 102 actually indicate a plurality of I lines as indicated by their thickness. Throughout FIG. 5 those lines which are heavy indicate that the line is actually a cable having a plurality of input or output lines in multiple.
  • the horizontal positioning of the cathode ray tube raster is determined by the inputs on lines 100.
  • the horizontal size of the raster is also controlled by the instructions fed to the scanner unit on lines 100.
  • the size and location of the vertical position of the raster in the cathode ray tube of scanner unit 74 is determined by the inputs on lines 102 from the instruction control unit 76.
  • the horizontal and vertical locations of the raster are also fed back via lines 100 and 102 to the instruction control unit.
  • the locations are in turn fed to the master control unit via input and output buffer unit 78 so that the horizontal and vertical posi tion or coordinates of a character are stored with the character when it is recognized, as will hereinafter be seen.
  • the cathode ray tube in the scanner unit 74 forms the output of the flying spot scanner system which emits a beam oflight which is directed to the document being read on the document handling unit 72.
  • the beam is appropriately directed by a lens system between the cathode ray tube and the document.
  • the beam is scanned in a raster which is slightly larger than the largest character which is to be scanned. in the present embodiment, the preferred raster includes thirty vertical scans.
  • the photomultiplying tube in the scanner unit 74 is connected to a pulse shaper which samples the output from the photomultiplying tube at predetermined intervals.
  • the photomultiplier tube emits a signal in accordance with thereflection of the beam of light on the surface of the document.
  • the output of photomultiplier tube is at one level.
  • the location of the beam from the cathode ray tube on a black surface of the document such as a character produces a different signal level output from the photomultiplier.
  • the pulse shaper samples the photomultiplier output at discrete intervals so that pulses are produced indicative of either a white surface or a black surface as the cathode ray tube beam scans the surface of the document.
  • the pulse shaper samples the output of-the photomultiplier tube forty times in each of the columns.
  • the pulse shaper will produce 1,200 (30 columns X 40 samples per column) discrete outputs.
  • the pulse shaper also includes appropriate gating so that unless a certain threshold of illumination is reflected to the photomultiplier, the output indicates that a black area has been scanned. In this manner, a digital output is produced.
  • the output of the pulse shaper is therefore either one of two levels; the first level indicating that the area scanned is predominantly black at thesampled location and a second level indicating the sampled location is predominantly white.
  • the threshold circuitry within the pulse shaper enables the generation of a discrete digital output of either one level or another.
  • the output from the pulse shaper is fed via line 104 of scanner unit 74 to the shift register 78.
  • Shift register 78 is capable of storing 1,200 bits. That is, the output from the scanner unit 74 for a complete character scan on a document in the document handling unit 72 may be stored in the shift register.
  • the shift register includes 1,200 flip-flops which are serially connected as shown in FIG. 6.
  • the flip-flops are shown in 30 vertical columns labeled C-l, C-2-C-30 and40 horizontal rows labeled R-l R-2, R-3-R-40 in accordance with the location of the samples in the scanning raster.
  • the first column C-l is comprised of flip-flops FF-l, FF-2, FF-3-FF-40. These flip-flops are serially connected. That is, the output of FF-l is connected to the input of FF-Z, the output of FF-2 is connected to the input of FF-3and the output of FF-39 is connected to the input of F F-40.
  • flip-flop FF-40 is connected to the input of FF-4l which is located at the top of the second column C-2.
  • FF-4l through FF-SO comprise the second column and are similarly serially connected.
  • the output of FF-80 is connected to the input of FF-8l and so on through to the 30th column C-30.
  • Each of the forty flip-flops in a column corresponds to the points along a vertical column of a raster at which the output of the photomultiplier tube in the scanner unit 74 are sampled by the pulse shaper unit.
  • the pulse shaper unit samples the output of the photomultiplier in accordance with signals fed by a clock pulse source which also feeds shift pulses via line 106 to the shift register 78.
  • the line 106 is connected to the input of each of flip-flops FF-l to FF- 1 200.
  • the pulses on line 106 advance the information through the shift register.
  • the shift register 78 need not be physically positioned in 30 columns of 40 flip-flops.
  • the flip-flops FF- 1 through FF-l ,200 may be positioned so that the flipflops are in a single line from FF-l through FI -1,200 or in any other physical location. it is not necessary that the flip-flops be positioned in accordance with the location of the sampled raster.
  • the necessity of positioning the flip-flops in a rectangular pattern is obviated by use of electronic extraction masks which are connected to the output of the flip-flops irrespective of their locations.
  • Each of the flip-flops FF-l through FF-l ,200 is comprised of a flip-flop circuit 107 which includes a bi-stable flip-flop circuit having buffer amplifiers connected to the output thereof as shown in FIG. 7.
  • the bi-stable portion of the circuit shown in FIG. 7 is an Eccles-Jordan type flip-flop that is comprised of transistors 108 and 110 and the associated circuitry connected therebetween.
  • the emitters of transistors 108 and 110 are each connected to ground.
  • the collector of transistor 108 is connected to the base of transistor 110 via a resistor 112 and a capacitor 114 which are connected in parallel.
  • the collector of transistor 108 is also connected to a negative source of voltage (-V) via resistor 116 and to the input of the next stage via line 118.
  • the collector of transistor 110 is connected to the base of transistor 108 via resistor 120 and capacitor 122 which are connected in parallel and to the negative source of voltage (-V) via resistor 124.
  • the collectors of transistors 108 and 110 are also connected to the bases of transistors 126 and 128, respectively, which act as amplifiers to drive the feature extraction masks in the feature extraction masks unit 80.
  • the bases of transistors 108 and 110 are connected to a positive source of voltage (V) via resistors 127 and 129, respectively.
  • the base of transistor 108 is also connected to capacitor 130 which is connected to the output line from the previous transistor stage. That is, the output line 118 of a previous flip-flop 107 is connected to the input of capacitor 130 except in the case of flip-flop FF-l, the capacitor 130 is connected to input line 104 from the scanner unit 74.
  • the base of transistor 110 is connected to capacitor 132.
  • the capacitor 132 is connected to the line 106 which receives the shift pulses and shifts the contents of the shift register 78 from one stage to the next.
  • the collector of transistor 126 is connected to an output line 134 which is fed to the various feature masks which are associated with a particular stage of the shift register 78.
  • the collector of transistor 128 is connected to output line 136 which is also connected to various feature masks which are associated with that particular stage of shift register 78.
  • the emitters of transistors 126 and 128 are connected via resistors 138 and 140, respectively, to a positive source of voltage (V).
  • the collectors of transistors 126 and 128 are also connected via resistors 142 and 144, respectively, to the negative source of voltage (-V).
  • the flip-flop comprised of transistors 108 and 110 is a bi-stable circuit. That is, either transistor 108 or transistor 110 conducts while the other is cut-off. Assuming transistor 108 is conduct- -ing' the transistor 110 is cut-off by the voltage on the collector of transistor 108 which is fed to the resistor divider comprised of resistors 112 and 129 which back biases the emitter-base junction of transistor 110. Similarly, when transistor 110 conducts, the collector voltage of transistor 110 back biases the emitter-base junction of transistor 108 so that it is cut-off. Assuming transistor 110 is conducting, an input pulse to capacitor 132 back biases the emitter-base junction of transistor 110 so that it is cut-off. The change in output voltage on the collector of transistor 110 thereby enables transistor 108 to begin conduction. If, however, transistor 110 were cut-off prior to reception of a pulse to capacitor 132, the transistor 110 would be driven.
  • transistor 108 is conducting and an input pulse is applied to capacitor 130, the transistor 108 is cut-off and the rise in collector voltage turns off transistor 110.
  • each of the flip-flops 107 in shift register 78 is driven to the condition where transistor is cut-off. That is, if transistor 103 in one particular stage of the shift register is cut-off, it is caused to conduct by the input pulse on shift line 106. In those stages where the transistor 108 is conducting, the conditions or states remain unchanged.
  • the output from the previous stage is then received by each of the flip-flops 107 and ifa pulse is applied on line 118 from the previous stage indicative of the fact that transistor 108 of the previous stage had been cutoff prior to the shift pulse, the transistor 108 of the next stage is cut-off by the pulse applied to capacitor 130.
  • appropriate pulse delay means are inserted between the output line 118 of the previous stage and the input to capacitor of the next stage so that the flip-flops 107 which have been changed by a shift pulse have time to be stabilized prior to the reception of the output from the previous stage.
  • the output level on line 118 is indicative that a black portion of the document has been scanned to produce the pulse.
  • conduction of transistor 108 indicates that a white portion of the document has been scanned. It is, of course, to be understood that this may be reversed as the demands of the circuitry require.
  • transistor 108 conducts and transistor 110 is cut-off in one of the stages of the shift register, the stage is considered to be in a white" state;
  • transistor 108 is cut-off and transistor 110 conducts, the stage is considered to be in a black state.
  • the output from the collector of transistor 108 also drives transistor 126 which produces an output on line 134 which is inverted and which drives the feature masks associated with a flip-flop stage 107.
  • the output voltage on the collector of transistor 110 is inverted by amplifier 128 and applied via line 136 to the feature masks associated with the stages of the flipflop of the shift register.
  • the shift register 78 is connected via cable to the feature extraction masks 80.
  • Cable 145 includes the output lines 134 and 136 from each of the 1,200 flip-flops in shift register 78.
  • the lines 134 and 136 are combinatorially applied to the plurality of masks which comprise the feature extraction masks unit 80.
  • Each feature extraction mask is connected to a plurality of flip-flops in shift register 78. The inputs may be either from the line 134 or line 136 of the flip-flops depending on the feature which is sought to be recognized.
  • the recognition gate for those features are connected to line 134 of the flipflops of the shift register 78.
  • the detection for the absence of a segment may be recognized by sensing the lines 136 of the various flip-flops associated with the feature.
  • Each feature mask includes a plurality of resistors 146, the first end of which is connected to the output lines 134 or 136 of the various flip-flop stages of the shift register 78.
  • the feature mask also includes a threshold gate which is comprised of transistors 148 and 150 and their associated circuitry. It should be understood that various combinations and pluralities of resistors may be used for a feature mask. That is, there need not be four resistors as shown, but in fact, any number from 2 to 60 can be used for a feature mask. However, for the most part, the average feature mask contains from 4 to of such resistors.
  • Resistors 146 may be weighted in value so that certain portions of a feature which are more important are given more value as an input to the base of transistor 148.
  • Transistors 148 and 150 are preferably of the P-N-P type. The emitters of transistors 148 and 150 are both connected to ground via resistor 152.
  • the collector of transistor 148 is connected to a negative source of voltage (-15) via resistor 154 and to the-base of transistor 150 via resistor 156.
  • the base of transistor 150 is also connected to a positive source of voltage (E) via resistor 158.
  • the collector of transistor 150 is also connected to the negative source of voltage (-E) via resistor 160.
  • the base of transistor 148 in addition to being connected via resistor 146 to the various outputs of shift register 78, is also connected to a positive source of voltage (E) via resistor 161.
  • the collector of transistor 150 is also connected to a positive source of voltage (E) via resistor 164.
  • the transistors 148 and 150 are so biased by a voltage source E and E that the transistors 148 and 150 do not conduct until a plurality of inputs are applied toresistors 146 which overcome a predetermined threshold.
  • a voltage source E and E a voltage source which is applied toresistors 146 which overcome a predetermined threshold.
  • the mask has a threshold which is exceeded by inputs to three of the four resistors.
  • the emitter-base junction of transistor 148 is forwardly biased and therefore conducts. This enables the conduction of transistor 150 which produces an output signal on line 162.
  • Line 162 is connected to the collector of transistor 150 and the output signal is transmitted to the logic gates which are located in the combination of features to characters unit 82.
  • the feature extraction mask unit 80 is connected to the combination of features to characters unit 82 by a cable 166 which is comprised of the output lines 162 from each of the threshold gates in the feature extraction masks.
  • FIG. 9 a character mask is diagrammatically illustrated.
  • the diagram represents each of the feature masks used for the identification of the letter H. That is, the illustration represents the manner in which the shift register 78 is sensed in order to recognize the letter H if it is on a document and is scanned by the cathode ray tube of scanner unit 74.
  • the diagram is comprised of 30 columns of 40 blocks 168.
  • Each block represents the stage of a flip-flop of shift register 78 to which the resistors 146 of the feature extraction masks are connected.
  • the labels C-l through C-30 for the columns and R-l through R- 40 for the rows correspond to the columns and rows of the shift register 78 as shown in FIG. 6. That is, the block 168 in column C-1 and row R-l corresponds to flip-flop F F-l in FIG. 5.
  • the feature mask for an H required the detection of either a white or black predominance in that particular area of the document,
  • one of the resistors 146 of a feature mask is connected to the line 134 or 136, respectively, of flip-flop FF-l
  • feature extraction masks are used in accordance with the pattern shown in FIG. 9. That is, the letter H is comprised of a plurality of sectors 170, 172, 174, 176, 178, 180, 182, 184, 186 and 188. Each of the sectors of the letter H are five blocks long and three blocks wide and thereby encompass 15 blocks. This is representative of the fact that each feature mask which is represented by a sector in the letter H in FIG. 9 includes fifteen resistors 146 which are connected to the output lines 134 of 15 stages of shift register 78. The sectors to 176 extend vertically and form the left vertical bar of the letter H. Sectors 178 and 180 extend horizontally and form the central bar of the letter H, and sectors 182 through 188 extend vertically and form the right vertical bar of the letter H.
  • the mask for the letter H further includes a pair of sectors 190 and 192 which are each two blocks square. That is, the feature masks which are represented by each of these sectors includes four resistors 146 which are connected to the output lines 136 of four stages of shift register 78.
  • the sectors 190 and 192 correspond to white areas on a document so that not only does the character H mask require detection of black areas on the document where the letter H would be, but also that there be white areas on the document between the vertical bars and the central horizontal bars of the H.
  • the sector 170 is diagrammatically illustrative of a feature mask as shown in FIG. 8 having fifteen resistors 146.
  • Each of the boxes 168 within sector 170 correspond to a resistor 146 in such a feature extraction mask.
  • the resistor corresponding to the box 168 which is disposed in column C-21 and row R-l l is connected to output line 134 of FF-8l 1.
  • the box 168 which is disposed in both column C-21 and row R-l2 indicates that a second resistor 146 of the feature extraction mask is connected to the output line 134 of flip-flop FF-8l2.
  • each of the sectors 172 through 188 indicate feature extraction masks having fifteen resistors 146 connected to the output lines 134 of various flip-flops throughout the shift register 78 in accordance with the location of the boxes in FIG. 9.
  • the sectors 190 and 192 are each illustrative of feature masks having four input resistors 146.
  • the box 168 which is disposed in both column C-l5 and row R-l 2 indicates that the first resistor 146 in the feature mask corresponding to sector 190 is connected to output line 136 of flip flop FF-572' which is in column C-lS and row R-l2 of the shift register 78.
  • the signals formed by the scanning of the letter H are shifted through shift re-
  • various of the extraction masks be energized. That is, it is not necessary that all of the sectors of the letter H be recognized simultaneously. For example, if either of the sectors 170 or 172 is not recognized, the letter H may still be detected if the other is present. Thus, if the print on the document is sporadic at either portion of the H corresponding to sectors 170 or 172, the letter H can still be recognized. Similarly, as will be seen hereinafter, the absence of the recognition of other of the sectors will not completely prevent recognition of the letter H.
  • the outputs of the feature masks corresponding to sectors 170 through 192 are fed via cable 166 to the combination of features to characters unit 82.
  • combination of features to characters unit 82 includes a plurality of gating circuits, tree circuits or logic circuits to convert the outputs of the various feature masks to characters.
  • appropriate logic circuitry may be used to mechanize the following equation to recognize the letter H:
  • the detection of the character by the combination of features to characters unit 82 provides an output signal on cable 194 which is connected to the input of code generator 84.
  • Code generator 84 converts the input from cable 194 to a binary-coded representation-of the character identified or recognized by the unit 82.
  • the output of code generator 84 is connected to the input-output buffer unit 88 via cable 196.
  • Cable 196 includes a plurality of lines which feed the character to the input-output buffer unit in parallel.
  • the input-output buffer unit 88 acts as a multiplexing unit for feeding information into and out of the master control unit 86 on a time sharing basis.
  • Master control unit 86 is preferably a general purpose digital computer which is programmed in accordance with the requirements of I buffer unit 88 provides these coordinates to master' control unit 86 via cable 202. Thus, not only the character which is read but the location thereof is stored together therewith in a temporary storage area of the master control unit.
  • the input-output buffer unit 88 also provides instructions to instruction control unit 76 via line 204 which is connected therebetween.
  • the master control unit 86 provides instruction signals for distribution throughout the system via cable 202 which is connected between the input-output buffer unit and the master control unit.
  • the input-output buffer unit 88 is a multiplexing unit which controls traffic between the remainder of the system and the master control unit 86.
  • FIG. 12 diagrammatically illustrates, in the same manner as FIG. 9, the combination of feature extraction masks which comprise the means of detecting and recognizing the editing symbols on a document.
  • the vertical bar 22 of the editing symbol 20 is comprised of 15 sectors M1 through M15. Each of the sectors is five blocks long by one block wide. Each of the sectors M1 through M15 is vertically elongated and is positioned substantially at the center of the raster.
  • the top left horizontal bar 24 is comprised of sectors Tl T2 and T3.
  • the top right horizontal bar 26 is comprised of sectors T4, T5 and T6.
  • the left central horizontal bar 28 is comprised of sectors C1, C2 and C3.
  • the right central horizontal bar 30 is comprised of sectors C4, C5 and C6.
  • the bottom left horizontal bar 32 is comprised of sectors L1, L2 and L3.
  • the bottom right horizontal bar 34 is comprised of sectors L4, L5 and L6.
  • Each of the sectors Tl through L6 which comprise the horizontal bars 24 through 34, is horizontally elongated and is five blocks long by one block wide.
  • Each of sectors-M1 through M15 represents a feature mask having five input resistors 146 which are connected to the output lines 134 of the stages of shift register 78 in accordance with the location of the boxes in FIG. 12.
  • Each of the sectors Tl through T6, Cl through C6 and L1 through L6 that form the horizontal bars of the editing symbol 20 are illustrative of a pair of feature extraction masks each having five resistors 146 connected to the base of transistor 148.
  • the resistors 146 of the first of each of the feature masks associated with these sectors of the symbol 20 are connected to the output lines 134 of the various stages of the shift register 78 with which they are associated.
  • the resistors 146 of the second of the feature masks associated with these sectors are connected to the output lines 136 of the associated stages of the shift register 78. Therefore, there is a feature mask to detect both the presence or absence of any of the sectors that comprise the horizontal and vertical bars which form the editing symbol 20.
  • the feature masks associated with the black sides or outputs 134 of the associated stages of the flip-flop produce an output to indicate the presence of a sector when the area of the sector on the document is predominantly black.
  • the output on line 162 of the feature mask is labeled for use in the equations, infra, in accordance with the sector detected.
  • the feature mask for sector Tl detects a black sector, the presence of the sector in the Boolean equation is labeled T1".
  • the recognition masks which are connected to output lines 136 indicate the absence of a particular symbol or a predominantly white area on the document.
  • the output signal produced by the feature mask is labeled T l.
  • the signals produced in the feature extraction masks unit 80 by the feature extraction masks which are used to determine the presence of'a black sector are labeled by the sector which they represent, whereas the feature masks which detect the absence of a black area or the presence of a white area, emit a signal indicative thereof which is labeled by the sector which they represent with a bar above it. This terminology is used throughout the equations set forth, infra.
  • a horizontally extending sector TZ which is 13 blocks long and one block wide and is thus coextensive with the top of the editing symbol 20.
  • the sector TZ is also spaced one block above the editing symbol 20.
  • a horizontally elongated sector BZ which is also 13 blocks long by one block wide.
  • the sectors TZ and B2 are each associated with a feature extraction mask having 13 resistors 146 connected to the base of transistor 148. Each of the resistors is connected to the output line 136 of the associated stages of shift register 78.
  • the sector TZ extends from column C-lO to (3-22 on row R-6 and thus the resistors 146 are connected to stages FF-366, FF-406, FF-446, FF-486 and FF-846.
  • the resistors 146 of the feature extraction mask associated with sector 82 are connected to the flip-flops FF-394, FF-434 and FF-874.
  • a sector LZ which is three blocks long and two blocks wide.
  • Another sector R2 is provided between horizontal bars 30 and34 and vertical bar 22 which is also three blocks long by two blocks wide.
  • the sectors L2 and R2 are each associated witha feature mask having six resistors 146 which are connected to v the lines 136 of the associated stages of shift register 78.
  • the sectors TZ, 82, L2 and R2 as will be seen hereinafter insure that the signals emitted by scanning an editing symbol which are shifted through the shift register 78 are in the proper position to enable an accurate character identification.
  • the feature extraction masks may have weighted resistors for the characters. ln the feature extraction masks used for the sectors of the horizontal and vertical bars of the editing symbol 20, the resistors 146 are substantially equal in resistance.
  • the threshold gate associated with each of the sectors is properly biased so that it may be operated by the receipt of three bits out of five from the shift register. That is, if the cathode ray tube scans a black area in any three of the five positions within a sector, the threshold gate of the feature mask is operated to produce a signal on line 162 of the threshold gate to indicate that a sector is present.
  • the threshold gate associated with the sector from being operated when a sector is present on the document.
  • TZ, LZ, RZ and 82 comprise a registration mask.
  • the feature masks associated with these sectors aid in the prevention of an inaccurate identification of a character in the editing symbol masks.
  • the feature masks associated with sectors TZ and BZ of the mask are set so that the thirteen resistors 146 are similar in weight andthe circuit is operated upon receipt of ten or more signals from lines 136 of the 13 stages of the shift register 78 to which they are connected. That is, if the cathode ray tube scans ten white areas out of the thirteen areas of the sector, the feature mask is energized.
  • the feature masks associated with sectors L2 and R2 are set so that the presence of four or more white spots during the scan of the sector by the cathode ray tube energizes the threshold gate of the feature mask.
  • the feature masks of sectors L2 and R2 when energized indicate that the color of these areas on the document are predominantly white.
  • This condition for any of the registration feature masks is represented in the following equations by a bar (i.e. T Z over the top of the sector which has been scanned.
  • the use of a bar over the top of the sector i.e. fi indicates that the feature masks associated therewith which is connected to the lines 136 or the white" sides of the flip-flop stages of shift register 78 have been energized due to the absence of the sector.
  • the bars 24 through 34 of the editing symbol 20 are recognized as present by the logic circuitry upon the recognition of either one of the three sectors in each of thehorizontal bars. That is, the top left segment 24 (hereinafter referred to as TL) is recognized if the feature mask associated with either T1, T2 or T3 is energized. Similarly, the horizontal bars 26, 28, 30, 32 and 34 (hereinafter referred to as TR, CL, CR, LL, and LR, respectively) are recognized as present by the recognition of one or more of the sectors by their associated feature mask.
  • the vertical bar 22 is formed of five groups each including three sectors. Thus, five vertical portions of the bar are sensed for these portions of the bars and are hereinafter referred to as V1, V2, V3, V4 and V5. Vl is considered to be present if the recognition mask associated with either M1, M2 or M3 is energized. Similarly, the remaining vertical portions are recognized upon recognition of one or more of the vertical sectors comprising the portion. These portions are thus detected by logic circuitry in unit 82 which is mechanized in accordance with the following Boolean equations:
  • V the vertical bar 22
  • the registration mask should produce the registration signal R in accordance with the following Boolean equation:
  • Code generator 84 converts the signal from cable 194 to a bi nary-coded signal and feeds these signals via cable 196 to the input-output buffer unit 88.
  • the instruction control 76 provides via lines 198 and 200 the binary-coded signals representing the location at which the editing symbol is located on the document.
  • the input-output buffer unit 88 transmits both the representation of the symbol and the location (x and y coordinates) thereof to the master control unit 86 for storage therein.
  • the master control unit 86 also supplies instruction signals to the instruction control 76 via line 204 and the input-output buffer unit 88 so that the document scanning equipment may be controlled for location of scan as well as the size of the scan.
  • the size of the scan may also be varied where the editing symbol detected does not fall within specific size limits. Thus, if the symbol is written too small, the raster produced by the cathode ray tube is reduced. Similarly, if the editing symbol is too large,-the raster is increased in size.
  • the document to be scanned is placed into the cylindrical rotating platen of the document handling unit 72.
  • the document handling unit emits a signal over line 206 to the scanner unit 74 to indicate that the document is in place. If the document is not in place, the feeding apparatus of the document handling unit 72 is operated until the document is properly disposed.
  • the cathode ray tube of the scanner unit begins to search for the first line of the document so that it can begin to optically scan the characters throughout the document.
  • the cathode ray tube scans in pattern to locate the first line of typewritten or printed information on the document. Until the first line is found,'the cathode ray tube continues to scan in pattern.
  • the horizontal and vertical position of the cathode ray tube beam is transmitted to the master control via the instruction control 76 and the input-outputbuffer unit 88.
  • the control unit 86 instructs the scanner unit to start scanning in a character pattern at the given horizontal and vertical location (hereinafter referred to as the x/y coordinate).
  • the scanner unit then begins a character scan at the x/y coordinate. If the scanner does not recognize video, that is, when a character is not present at the first location, the character scan is moved further along the line by the instruction control 76.
  • the x/y coordinate is transmitted to the master control unit 86 via the instruction control unit 76 and the input-output buffer unit 88.
  • the character at that position is scanned by the cathode ray tube and the output of the photomultiplier tube is fed via line 104 to the shift register 78. If a character is identified and recognized by the units and 82, the character and the x/y coordinate of the character are stored in the master control unit.
  • the instruction control 76 controls the scanner unit so that the scanner unit continues the character scans along the line until the end of the line. At the end of the line, the scanner is instructed by the instruction control 76 to scan in an editing scan at the x/y coordinate below the previous line. The scanner continues in an editing mode until a video interrupt. That is, if there is recognition that a character does exist on the editing line, then the x/y coordinate is fed to the master control unit and the scanner unit begins a character scan to provide the shift register with the output signals from the photomultiplier in scanner unit 74 for determination of the editing symbol located on the line. The recognition equipment thus sends the binary-coded representation of the symbol to the master control unit for the storage with the x/y coordinate thereof.
  • instruction control 76 instructs the scanner unit to continue scanning between the lines of textual material until the end thereof whereupon the instruction control instructs the scanner to index to the next line of textual material. The process is then repeated by the scanner unit 76 at the next line of. textual material and the portion of the document underneath the line for detection of instructions. This process is repeated until the end of the document whereupon an end of document signal is generated in the master control unit 86 and the document handling unit 72 is instructed to put the next document in place for optical recognition.
  • the master control unit 86 is a general purpose digital computer.
  • the information concerning the characters on the lines .of textual material and the instruction symbols are stored in temporary storage areas of memory.
  • Line merges are initiated by the program in the master control and the editing operation is accomplished.
  • the editing operation is diagrammatically illustrated in FIG. 11' which is a flow chart of the information in the computer for performing the merge.
  • the x/y coordinate of the first character in the first line is fetched from the temporary memory.
  • the x/y coordinate of the first editing character found is also fetched from the temporary storage associated therewith.
  • the coordinates of the textual character and the editing character are compared. If the coordinates do not compare, that is, it is determined that the x/y coordinate of the editing symbol is not adjacent to the x/y coordinate of the textual character, then the textual character is not in error and is not changed. Then the x/y coordinate of the next character is fetched and the coordinates of the edited character and the textual character are compared in the same manner that the coordinates were compared in the previous comparison.
  • the editing character is fetched and the editing operation indicated by the character is performed.
  • the results of the editing operation is stored in the final storage area along with the storage of the previous textual characters.
  • the final operations on the stored data are then performed in accordance with the instructions which are indicated by the editing syrmbols representative of the graphic arts instructions. hus, the computer organizes the proper number of letters for a line and the width of the final columns that are used in the reproduction of the textual material before the textual material is read out of the computer.
  • the invention enables the editing of printed or typed textual material for direct insertion into a character recognition system.
  • the need for retyping or reprinting the entire sheet in perfect form is thus obviated.
  • the method of editing is no more time consuming than other forms of editing and the symbols used are easy to write while being machine recognizable.
  • the edited document is then ready to be placed directly in the character recognition system which can read the textual material as well as incorporate the alterations.

Abstract

This invention relates to character recognition and more particularly to a method of editing a document prior to optical scanning thereof in a character recognition system.

Description

United States Patent 1 1 1111 3,709,525
Frank 51 Jan. 9, 1973 [54] CHARACTER RECOGNITION [56] References Cited [75] Inventor: Alan 1. Frank, Philadelphia, Pa. UNITED STATESPATENTS [731 Assigneel Sen-Data 'P" Philadel' 1,021,189 3 1912 1-1111 ..283/17 p 1,267,640 5 1910 Eegleston ..283/l7 Filed: Nov- ,19 Dl98,l9l 5/1964 S1lsby ..283/9X [21] Appl. No.: 871,550 Primary Examiner-Lawrence Charles ated Us. pp cafio D Attorney-Caesar, Rivise, Bernstein and Cohen [63] Continuation of Set. No. 544,202, April 21, 1966, ABSTRACT abandoned This invention relates to character recognition and more particularly to a method of editing a document 52 US. Cl ..2s3/1, 235/6l.l2 N prior to optical Scanning thereof in a character m0? 21 1111.01. ..G06k 19/00 "mo-n System Field of Search 283/1, 17, 340/1463 l Claim, 12 Drawing Figures '1 121521 21 11 E 21126111 1 1; 4a11+fi1 e 1;- q visa-1 PATENTEDJAI ems 3.709.525
SHEET 2 OF 9 With growing urgency, U.s. planners are grappling with jgfr mentous international problem: An increasingly hungry wor1d; is turning more and more to this bountiful country for neededfood; but the U.S., with its surpluses already shrinking, will be unable to fill the food gap that looms ahaed.
To fend off the specter/ starvation, the flanners are pondering various combinations of American help and foreign self-help. The choices finally made will hinge in part on how much room is left for welfare programs as L w Vietnam wag spefi ing rises.
I INVENTOR. ALAN I. FRANK PATENTED JAN 9 I973 3.709 525 SHEET 3 UP 9 J: .g With growing urgency, U.s. planners are grappling v with z momentous international problem: An increasingly UL 50 hungry worlds is turning more and more to this bountiful country for neededfood; but the U.S.,' with its surpluses F262 already shrinking, will be unable to fill the food gap that looms ahaed.
To fend off the specter starvation, the Planners h) f? 66 62 are pondering various combinations of American help and foreign self-help. The choices finally made will hinge in part on how much room is left for welfare programs as Vietnam was speeding rises.
INVENTOR. ALAN l. FRANK PATENTEDJAN 9|975 3.709.525
SHEET 6 OF 9 #4 INPUTS FROM 5 SINGLE BITS Glc OF SHIFT S REGISTER 7% /ei E528. E
INVENTOP ALAN u. FRANK By D Cam,m,
,4 ITOFNEXS,
CHARACTER RECOGNITION This application is a continuation of application Ser. No. 544,202, filed Apr. 21, 1966, now abandoned.
The use of character recognition techniques to recognize and read into computers printed copy is becoming more and more commonplace. Optical scanning as well as other character recognition systems, however, have been fairly limited to the use of printed data. The reason being that handwriting varies greatly from one person to the next. Thus, where data is not in a printed or typewritten form, it is necessary to reproduce the data in such a form so that it may be recognized by character recognition equipment for insertion into the memory of a large scale computer or be used by printing machinery, etc. Similarly, where mistakes appear in printed data, it is necessary that the data be retyped in perfect form in order that the recognition equipment can receive the altered data. Thus, an entire sheet of data may be perfectly usable with the exception of a single line, yet the entire sheet of printed data must be retyped or printed to incorporate the amendment or deletion to the line.
It is, therefore, an object of this invention to provide a new and improved editing technique which enables the edited copy to be directly read into a machine by optical scanning techniques.
It is another object of this invention to provide a new and improved editing technique which utilizes an easily recognizable editing code.
Another object of the invention is to provide a font of editing symbols which enable a sheet of textual material to be corrected or altered without requiring manual reproduction of the material for conversion by a character recognition system into machine language.
Another object of this invention is to provide a new and improved method of altering textual material for reading the altered material directly by an optical scanning device.
Another object of this invention is to provide a new and improved method of reading altered textual material into a machine.
It is another object of the invention to provide a new and improved character recognition system which may read printed textual material having alterations and modifications handwritten therein.
These and other objects of the present invention are achieved by providing a font of editing symbols, each of said symbols being comprised of a portion of a symbol comprising a vertically extending upright bar, a pair of horizontally extending top bars which extend to opposite sides of said upright bar from the top thereof, a pair of horizontally extending center bars which extend to opposite sides of said upright bar from the center thereof, and a pair of horizontally extending bottom bars which extend to opposite sides of said upright bar from the bottom thereof, whereby each of said editing symbols is comprised of said upright bar and a combination of the presence and absence of said horizontal bars.
In accordance with the invention, a font of editing symbols is provided which are easily recognizable by a character recognition system though handwritten. The symbols are comprised of a vertical bar and the combinatorial presence and absence of six horizontal bars which extend from the vertical upright bar. These editing symbols are used in conjunction with textual material by insertion of a proper one of the symbols underneath the portion of textual material which is in error. After the appropriate symbols have been inserted throughout to amend or alter the textual material, the page of textual material and handwritten symbols may then be read by a character recognition system which will automatically edit the textual material in accordance with the editing symbols.
Other objects and many of the attendant advantages of this invention will be readily appreciated as the same becomes better understood by reference to the following detailed description when considered in connection with the accompanying drawings wherein:
FIG. 1 is an enlarged plan view of the basic editing symbol from which the font of editing symbols is comprised;
FIG. 2 is a font of editing symbols comprised of the editing symbol shown in FIG. 1;
FIG. 3 is a plan view of a sheet of textual material as edited by a standard editing technique;
FIG. 4 is a plan view of a sheet of the same textual material as that shown in FIG. 3 as edited in accordance with the invention;
FIG. 5 is a schematic block diagram of an optical scanning system embodying the invention;
FIG. 6 is a schematic block diagram of the shift register used in the system;
FIG. 7 is a schematic diagram of the flip-flop circuitry used throughout the shift register;
FIG. 8 is a schematic diagram ofa recognition circuit used in the Feature Extraction Mask unit;
FIG. 9 is a pictorial diagram illustrative of the operation of the recognition circuits;
FIG. 10 is a schematic block diagram of the flow of data throughout the system;
FIG. 11 is a schematic block diagram of the flow of data within a computer after a document has been scanned; and
FIG. 12 is a pictorial diagram illustrative of the recognition circuits for an editing symbol.
Referring now in greater detail to the various figures of the drawings wherein similar reference characters refer to similar parts, the editing symbol embodying the present invention is generally shown at 20 in FIG. 1. The editing symbol 20 is basically comprised of a vertical upright bar 22 which extends from the bottom to the top of the editing symbol. The symbol also includes six horizontal bars 24, 26, 28, 30:32 and 34. The first pair of horizontal bars 24 and 26 extend laterally from the top of upright bar 22 to the left and right sides, respectively. The pair of bars 28 and 30 extend laterally from the center of upright bar 22 to the left and right, respectively. Finally, the pair of bars 32 and 34 extend from the bottom end of upright bar 22 to the left and right sides, respectively.
The editing symbol 20 is the basic structure for a font of handwritten symbols which are formed as a combination of the upright bar 22 and a combination of the presence and absence of bars 24 to 34. This font of symbols is shown in FIG. 2. As can be seen, there are 64 symbols which can be comprised of the editing symbol shown in FIG. 1. That is, the number of combinations which can be derived from the presence and absence of six bars is 2 or 64. As will be seen hereinafter, the provision of an editing symbol having only a single vertical bar and a plurality of horizontal bars facilitates the easy recognition of the symbol.
Each of the symbols in FIG. 2 may be used to represent either an editing instruction, an alphabetic insertion or a linecasting instruction. Thus, the editing symbol 36, which is comprised of vertical bar 22 and horizontal bars 24, 26, 32 and 34 and which is shown in the third column from the right and third row from the bottom in FIG. 2, may be used as an editing instruction to indicate a capital letter is required rather than a lower case. That is, where textual material is printed with a mistake such as not capitalizing the first letter of a proper noun, the editing symbol 36 may be written underneath the first letter of the word which needs capitalization. In this manner, when the textual materi' al is inserted in the character recognition system of the invention, the editing symbol 36 instructs the system to alter the textual material in accordance with the instruction. Thus, rather than the machine printing a lower case letter, a capital letter is printed instead.
In standard systems for editing textual material, it has been necessary to edit all of the material and then retype or print the material incorporating the revisions prior to having the material read by character recognition equipment. Thus, in the following example, it can be seen that by use of the novel system of this invention, no extra work is required in editing textual material, yet the textual material may be read directly into a character recognition system without requiring a perfect sheet of copy.
In the example hereinafter cited comparing the present system to that presently used by the Government Printing Office, it can be seen that the manner in which the editing of textual material may be accomplished is fairly similar. The following is a chart of some of the symbols used by the Government Printing Office and those symbols embodying the invention which can be used to perform the same function:
SYMBOLS FOR EDITING AND LINECASTING GPO Symbols Editing Symbols Function Caps Deletion Only Start of Deletion Start of Insertion End of Insertion Insert Space Transpose Lower Case The following are examples of alphabetic insertions and graphic arts instructions which may be inserted with the editing symbols embodying the invention:
ALPHABETIC INSERTIONS "l l f- -1 point l I pica line It should be understood that the editing symbols shown above are exemplary only and other symbols may be used for the same functions and the editing symbols shown may be used for other functions. The same symbol may be used not only for an editing instruction, but also for an alphabetic insertion or linecasting instruction. That is, there are only 64 editing symbols which may be made from the vertical bar 22 and horizontal bars 24 through 34 of the basic editing symbol 20. Thus, if the total number ofinstructions, alphabetic insertions and linecasting instructions are greater than 64 in number, it is necessary to use the editing symbols in more than one manner.
The manner in which the symbol is used can be determined by either its location or the adjacent editing symbols. For instance, an alphabetic insertion is always used after an editing instruction symbol. Therefore, the editing instruction symbol enables the computer to determine that the following symbol thereafter is an alphabetic insertion. It should also be understood that editing symbols may be used not only for alphabetic and graphic arts insertions, but numerical insertions and other symbols as well.
Also, the sensing of the linecasting instructions in a position other than within the text as will be seen hereinafter enables determination by the computer that the symbol is specifically to be used as a graphic arts instruction as opposed to an alteration of the textual material. The use of the editing symbols will be more clearly seen in conjunction with the example hereinafter shown.
In FIG. 3 and FIG. 4, there is shown a sheet 38 and 40, respectively, of textual material which has been edited by the use of Government Printing Office (hereinafter abbreviated to GPO) symbols and by use of the editing symbols of the invention, respectively. The textual material should read as follows:
With growing urgency, U.S. planners are grappling with a momentous international problem: An increasingly hungry world is turning more and more to this bountiful country for needed food; but the US, with its surpluses already shrinking, will be unable to fill the food gap that looms ahead.
To fend off the specter of starvation, the planners are pondering various combinations of American help and foreign self-help. The choices finally made will hinge in part on how much room is left for welfare programs as Vietnam war spending rises.
Thus, in FIG. 3, the GPO symbols are interspersed throughout the textual material in order to alter and amend the errors. In the upper left-hand corner of sheet 38, the notation VB ZOXII" is shown. This notation in GPO editing instructions indicates the following instructions to those in the graphic arts, such as linecasters:
Print the textual material in Vogue Bold with a 20 point and l l pica line.
In FIG. 4, the same instruction is indicated in the top left-hand corner by editing symbols 42, 44 and 46. Symbol 42 is comprised of the vertical upright bar 22 and the presence of horizontal bars 24, 26, 30 and 32 and the absence of the remaining bars (28 and 24). Editing symbol 44 is comprised of the vertical upright bar 22 and the presence of horizontal bars 26, 28 and 34. Editing symbol 46 is comprised of the vertical upright bar 22 and the presence of horizontal bars 26, 28, 30 and 32.
Editing symbol 42, as previously mentioned, indicates a linecasting instruction of Vogue Bold. The editing symbol 44 is the instruction for point and editing symbol 46 is the linecasting instruction for ll pica line.
Thus, it can be seen that the editing symbols of the invention may be used similarly to GPO abbreviations to indicate the manner in which the textual material will be printed.
Referring to FIG. 3, the first line of textual material in sheet 38 is in error in that the small s in the abbreviation U.S." should be a capital S". This mistake is indicated by the GPO symbol of three parallel lines handwritten below the letter in error which indicates that a capital letter S:" should replace the lower case letters".
As was previously indicated, the symbol 36 may be placed underneath the small "s" on the first line of sheet 40 in FIG. 4 to indicate that a capital S should be inserted therefor.
An error appears on the second line on sheets 38 and 40 in that the letter 2" should be an 0". The GPO method of editing such an error would be to pencil the GPO symbol /through the letter 2" and inserting after the deleted letter the GPO start of an insertion symbol A. The letter a" is then placed above the GPO symbol A.
On the second line of sheet 40, the z is corrected to an 0" by inserting the editing symbol 48 underneath the z". Editing symbol 48 indicates that the letter above it should be deleted. The editing symbol 48 is followed by editing symbols 50 and 52. Editing symbol 50 indicates that the letter a" should be inserted and editing symbol 52 indicates that there are no further letters to be inserted in place of the deleted z".
On the third line of sheets 38 and 40, the word worlds should be world". The GPO symbol indicating a deletion is written through the s" to indicate that the word should be world.
On sheet 40, the editing symbol 54 which indicates that a deletion only should be made is inserted under the .s" in worlds", thus, indicating the deletion thereof.
The fourth line of sheets 38 and is in error in that needed and food" should be separated. This is indicated by the GPO symbol# for inserting a space.
On sheet 40, the editing symbol 56 is inserted underneath the second d" and the in neededfood" to indicate that a space should be inserted between the letter d" and the letter j.
The next line of the textual material on both sheets 38 and 40 does not contain any errors and therefore no editing symbol is necessary in either system.
On the next line, the word ahead" is in error in that the a" and the e" are not in the proper order. On sheet 38, the GPO symbol tr for transposing is placed underneath the "ae.
On sheet 40, editing symbol 58 is inserted underneath the ae" to indicate that a transposition of the a and the e is necessary.
On the next line, which is the first line of the second paragraph, there is an error in that the word of" should be inserted between "specter and starvation" and the first letter P" in the word Planners" should be a lower case. The first error on the line is corrected on sheet 38 by the insertion of the GPO symbol Afor the start of an insertion between the words specter" and starvation and the insertion of the word of above the symbol.
On sheet 40, the symbol 60 is placed below the space between the words specter" and starvation" to indicate the start of an insertion, and the symbols 62, 64 and 58 follow to indicate that the letters 0" and f should be inserted between specter" and starvation 0n sheet 38, the GPO symbol lc to indicate lower case is inserted above the P in Planners" to correct the ease error.
On sheet 40, the correction of the capital P" to a lower case "p" is indicated by editing symbol 66 which is placed beneath the "P" and indicates that a lower case "p should be substituted therefor.
The next three lines do not contain any errors and are therefore not edited. However, on the last line of sheets 38 and 40, the words was speeding" should read war spending". On sheet 38, the line is corrected by writing the GPO deletion symbol through the s and through the c and inserting the GPO symbol/\for the start of an insertion after each of these deletion symbols. The letters r" and n" are then inserted over the first and second start of insertion symbols, respectively, to indicate that they replace the .r" and the c", respectively.
On the last line of sheet 40, the editing symbol 48 is inserted underneath the letter "s in the word was" and underneath the letter c in the word speeding. The first symbol 48 is followed by editing symbols 68 and 52. The second editing symbol 48 is followed by editing symbols 70 and 52. These symbols are inserted after the symbol 48 which indicates the start of a deletion to indicate that the s" and the c" are to be replaced, respectively, by an r and an n".
It can thus be seen from the description of the editing of sheets 38 and 40 that the manner of editing textual material by use of the editing symbols of the invention is very similar to the use of the editing symbols in a conventional system such as that used by the Government Printing Office.
The symbols are easy to write and are very flexible. Thus, the symbols may be used not only for instructions for altering or deleting, but for use to indicate alphabetic insertions, linecasting instructions as well as other insertions or instructions.
The schematic block diagram of a system which may be used to optically scan the edited sheet 40 is shown in FIG. 5. The system includes a document handling unit 72, a scanner unit 74, an instruction control unit 76, a cross-correlation unit comprising a shift'register 78, feature extraction masks and logic circuitry comprised of the combination of features to characters circuits 82, and the code generator 84, a master control unit 86 and an input-output buffer unit 88.
The document handling unit 72 basically comprises a rotating cylindrical platen and a document input unit which feeds the incoming documents to the platen. The rotating platen supports the documents and is adjacent the scanner unit 74. Scanner unit 74 is a flying spot scanner and basically comprises a cathode ray tube shaping circuit. The cathode ray tube supplies a raster of light which is directed at the document which is presently in position for being read on the rotating platen of the document handling unit 72. The size of the raster and the location thereof are determined by the inputs on lines 100 and 102. Lines 100 and 102 are connected between the scanner unit 74 and the instruction control unit 76.
The lines 100 and 102 actually indicate a plurality of I lines as indicated by their thickness. Throughout FIG. 5 those lines which are heavy indicate that the line is actually a cable having a plurality of input or output lines in multiple. The horizontal positioning of the cathode ray tube raster is determined by the inputs on lines 100. The horizontal size of the raster is also controlled by the instructions fed to the scanner unit on lines 100.
The size and location of the vertical position of the raster in the cathode ray tube of scanner unit 74 is determined by the inputs on lines 102 from the instruction control unit 76. The horizontal and vertical locations of the raster are also fed back via lines 100 and 102 to the instruction control unit. The locations are in turn fed to the master control unit via input and output buffer unit 78 so that the horizontal and vertical posi tion or coordinates of a character are stored with the character when it is recognized, as will hereinafter be seen.
The cathode ray tube in the scanner unit 74 forms the output of the flying spot scanner system which emits a beam oflight which is directed to the document being read on the document handling unit 72. The beam is appropriately directed by a lens system between the cathode ray tube and the document. The beam is scanned in a raster which is slightly larger than the largest character which is to be scanned. in the present embodiment, the preferred raster includes thirty vertical scans. The photomultiplying tube in the scanner unit 74 is connected to a pulse shaper which samples the output from the photomultiplying tube at predetermined intervals. That is, as the cathode ray tube emits a beam of light in a vertical column along the surface of a document, the photomultiplier tube emits a signal in accordance with thereflection of the beam of light on the surface of the document. Thus, if the beam is reflected off a white area of the document, the output of photomultiplier tube is at one level. Whereas, the location of the beam from the cathode ray tube on a black surface of the document such as a character produces a different signal level output from the photomultiplier. The pulse shaper samples the photomultiplier output at discrete intervals so that pulses are produced indicative of either a white surface or a black surface as the cathode ray tube beam scans the surface of the document.
in the preferred embodiment, the pulse shaper samples the output of-the photomultiplier tube forty times in each of the columns. Thus, for each raster of illumination that the cathode ray tube produces onto the surface of a document, the pulse shaper will produce 1,200 (30 columns X 40 samples per column) discrete outputs. The pulse shaper also includes appropriate gating so that unless a certain threshold of illumination is reflected to the photomultiplier, the output indicates that a black area has been scanned. In this manner, a digital output is produced. The output of the pulse shaper is therefore either one of two levels; the first level indicating that the area scanned is predominantly black at thesampled location and a second level indicating the sampled location is predominantly white.
Thus, if the beam from the cathode ray tube scans a surface which is partially black and partially white at the time that the photomultiplier output is sampled, then the threshold circuitry within the pulse shaper enables the generation of a discrete digital output of either one level or another. The output from the pulse shaper is fed via line 104 of scanner unit 74 to the shift register 78. Shift register 78 is capable of storing 1,200 bits. That is, the output from the scanner unit 74 for a complete character scan on a document in the document handling unit 72 may be stored in the shift register.
The shift register includes 1,200 flip-flops which are serially connected as shown in FIG. 6. The flip-flops are shown in 30 vertical columns labeled C-l, C-2-C-30 and40 horizontal rows labeled R-l R-2, R-3-R-40 in accordance with the location of the samples in the scanning raster. The first column C-l is comprised of flip-flops FF-l, FF-2, FF-3-FF-40. These flip-flops are serially connected. That is, the output of FF-l is connected to the input of FF-Z, the output of FF-2 is connected to the input of FF-3and the output of FF-39 is connected to the input of F F-40. The output of flip-flop FF-40 is connected to the input of FF-4l which is located at the top of the second column C-2. FF-4l through FF-SO comprise the second column and are similarly serially connected. The output of FF-80 is connected to the input of FF-8l and so on through to the 30th column C-30. There are, thus, 30 columns of 40 flip-flops. Each of the forty flip-flops in a column corresponds to the points along a vertical column of a raster at which the output of the photomultiplier tube in the scanner unit 74 are sampled by the pulse shaper unit.
The pulse shaper unit samples the output of the photomultiplier in accordance with signals fed by a clock pulse source which also feeds shift pulses via line 106 to the shift register 78.The line 106 is connected to the input of each of flip-flops FF-l to FF- 1 200. Th us,
as the stream of pulses representing the sampled output of the photomultiplier are fed to line 104 of the shift register 78, the pulses on line 106 advance the information through the shift register. it should be understood that the shift register 78 need not be physically positioned in 30 columns of 40 flip-flops. The flip-flops FF- 1 through FF-l ,200 may be positioned so that the flipflops are in a single line from FF-l through FI -1,200 or in any other physical location. it is not necessary that the flip-flops be positioned in accordance with the location of the sampled raster. The necessity of positioning the flip-flops in a rectangular pattern is obviated by use of electronic extraction masks which are connected to the output of the flip-flops irrespective of their locations.
Each of the flip-flops FF-l through FF-l ,200 is comprised of a flip-flop circuit 107 which includes a bi-stable flip-flop circuit having buffer amplifiers connected to the output thereof as shown in FIG. 7. The bi-stable portion of the circuit shown in FIG. 7 is an Eccles-Jordan type flip-flop that is comprised of transistors 108 and 110 and the associated circuitry connected therebetween.
The emitters of transistors 108 and 110 are each connected to ground. The collector of transistor 108 is connected to the base of transistor 110 via a resistor 112 and a capacitor 114 which are connected in parallel. The collector of transistor 108 is also connected to a negative source of voltage (-V) via resistor 116 and to the input of the next stage via line 118. The collector of transistor 110 is connected to the base of transistor 108 via resistor 120 and capacitor 122 which are connected in parallel and to the negative source of voltage (-V) via resistor 124. The collectors of transistors 108 and 110 are also connected to the bases of transistors 126 and 128, respectively, which act as amplifiers to drive the feature extraction masks in the feature extraction masks unit 80. The bases of transistors 108 and 110 are connected to a positive source of voltage (V) via resistors 127 and 129, respectively. The base of transistor 108 is also connected to capacitor 130 which is connected to the output line from the previous transistor stage. That is, the output line 118 of a previous flip-flop 107 is connected to the input of capacitor 130 except in the case of flip-flop FF-l, the capacitor 130 is connected to input line 104 from the scanner unit 74. The base of transistor 110 is connected to capacitor 132. The capacitor 132 is connected to the line 106 which receives the shift pulses and shifts the contents of the shift register 78 from one stage to the next.
The collector of transistor 126 is connected to an output line 134 which is fed to the various feature masks which are associated with a particular stage of the shift register 78. Similarly, the collector of transistor 128 is connected to output line 136 which is also connected to various feature masks which are associated with that particular stage of shift register 78. The emitters of transistors 126 and 128 are connected via resistors 138 and 140, respectively, to a positive source of voltage (V). The collectors of transistors 126 and 128 are also connected via resistors 142 and 144, respectively, to the negative source of voltage (-V).
As previously mentioned, the flip-flop comprised of transistors 108 and 110 is a bi-stable circuit. That is, either transistor 108 or transistor 110 conducts while the other is cut-off. Assuming transistor 108 is conduct- -ing' the transistor 110 is cut-off by the voltage on the collector of transistor 108 which is fed to the resistor divider comprised of resistors 112 and 129 which back biases the emitter-base junction of transistor 110. Similarly, when transistor 110 conducts, the collector voltage of transistor 110 back biases the emitter-base junction of transistor 108 so that it is cut-off. Assuming transistor 110 is conducting, an input pulse to capacitor 132 back biases the emitter-base junction of transistor 110 so that it is cut-off. The change in output voltage on the collector of transistor 110 thereby enables transistor 108 to begin conduction. If, however, transistor 110 were cut-off prior to reception of a pulse to capacitor 132, the transistor 110 would be driven.
further into a cut-off region and the state of the flipflop remains unchanged. Similarly, if transistor 108 is conducting and an input pulse is applied to capacitor 130, the transistor 108 is cut-off and the rise in collector voltage turns off transistor 110. An input pulse to capacitor 130, when transistor 108 is cut-off, merely drives the transistor 108 further into cut-off and the state of the flip-flop is unchanged.
Thus, it can be seen that each time a shift pulse is applied on line 106, each of the flip-flops 107 in shift register 78 is driven to the condition where transistor is cut-off. That is, if transistor 103 in one particular stage of the shift register is cut-off, it is caused to conduct by the input pulse on shift line 106. In those stages where the transistor 108 is conducting, the conditions or states remain unchanged.
The output from the previous stage is then received by each of the flip-flops 107 and ifa pulse is applied on line 118 from the previous stage indicative of the fact that transistor 108 of the previous stage had been cutoff prior to the shift pulse, the transistor 108 of the next stage is cut-off by the pulse applied to capacitor 130. it should be understood that appropriate pulse delay means are inserted between the output line 118 of the previous stage and the input to capacitor of the next stage so that the flip-flops 107 which have been changed by a shift pulse have time to be stabilized prior to the reception of the output from the previous stage.
Whenever transistors 108 of the flip-flop circuits 107 are cut-off, the output level on line 118 is indicative that a black portion of the document has been scanned to produce the pulse. Whereas, conduction of transistor 108 indicates that a white portion of the document has been scanned. It is, of course, to be understood that this may be reversed as the demands of the circuitry require. Thus, for ease of reference, when transistor 108 conducts and transistor 110 is cut-off in one of the stages of the shift register, the stage is considered to be in a white" state; When transistor 108 is cut-off and transistor 110 conducts, the stage is considered to be in a black state.
The output from the collector of transistor 108 also drives transistor 126 which produces an output on line 134 which is inverted and which drives the feature masks associated with a flip-flop stage 107. Similarly, the output voltage on the collector of transistor 110 is inverted by amplifier 128 and applied via line 136 to the feature masks associated with the stages of the flipflop of the shift register.
As best seen in FIG. 5, the shift register 78 is connected via cable to the feature extraction masks 80. Cable 145 includes the output lines 134 and 136 from each of the 1,200 flip-flops in shift register 78. The lines 134 and 136 are combinatorially applied to the plurality of masks which comprise the feature extraction masks unit 80. There are as many feature masks as there are features which must be recognized in order to identify the character which is being shifted through the shift register 78. Each feature extraction mask is connected to a plurality of flip-flops in shift register 78. The inputs may be either from the line 134 or line 136 of the flip-flops depending on the feature which is sought to be recognized. That is, if a character is sought to be recognized by a combination of the presence of various features, the recognition gate for those features are connected to line 134 of the flipflops of the shift register 78. Whereas, the detection for the absence of a segment may be recognized by sensing the lines 136 of the various flip-flops associated with the feature.
A feature mask is shown in FIG. 8. Each feature mask includes a plurality of resistors 146, the first end of which is connected to the output lines 134 or 136 of the various flip-flop stages of the shift register 78. The feature mask also includes a threshold gate which is comprised of transistors 148 and 150 and their associated circuitry. It should be understood that various combinations and pluralities of resistors may be used for a feature mask. That is, there need not be four resistors as shown, but in fact, any number from 2 to 60 can be used for a feature mask. However, for the most part, the average feature mask contains from 4 to of such resistors. Resistors 146 may be weighted in value so that certain portions of a feature which are more important are given more value as an input to the base of transistor 148. Transistors 148 and 150 are preferably of the P-N-P type. The emitters of transistors 148 and 150 are both connected to ground via resistor 152.
The collector of transistor 148 is connected to a negative source of voltage (-15) via resistor 154 and to the-base of transistor 150 via resistor 156. The base of transistor 150 is also connected to a positive source of voltage (E) via resistor 158. The collector of transistor 150 is also connected to the negative source of voltage (-E) via resistor 160. The base of transistor 148, in addition to being connected via resistor 146 to the various outputs of shift register 78, is also connected to a positive source of voltage (E) via resistor 161. The collector of transistor 150 is also connected to a positive source of voltage (E) via resistor 164. The transistors 148 and 150 are so biased by a voltage source E and E that the transistors 148 and 150 do not conduct until a plurality of inputs are applied toresistors 146 which overcome a predetermined threshold. Thus, if a particular mask has four resistors and is adapted to be operated by inputs to any three of the four resistors 146 then the mask has a threshold which is exceeded by inputs to three of the four resistors. Thus, when the circuit receives three of the inputs, the emitter-base junction of transistor 148 is forwardly biased and therefore conducts. This enables the conduction of transistor 150 which produces an output signal on line 162. Line 162 is connected to the collector of transistor 150 and the output signal is transmitted to the logic gates which are located in the combination of features to characters unit 82. As seen in FIG/5, the feature extraction mask unit 80 is connected to the combination of features to characters unit 82 by a cable 166 which is comprised of the output lines 162 from each of the threshold gates in the feature extraction masks.
Referring now to FIG. 9, a character mask is diagrammatically illustrated. The diagram represents each of the feature masks used for the identification of the letter H. That is, the illustration represents the manner in which the shift register 78 is sensed in order to recognize the letter H if it is on a document and is scanned by the cathode ray tube of scanner unit 74.
The diagram is comprised of 30 columns of 40 blocks 168. Each block represents the stage of a flip-flop of shift register 78 to which the resistors 146 of the feature extraction masks are connected. Thus, the labels C-l through C-30 for the columns and R-l through R- 40 for the rows correspond to the columns and rows of the shift register 78 as shown in FIG. 6. That is, the block 168 in column C-1 and row R-l corresponds to flip-flop F F-l in FIG. 5. Thus, if the feature mask for an H required the detection of either a white or black predominance in that particular area of the document,
one of the resistors 146 of a feature mask is connected to the line 134 or 136, respectively, of flip-flop FF-l For the letter H, feature extraction masks are used in accordance with the pattern shown in FIG. 9. That is, the letter H is comprised of a plurality of sectors 170, 172, 174, 176, 178, 180, 182, 184, 186 and 188. Each of the sectors of the letter H are five blocks long and three blocks wide and thereby encompass 15 blocks. This is representative of the fact that each feature mask which is represented by a sector in the letter H in FIG. 9 includes fifteen resistors 146 which are connected to the output lines 134 of 15 stages of shift register 78. The sectors to 176 extend vertically and form the left vertical bar of the letter H. Sectors 178 and 180 extend horizontally and form the central bar of the letter H, and sectors 182 through 188 extend vertically and form the right vertical bar of the letter H.
The mask for the letter H further includes a pair of sectors 190 and 192 which are each two blocks square. That is, the feature masks which are represented by each of these sectors includes four resistors 146 which are connected to the output lines 136 of four stages of shift register 78. The sectors 190 and 192 correspond to white areas on a document so that not only does the character H mask require detection of black areas on the document where the letter H would be, but also that there be white areas on the document between the vertical bars and the central horizontal bars of the H.
For each sector illustrated in the letter H mask of FIG. 9, there is a feature mask in the feature extraction masks unit 80. For example, the sector 170 is diagrammatically illustrative of a feature mask as shown in FIG. 8 having fifteen resistors 146. Each of the boxes 168 within sector 170 correspond to a resistor 146 in such a feature extraction mask. The resistor corresponding to the box 168 which is disposed in column C-21 and row R-l l is connected to output line 134 of FF-8l 1. Similarly, the box 168 which is disposed in both column C-21 and row R-l2 indicates that a second resistor 146 of the feature extraction mask is connected to the output line 134 of flip-flop FF-8l2. In the same manner, each of the sectors 172 through 188 indicate feature extraction masks having fifteen resistors 146 connected to the output lines 134 of various flip-flops throughout the shift register 78 in accordance with the location of the boxes in FIG. 9. The sectors 190 and 192 are each illustrative of feature masks having four input resistors 146. Thus, the box 168 which is disposed in both column C-l5 and row R-l 2 indicates that the first resistor 146 in the feature mask corresponding to sector 190 is connected to output line 136 of flip flop FF-572' which is in column C-lS and row R-l2 of the shift register 78.
As the cathode ray tube in scanner unit 74 scans a letter H on a document, the signals formed by the scanning of the letter H are shifted through shift re- However, it is enough that various of the extraction masks be energized. That is, it is not necessary that all of the sectors of the letter H be recognized simultaneously. For example, if either of the sectors 170 or 172 is not recognized, the letter H may still be detected if the other is present. Thus, if the print on the document is sporadic at either portion of the H corresponding to sectors 170 or 172, the letter H can still be recognized. Similarly, as will be seen hereinafter, the absence of the recognition of other of the sectors will not completely prevent recognition of the letter H.
The outputs of the feature masks corresponding to sectors 170 through 192 are fed via cable 166 to the combination of features to characters unit 82. The
combination of features to characters unit 82 includesa plurality of gating circuits, tree circuits or logic circuits to convert the outputs of the various feature masks to characters. Thus, in the example of the letter H, appropriate logic circuitry may be used to mechanize the following equation to recognize the letter H:
This is a Boolean equation which is mechanized within the combination of features to characters unit 82. S170 through S192 indicate that an output signal indicative of the presence of a sector is provided on lines 162 of the feature masks associated with sectors 170 through 192, respectively. The symbol indicates the OR function and the symbol indicates the AND function. It can thus be seen that each of the following conditions are necessary in the unit 82 to determine that an H has been scanned:
l. The recognition of the presence of either or both of sectors 170 and 172. 2. The recognition of the presence of either or both of sectors 174 or 176.
3. The recognition of the presence of either or both of sectors 178 and 180.
. The recognition of the presence of either or both of sectors 182 or 188.
5. The recognition of the presence of either or both of sectors 186 or 188.
The recognition of the presence of both sectors 190 and 192.
The detection of the character by the combination of features to characters unit 82 provides an output signal on cable 194 which is connected to the input of code generator 84. Code generator 84 converts the input from cable 194 to a binary-coded representation-of the character identified or recognized by the unit 82.
The output of code generator 84 is connected to the input-output buffer unit 88 via cable 196. Cable 196 includes a plurality of lines which feed the character to the input-output buffer unit in parallel. The input-output buffer unit 88 acts as a multiplexing unit for feeding information into and out of the master control unit 86 on a time sharing basis. Master control unit 86 is preferably a general purpose digital computer which is programmed in accordance with the requirements of I buffer unit 88 provides these coordinates to master' control unit 86 via cable 202. Thus, not only the character which is read but the location thereof is stored together therewith in a temporary storage area of the master control unit. I
it should be noted that the input-output buffer unit 88 also provides instructions to instruction control unit 76 via line 204 which is connected therebetween. The master control unit 86 provides instruction signals for distribution throughout the system via cable 202 which is connected between the input-output buffer unit and the master control unit. As hereinbefore mentioned, the input-output buffer unit 88 is a multiplexing unit which controls traffic between the remainder of the system and the master control unit 86.
Referring now to FIG. 12, a combination of feature masks is diagrammatically shown for the detection and the identification of editing symbols of the invention on a document. FIG. 12 diagrammatically illustrates, in the same manner as FIG. 9, the combination of feature extraction masks which comprise the means of detecting and recognizing the editing symbols on a document. The vertical bar 22 of the editing symbol 20 is comprised of 15 sectors M1 through M15. Each of the sectors is five blocks long by one block wide. Each of the sectors M1 through M15 is vertically elongated and is positioned substantially at the center of the raster. The top left horizontal bar 24 is comprised of sectors Tl T2 and T3. The top right horizontal bar 26 is comprised of sectors T4, T5 and T6. The left central horizontal bar 28 is comprised of sectors C1, C2 and C3. The right central horizontal bar 30 is comprised of sectors C4, C5 and C6. The bottom left horizontal bar 32 is comprised of sectors L1, L2 and L3. The bottom right horizontal bar 34 is comprised of sectors L4, L5 and L6. Each of the sectors Tl through L6 which comprise the horizontal bars 24 through 34, is horizontally elongated and is five blocks long by one block wide.
Each of sectors-M1 through M15 represents a feature mask having five input resistors 146 which are connected to the output lines 134 of the stages of shift register 78 in accordance with the location of the boxes in FIG. 12. Each of the sectors Tl through T6, Cl through C6 and L1 through L6 that form the horizontal bars of the editing symbol 20 are illustrative of a pair of feature extraction masks each having five resistors 146 connected to the base of transistor 148. The resistors 146 of the first of each of the feature masks associated with these sectors of the symbol 20 are connected to the output lines 134 of the various stages of the shift register 78 with which they are associated. The resistors 146 of the second of the feature masks associated with these sectors are connected to the output lines 136 of the associated stages of the shift register 78. Therefore, there is a feature mask to detect both the presence or absence of any of the sectors that comprise the horizontal and vertical bars which form the editing symbol 20.
The feature masks associated with the black sides or outputs 134 of the associated stages of the flip-flop produce an output to indicate the presence of a sector when the area of the sector on the document is predominantly black. When a black sector is present the output on line 162 of the feature mask is labeled for use in the equations, infra, in accordance with the sector detected. Thus, if the feature mask for sector Tl detects a black sector, the presence of the sector in the Boolean equation is labeled T1".
The recognition masks which are connected to output lines 136 indicate the absence of a particular symbol or a predominantly white area on the document. Thus, in the case of the area of sector Tl being predominantly white, the output signal produced by the feature mask is labeled T l. Thus, the signals produced in the feature extraction masks unit 80 by the feature extraction masks which are used to determine the presence of'a black sector are labeled by the sector which they represent, whereas the feature masks which detect the absence of a black area or the presence of a white area, emit a signal indicative thereof which is labeled by the sector which they represent with a bar above it. This terminology is used throughout the equations set forth, infra.
Provided above the horizontal sectors T1 and T4 and the tops of vertical sectors Ml, M2 and M3, is a horizontally extending sector TZ which is 13 blocks long and one block wide and is thus coextensive with the top of the editing symbol 20. The sector TZ is also spaced one block above the editing symbol 20. Provided below the horizontal sectors L3 and L6 and the bottoms of vertical sectors M13, MM and M15 is a horizontally elongated sector BZ which is also 13 blocks long by one block wide. The sectors TZ and B2 are each associated with a feature extraction mask having 13 resistors 146 connected to the base of transistor 148. Each of the resistors is connected to the output line 136 of the associated stages of shift register 78. The sector TZ extends from column C-lO to (3-22 on row R-6 and thus the resistors 146 are connected to stages FF-366, FF-406, FF-446, FF-486 and FF-846. The resistors 146 of the feature extraction mask associated with sector 82 are connected to the flip-flops FF-394, FF-434 and FF-874.
Provided between the horizontal bars 24 and 28 and vertical bar 22 is a sector LZ which is three blocks long and two blocks wide. Another sector R2 is provided between horizontal bars 30 and34 and vertical bar 22 which is also three blocks long by two blocks wide. The sectors L2 and R2 are each associated witha feature mask having six resistors 146 which are connected to v the lines 136 of the associated stages of shift register 78. The sectors TZ, 82, L2 and R2 as will be seen hereinafter insure that the signals emitted by scanning an editing symbol which are shifted through the shift register 78 are in the proper position to enable an accurate character identification.
As previously mentioned, the feature extraction masks may have weighted resistors for the characters. ln the feature extraction masks used for the sectors of the horizontal and vertical bars of the editing symbol 20, the resistors 146 are substantially equal in resistance. The threshold gate associated with each of the sectors is properly biased so that it may be operated by the receipt of three bits out of five from the shift register. That is, if the cathode ray tube scans a black area in any three of the five positions within a sector, the threshold gate of the feature mask is operated to produce a signal on line 162 of the threshold gate to indicate that a sector is present. Thus, normal variations in the line produced by a pencil or a pen does not prevent the threshold gate associated with the sector from being operated when a sector is present on the document.
TZ, LZ, RZ and 82 comprise a registration mask. As hereinbefore mentioned, the feature masks associated with these sectors aid in the prevention of an inaccurate identification of a character in the editing symbol masks. The feature masks associated with sectors TZ and BZ of the mask are set so that the thirteen resistors 146 are similar in weight andthe circuit is operated upon receipt of ten or more signals from lines 136 of the 13 stages of the shift register 78 to which they are connected. That is, if the cathode ray tube scans ten white areas out of the thirteen areas of the sector, the feature mask is energized. The feature masks associated with sectors L2 and R2 are set so that the presence of four or more white spots during the scan of the sector by the cathode ray tube energizes the threshold gate of the feature mask. The feature masks of sectors L2 and R2 when energized indicate that the color of these areas on the document are predominantly white. This condition for any of the registration feature masks is represented in the following equations by a bar (i.e. T Z over the top of the sector which has been scanned. Similarly, with respect to the sectors of the editing symbol 20, the use of a bar over the top of the sector (i.e. fi) indicates that the feature masks associated therewith which is connected to the lines 136 or the white" sides of the flip-flop stages of shift register 78 have been energized due to the absence of the sector.
The bars 24 through 34 of the editing symbol 20 are recognized as present by the logic circuitry upon the recognition of either one of the three sectors in each of thehorizontal bars. That is, the top left segment 24 (hereinafter referred to as TL) is recognized if the feature mask associated with either T1, T2 or T3 is energized. Similarly, the horizontal bars 26, 28, 30, 32 and 34 (hereinafter referred to as TR, CL, CR, LL, and LR, respectively) are recognized as present by the recognition of one or more of the sectors by their associated feature mask.
The circuitry for the combination of features to characters unit 82 insofar as the detection and identification of the editing symbols required is thus mechanized in accordance with the following Boolean equations:
For the Presence of the Horizontal Bars S. LL= Ll L2+ L3 6. LR= L4+ L5+ L6 For the Absence of the Horizontal Bars 6. E E E Us The vertical bar 22 is formed of five groups each including three sectors. Thus, five vertical portions of the bar are sensed for these portions of the bars and are hereinafter referred to as V1, V2, V3, V4 and V5. Vl is considered to be present if the recognition mask associated with either M1, M2 or M3 is energized. Similarly, the remaining vertical portions are recognized upon recognition of one or more of the vertical sectors comprising the portion. These portions are thus detected by logic circuitry in unit 82 which is mechanized in accordance with the following Boolean equations:
If each of the vertical portions Vl through V5 are present, the vertical bar 22 (hereinafter referred to as V) is considered to be present. Thus, the detection of V is mechanized by the following equation:
As hereinbefore mentioned, to insure that the presence of the vertical bar and the combination of the presence and absence of horizontal bars TL, TR, CL, CR, LL and LR are, in fact, in the proper location at the time that the vertical bar 22 is detected, the registration mask should produce the registration signal R in accordance with the following Boolean equation:
R T2- 17 :7 R2
It can therefore be seen that in order for the recognition masks to indicate that an editing symbol is present. not only must the vertical bar 22 be present as indicated by the signal being generated by the logic circuitry, but also the signal R must be generated by the logic circuitry. If both the R and V signals are present, it is indicative that an editing symbol of the font of editing symbols shown in FIG. 2 is present. It can be seen by the following exemplary illustrations how the logic tree is mechanized in order to identify which of the following editing symbols are scanned:
It can therefore be seen that if both the R and V are generated, an editing symbol is identified. It can. be seen in the above equations that where a particular horizontal bar of an editing symbol on the left side of the equation is required to be present, the label representative of the bar appears on the right side of the equation. Where the bar is not present in the left side of the equation, the label representative of the bar appears on the right side of the equation with a line thereover. That is, where the top left bar 26 is present in the symbol on the left side of the equation, TL appears in the right side of the equatio and where the top left bar is not present, the symbol TL appears.
When the combination of features to characters unit 82 detects an editing symbol, the output is fed on an appropriate line in cable 194 to code generator 84. Code generator 84 converts the signal from cable 194 to a bi nary-coded signal and feeds these signals via cable 196 to the input-output buffer unit 88.
The instruction control 76 provides via lines 198 and 200 the binary-coded signals representing the location at which the editing symbol is located on the document. The input-output buffer unit 88 transmits both the representation of the symbol and the location (x and y coordinates) thereof to the master control unit 86 for storage therein.
The master control unit 86 also supplies instruction signals to the instruction control 76 via line 204 and the input-output buffer unit 88 so that the document scanning equipment may be controlled for location of scan as well as the size of the scan. The size of the scan may also be varied where the editing symbol detected does not fall within specific size limits. Thus, if the symbol is written too small, the raster produced by the cathode ray tube is reduced. Similarly, if the editing symbol is too large,-the raster is increased in size.
The overall flow of operations and data within the character recognition system shown in FIG. 5 is illustrated by the schematic flow diagram in FIG. 10. As seen therein, the operation of the character recognition system is as follows:
The document to be scanned is placed into the cylindrical rotating platen of the document handling unit 72. When the document is in place, the document handling unit emits a signal over line 206 to the scanner unit 74 to indicate that the document is in place. If the document is not in place, the feeding apparatus of the document handling unit 72 is operated until the document is properly disposed.
The cathode ray tube of the scanner unit begins to search for the first line of the document so that it can begin to optically scan the characters throughout the document. The cathode ray tube scans in pattern to locate the first line of typewritten or printed information on the document. Until the first line is found,'the cathode ray tube continues to scan in pattern.
When the first line is detected, the horizontal and vertical position of the cathode ray tube beam is transmitted to the master control via the instruction control 76 and the input-outputbuffer unit 88. When the location of the first line of the document is received by the master control unit 86, the control unit 86 instructs the scanner unit to start scanning in a character pattern at the given horizontal and vertical location (hereinafter referred to as the x/y coordinate). The scanner unit then begins a character scan at the x/y coordinate. If the scanner does not recognize video, that is, when a character is not present at the first location, the character scan is moved further along the line by the instruction control 76.
When video is detected, the x/y coordinate is transmitted to the master control unit 86 via the instruction control unit 76 and the input-output buffer unit 88. The character at that position is scanned by the cathode ray tube and the output of the photomultiplier tube is fed via line 104 to the shift register 78. If a character is identified and recognized by the units and 82, the character and the x/y coordinate of the character are stored in the master control unit.
The instruction control 76 controls the scanner unit so that the scanner unit continues the character scans along the line until the end of the line. At the end of the line, the scanner is instructed by the instruction control 76 to scan in an editing scan at the x/y coordinate below the previous line. The scanner continues in an editing mode until a video interrupt. That is, if there is recognition that a character does exist on the editing line, then the x/y coordinate is fed to the master control unit and the scanner unit begins a character scan to provide the shift register with the output signals from the photomultiplier in scanner unit 74 for determination of the editing symbol located on the line. The recognition equipment thus sends the binary-coded representation of the symbol to the master control unit for the storage with the x/y coordinate thereof. instruction control 76 instructs the scanner unit to continue scanning between the lines of textual material until the end thereof whereupon the instruction control instructs the scanner to index to the next line of textual material. The process is then repeated by the scanner unit 76 at the next line of. textual material and the portion of the document underneath the line for detection of instructions. This process is repeated until the end of the document whereupon an end of document signal is generated in the master control unit 86 and the document handling unit 72 is instructed to put the next document in place for optical recognition.
As previously mentioned, the master control unit 86 is a general purpose digital computer. The information concerning the characters on the lines .of textual material and the instruction symbols are stored in temporary storage areas of memory. Line merges are initiated by the program in the master control and the editing operation is accomplished. The editing operation is diagrammatically illustrated in FIG. 11' which is a flow chart of the information in the computer for performing the merge.
As seen therein, after the end of document signal is received, the x/y coordinate of the first character in the first line is fetched from the temporary memory. The x/y coordinate of the first editing character found is also fetched from the temporary storage associated therewith. The coordinates of the textual character and the editing character are compared. If the coordinates do not compare, that is, it is determined that the x/y coordinate of the editing symbol is not adjacent to the x/y coordinate of the textual character, then the textual character is not in error and is not changed. Then the x/y coordinate of the next character is fetched and the coordinates of the edited character and the textual character are compared in the same manner that the coordinates were compared in the previous comparison.
If a comparison had been made in which the x/y coordinates of both the textual character and the editing character are within a specified limit and therefore adjacent to each other, then the editing character is fetched and the editing operation indicated by the character is performed. The results of the editing operation is stored in the final storage area along with the storage of the previous textual characters. The final operations on the stored data are then performed in accordance with the instructions which are indicated by the editing syrmbols representative of the graphic arts instructions. hus, the computer organizes the proper number of letters for a line and the width of the final columns that are used in the reproduction of the textual material before the textual material is read out of the computer.
It can, therefore, be seen that a new and improved method of editing as well as a new and improved character recognition system has been shown.
The invention enables the editing of printed or typed textual material for direct insertion into a character recognition system. The need for retyping or reprinting the entire sheet in perfect form is thus obviated.
Further, the method of editing is no more time consuming than other forms of editing and the symbols used are easy to write while being machine recognizable. The edited document is then ready to be placed directly in the character recognition system which can read the textual material as well as incorporate the alterations.
Obviously many modifications and variations in the present invention are possible in the light of the above teachings. it is, therefore, to be understood that within the scope of the appended claims, the invention may be practiced otherwise than as specifically described.
What is claimed as the invention is: 1. A font of editing symbols as shown in FIG. 2.

Claims (1)

1. A font of editing symbols as shown in FIG. 2.
US00871550A 1969-11-10 1969-11-10 Character recognition Expired - Lifetime US3709525A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US87155069A 1969-11-10 1969-11-10

Publications (1)

Publication Number Publication Date
US3709525A true US3709525A (en) 1973-01-09

Family

ID=25357687

Family Applications (1)

Application Number Title Priority Date Filing Date
US00871550A Expired - Lifetime US3709525A (en) 1969-11-10 1969-11-10 Character recognition

Country Status (1)

Country Link
US (1) US3709525A (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3872460A (en) * 1973-04-13 1975-03-18 Harris Intertype Corp Video layout system
US3981003A (en) * 1974-03-22 1976-09-14 Ebauches S.A. Digital display device
US4008793A (en) * 1971-09-08 1977-02-22 Vittorino Terracina Typewriting machine
US4021777A (en) * 1975-03-06 1977-05-03 Cognitronics Corporation Character reading techniques
US4538182A (en) * 1981-05-11 1985-08-27 Canon Kabushiki Kaisha Image processing apparatus
US4575125A (en) * 1983-12-19 1986-03-11 Uniroyal, Inc. Articles having invertible lettering thereon
US4696492A (en) * 1985-06-27 1987-09-29 Hardin Evelyn L Soundwriting--A phonetic script with keyboard
US5003612A (en) * 1984-05-29 1991-03-26 Image S.A. Method of identification of a product for the purpose of preventing fraud and device for carrying out this method
US5167016A (en) * 1989-12-29 1992-11-24 Xerox Corporation Changing characters in an image
US5181255A (en) * 1990-12-13 1993-01-19 Xerox Corporation Segmentation of handwriting and machine printed text
US5402504A (en) * 1989-12-08 1995-03-28 Xerox Corporation Segmentation of text styles
US20050106537A1 (en) * 2002-04-18 2005-05-19 Andrew Chepaitis Dynamic tactile and low vision fonts
US20080267534A1 (en) * 2007-04-30 2008-10-30 Sriganesh Madhvanath Image-Processing System
US20100099061A1 (en) * 2002-04-18 2010-04-22 Chepaitis Andrew J Alphanumeric font for the blind and visually impaired
USD789444S1 (en) * 2016-05-16 2017-06-13 Charles Mensah Korankye Adinkra alphabet

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US1021189A (en) * 1909-11-18 1912-03-26 Irving Hill Alphabetical symbols.
US1267640A (en) * 1917-11-22 1918-05-28 Edwin W Beardsley Writing paper, card, tablet, or the like.

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US1021189A (en) * 1909-11-18 1912-03-26 Irving Hill Alphabetical symbols.
US1267640A (en) * 1917-11-22 1918-05-28 Edwin W Beardsley Writing paper, card, tablet, or the like.

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4008793A (en) * 1971-09-08 1977-02-22 Vittorino Terracina Typewriting machine
US3872460A (en) * 1973-04-13 1975-03-18 Harris Intertype Corp Video layout system
US3981003A (en) * 1974-03-22 1976-09-14 Ebauches S.A. Digital display device
US4021777A (en) * 1975-03-06 1977-05-03 Cognitronics Corporation Character reading techniques
US4538182A (en) * 1981-05-11 1985-08-27 Canon Kabushiki Kaisha Image processing apparatus
US4575125A (en) * 1983-12-19 1986-03-11 Uniroyal, Inc. Articles having invertible lettering thereon
US5003612A (en) * 1984-05-29 1991-03-26 Image S.A. Method of identification of a product for the purpose of preventing fraud and device for carrying out this method
US4696492A (en) * 1985-06-27 1987-09-29 Hardin Evelyn L Soundwriting--A phonetic script with keyboard
US5402504A (en) * 1989-12-08 1995-03-28 Xerox Corporation Segmentation of text styles
US5570435A (en) * 1989-12-08 1996-10-29 Xerox Corporation Segmentation of text styles
US5167016A (en) * 1989-12-29 1992-11-24 Xerox Corporation Changing characters in an image
US5181255A (en) * 1990-12-13 1993-01-19 Xerox Corporation Segmentation of handwriting and machine printed text
US20050106537A1 (en) * 2002-04-18 2005-05-19 Andrew Chepaitis Dynamic tactile and low vision fonts
US20090305199A1 (en) * 2002-04-18 2009-12-10 Andrew Chepaitis Dynamic tactile and low vision fonts
US20100099061A1 (en) * 2002-04-18 2010-04-22 Chepaitis Andrew J Alphanumeric font for the blind and visually impaired
US20080267534A1 (en) * 2007-04-30 2008-10-30 Sriganesh Madhvanath Image-Processing System
US7983485B2 (en) * 2007-04-30 2011-07-19 Hewlett-Packard Development Company, L.P. System and method for identifying symbols for processing images
USD789444S1 (en) * 2016-05-16 2017-06-13 Charles Mensah Korankye Adinkra alphabet

Similar Documents

Publication Publication Date Title
US3709525A (en) Character recognition
US3611291A (en) Character recognition system for reading a document edited with handwritten symbols
JP2536966B2 (en) Text editing system
US5167016A (en) Changing characters in an image
KR840002409B1 (en) Apparatus for producing ideographic text
CA1306305C (en) Method and apparatus for processing ideographic characters
US4594674A (en) Generating and storing electronic fonts
GB1245058A (en) Character display apparatus
KR910003523A (en) Document data processing method using image data
US3925760A (en) Method of and apparatus for optical character recognition, reading and reproduction
US4562304A (en) Apparatus and method for emulating computer keyboard input with a handprint terminal
US3714636A (en) Automatic editing method with page formatting
JP2008108114A (en) Document processor and document processing method
JPS6158852B2 (en)
EP0119396B1 (en) Apparatus for and methods of presenting or displaying data represented as electric signals
US3512129A (en) Character recognition selective copying and reproducing apparatus
Grimsdale et al. Character recognition by digital computer using a special flying-spot scanner
JPH0624909Y2 (en) Document creation device
JP2529421B2 (en) Character recognition device
JPH06301713A (en) Bilingual display method and document display device and digital copying device
JPS62290984A (en) Pattern information inputting paper and method of recognizing pattern information using said paper
JPS5972577A (en) Drawing reader
JPS6189061A (en) Printer
JPS60181880A (en) Optical character inputting device
JPS61206090A (en) Character reading device