US20040096102A1 - Methodology for scanned color document segmentation - Google Patents
Methodology for scanned color document segmentation Download PDFInfo
- Publication number
- US20040096102A1 US20040096102A1 US10/299,534 US29953402A US2004096102A1 US 20040096102 A1 US20040096102 A1 US 20040096102A1 US 29953402 A US29953402 A US 29953402A US 2004096102 A1 US2004096102 A1 US 2004096102A1
- Authority
- US
- United States
- Prior art keywords
- foreground
- pixel
- parametric model
- image
- mask
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 51
- 230000011218 segmentation Effects 0.000 title claims description 14
- 239000000203 mixture Substances 0.000 claims abstract description 19
- 230000003044 adaptive effect Effects 0.000 claims abstract description 7
- 238000009826 distribution Methods 0.000 claims description 21
- 238000005070 sampling Methods 0.000 claims description 5
- 230000001419 dependent effect Effects 0.000 claims description 3
- 238000007906 compression Methods 0.000 abstract description 18
- 230000006835 compression Effects 0.000 abstract description 18
- 238000012545 processing Methods 0.000 abstract description 12
- 238000003709 image segmentation Methods 0.000 abstract description 4
- 238000004364 calculation method Methods 0.000 abstract 2
- 238000004458 analytical method Methods 0.000 description 5
- 239000003086 colorant Substances 0.000 description 5
- 238000013459 approach Methods 0.000 description 4
- 238000003384 imaging method Methods 0.000 description 4
- 238000000926 separation method Methods 0.000 description 4
- 238000007476 Maximum Likelihood Methods 0.000 description 3
- 239000002131 composite material Substances 0.000 description 3
- 239000000654 additive Substances 0.000 description 2
- 230000000996 additive effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 230000000877 morphologic effect Effects 0.000 description 2
- 238000009877 rendering Methods 0.000 description 2
- 241000255925 Diptera Species 0.000 description 1
- 238000009125 cardiac resynchronization therapy Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012015 optical character recognition Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000010422 painting Methods 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N1/00—Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
- H04N1/46—Colour picture communication systems
- H04N1/64—Systems for the transmission or the storage of the colour picture signal; Details therefor, e.g. coding or decoding means therefor
- H04N1/642—Adapting to different types of images, e.g. characters, graphs, black and white image portions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/11—Region-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/16—Image preprocessing
- G06V30/162—Quantising the image signal
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/18—Extraction of features or characteristics of the image
- G06V30/18105—Extraction of features or characteristics of the image related to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/413—Classification of content, e.g. text, photographs or tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10004—Still image; Photographic image
- G06T2207/10008—Still image; Photographic image from scanner, fax or copier
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20004—Adaptive image processing
- G06T2207/20008—Globally adaptive
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30176—Document
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Definitions
- the present invention relates generally to image processing, and more particularly, to techniques for compressing the digital representation of a document.
- MRC Mixed Raster Content
- the image a composite image having text intermingled with color or gray scale information—is segmented into two or more planes, generally referred to as the upper and lower plane, and a selector plane is generated to indicate, for each pixel, which of the image planes contains the actual image data that should be used to reconstruct the final output image.
- Segmenting the planes in this manner can improve the compression of the image because the data can be arranged such that the planes are smoother and more compressible than the original image. Segmentation also allows different compression methods to be applied to the different planes, thereby allowing a compression technique that is most appropriate for the data residing thereon can be applied to each plane.
- the Mixed Raster Content (MRC) imaging model enables exemplary representation of basic document structures. Its intent is to facilitate high compression by segmenting a document image into a number of regions according to compression type. For example, text pixels are extracted and encoded with ITU-T G4 or JBIG2. Background and pictures are extracted and compressed with JPEG (perhaps at differing quantization levels). Thus a document image is partitioned into a number of regions according to appropriate compression schemes. But MRC can also describe a basic “functional” decomposition of the image: text, background, photographs, and graphics, which can be used for subsequent processing. For example, text can be “OCRed” (Optical Character Recognition) or photographs color corrected for different display media.
- OCR Optical Character Recognition
- the present invention relates to a method for creating a decision surface in 3D color space by determining a parametric model of foreground and background pixel distributions; estimating parametric model parameters from the foreground and background pixel distributions; and computing a decision surface from the parametric model parameters.
- the present invention relates to a method for segmenting image data pixels in 3D color space comprising sampling a subset of the pixels in the image data, determining a parametric model of foreground and background pixel distributions from the subset of pixels, and estimating parametric model parameters from the foreground and background pixel distributions.
- This allows computing a decision surface from the parametric model parameters so as to compare all image data pixels against the decision surface, and determine as per the comparing step if a given data pixel is above or below the decision surface.
- the present invention also relates to a method for adaptive color document segmentation comprising reading a raster image into memory, converting the raster image into L*a*b* color space, and sampling a subset of pixels at uniformly distributed points in the image. This allows determining a parametric model of foreground and background pixel distributions from the subset of pixels, estimating parametric model parameters from the resultant foreground and background pixel distributions, and computing a decision surface from the parametric model parameters.
- FIG. 1 illustrates a composite image and includes an example of how such an image may be decomposed into three MRC image planes—an upper plane, a lower plane, and a selector plane.
- FIG. 2 contains a detailed view of a pixel map and the manner in which pixels are grouped to form blocks.
- FIG. 3A shows two 3D distributions and decision surface in L*a*b* color space.
- FIG. 3B shows a 2D slice through the distributions and decision surface of FIG. 3A.
- FIG. 4 provides a flow chart for recursive document image segmentation.
- the present invention is directed to a method for segmenting the various types of image data contained in a composite color document image. While the invention will described in a Mixed Raster Content (MRC) technique, it may be adapted for use with other methods and apparatus' and is not therefore, limited to a MRC format.
- MRC Mixed Raster Content
- the technique described herein is suitable for use in various devices required for storing or transmitting documents such as facsimile devices, image storage devices and the like, and processing of both color and grayscale black and white images are possible.
- a pixel map is one in which each discrete location on the page contains a picture element or “pixel” that emits a light signal with a value that indicates the color or, in the case of gray scale documents, how light or dark the image is at that location.
- pixel maps have values that are taken from a set of discrete, non-negative integers.
- individual separations are often represented as digital values, often in the range 0 to 255, where 0 represents no colorant and 255 represents maximum colorant.
- 0 represents no colorant and 255 represents maximum colorant.
- (0,0,0) represents an additive mixture of no red, no green, and no blue, hence (0,0,0) represents black;
- (0, 255, 0) represents no red, maximum green, and no blue, hence (0, 255, 0) represents green; (128, 128, 128) and additive mixture of equal amounts of a medium amount of reg, green, and blue, hence (128, 128, 128) represents a medium gray.
- color spaces are used in the art to represent colors including L*a*b*, L*u*v*, and YCbCr.
- Each has its particular advantage is a particular imaging system (e.g., copiers, printers, CRTs, television transmission). Transformation from one color space to another is routine in the art and is performed using mathematical operations embodied in computer hardware or software.
- the three values of each separation represents coordinates of points in 3D space.
- the pixel maps of concern in a preferred embodiment of the present invention are representations of “scanned” images. That is, images which are created by digitizing light reflected off of physical media using a digital scanner.
- bitmap is used to mean a binary pixel map in which pixels can take one of two values, 1 or 0.
- pixel map 10 representing a color or gray-scale document is preferably decomposed into a three plane page format as indicated in FIG. 1.
- Pixels on pixel map 10 are preferably grouped in blocks 18 (best viewed in FIG. 2) to allow for better image processing efficiency.
- the document format is typically comprised of an upper plane 12 , a lower plane 14 , and a selector plane 16 .
- Upper plane 12 and lower plane 14 contain pixels that describe the original image data, wherein pixels in each block 18 have been separated based upon pre-defined criteria. For example, pixels that have values above a certain threshold are placed on one plane, while those with values that are equal to or below the threshold are placed on the other plane.
- Selector plane 16 keeps track of every pixel in original pixel map 10 and maps all pixels to an exact spot on either upper plane 12 or lower plane 14 .
- the upper and lower planes are stored at the same bit depth and number of colors as the original pixel map 10 , but possibly at reduced resolution.
- Selector plane 16 is created and stored as a bitmap. It is important to recognize that while the terms “upper” and “lower” are used to describe the planes on which data resides, it is not intended to limit the invention to any particular arrangement or configuration.
- all three planes are compressed using a method suitable for the type of data residing thereon.
- upper plane 12 and lower plane 14 may be compressed and stored using a lossy compression technique such as JPEG, while selector plane 16 is compressed and stored using a lossless compression format such as gzip or CCITT-G4.
- JPEG lossy compression technique
- selector plane 16 is compressed and stored using a lossless compression format such as gzip or CCITT-G4.
- group 4 MMR
- group 4 would preferably be used for selector plane 16 , since the particular compression format used must be one of the approved formats (MMR, MR, MH, JPEG, JBIG, etc.) for facsimile data transmission.
- Pixel map 10 represents a scanned image composed of light intensity signals dispersed throughout the separation at discrete locations. Again, a light signal is emitted from each of these discrete locations, referred to as “picture elements,” “pixels” or “pels,” at an intensity level which indicates the magnitude of the light being reflected from the original image at the corresponding location in that separation.
- a segmentation system utilizing an expectation-maximization algorithm to fit a mixture of three-dimensional gaussians to L*a*b* pixel samples. From the estimated densities and proportionality parameter, a quadratic decision boundary is calculated and applied to every pixel in the image. A binary selector plane is maintained that assigns one to the selector pixel value if the pixel is foreground and zero otherwise (background). The component distribution with the greater luminance is assigned the role of a background prototype. This process is essentially 3D thresholding.
- the samples fail to exhibit a clear mixture —the sample is homogenous or is not well-fitted with a mixture of 3D gaussians.
- a segmentation attempt is made using only the L* channel by a mixture of 1D gaussians.
- the segmenter reports that the document image cannot be segmented.
- FIG. 3A is a simplified depiction of the above description provided as an aid in the visualization of the methodology employed.
- FIG. 3A is an example of when the samples exhibit a well fitted mixture of 3D gaussians 30 and 31 .
- Gaussian 30 represents background (lighter) pixel samples and gaussian 31 is the foreground (darker) pixel samples.
- gaussian 31 is the foreground (darker) pixel samples.
- FIG. 3B is a 2D slice of FIG. 3A to aid in further visually clarifying the relationship of sample pixel gaussians 30 and 31 and resultant binary selector 32 .
- the selector is processed to find connected components by first doing a morphological opening and then a closing. Large connected components are extracted as objects and output as foreground/mask pairs.
- the segmented document image is now ready for subsequent processing.
- the objects may be smoothed or enhanced according to image type, the selector plane subjected to further analysis as a binary document image, etc. Also, one may compress the image according to the TIFF-FX profile M standard or variant.
- Expectation-Maximization is a general technique for maximum-likelihood estimation (mles) when data are missing.
- the seminal paper is A. P. Dempster, N. M. Laird, and D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm (with discussion), Journal of the Royal Statistical Society B, 39, pp. 1-38 (1977). and a recent comprehensive treatment is G. J. McLachlan and T. Krishnan, The EM Alqorithm and Extensions , Wiley, New York (1997) both of which are herein incorporated by reference for their teaching.
- the mixture-of-gaussians (MoG) estimation problem is a straightforward and intuitive application of EM.
- the EM algorithm provides an iterative and intuitive method to produce mles.
- the missing data in this case is membership information.
- the first step in the EM algorithm is to initialize parameter estimates, ⁇ circumflex over ( ⁇ ) ⁇ (0), ⁇ circumflex over ( ⁇ ) ⁇ 1 (0) , ⁇ circumflex over ( ⁇ ) ⁇ 1 (0) , ⁇ circumflex over ( ⁇ ) ⁇ 2 (0) , ⁇ circumflex over ( ⁇ ) ⁇ 2 (0) .
- the next step, the “E-step,” is to use equation (5) to get estimates of the z ij .
- the next step, the “M-step” is to use these estimates of the z ij and the original data in equations (3) and (4) to get updated mles of the parameters.
- segmentation classes are compression classes, i.e., regions amenable to compression with appropriate algorithms: text with ITU-T Group 4 (MMR) and color images with JPEG.
- MMR ITU-T Group 4
- JPEG color images with JPEG.
- One advantage of this approach is that one avoids compressing text with JPEG where it is known to produce ringing and mosquito noise.
- Mixed raster content is an imaging model directed toward facilitating compression, yet it can be used as a “carrier” for documents segmented for rendering or layout analysis.
- I ( x,y ) (1 ⁇ M 0( x, y )) BG 0( x, y )+ M 0( x, y ) FG 0( x, y )
- a (vector) pixel value is selected from the background, if the mask is zero, and from the foreground if the mask is one.
- An object foreground is an image FGi and a mask Mi:
- the number of objects that can appear on a page is not a priori restricted except that objects cannot overlap (for we cannot segment them if they do), and they must have a certain minimum area (say, 2 square inches).
- a exemplary segmentation methodology comprises:
- step 5 If ⁇ circumflex over ( ⁇ ) ⁇ b (l*) ⁇ circumflex over ( ⁇ ) ⁇ f (l*) ⁇ t and s 1 ⁇ circumflex over ( ⁇ ) ⁇ s 2 then fit a 1D mixture of gaussians to the L* values and perform step 5 (which can be reduced to a simple threshold operation).
- FIG. 4 there is depicted a flow chart for employing the segmentation methodology described above into a Mixed Raster Content embodiment.
- start block 400 initially a document page is scanned.
- a raster image is read in and converted to yield a L*a*b* image.
- the adaptive image segmenter is employed as previously described above.
- a uniform sampling of pixels across the image is taken; the number of samples may vary but in one preferred embodiment 2000 samples are employed; Expectation-Maximization is applied to the sample pixel data to yield an estimate of parametric model parameters comprising a mixture parameter, two 3D means and corresponding covariance matrices; a quadratic decision surface is computed from the parametric model parameters; this quadratic decision surface is employed as a binary selector plane and each document image data pixel is then compared against the decision surface to determine each pixel as designated either background or foreground; if as a result of that comparison a foreground and background are indeed found at decision block 420 , the pixel by pixel designation determination from the comparison is used to create a binary mask plane block 470 , else the methodology is complete as indicated with end-block 460 .
- the binary mask plane is converted into run lengths, cleaned using morphological open and close operations, and regions larger than a given threshold are merged.
- Large connected components are reserved as windows and are used to mask out portions of the preliminary foreground 450 .
- the reserved large connected components are subtracted out from the preliminary foreground and the mask plane.
- the initial result is a background plane 430 , a mask plane 440 , and a preliminary foreground plane 450 .
- the reserved large connected components are reiteratively processed (as just described above) starting again at block 410 through to block 480 , to yield any “n” number of foreground/mask pairs 490 , 500 , until no further pairs are found, as determined at decision block 420 .
- the methodology is then complete as indicated with end-block 460 .
Abstract
Description
- The present invention relates generally to image processing, and more particularly, to techniques for compressing the digital representation of a document.
- Documents scanned at high resolutions require very large amounts of storage space. Instead of being stored as is, the data is typically subjected to some form of data compression in order to reduce its volume, and thereby avoid the high costs associated with storing and transmitting it. Although much content is online, there remains a substantial amount of information in paper documents. Workflows can require extracting information in printed forms, converting legacy documents, or committing content of paper documents to a storage and retrieval system. In document processing systems, scanning completes the cycle: electronic, print, electronic. Conversion of printed documents to electronic format has been the subject of thousands of research articles and numerous books. Most work has focused on binary black and white documents. Yet the majority of documents today are in color at increasingly higher resolutions.
- One approach to satisfy the compression needs of differing types of data has been to use a Mixed Raster Content (MRC) format to describe the image. The image—a composite image having text intermingled with color or gray scale information—is segmented into two or more planes, generally referred to as the upper and lower plane, and a selector plane is generated to indicate, for each pixel, which of the image planes contains the actual image data that should be used to reconstruct the final output image. Segmenting the planes in this manner can improve the compression of the image because the data can be arranged such that the planes are smoother and more compressible than the original image. Segmentation also allows different compression methods to be applied to the different planes, thereby allowing a compression technique that is most appropriate for the data residing thereon can be applied to each plane.
- From a document interchange perspective, the Mixed Raster Content (MRC) imaging model enables exemplary representation of basic document structures. Its intent is to facilitate high compression by segmenting a document image into a number of regions according to compression type. For example, text pixels are extracted and encoded with ITU-T G4 or JBIG2. Background and pictures are extracted and compressed with JPEG (perhaps at differing quantization levels). Thus a document image is partitioned into a number of regions according to appropriate compression schemes. But MRC can also describe a basic “functional” decomposition of the image: text, background, photographs, and graphics, which can be used for subsequent processing. For example, text can be “OCRed” (Optical Character Recognition) or photographs color corrected for different display media.
- Central to the optimization of MRC is the segmentation of the document. The segmentation needs to be robust and adaptive to a multitude of scanners while minimizing “show through” from the backside of the scanned sheet. It also must be simple and fast, making it amenable to software execution. Finally, it should reduce much of the document analysis problem to processing binary images.
- In U.S. Pat. No. 6,400,844, to Fan et al., the invention described discloses an improved technique for compressing a color or gray scale pixel map representing a document using an MRC format includes a method of segmenting an original pixel map into two planes, and then compressing the data or each plane in an efficient manner. The image is segmented by separating the image into two portions at the edges. One plane contains image data for the dark sides of the edges, while image data for the bright sides of the edges and the smooth portions of the image are placed on the other plane. This results in improved image compression ratios and enhanced image quality.
- The above is herein incorporated by reference in its entirety for its teaching.
- Therefore, as discussed above, there exists a need for a methodology to minimize the impact of segmentation on the operation of MRC or other scan systems, yet remain robust and adaptive to a multitude of scanners, while reducing much of the document analysis problem to that of processing binary images. Thus, it would be desirable to solve this and other deficiencies and disadvantages with an improved methodology for color document image segmentation.
- The present invention relates to a method for creating a decision surface in 3D color space by determining a parametric model of foreground and background pixel distributions; estimating parametric model parameters from the foreground and background pixel distributions; and computing a decision surface from the parametric model parameters.
- In particular, the present invention relates to a method for segmenting image data pixels in 3D color space comprising sampling a subset of the pixels in the image data, determining a parametric model of foreground and background pixel distributions from the subset of pixels, and estimating parametric model parameters from the foreground and background pixel distributions. This allows computing a decision surface from the parametric model parameters so as to compare all image data pixels against the decision surface, and determine as per the comparing step if a given data pixel is above or below the decision surface.
- The present invention also relates to a method for adaptive color document segmentation comprising reading a raster image into memory, converting the raster image into L*a*b* color space, and sampling a subset of pixels at uniformly distributed points in the image. This allows determining a parametric model of foreground and background pixel distributions from the subset of pixels, estimating parametric model parameters from the resultant foreground and background pixel distributions, and computing a decision surface from the parametric model parameters. That in turn allows comparing all image pixels against the decision surface, determining as per the comparing step if a given image pixel is above or below the decision surface, and sorting the given image pixel into a foreground mask or a background mask as dependent upon the determination of being below or above the decision surface. Then a single bit in a selector mask is set for each pixel location as per the determination made in the determination step.
- FIG. 1 illustrates a composite image and includes an example of how such an image may be decomposed into three MRC image planes—an upper plane, a lower plane, and a selector plane.
- FIG. 2 contains a detailed view of a pixel map and the manner in which pixels are grouped to form blocks.
- FIG. 3A shows two 3D distributions and decision surface in L*a*b* color space.
- FIG. 3B shows a 2D slice through the distributions and decision surface of FIG. 3A.
- FIG. 4 provides a flow chart for recursive document image segmentation.
- The present invention is directed to a method for segmenting the various types of image data contained in a composite color document image. While the invention will described in a Mixed Raster Content (MRC) technique, it may be adapted for use with other methods and apparatus' and is not therefore, limited to a MRC format. The technique described herein is suitable for use in various devices required for storing or transmitting documents such as facsimile devices, image storage devices and the like, and processing of both color and grayscale black and white images are possible.
- A pixel map is one in which each discrete location on the page contains a picture element or “pixel” that emits a light signal with a value that indicates the color or, in the case of gray scale documents, how light or dark the image is at that location. As those skilled in the art will appreciate, most pixel maps have values that are taken from a set of discrete, non-negative integers.
- For example, in a pixel map for a color document, individual separations are often represented as digital values, often in the range 0 to 255, where 0 represents no colorant and 255 represents maximum colorant. For example, in the RGB color space, (0,0,0) represents an additive mixture of no red, no green, and no blue, hence (0,0,0) represents black; (0, 255, 0) represents no red, maximum green, and no blue, hence (0, 255, 0) represents green; (128, 128, 128) and additive mixture of equal amounts of a medium amount of reg, green, and blue, hence (128, 128, 128) represents a medium gray. Many other color spaces are used in the art to represent colors including L*a*b*, L*u*v*, and YCbCr. Each has its particular advantage is a particular imaging system (e.g., copiers, printers, CRTs, television transmission). Transformation from one color space to another is routine in the art and is performed using mathematical operations embodied in computer hardware or software. The three values of each separation represents coordinates of points in 3D space. The pixel maps of concern in a preferred embodiment of the present invention are representations of “scanned” images. That is, images which are created by digitizing light reflected off of physical media using a digital scanner. The term bitmap is used to mean a binary pixel map in which pixels can take one of two values, 1 or 0.
- Turning now to the drawings for a more detailed description of the MRC format,
pixel map 10 representing a color or gray-scale document is preferably decomposed into a three plane page format as indicated in FIG. 1. Pixels onpixel map 10 are preferably grouped in blocks 18 (best viewed in FIG. 2) to allow for better image processing efficiency. The document format is typically comprised of anupper plane 12, alower plane 14, and aselector plane 16.Upper plane 12 andlower plane 14 contain pixels that describe the original image data, wherein pixels in eachblock 18 have been separated based upon pre-defined criteria. For example, pixels that have values above a certain threshold are placed on one plane, while those with values that are equal to or below the threshold are placed on the other plane.Selector plane 16 keeps track of every pixel inoriginal pixel map 10 and maps all pixels to an exact spot on eitherupper plane 12 orlower plane 14. - The upper and lower planes are stored at the same bit depth and number of colors as the
original pixel map 10, but possibly at reduced resolution.Selector plane 16 is created and stored as a bitmap. It is important to recognize that while the terms “upper” and “lower” are used to describe the planes on which data resides, it is not intended to limit the invention to any particular arrangement or configuration. - After processing, all three planes are compressed using a method suitable for the type of data residing thereon. For example,
upper plane 12 andlower plane 14 may be compressed and stored using a lossy compression technique such as JPEG, whileselector plane 16 is compressed and stored using a lossless compression format such as gzip or CCITT-G4. It would be apparent to one of skill in the art to compress and store the planes using other formats that are suitable for the intended use of the output document. For example, in the Color Facsimile arena, group 4 (MMR) would preferably be used forselector plane 16, since the particular compression format used must be one of the approved formats (MMR, MR, MH, JPEG, JBIG, etc.) for facsimile data transmission. - In the present invention digital image data is preferably processed using a MRC technique such as described above.
Pixel map 10 represents a scanned image composed of light intensity signals dispersed throughout the separation at discrete locations. Again, a light signal is emitted from each of these discrete locations, referred to as “picture elements,” “pixels” or “pels,” at an intensity level which indicates the magnitude of the light being reflected from the original image at the corresponding location in that separation. - Central to the present invention is a segmentation system utilizing an expectation-maximization algorithm to fit a mixture of three-dimensional gaussians to L*a*b* pixel samples. From the estimated densities and proportionality parameter, a quadratic decision boundary is calculated and applied to every pixel in the image. A binary selector plane is maintained that assigns one to the selector pixel value if the pixel is foreground and zero otherwise (background). The component distribution with the greater luminance is assigned the role of a background prototype. This process is essentially 3D thresholding. If the Euclidean distance of the estimated means are close together, or if the estimated proportionality parameter is near zero or one, the samples fail to exhibit a clear mixture —the sample is homogenous or is not well-fitted with a mixture of 3D gaussians. At this stage, a segmentation attempt is made using only the L* channel by a mixture of 1D gaussians. Again, if estimated means are close or the estimated proportionality parameter is close to zero or one, the segmenter reports that the document image cannot be segmented.
- FIG. 3A is a simplified depiction of the above description provided as an aid in the visualization of the methodology employed. FIG. 3A is an example of when the samples exhibit a well fitted mixture of
3D gaussians binary selector plane 32 is maintained which allows expeditious thresholding of the remainder of the document page. FIG. 3B is a 2D slice of FIG. 3A to aid in further visually clarifying the relationship of sample pixel gaussians 30 and 31 and resultantbinary selector 32. - Next, the selector is processed to find connected components by first doing a morphological opening and then a closing. Large connected components are extracted as objects and output as foreground/mask pairs. The segmented document image is now ready for subsequent processing. The objects may be smoothed or enhanced according to image type, the selector plane subjected to further analysis as a binary document image, etc. Also, one may compress the image according to the TIFF-FX profile M standard or variant.
- Expectation-Maximization (EM) is a general technique for maximum-likelihood estimation (mles) when data are missing. The seminal paper is A. P. Dempster, N. M. Laird, and D. B. Rubin,Maximum likelihood from incomplete data via the EM algorithm (with discussion), Journal of the Royal Statistical Society B, 39, pp. 1-38 (1977). and a recent comprehensive treatment is G. J. McLachlan and T. Krishnan, The EM Alqorithm and Extensions, Wiley, New York (1997) both of which are herein incorporated by reference for their teaching. The mixture-of-gaussians (MoG) estimation problem is a straightforward and intuitive application of EM.
- There are other approaches to this problem. Estimating the MoG can be thought of as unsupervised pattern recognition.
-
-
-
- The EM algorithm provides an iterative and intuitive method to produce mles.
-
-
- and covariance mles omitted for brevity.
-
- The first step in the EM algorithm is to initialize parameter estimates, {circumflex over (α)}(0), {circumflex over (μ)} 1 (0), {circumflex over (Σ)}1 (0), {circumflex over (μ)}2 (0), {circumflex over (Σ)}2 (0). The next step, the “E-step,” is to use equation (5) to get estimates of the zij. The next step, the “M-step” is to use these estimates of the zij and the original data in equations (3) and (4) to get updated mles of the parameters. The algorithm iterates these two steps until some measure of convergence is achieved (typically, updated parameter estimates differ little from previous ones, or the likelihood value stabilizes). That's essentially all there is to it for mixture-of-gaussians (MoG). The fact that such a simple and intuitive method works under general conditions is makes it an important tool in late 20th century statistics.
- Document image segmentation may be done for a number of reasons. Recently, there has been interest in segmenting a document image for compression. In this case, segmentation classes are compression classes, i.e., regions amenable to compression with appropriate algorithms: text with ITU-T Group 4 (MMR) and color images with JPEG. One advantage of this approach is that one avoids compressing text with JPEG where it is known to produce ringing and mosquito noise. One can also use segmentation to find rendering classes, e.g., halftone regions to be descreened, text to be sharpened, and photos to be enhanced.
- Mixed raster content is an imaging model directed toward facilitating compression, yet it can be used as a “carrier” for documents segmented for rendering or layout analysis.
- Formally, we represent a color image as a mapping from a raster to a triplet of 8-bit colors:
- I:[m x ,n y ]×[m y ,n x]→[0,255]3
- where 0≦mx<nx and 0≦my<ny. A 3-plane mixed raster content representation uses a mask M to separate background and foreground content. Let mx=my=0 and
- M0:[0,n x]×[0,n y]→{0,1}
- be a binary mask where nx and ny represent the complete extent of the image raster. Let
- FG0, BG0: [0, n x]×[0, n y], →[0, 255]3
- be foreground and background images, respectively. A 3-plane MRC document image representation is
- I(x,y)=(1−M0(x, y))BG0(x, y)+M0(x, y)FG0(x, y)
- for (x, y)∈[0, nx]×[0,ny].
- Essentially, a (vector) pixel value is selected from the background, if the mask is zero, and from the foreground if the mask is one. One can view the imaging operation as pouring the foreground through a mask onto the background.
- We also need the concept of an object, which is a foreground/mask pair meant to represent a photograph or graphic. An object foreground is an image FGi and a mask Mi:
- FGi:[mi x ,ni x ]×[mi y ,ni y]→[0,255]3
- Mi:[mi x , ni x ]×[mi y ,ni y]→{0, 1}
- where 0≦mix<nix≦nx and 0≦miy<niy≦ny.
-
- This decomposition is by no means unique and there are others more appropriate for compression.
- A exemplary segmentation methodology comprises:
- 1) Read a raster image into memory
- 2) Convert it to L*a*b*
- 3) Sample the image at a number of uniformly distributed points
- 4) Using the Expectation-Maximization (EM) algorithm to estimate a mixture parameter, two 3D means and the covariance matrices: {circumflex over (α)}, {circumflex over (μ)}f, {circumflex over (Σ)}f, {circumflex over (μ)}b, {circumflex over (Σ)}b presumably representing foreground and background gaussians; i.e., the data are fit with αf(x;μb,Σb)+(1−α)f(x;μb,Σb), where x=(l*,a*,b*) at a point. This is done to yield a
quadratic decision surface 32. - 5) Compare each image pixel to the
decision surface 32 and thereby separate each pixel into a foreground or background plane, while also capturing that steering decision into a selector mask plane. If ∥{circumflex over (μ)}b(l*)−{circumflex over (μ)}f(l*)∥<t and s1≦{circumflex over (α)}≦s2 then foreground and background are well-separated in L*a*b* - a. For each pixel x in the image, if {circumflex over (α)}f(x; {circumflex over (μ)}b, {circumflex over (Σ)}b)<(1−{circumflex over (α)})f(x; {circumflex over (μ)}b, {circumflex over (Σ)}b x in the background and put a “0” in the mask M0 at that point; else put X in the foreground and put a “1” in the mask M0 at that point.
- b. Make a copy S of the mask M0.
- c. Convert S to horizontal run-lengths and do a closing with a horizontal element (this closes small gaps)
- d. Convert S to vertical run-lengths and do a closing with a vertical element (this closes small gaps)
- e. Convert S to horizontal run-lengths and do an opening with a horizontal element (this smoothes window boundaries)
- f. Convert S to vertical run-lengths and do an opening with a vertical element (this smoothes window boundaries)
- g. Convert S to connected components.
- h. For each connected component Mi larger than a variable “thresh” in area
- i. Remove Mi from M0
- ii. Mask out Mi from FG0 making FG0 white where Mi is “1” and copying those pixels to a new object foreground FGi
- iii. Fill the holes in Mi by
- 1. Finding small connected components in Mi of “0”-valued pixels
- 2. Painting those connected components “1”.
- iv. Output the found object as a foreground/mask pair (FGi,Mi)
- i. Output the background BG0, the mask (selector) M0, and foreground FG0
- 6) If ∥{circumflex over (μ)}b(l*)−{circumflex over (μ)}f(l*)∥≦t and s1≦{circumflex over (α)}≦s2 then fit a 1D mixture of gaussians to the L* values and perform step 5 (which can be reduced to a simple threshold operation).
- 7) Else the data form one gaussian blob or the EM algorithm failed to return a reasonable estimate, return the original image as BG0.
- Turning now to FIG. 4 there is depicted a flow chart for employing the segmentation methodology described above into a Mixed Raster Content embodiment. As shown with
start block 400, initially a document page is scanned. A raster image is read in and converted to yield a L*a*b* image. Atblock 410 the adaptive image segmenter is employed as previously described above. To recapitulate the segmenter methodology: a uniform sampling of pixels across the image is taken; the number of samples may vary but in one preferred embodiment 2000 samples are employed; Expectation-Maximization is applied to the sample pixel data to yield an estimate of parametric model parameters comprising a mixture parameter, two 3D means and corresponding covariance matrices; a quadratic decision surface is computed from the parametric model parameters; this quadratic decision surface is employed as a binary selector plane and each document image data pixel is then compared against the decision surface to determine each pixel as designated either background or foreground; if as a result of that comparison a foreground and background are indeed found atdecision block 420, the pixel by pixel designation determination from the comparison is used to create a binarymask plane block 470, else the methodology is complete as indicated with end-block 460. - In
block 480 the binary mask plane is converted into run lengths, cleaned using morphological open and close operations, and regions larger than a given threshold are merged. Large connected components are reserved as windows and are used to mask out portions of thepreliminary foreground 450. The reserved large connected components are subtracted out from the preliminary foreground and the mask plane. The initial result is abackground plane 430, amask plane 440, and apreliminary foreground plane 450. The reserved large connected components are reiteratively processed (as just described above) starting again atblock 410 through to block 480, to yield any “n” number of foreground/mask pairs 490, 500, until no further pairs are found, as determined atdecision block 420. The methodology is then complete as indicated with end-block 460. - It may be desirable or otherwise advantageous to replace all the pixel values in a background mask with an average value. This will help suppress show through artifacts, such as are typical when scanning duplex originals where backside images are visible from the front side.
- In closing, by providing a methodology to minimize the impact of segmentation on the operation of MRC or other scan systems, there is provided an approach robust and adaptive to a multitude of scanners, which also reduces the document analysis problem to that of processing binary images. The above methodology may also be combined with other processing steps such as compression, hints generation, and object classification.
- While the embodiments disclosed herein are preferred, it will be appreciated from this teaching that various alternative modifications, variations or improvements therein may be made by those skilled in the art. All such variants are intended to be encompassed by the following claims:
Claims (19)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/299,534 US20040096102A1 (en) | 2002-11-18 | 2002-11-18 | Methodology for scanned color document segmentation |
JP2003386426A JP2004173276A (en) | 2002-11-18 | 2003-11-17 | Decision surface preparation method, image data pixel classifying method, and collar document classifying method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/299,534 US20040096102A1 (en) | 2002-11-18 | 2002-11-18 | Methodology for scanned color document segmentation |
Publications (1)
Publication Number | Publication Date |
---|---|
US20040096102A1 true US20040096102A1 (en) | 2004-05-20 |
Family
ID=32297719
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/299,534 Abandoned US20040096102A1 (en) | 2002-11-18 | 2002-11-18 | Methodology for scanned color document segmentation |
Country Status (2)
Country | Link |
---|---|
US (1) | US20040096102A1 (en) |
JP (1) | JP2004173276A (en) |
Cited By (40)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060115169A1 (en) * | 2004-12-01 | 2006-06-01 | Ohk Hyung-Soo | Apparatus for compressing document and method thereof |
US20060245003A1 (en) * | 2005-04-28 | 2006-11-02 | Xerox Corporation | Method and system for sending material |
US20070018995A1 (en) * | 2005-07-20 | 2007-01-25 | Katsuya Koyanagi | Image processing apparatus |
US20070092140A1 (en) * | 2005-10-20 | 2007-04-26 | Xerox Corporation | Document analysis systems and methods |
US20070146830A1 (en) * | 2005-12-22 | 2007-06-28 | Xerox Corporation | Matching the perception of a digital image data file to a legacy hardcopy |
US20070206857A1 (en) * | 2006-03-02 | 2007-09-06 | Richard John Campbell | Methods and Systems for Detecting Pictorial Regions in Digital Images |
US20070206855A1 (en) * | 2006-03-02 | 2007-09-06 | Sharp Laboratories Of America, Inc. | Methods and systems for detecting regions in digital images |
EP1831823A1 (en) * | 2004-12-21 | 2007-09-12 | Canon Kabushiki Kaisha | Segmenting digital image and producing compact representation |
US20070253040A1 (en) * | 2006-04-28 | 2007-11-01 | Eastman Kodak Company | Color scanning to enhance bitonal image |
US20070291120A1 (en) * | 2006-06-15 | 2007-12-20 | Richard John Campbell | Methods and Systems for Identifying Regions of Substantially Uniform Color in a Digital Image |
US20080056573A1 (en) * | 2006-09-06 | 2008-03-06 | Toyohisa Matsuda | Methods and Systems for Identifying Text in Digital Images |
US20080231752A1 (en) * | 2007-03-22 | 2008-09-25 | Imatte, Inc. | Method for generating a clear frame from an image frame containing a subject disposed before a backing of nonuniform illumination |
US20080273807A1 (en) * | 2007-05-04 | 2008-11-06 | I.R.I.S. S.A. | Compression of digital images of scanned documents |
US20090041344A1 (en) * | 2007-08-08 | 2009-02-12 | Richard John Campbell | Methods and Systems for Determining a Background Color in a Digital Image |
US20090046931A1 (en) * | 2007-08-13 | 2009-02-19 | Jing Xiao | Segmentation-based image labeling |
US20090110320A1 (en) * | 2007-10-30 | 2009-04-30 | Campbell Richard J | Methods and Systems for Glyph-Pixel Selection |
US20090269300A1 (en) * | 2005-08-24 | 2009-10-29 | Bruce Lawrence Finkelstein | Anthranilamides for Controlling Invertebrate Pests |
US20090304303A1 (en) * | 2008-06-04 | 2009-12-10 | Microsoft Corporation | Hybrid Image Format |
US7675646B2 (en) | 2005-05-31 | 2010-03-09 | Xerox Corporation | Flexible print data compression |
US20100142820A1 (en) * | 2008-12-05 | 2010-06-10 | Xerox Corporation | 3 + 1 layer mixed raster content (mrc) images having a black text layer |
US20100142806A1 (en) * | 2008-12-05 | 2010-06-10 | Xerox Corporation | 3 + 1 layer mixed raster content (mrc) images having a text layer and processing thereof |
US7792359B2 (en) | 2006-03-02 | 2010-09-07 | Sharp Laboratories Of America, Inc. | Methods and systems for detecting regions in digital images |
US7864365B2 (en) | 2006-06-15 | 2011-01-04 | Sharp Laboratories Of America, Inc. | Methods and systems for segmenting a digital image into regions |
US20110069885A1 (en) * | 2009-09-22 | 2011-03-24 | Xerox Corporation | 3+n layer mixed rater content (mrc) images and processing thereof |
US20110304861A1 (en) * | 2010-06-14 | 2011-12-15 | Xerox Corporation | Colorimetric matching the perception of a digital data file to hardcopy legacy |
US8218875B2 (en) * | 2010-06-12 | 2012-07-10 | Hussein Khalid Al-Omari | Method and system for preprocessing an image for optical character recognition |
US8300890B1 (en) * | 2007-01-29 | 2012-10-30 | Intellivision Technologies Corporation | Person/object image and screening |
US8325394B2 (en) | 2010-05-28 | 2012-12-04 | Xerox Corporation | Hierarchical scanner characterization |
US20130191082A1 (en) * | 2011-07-22 | 2013-07-25 | Thales | Method of Modelling Buildings on the Basis of a Georeferenced Image |
US8694332B2 (en) | 2010-08-31 | 2014-04-08 | Xerox Corporation | System and method for processing a prescription |
US8855414B1 (en) * | 2004-06-30 | 2014-10-07 | Teradici Corporation | Apparatus and method for encoding an image generated in part by graphical commands |
CN105389769A (en) * | 2015-11-05 | 2016-03-09 | 欧阳春娟 | Improved steganography method for optimizing decision surface |
US20160163059A1 (en) * | 2014-12-04 | 2016-06-09 | Fujitsu Limited | Image processing device and method |
CN106611425A (en) * | 2016-12-19 | 2017-05-03 | 辽宁工程技术大学 | Panchromatic remote sensing image segmentation method |
US20190266433A1 (en) * | 2018-02-27 | 2019-08-29 | Intuit Inc. | Method and system for background removal from documents |
CN110335280A (en) * | 2019-07-05 | 2019-10-15 | 湖南联信科技有限公司 | A kind of financial documents image segmentation and antidote based on mobile terminal |
US10990423B2 (en) | 2015-10-01 | 2021-04-27 | Microsoft Technology Licensing, Llc | Performance optimizations for emulators |
US11042422B1 (en) | 2020-08-31 | 2021-06-22 | Microsoft Technology Licensing, Llc | Hybrid binaries supporting code stream folding |
US11231918B1 (en) | 2020-08-31 | 2022-01-25 | Microsoft Technologly Licensing, LLC | Native emulation compatible application binary interface for supporting emulation of foreign code |
US11403100B2 (en) | 2020-08-31 | 2022-08-02 | Microsoft Technology Licensing, Llc | Dual architecture function pointers having consistent reference addresses |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7542164B2 (en) * | 2004-07-14 | 2009-06-02 | Xerox Corporation | Common exchange format architecture for color printing in a multi-function system |
JP4726040B2 (en) * | 2005-01-31 | 2011-07-20 | 株式会社リコー | Encoding processing device, decoding processing device, encoding processing method, decoding processing method, program, and information recording medium |
JP2007189275A (en) * | 2006-01-11 | 2007-07-26 | Ricoh Co Ltd | Image processor |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5327262A (en) * | 1993-05-24 | 1994-07-05 | Xerox Corporation | Automatic image segmentation with smoothing |
US5341226A (en) * | 1993-04-22 | 1994-08-23 | Xerox Corporation | Automatic image segmentation for color documents |
US5555556A (en) * | 1994-09-30 | 1996-09-10 | Xerox Corporation | Method and apparatus for document segmentation by background analysis |
US5745596A (en) * | 1995-05-01 | 1998-04-28 | Xerox Corporation | Method and apparatus for performing text/image segmentation |
US5802203A (en) * | 1995-06-07 | 1998-09-01 | Xerox Corporation | Image segmentation using robust mixture models |
US5850474A (en) * | 1996-07-26 | 1998-12-15 | Xerox Corporation | Apparatus and method for segmenting and classifying image data |
US6181829B1 (en) * | 1998-01-21 | 2001-01-30 | Xerox Corporation | Method and system for classifying and processing of pixels of image data |
US6229923B1 (en) * | 1998-01-21 | 2001-05-08 | Xerox Corporation | Method and system for classifying and processing of pixels of image data |
US6298151B1 (en) * | 1994-11-18 | 2001-10-02 | Xerox Corporation | Method and apparatus for automatic image segmentation using template matching filters |
US6400844B1 (en) * | 1998-12-02 | 2002-06-04 | Xerox Corporation | Method and apparatus for segmenting data to create mixed raster content planes |
US20030137506A1 (en) * | 2001-11-30 | 2003-07-24 | Daniel Efran | Image-based rendering for 3D viewing |
US20030198386A1 (en) * | 2002-04-19 | 2003-10-23 | Huitao Luo | System and method for identifying and extracting character strings from captured image data |
US20040001612A1 (en) * | 2002-06-28 | 2004-01-01 | Koninklijke Philips Electronics N.V. | Enhanced background model employing object classification for improved background-foreground segmentation |
US6798977B2 (en) * | 1998-02-04 | 2004-09-28 | Canon Kabushiki Kaisha | Image data encoding and decoding using plural different encoding circuits |
-
2002
- 2002-11-18 US US10/299,534 patent/US20040096102A1/en not_active Abandoned
-
2003
- 2003-11-17 JP JP2003386426A patent/JP2004173276A/en not_active Withdrawn
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5341226A (en) * | 1993-04-22 | 1994-08-23 | Xerox Corporation | Automatic image segmentation for color documents |
US5327262A (en) * | 1993-05-24 | 1994-07-05 | Xerox Corporation | Automatic image segmentation with smoothing |
US5555556A (en) * | 1994-09-30 | 1996-09-10 | Xerox Corporation | Method and apparatus for document segmentation by background analysis |
US6298151B1 (en) * | 1994-11-18 | 2001-10-02 | Xerox Corporation | Method and apparatus for automatic image segmentation using template matching filters |
US5745596A (en) * | 1995-05-01 | 1998-04-28 | Xerox Corporation | Method and apparatus for performing text/image segmentation |
US5802203A (en) * | 1995-06-07 | 1998-09-01 | Xerox Corporation | Image segmentation using robust mixture models |
US5850474A (en) * | 1996-07-26 | 1998-12-15 | Xerox Corporation | Apparatus and method for segmenting and classifying image data |
US6181829B1 (en) * | 1998-01-21 | 2001-01-30 | Xerox Corporation | Method and system for classifying and processing of pixels of image data |
US6229923B1 (en) * | 1998-01-21 | 2001-05-08 | Xerox Corporation | Method and system for classifying and processing of pixels of image data |
US6798977B2 (en) * | 1998-02-04 | 2004-09-28 | Canon Kabushiki Kaisha | Image data encoding and decoding using plural different encoding circuits |
US6400844B1 (en) * | 1998-12-02 | 2002-06-04 | Xerox Corporation | Method and apparatus for segmenting data to create mixed raster content planes |
US20030137506A1 (en) * | 2001-11-30 | 2003-07-24 | Daniel Efran | Image-based rendering for 3D viewing |
US20030198386A1 (en) * | 2002-04-19 | 2003-10-23 | Huitao Luo | System and method for identifying and extracting character strings from captured image data |
US20040001612A1 (en) * | 2002-06-28 | 2004-01-01 | Koninklijke Philips Electronics N.V. | Enhanced background model employing object classification for improved background-foreground segmentation |
Cited By (74)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8855414B1 (en) * | 2004-06-30 | 2014-10-07 | Teradici Corporation | Apparatus and method for encoding an image generated in part by graphical commands |
US20060115169A1 (en) * | 2004-12-01 | 2006-06-01 | Ohk Hyung-Soo | Apparatus for compressing document and method thereof |
EP1831823A1 (en) * | 2004-12-21 | 2007-09-12 | Canon Kabushiki Kaisha | Segmenting digital image and producing compact representation |
US7991224B2 (en) | 2004-12-21 | 2011-08-02 | Canon Kabushiki Kaisha | Segmenting digital image and producing compact representation |
EP1831823A4 (en) * | 2004-12-21 | 2009-06-24 | Canon Kk | Segmenting digital image and producing compact representation |
US20060245003A1 (en) * | 2005-04-28 | 2006-11-02 | Xerox Corporation | Method and system for sending material |
US7483179B2 (en) | 2005-04-28 | 2009-01-27 | Xerox Corporation | Method and system for sending material |
US7675646B2 (en) | 2005-05-31 | 2010-03-09 | Xerox Corporation | Flexible print data compression |
US20070018995A1 (en) * | 2005-07-20 | 2007-01-25 | Katsuya Koyanagi | Image processing apparatus |
US7840063B2 (en) * | 2005-07-20 | 2010-11-23 | Fuji Xerox Co., Ltd. | Image processing apparatus |
US20090269300A1 (en) * | 2005-08-24 | 2009-10-29 | Bruce Lawrence Finkelstein | Anthranilamides for Controlling Invertebrate Pests |
US8849031B2 (en) * | 2005-10-20 | 2014-09-30 | Xerox Corporation | Document analysis systems and methods |
US20070092140A1 (en) * | 2005-10-20 | 2007-04-26 | Xerox Corporation | Document analysis systems and methods |
US7649650B2 (en) * | 2005-12-22 | 2010-01-19 | Xerox Corporation | Matching the perception of a digital image data file to a legacy hardcopy |
US20070146830A1 (en) * | 2005-12-22 | 2007-06-28 | Xerox Corporation | Matching the perception of a digital image data file to a legacy hardcopy |
US20070206857A1 (en) * | 2006-03-02 | 2007-09-06 | Richard John Campbell | Methods and Systems for Detecting Pictorial Regions in Digital Images |
US7792359B2 (en) | 2006-03-02 | 2010-09-07 | Sharp Laboratories Of America, Inc. | Methods and systems for detecting regions in digital images |
US8630498B2 (en) | 2006-03-02 | 2014-01-14 | Sharp Laboratories Of America, Inc. | Methods and systems for detecting pictorial regions in digital images |
US20070206855A1 (en) * | 2006-03-02 | 2007-09-06 | Sharp Laboratories Of America, Inc. | Methods and systems for detecting regions in digital images |
US7889932B2 (en) | 2006-03-02 | 2011-02-15 | Sharp Laboratories Of America, Inc. | Methods and systems for detecting regions in digital images |
US20070253040A1 (en) * | 2006-04-28 | 2007-11-01 | Eastman Kodak Company | Color scanning to enhance bitonal image |
US8437054B2 (en) | 2006-06-15 | 2013-05-07 | Sharp Laboratories Of America, Inc. | Methods and systems for identifying regions of substantially uniform color in a digital image |
US20070291120A1 (en) * | 2006-06-15 | 2007-12-20 | Richard John Campbell | Methods and Systems for Identifying Regions of Substantially Uniform Color in a Digital Image |
US8368956B2 (en) | 2006-06-15 | 2013-02-05 | Sharp Laboratories Of America, Inc. | Methods and systems for segmenting a digital image into regions |
US7864365B2 (en) | 2006-06-15 | 2011-01-04 | Sharp Laboratories Of America, Inc. | Methods and systems for segmenting a digital image into regions |
US20080056573A1 (en) * | 2006-09-06 | 2008-03-06 | Toyohisa Matsuda | Methods and Systems for Identifying Text in Digital Images |
US8150166B2 (en) | 2006-09-06 | 2012-04-03 | Sharp Laboratories Of America, Inc. | Methods and systems for identifying text in digital images |
US20110110596A1 (en) * | 2006-09-06 | 2011-05-12 | Toyohisa Matsuda | Methods and Systems for Identifying Text in Digital Images |
US7876959B2 (en) | 2006-09-06 | 2011-01-25 | Sharp Laboratories Of America, Inc. | Methods and systems for identifying text in digital images |
US8300890B1 (en) * | 2007-01-29 | 2012-10-30 | Intellivision Technologies Corporation | Person/object image and screening |
US20080231752A1 (en) * | 2007-03-22 | 2008-09-25 | Imatte, Inc. | Method for generating a clear frame from an image frame containing a subject disposed before a backing of nonuniform illumination |
GB2461450A (en) * | 2007-03-22 | 2010-01-06 | Imatte Inc | A method for generating a clear frame from an image frame containing a subject disposed before a backing of nonuniform illumination |
WO2008115533A1 (en) * | 2007-03-22 | 2008-09-25 | Imatte, Inc. | A method for generating a clear frame from an image frame containing a subject disposed before a backing of nonuniform illumination |
US8666185B2 (en) | 2007-05-04 | 2014-03-04 | I.R.I.S. | Compression of digital images of scanned documents |
US8995780B2 (en) * | 2007-05-04 | 2015-03-31 | I.R.I.S. | Compression of digital images of scanned documents |
US8331706B2 (en) | 2007-05-04 | 2012-12-11 | I.R.I.S. | Compression of digital images of scanned documents |
US20140177954A1 (en) * | 2007-05-04 | 2014-06-26 | I.R.I.S. | Compression of digital images of scanned documents |
US8068684B2 (en) * | 2007-05-04 | 2011-11-29 | I.R.I.S. | Compression of digital images of scanned documents |
US20080273807A1 (en) * | 2007-05-04 | 2008-11-06 | I.R.I.S. S.A. | Compression of digital images of scanned documents |
US20090041344A1 (en) * | 2007-08-08 | 2009-02-12 | Richard John Campbell | Methods and Systems for Determining a Background Color in a Digital Image |
US20090046931A1 (en) * | 2007-08-13 | 2009-02-19 | Jing Xiao | Segmentation-based image labeling |
US7907778B2 (en) | 2007-08-13 | 2011-03-15 | Seiko Epson Corporation | Segmentation-based image labeling |
US8121403B2 (en) | 2007-10-30 | 2012-02-21 | Sharp Laboratories Of America, Inc. | Methods and systems for glyph-pixel selection |
US8014596B2 (en) | 2007-10-30 | 2011-09-06 | Sharp Laboratories Of America, Inc. | Methods and systems for background color extrapolation |
US20090110320A1 (en) * | 2007-10-30 | 2009-04-30 | Campbell Richard J | Methods and Systems for Glyph-Pixel Selection |
US20090110319A1 (en) * | 2007-10-30 | 2009-04-30 | Campbell Richard J | Methods and Systems for Background Color Extrapolation |
US8391638B2 (en) * | 2008-06-04 | 2013-03-05 | Microsoft Corporation | Hybrid image format |
US9020299B2 (en) | 2008-06-04 | 2015-04-28 | Microsoft Corporation | Hybrid image format |
US20090304303A1 (en) * | 2008-06-04 | 2009-12-10 | Microsoft Corporation | Hybrid Image Format |
US20100142820A1 (en) * | 2008-12-05 | 2010-06-10 | Xerox Corporation | 3 + 1 layer mixed raster content (mrc) images having a black text layer |
US20100142806A1 (en) * | 2008-12-05 | 2010-06-10 | Xerox Corporation | 3 + 1 layer mixed raster content (mrc) images having a text layer and processing thereof |
US8180153B2 (en) | 2008-12-05 | 2012-05-15 | Xerox Corporation | 3+1 layer mixed raster content (MRC) images having a black text layer |
US8285035B2 (en) | 2008-12-05 | 2012-10-09 | Xerox Corporation | 3+1 layer mixed raster content (MRC) images having a text layer and processing thereof |
US20110069885A1 (en) * | 2009-09-22 | 2011-03-24 | Xerox Corporation | 3+n layer mixed rater content (mrc) images and processing thereof |
US8306345B2 (en) * | 2009-09-22 | 2012-11-06 | Xerox Corporation | 3+N layer mixed raster content (MRC) images and processing thereof |
US8325394B2 (en) | 2010-05-28 | 2012-12-04 | Xerox Corporation | Hierarchical scanner characterization |
US8548246B2 (en) | 2010-06-12 | 2013-10-01 | King Abdulaziz City For Science & Technology (Kacst) | Method and system for preprocessing an image for optical character recognition |
US8218875B2 (en) * | 2010-06-12 | 2012-07-10 | Hussein Khalid Al-Omari | Method and system for preprocessing an image for optical character recognition |
US8456704B2 (en) * | 2010-06-14 | 2013-06-04 | Xerox Corporation | Colorimetric matching the perception of a digital data file to hardcopy legacy |
US20110304861A1 (en) * | 2010-06-14 | 2011-12-15 | Xerox Corporation | Colorimetric matching the perception of a digital data file to hardcopy legacy |
US8694332B2 (en) | 2010-08-31 | 2014-04-08 | Xerox Corporation | System and method for processing a prescription |
US20130191082A1 (en) * | 2011-07-22 | 2013-07-25 | Thales | Method of Modelling Buildings on the Basis of a Georeferenced Image |
US9396583B2 (en) * | 2011-07-22 | 2016-07-19 | Thales | Method of modelling buildings on the basis of a georeferenced image |
US20160163059A1 (en) * | 2014-12-04 | 2016-06-09 | Fujitsu Limited | Image processing device and method |
US9524559B2 (en) * | 2014-12-04 | 2016-12-20 | Fujitsu Limited | Image processing device and method |
US10990423B2 (en) | 2015-10-01 | 2021-04-27 | Microsoft Technology Licensing, Llc | Performance optimizations for emulators |
CN105389769A (en) * | 2015-11-05 | 2016-03-09 | 欧阳春娟 | Improved steganography method for optimizing decision surface |
CN106611425A (en) * | 2016-12-19 | 2017-05-03 | 辽宁工程技术大学 | Panchromatic remote sensing image segmentation method |
US20190266433A1 (en) * | 2018-02-27 | 2019-08-29 | Intuit Inc. | Method and system for background removal from documents |
US10740644B2 (en) * | 2018-02-27 | 2020-08-11 | Intuit Inc. | Method and system for background removal from documents |
CN110335280A (en) * | 2019-07-05 | 2019-10-15 | 湖南联信科技有限公司 | A kind of financial documents image segmentation and antidote based on mobile terminal |
US11042422B1 (en) | 2020-08-31 | 2021-06-22 | Microsoft Technology Licensing, Llc | Hybrid binaries supporting code stream folding |
US11231918B1 (en) | 2020-08-31 | 2022-01-25 | Microsoft Technologly Licensing, LLC | Native emulation compatible application binary interface for supporting emulation of foreign code |
US11403100B2 (en) | 2020-08-31 | 2022-08-02 | Microsoft Technology Licensing, Llc | Dual architecture function pointers having consistent reference addresses |
Also Published As
Publication number | Publication date |
---|---|
JP2004173276A (en) | 2004-06-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20040096102A1 (en) | Methodology for scanned color document segmentation | |
US7433535B2 (en) | Enhancing text-like edges in digital images | |
US7672022B1 (en) | Methods and apparatus for analyzing an image | |
US6757081B1 (en) | Methods and apparatus for analyzing and image and for controlling a scanner | |
US7221790B2 (en) | Processing for accurate reproduction of symbols and other high-frequency areas in a color image | |
US7783117B2 (en) | Systems and methods for generating background and foreground images for document compression | |
US7634150B2 (en) | Removing ringing and blocking artifacts from JPEG compressed document images | |
US9135722B2 (en) | Perceptually lossless color compression | |
US7266250B2 (en) | Methods for generating anti-aliased text and line graphics in compressed document images | |
US6633670B1 (en) | Mask generation for multi-layer image decomposition | |
US8417029B2 (en) | Image processing apparatus and method, including fill-up processing | |
EP1327955A2 (en) | Text extraction from a compound document | |
JP2000175051A (en) | Method for dividing digital image data, method for dividing data block and classification method | |
JP2004320701A (en) | Image processing device, image processing program and storage medium | |
JP2000196895A (en) | Digital image data classifying method | |
US20090284801A1 (en) | Image processing apparatus and image processing method | |
JP4093413B2 (en) | Image processing apparatus, image processing program, and recording medium recording the program | |
JP4035456B2 (en) | Image compression method and image compression apparatus | |
US7362474B2 (en) | Printing quality enhancement via graphic/text detection method in compression (JPEG) image | |
KR100537827B1 (en) | Method for the Separation of text and Image in Scanned Documents using the Distribution of Edges | |
Handley | Scanned Color Document Image Segmentation Using the EM Algorithm | |
Hu | Three Problems in Image Analysis and Processing: Determining Optimal Resolution for Scanned Document Raster Content, Page Orientation, and Color Table Compression | |
Nishida | Networked document imaging with normalization and optimization | |
Kurosu | A method of scale conversion for binary image including dithered image | |
Prabhakar et al. | Detection and segmentation of sweeps in color graphics images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: XEROX CORPORATION, CONNECTICUT Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HANDLEY, JOHN C.;REEL/FRAME:013518/0746 Effective date: 20021115 |
|
AS | Assignment |
Owner name: JPMORGAN CHASE BANK, AS COLLATERAL AGENT, TEXAS Free format text: SECURITY AGREEMENT;ASSIGNOR:XEROX CORPORATION;REEL/FRAME:015134/0476 Effective date: 20030625 Owner name: JPMORGAN CHASE BANK, AS COLLATERAL AGENT,TEXAS Free format text: SECURITY AGREEMENT;ASSIGNOR:XEROX CORPORATION;REEL/FRAME:015134/0476 Effective date: 20030625 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: XEROX CORPORATION, CONNECTICUT Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:JPMORGAN CHASE BANK, N.A. AS SUCCESSOR-IN-INTEREST ADMINISTRATIVE AGENT AND COLLATERAL AGENT TO JPMORGAN CHASE BANK;REEL/FRAME:066728/0193 Effective date: 20220822 |