CN100543766C - Image segmentation methods, compact representation production method, image analysis method and device - Google Patents

Image segmentation methods, compact representation production method, image analysis method and device Download PDF

Info

Publication number
CN100543766C
CN100543766C CNB200580043979XA CN200580043979A CN100543766C CN 100543766 C CN100543766 C CN 100543766C CN B200580043979X A CNB200580043979X A CN B200580043979XA CN 200580043979 A CN200580043979 A CN 200580043979A CN 100543766 C CN100543766 C CN 100543766C
Authority
CN
China
Prior art keywords
color
pixel
paster
component
communicated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB200580043979XA
Other languages
Chinese (zh)
Other versions
CN101091186A (en
Inventor
詹姆斯·P·安德鲁
詹姆斯·A·贝斯利
斯蒂文·伊尔甘格
陈玉玲
艾瑞克·W-S·崇
麦克尔·J·劳泽尔
蒂默斯·J·沃克
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Publication of CN101091186A publication Critical patent/CN101091186A/en
Application granted granted Critical
Publication of CN100543766C publication Critical patent/CN100543766C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

Method (100), device and the computer program of the compact representation that is used for producing automatically the color document are disclosed.In the method, in one takes turns by the piece raster order, with the digital image segmentation (110) of color documentation page for being communicated with component.Based on the compactness of whole page or leaf, be communicated with the component statistic, use topological analysis (120) that the digital picture of page or leaf is divided into prospect and background image.In the position that at least one part of foreground image has been covered background image, at least one part of background image is repaired (520) with the piece raster order in a wheel.Merge (130) foreground image and background image and form compact document.The method, device and the computer program that are used for to the digital image segmentation that comprises a plurality of pixels are also disclosed.

Description

Image segmentation methods, compact representation production method, image analysis method and device
Technical field
Relate generally to digital image processing field of the present invention, and the high-grade description that relates more specifically to produce digital picture.
Background technology
Image segmentation be with image division for or the zone that is split up into semantically or visually links up.Each zone is one group of pixel that is communicated with like attribute or a plurality of attributes.For monochrome image, a base attribute that is used for segmentation is the brightness amplitude, and is color component for coloured image.
The surge of the scanning technique that combines with ever-increasing computing ability causes the many progress in the document analysis system field.These systems can be used for, and usually by means of the OCR technology, extract semantic information from scanned document.This system can also be used for using suitable compression method to improve the compression of file and picture by the content choice ground according to each part of page or leaf.Improved document compression makes the application that himself is suitable for such as file and electronic distribution.
Segmentation is the processing stage that is used for that file and picture analyzes, and wherein before can carrying out high-grade processing such as territorial classification and topological analysis, must at first the inferior grade pixel be segmented into primitive object.Topological analysis is categorized as known object type according to some pre-defined rules about document layout with primitive object.Typically, topological analysis does not analyze the original scan image data, but works in alternative data set, such as from the spot of the segmentation of page or leaf or be communicated with component.Except each object properties, topological analysis can also use the object grouping, so that determine their classification.
The some known methods that are used for image segmentation are described below.
It is the simplest method that is used for segmentation that threshold value is divided, and if be two-stage (for example, the black and white document image) with the image handled, it may be fast and effectively.Yet, if image is complicated, have the zone of a plurality of brightness or color grade, in the binaryzation process, may lose in these zones certain some.More complicated threshold value partitioning technology adopts self-adaptation or multistage threshold value to divide, and wherein carries out threshold value by local grade and estimates and binary conversion treatment.Yet these methods are segmentation object correctly still.
Based on the method for cluster (clustering), such as k average and vector quantization, often produce good segmentation result, but they are the iterative algorithms that need many wheels.Therefore, this method may be slow, and be difficult to realize.
Division and merging image segmentation techniques if wherein the attribute of original image section is not consistent, then are split into the square-shaped image section 4 quadrants based on the quadtrees data representation.If find that 4 adjacent squares are consistent, these squares merged into the single square that constitutes by these 4 adjacent squares.Division and merging are handled and are started from the full images grade usually.Therefore, only could begin to handle after having cushioned whole page or leaf, this needs high bandwidth of memory.In addition, this method computation-intensive often.
Region growing is a kind of known method that is used for image segmentation, and is conceptive one of the simplest method.The neighbor that will have like attribute or a plurality of attributes is divided into one group, so that form the section zone.Yet, in practice, must apply quite complicated constraint so that reach acceptable result to growth pattern.Existing region growing method may have some unfavorable effects, because these methods often are partial to the position of initial seed.Different seed is selected to bring different segmentation result, and if seed points be positioned on the edge, then may go wrong.
The surge of the scanning technique that combines with ever-increasing computing ability causes the many progress in document analysis system field.These systems can be used for, and usually by means of the OCR technology, extract semantic information from scanned document.This technology is used in the application that number increases day by day, reads such as automatic form, and can be used for using suitable compression method by the content choice ground according to each part of page or leaf, improves the compression of document.Improved document compression makes the application that himself is suitable for such as file and electronic distribution.
Some document analysis systems carries out topological analysis, so that document is divided into zone according to their classifying content.Typically, topological analysis does not analyze the view data of original scanning, but works in alternative data set, such as from the spot of the segmentation of page or leaf or be communicated with component.Except each object properties, topological analysis can also use the object grouping, so that determine their classification.
Usually, carry out the two-value segmentation of page or leaf, and this can obtain by to original image threshold value being set simply so that produce the data that are used for topological analysis.The object that an advantage of this two-value segmentation is segmentation be positioned at help to carry out topological analysis simply comprise hierarchy.Unfortunately, the layout of the color document of many complexity can not intactly be represented with bianry image fully.The minimizing of the intrinsic information content may cause the degeneration of key character in the conversion from the color to the bianry image, and even the losing of detailed structure that cause document.
Therefore the color segments of page or leaf document analysis has advantage aspect the content that keeps page or leaf, but has also brought additional complicacy to it.At first, piecewise analysis itself becomes more thorny, and has increased processing requirements.The second, because the page or leaf object of segmentation do not form and do not comprise hierarchy, complicated analysis to these segmentation page or leaf objects.This has limited the accuracy and the validity of topological analysis.
The document layout analytic system can also adopt the technology of the text classification that is used for the identifying file zone.In these methods certain some use pixels and, the histogram analysis of shade (shadowing) and projected outline.These methods are normally insecure, because be difficult to the statistic of stalwartness is applied to this method, and are difficult to for being perhaps multirow and do not know character set in the document and the text of text justification is adjusted of single file.
Summary of the invention
According to a first aspect of the invention, provide a kind of method to the digital image segmentation that comprises a plurality of pixels.The method comprising the steps of: produce a plurality of of pixel from digital picture; With use block of pixels to be communicated with component for each piece generates at least one in the mode of single-wheel.Generating step comprises again: block of pixels is segmented at least one is communicated with component, each is communicated with component and comprises continuous and semantic one group of relevant pixel on the space; With this piece this at least one be communicated with component with go out from least one other piece segmentation with pre-treatment at least one be communicated with the component merging; And store the position of connection component in image of this piece with the form of compactness.
Semantic relevant pixel can comprise the pixel that is similar in color.
Produce step and can comprise substep: digital picture is arranged to a plurality of bands, and each band comprises the pixel column that links up of predetermined number; And cushion and handle these bands one by one.This treatment step comprises the substep that the band of each current buffering is carried out again: current band is arranged to a plurality of of pixel; With for generating the piece that step 1 connects a ground buffering and handles current band.
The storage substep can comprise store M-1 bitonal bitmap, and wherein M is communicated with component in a piece, and M is an integer.
The storage substep can comprise the storage key map.
The segmentation substep can comprise: for each piece is estimated some representational colors; Each piece is quantified as representational color; Form the connection component with piece from each quantification.The segmentation substep can also comprise: the subclass of the connection component that will form merges.Merge substep and can comprise the statistic of collecting the connection component.Statistic can comprise one or more in bounding box, pixel counts, boundary length and the average color.This method can also comprise step: will be considered to the formed connection component deletion of noise.Noise can comprise having the pixel counts that is lower than predetermined threshold and be higher than the boundary length of another predetermined threshold and the component that is communicated with of pixel counts ratio.Combining step can comprise: the connection component of piece is merged with the component that is communicated with of the piece of the left side and top; With the statistic of upgrading the connection component that merges.Statistic can comprise all one or more in the color of bounding box, pixel counts, filling ratio peace.
The estimator step can comprise: based on the yuv data of the pixel in each piece, form the histogram relevant with a plurality of color libraries; Based on the statistics with histogram amount to each block sort; With merge the storehouse color based on block sort so that form representative colors.This method also is included in one and forms the step of key map for each pixel in taking turns.Quantization step can comprise: the non-NULL storehouse is quantified as representative colors; Be created to the storehouse mapping of representative colors; With the mapping of use storehouse key map is remapped to representative colors.Forming substep can comprise: determine the brightness band based on the Y value; Determine the color row based on U and V value; Pixel color is accumulated to mapping library; With the pixel counts that increases progressively mapping library.The step of determining the brightness band can also comprise that the brightness band is anti-aliasing.The step of determining the color row can also comprise that the color row are anti-aliasing.
Combining step can be included as the following substep that each connection component in the current block that contacts a left side and coboundary is carried out: searching is communicated with component along the row that public boundary touches current connection component; With the optimal candidate of determining to merge.
According to another aspect of the present invention, provide a kind of device that comprises processor and storer, be used for giving the digital image segmentation that comprises a plurality of pixels according to any one aspect of the method for front.
According to another aspect of the present invention, a kind of computer program that comprises computer-readable medium is provided, and this computer-readable medium has the computer program that is used for giving according to any one aspect of the method for front the digital image segmentation that comprises a plurality of pixels that is recorded in wherein.
A kind of method of compact representation of automatic generation color document is provided according to another aspect of the present invention.The method comprising the steps of: in one takes turns by the piece raster order, with the digital image segmentation of color documentation page for being communicated with component; Based on the compactness of whole page or leaf, be communicated with the component statistic, use topological analysis that the digital picture of page or leaf is divided into prospect and background image; In the position that at least one part of foreground image has been covered background image, in a wheel, repair at least one part of background image with the piece raster order; Reach merging foreground image and background image to form compact document.
This method can also comprise the step to the background image down-sampling.In addition, this method can comprise the step of compressed background image.Compression step can relate to lossy compression method.In addition, this method can comprise the difference compression to the background image of lossy compression method.
According to another aspect of the present invention, provide a kind of device that comprises processor and storer, be used for producing automatically the compact representation of color document according to any one aspect of the method for front.
According to another aspect of the present invention, a kind of computer program that comprises computer-readable medium is provided, and this computer-readable medium has the computer program that is used for producing automatically according to any one aspect of the method for front the compact representation of color document that is recorded in wherein.
According to another aspect of the present invention, provide a kind of analysis to comprise the method for the digital picture of a plurality of pixels.The method comprising the steps of: with digital image segmentation is object, wherein to represent segmentation more than two labels; For each object provides one group of attribute; Be the subclass of object, use to comprise to measure to determine between the adjacent object of sharing the border, whether there is father-child's relation; Object-based attribute forms many group objects of sharing common father; Give object class with attribute and grouping according to them.
Can use the bounding box around each object and the information of the relation of touching between description object to determine to comprise.If two objects contact on the border, and the bounding box of an object comprises the bounding box of another object fully, and then this object comprises this another object.
Formation group step can comprise: the child object of row among the child right of considering common father; With the use object properties, determine whether each is to being grouped into together.Only can consider the adjacent object in the row child object with identical father is divided into groups.Can divide into groups to object based on bounding box and colouring information.
Can one group objects be categorized as text according to the text class quality test of object in the group.The test that is used for text class quality can comprise: be the single value of each object identification expression object's position; Form the histogram of these values; With according to histogrammic Attribute Recognition text.
The present invention can also comprise step: the attribute according to other object adds them the text classification group of object to, and no matter their father-child's attribute how.
The method of the digital picture of a plurality of pixels that a kind of analysis comprises documentation page is provided according to another aspect of the present invention.The method comprising the steps of: give digital image segmentation, so that form object based on image; Form the group of object; Whether represent text with definite each group of objects.Determining step comprises: according to the position of object on page or leaf, for each object is discerned single value; Form the histogram of these values; With according to histogrammic Attribute Recognition text.
Histogrammic attribute can be the sum that has more than the object in the storehouse in the histogram of the object number of appointment.Replacedly, attribute can be the quadratic sum of the counting in the histogram.
The single value of each object of indicated object position can be the limit of the bounding box of object.
According to another aspect of the present invention, provide a kind of device that comprises processor and storer, be used for analyzing the digital picture that comprises a plurality of pixels according to method according to any one aspect, front.
According to another aspect of the present invention, a kind of computer program that comprises computer-readable medium is provided, and this computer-readable medium has basis comprises the digital picture of a plurality of pixels according to the methods analyst of any one aspect, front the computer program that is used for that is recorded in wherein.
According to another aspect of the present invention, provide a kind of reparation to comprise the method for the digital picture of a plurality of pixels.The method comprising the steps of: produce a plurality of block of pixels from digital picture; With the pixel value that changes a string at least (run) pixel according to raster order at least one piece.Change step and comprise the following substep that each piece is carried out: determine the beginning and the end pixel of a string pixel that the piece relevant with object is interior, this string comprises the adjacent pixels that is grouped into together; According to the pixel value of the outer pixel of this string, revise at least one pixel value of the object in the described string; With activity (acitivity) measured value of determining in this piece not corresponding to object pixels; And if the activity measurement value of this piece is less than predetermined threshold value, all pixel values that will have in each piece of at least one crossview element are changed into the value of setting.
This method also comprises step: according to the pixel value of the pixel outside the object that enlarges, revise at least one pixel value of the object pixel that enlarges outside the object.
This generation step can comprise substep: digital picture is arranged as a plurality of bands, and each band comprises the pixel column that links up of predetermined number; Cushion and handle these bands one by one.This treatment step can comprise the following substep that each current buffer strip is carried out: current band is arranged as a plurality of block of pixels; For a change step cushions and handles the piece of current band one by one.
Described string comprises the neighbor of the grating of pixel of piece in capable.
This method also comprises the step of using block-based compression method to compress each piece.Block-based compression method can be JPEG.This method can also comprise the step of using another kind of compress technique further to compress the piece that compresses based on piece.
Can be according to the pixel value of the pixel outside the object, use value, or, revise at least one pixel value of object from the value of the pixel on the left side of string according to the pixel value interpolation on the left side of string and the right.
All pixel values with each piece of at least one crossview element can be changed into mean value with the piece of pre-treatment, or the mean value of the visible pixel in this piece.
This method can also comprise step: if the ending of the object pixel string that find to enlarge, then the color of pixel value is set to not the color value corresponding to the object pixels on the object pixel string left side that enlarges.
Pixel value can be a color value.
According to another aspect of the present invention, provide a kind of device that comprises processor and storer, be used for method, repaired the digital picture that comprises a plurality of pixels according to any one aspect, front.
According to another aspect of the present invention, a kind of computer program that comprises computer-readable medium is provided, this computer-readable medium has the method according to any one aspect, front of being used for that is recorded in wherein, repairs the computer program of the digital picture that comprises a plurality of pixels.
According to another aspect of the present invention, the method of the pixel value of the digital picture that a kind of change comprises a plurality of pixels is provided, at least a portion of these pixels is corresponding to an object in the image, the method comprising the steps of: digital picture is arranged as a plurality of bands, and each band comprises the pixel column that links up of predetermined number; Cushion and handle these bands successively one by one.This treatment step can comprise the following substep that each current buffer strip is carried out: current band is arranged as a plurality of block of pixels; Handle these pieces successively one by one.The piece treatment step comprises the following substep at each piece: determine in the piece not the activity measurement value corresponding to the object pixels in the image; If the activity measurement value less than predetermined threshold value, is then changed into a pixel value with the pixel value of all pixels in this piece; Use JPEG to compress this piece; With the piece that uses another kind of compression method JPEG image compression compression.
Each band can comprise 16 row pixels of digital picture, and piece comprises 16 * 16 pixels, and carries out compression step with pipeline system.
The step that changes all the color of pixel values in the piece can comprise that the color of pixel value is set to the color value that obtains not carrying out linear interpolation between the object pixels corresponding to the left side of the object pixel string that is close to expansion and the right, the object pixel of expansion be string outer and with the string adjacent pixels.
This method can also comprise step: the color of pixel value in the piece is set to not the average color corresponding to the object pixels in the piece.
This method can also comprise step: the average color of the piece before the color of pixel value in the piece is set to.
Sheltering of position that can be by enlarging the definition object determined the object pixels that enlarges.
Other compression method can comprise ZLIB.
Pixel value can be a color value.
According to another aspect of the present invention, a kind of device that comprises processor and storer is provided, be used for the method according to any one aspect, front, change the picture rope value of the digital picture that comprises a plurality of pixels, at least a portion of these pixels is corresponding to the object in the image.
According to another aspect of the present invention, a kind of computer program that comprises computer-readable medium is provided, this computer-readable medium has the method according to any one aspect, front of being used for that is recorded in wherein, change the computer program of the pixel value of the digital picture that comprises a plurality of pixels, at least a portion of these pixels is corresponding to the object in the image.
According to another aspect of the present invention, provide a kind of method to the digital image segmentation that comprises a plurality of pixels.The method comprising the steps of: produce a plurality of block of pixels from digital picture; With use block of pixels to be communicated with component for each piece generates at least one in a mode of taking turns.Generating step comprises again: block of pixels is segmented at least one is communicated with component, each is communicated with component and comprises the pixel that links to each other and semantically be correlated with on one group of space; With at least one of piece be communicated with component with go out from least one other piece segmentation with pre-treatment at least one be communicated with the component merging; With the position of connection component in image of storing this piece with compact form.
Semantic relevant pixel can comprise the pixel that is similar in color.
The storage substep can comprise store M-1 bitonal bitmap, and wherein M is communicated with component in a piece, and M is an integer.
The storage substep can comprise the storage key map.
The segmentation substep can comprise: for each piece is estimated some representational colors; Each piece is quantified as representational color; Form the connection component with piece from each quantification.The segmentation substep can also comprise: the subclass of the connection component that will form merges.Merge substep and can comprise the statistic of collecting the connection component.Statistic can comprise one or more in bounding box, pixel counts, boundary length and the average color.This method can also comprise that step will be considered to the step of the formed connection component deletion of noise.Noise can comprise having the pixel counts that is lower than predetermined threshold and be higher than the boundary length of another predefine threshold value and the component that is communicated with of pixel counts ratio.Combining step can comprise: the connection component of piece is merged with the component that is communicated with of the piece of the left side and top; With the statistic of upgrading the connection component that merges.Statistic can comprise all one or more in the color of bounding box, pixel counts, filling ratio peace.
The estimator step can comprise: based on the yuv data of the pixel in each piece, form the histogram relevant with a plurality of color libraries; Based on the statistics with histogram amount to each block sort; With merge the storehouse color based on block sort so that form representative colors.This method also is included in one and forms the step of key map for each pixel in taking turns.Quantization step can comprise: the non-NULL storehouse is quantified as representative colors; Be created to the storehouse mapping of representative colors; With the mapping of use storehouse key map is remapped to representative colors.Forming substep can comprise: determine the brightness band based on the Y value; Determine the color row based on U and V value; Pixel color is accumulated to mapping library; With the pixel counts that increases progressively mapping library.The step of determining the brightness band can also comprise that the brightness band is anti-aliasing.The step of determining the color row can also comprise that the color row are anti-aliasing.
Combining step can be included as the following substep that each connection component in the current block that contacts a left side and coboundary is carried out: searching is communicated with component along the row that public boundary contacts current connection component; With the optimal candidate of determining to merge.
According to another aspect of the present invention, provide a kind of device that comprises processor and storer, be used for giving the digital image segmentation that comprises a plurality of pixels according to any one aspect of the method for front.
According to another aspect of the present invention, a kind of computer program that comprises computer-readable medium is provided, and this computer-readable medium has the computer program that is used for giving according to any one aspect of the method for front the digital image segmentation that comprises a plurality of pixels that is recorded in wherein.
A kind of method of compact representation of automatic generation color document is provided according to another aspect of the present invention.The method comprising the steps of: in one takes turns with the order of piece grating, with the digital image segmentation of color documentation page for being communicated with component; Based on statistic compactness, that be communicated with component of whole page or leaf, use topological analysis that the digital picture of page or leaf is divided into prospect and background image; In the position that at least one part of foreground image has been covered background image, in single-wheel, repair at least one part of background image with the order of piece grating; Form compact document with merging foreground image and background image.
This method can also comprise the step to the background image down-sampling.In addition, this method can comprise the step of compressed background image.Compression step can relate to lossy compression method.In addition, this method can comprise the difference compression to the background image of lossy compression method.
According to another aspect of the present invention, provide a kind of device that comprises processor and storer, be used for producing automatically the compact representation of color document according to any one aspect of the method for front.
According to another aspect of the present invention, a kind of computer program that comprises computer-readable medium is provided, and this computer-readable medium has the computer program that is used for producing automatically according to any one aspect of the method for front the compact representation of color document that is recorded in wherein.
According to another aspect of the present invention, provide a kind of analysis to comprise the method for the digital picture of a plurality of pixels.The method comprising the steps of: with digital image segmentation is object, wherein to represent segmentation more than two labels; For each object provides one group of attribute; Be the subclass of object, use to comprise to measure to determine between the adjacent object of sharing the border, whether there is father-child's relation; Object-based attribute forms many group objects of sharing common father; With attribute and grouping they are classified according to object.
Can use the bounding box around each object and the information of the relation of touching between description object to determine to comprise.If two objects contact on the border, and the bounding box of an object comprises the bounding box of another object fully, and then this object comprises this another object.
Formation group step can comprise: the child object of row among the child right of considering common father; With the use object properties, determine whether each is to being grouped into together.The adjacent object that only may be thought of as in the row child object with identical father is divided into groups.Can divide into groups to object based on bounding box and colouring information.
Can one group objects be categorized as text according to the text class quality test of object in the group.The test that is used for text class quality can comprise: be the single value of each object identification expression object's position; Form the histogram of these values; With according to histogrammic Attribute Recognition text.
The present invention can also comprise step: the attribute according to other object adds them the text classification group of object to, no matter and they father-child's attribute how.
The method of the digital picture of a plurality of pixels that a kind of analysis comprises documentation page is provided according to another aspect of the present invention.The method comprising the steps of: give digital image segmentation, so that form object based on image; Form the group of object; Whether represent text with definite each group of objects.Determining step comprises: according to the position of object on page or leaf, for each object is discerned single value; Form the histogram of these values; With according to histogrammic Attribute Recognition text.
Histogrammic attribute can be the sum that has in the histogram more than the object in the storehouse of the object number of appointment.Replacedly, attribute can be the quadratic sum of the counting in the histogram.
The single value of each object of indicated object position can be the edge of the bounding box of object.
According to another aspect of the present invention, provide a kind of device that comprises processor and storer, be used for analyzing the digital picture that comprises a plurality of pixels according to method according to any one aspect, front.
According to another aspect of the present invention, a kind of computer program that comprises computer-readable medium is provided, this computer-readable medium have be recorded in wherein be used for analyze the computer program of the digital picture that comprises a plurality of pixels according to according to the method for any one aspect, front.
According to another aspect of the present invention, provide a kind of reparation to comprise the method for the digital picture of a plurality of pixels.The method comprising the steps of: produce a plurality of block of pixels from digital picture; Be at least one piece, change the pixel value of a string at least pixel according to raster order.Change step and comprise the substep following to the execution of each piece: determine in the piece beginning and the end pixel of a string pixel relevant with object, this string comprises the adjacent pixels that is grouped into together; According to the pixel value of the outer pixel of this string, revise at least one pixel value of the object in the described string; With the activity measurement value of determining not corresponding to object pixels in this piece; And the activity measurement value of if block is less than predetermined threshold value, and all pixel values that will have in each piece of at least one crossview element are changed into the value of setting.
This method also comprises step: according to the pixel value of the pixel outside the object that enlarges, revise at least one pixel value of the object pixel of the expansion outside the object.
This generation step can comprise substep: digital picture is arranged as a plurality of bands, and each band comprises the pixel column that links up of predetermined number; Cushion and handle these bands one by one.This treatment step can comprise the following substep that the band of each current buffering is carried out: current band is arranged as a plurality of block of pixels; For a change step cushions and handles the piece of current band one by one.
Described string comprises the neighbor of the grating of pixel of piece in capable.
This method can also comprise the step of using block-based compression method to compress each piece.Block-based compression method can be JPEG.This method can also comprise the step of using another kind of compress technique further to compress the piece that compresses based on piece.
Can be according to the pixel value of the pixel outside the object, use value, or, revise at least one pixel value of object from the value of the pixel on the left side of string according to the pixel value interpolation on the left side of string and the right.
All pixel values with each piece of at least one crossview element can be changed into mean value with the piece of pre-treatment, or the mean value of the visible pixel in this piece.
This method can also comprise step: if the ending of the object pixel string that find to enlarge, then the color of pixel value is set to not the color value corresponding to the object pixels on the object pixel string left side that enlarges.
Pixel value can be a color value.
According to another aspect of the present invention, provide a kind of device that comprises processor and storer, be used for method, repaired the digital picture that comprises a plurality of pixels according to any one aspect, front.
According to another aspect of the present invention, a kind of computer program that comprises computer-readable medium is provided, this computer-readable medium has the method according to any one aspect, front of being used for that is recorded in wherein, repairs the computer program of the digital picture that comprises a plurality of pixels.
According to another aspect of the present invention, the method of the pixel value of the digital picture that a kind of change comprises a plurality of pixels is provided, at least a portion of these pixels is corresponding to the object in the image, the method comprising the steps of: digital picture is arranged as a plurality of bands, and each band comprises the pixel column that links up of predetermined number; Cushion and handle these bands successively one by one.This treatment step can comprise the following substep that is used for each current buffer strip: current band is arranged as a plurality of block of pixels; Handle these pieces successively one by one.The piece treatment step comprises the following substep at each piece: determine in the piece not the activity measurement value corresponding to the object pixels in the image; If the activity measurement value less than predetermined threshold value, is then changed into a pixel value with the pixel value of all pixels in this piece; Use JPEG to compress this piece; With the piece that uses another kind of compression method JPEG image compression compression.
Each band can comprise 16 row pixels of digital picture, and piece comprises 16 * 16 pixels, and carries out compression step with pipeline system.
The step that changes all the color of pixel values in the piece can comprise that the color of pixel value is set to by not carrying out the color value that linear interpolation obtains between the object pixels corresponding to the left side of the object pixel string that is close to expansion and the right, the object pixel of expansion be string outer and with the string adjacent pixels.
This method can also comprise step: the color of pixel value in the piece is set to not the average color corresponding to the object pixels in the piece.
This method can also comprise step: the color of pixel value in the piece is set to the average color at preceding.
Sheltering of position that can be by enlarging the definition object determined the object pixels that enlarges.
Other compression method can comprise ZLIB.
Pixel value can be a color value.
According to another aspect of the present invention, a kind of device that comprises processor and storer is provided, be used for the method according to any one aspect, front, change the pixel value of the digital picture that comprises a plurality of pixels, at least a portion of these pixels is corresponding to the object in the image.
According to another aspect of the present invention, a kind of computer program that comprises computer-readable medium is provided, this computer-readable medium has the method according to any one aspect, front of being used for that is recorded in wherein, change the computer program of the pixel value of the digital picture that comprises a plurality of pixels, at least a portion of these pixels is corresponding to the object in the image.
Description of drawings
Below with reference to accompanying drawing some embodiment are described, wherein:
Fig. 1 provides the high-level flow of the summary of segmentation according to an embodiment of the invention, analysis and compressed digital video;
Fig. 2 is the process flow diagram of the color segments step of Fig. 1;
Fig. 3 is the detail flowchart of the step of the next pixel of the acquisition of Fig. 2;
Fig. 4 is the process flow diagram of topological analysis's step of Fig. 1;
Fig. 5 is the process flow diagram that the step of output image is compressed in the generation of Fig. 1;
Fig. 6 comprises character " i " and the block scheme of the image of the connection component that is associated;
Fig. 7 is the block scheme that can implement the general-purpose computing system of embodiments of the invention;
Fig. 8 is used to the incoming page of scanning to produce the block scheme that compresses the system that exports according to an embodiment of the invention;
Fig. 9 is the detail flowchart that the color of Figure 10 is communicated with component analysis and spot statistic procedure;
Figure 10 is the process flow diagram of the color segments step of Fig. 2;
Figure 11 is the detail flowchart of the color quantizing step of Figure 10;
Figure 12 is the new spot of the formation of Fig. 9 or the detail flowchart of the step of spottiness of growing;
Figure 13 is the process flow diagram of combining step between the paster (tile) of Figure 10;
Figure 14 is the process flow diagram of the right step of best spot of the processing of Figure 13 and CC;
Figure 15 shows the block scheme that spot merges the example of handling;
Figure 16 shows the block scheme of the merging between the state of current paster and two adjacent patch;
Figure 17 shows the table of the condition that merges between paster;
Figure 18 is the example that the CC of Figure 14 shines upon the result who handles;
Figure 19 is the process flow diagram of the interior combining step of paster of Figure 10;
Figure 20 is the process flow diagram of the grouping CC step of Fig. 4;
Figure 21 is the process flow diagram of step of the inspection CC group of Fig. 4;
Figure 22 is the process flow diagram of step of child CC of the group of searching of Figure 20;
Figure 23 is the process flow diagram of the initial packet step of Figure 20;
Figure 24 is the process flow diagram of the inspection alignment step of Figure 21;
Figure 25 (a) shows based on the image more than the simple case of the segmentation of two quantification gradations;
Figure 25 (b) and (c) show the image of accordingly result of two-value segmentation of the image of Figure 25 (a);
Figure 26 comprises the i.e. diagram in the zone of the document of letter " g " and " h " of two objects;
Figure 27 (a) shows the histogram of value at the end of bounding box of the image of Figure 27 (b);
Figure 27 (b) shows the selection of bounding box of the irregular layout of the part that segmentation from image is gone out;
Figure 27 (c) shows the corresponding histogram of the page or leaf of Figure 27 (d);
Figure 27 (d) shows the layout of the bounding box on the page or leaf that is used for group of text;
Figure 28 is the block scheme that the color spot is divided system;
Figure 29 (a) is the image of color histogram output to 29 (c), comprises original paster, key map and palette, and wherein the grey color part of the upper and lower of palette is represented empty storehouse;
Figure 30 is the process flow diagram of step of the reparation paster of Fig. 5;
Figure 31 is the process flow diagram of the formation paster prospect position of Figure 30 step of sheltering;
Figure 32 is the repairing pixel of Figure 30 and the process flow diagram of measuring the step of paster activity;
Figure 33 is the process flow diagram of the paster planarisation step of Figure 30;
Figure 34 be in the background image paster example pixel and in the full resolution prospect is sheltered the block scheme of checked respective pixel;
Figure 35 comprises the curve map of the example that one dimension is repaired;
Figure 36 shows the block scheme of the color spot processing of the example that uses 6 * 6 pasters;
Figure 37 comprises the image of the example that shows the spot merging;
Figure 38 comprises the image that shows the original paster that is used for the comparison purpose and merge spot output;
Figure 39 shows the block scheme that merges the spot on the paster with raster order;
Figure 40 shows the process flow diagram of the step of the formation 2D histogram of Figure 11 and first palette;
Figure 41 shows the process flow diagram of step of formation second palette of Figure 11;
Figure 42 shows the process flow diagram of step of generation two color palette of Figure 41;
Figure 43 shows the process flow diagram of step of the many color palette of generation of Figure 41;
Figure 44 shows the process flow diagram of the step that pixel and second palette are associated of Figure 11;
Figure 45 shows the detail flowchart of step of the mapping two-stage paster of Figure 44;
Figure 46 shows the detail flowchart of step of many colors of mapping paster of Figure 44;
Figure 47 is the analysis histogram of Figure 11 and to the process flow diagram of the step of paster classification;
Before Figure 48 shows all operations condition is upgraded and the table of afterwards CC and spot candidate counting;
Figure 49 shows the block scheme of the example that merges spot and CC candidate;
Figure 50 shows the process flow diagram of the step that the back merging of Figure 10 handles;
Figure 51 is the block scheme of system according to another embodiment of the invention;
Figure 52 is the block scheme of the color segments module of Figure 51; With
Figure 53 shows the Delaunay triangulation of set-point and the figure of Voronoi figure.
Embodiment
The method, device and the computer program that are used to handle with compressed digital video are disclosed.In the following description, propose some specific detail, comprised concrete lossless compressiong, color space, spatial resolution, patch size etc.Yet those skilled in the art can understand from the disclosure, can make and revise and/or substitute and do not depart from the scope of the present invention and spirit.In other cases, may ignore specific details so that can not make that the present invention is smudgy.
Accompanying drawing any one or a plurality of in quote under the situation of step with same reference numerals and/or feature, for the purpose of this instructions, these steps and/or feature have identical functions (a plurality of) or operation (a plurality of), unless opposite intention occurs.
In the context of the present specification, word " comprises " having open, non-exclusive connotation: " mainly comprise but needn't only comprise ", but neither " basically by ... form ", neither " only by ... form ".The variant that speech " comprises (comprising) " such as " comprising (comprise) " and " comprising (comprises) ", has corresponding connotation.
The content that describes in detail is organized as following chapters and sections.
1 general introduction
2 color segments
2.1 obtain the next paster of input picture
2.2 form the color segments of paster
2.2.1 color quantizing
2.2.1.1 form the 2D histogram and first palette
2.2.1.2 analyze histogram and to image classification
2.2.1.3 form second palette
2.2.1.3.1 produce 2 color palette
2.2.1.3.2 produce many color palette
2.2.1.4 the associated pixel and second palette
2.2.1.4.1 mapping two-stage paster
2.2.1.4.2 shine upon multistage paster
2.2.2 color CC analyzes and the spot statistic
2.2.2.1 formation spot
2.2.2.2 the example that spot merges
2.2.3 merge in the paster
2.2.4 merge between paster
2.2.4.1 merging condition between paster
2.2.4.2 merge example between paster
2.2.4.3 it is right to handle best spot and CC
2.2.4.4 CC mapping result
2.2.5 merging, the back handles
2.3 segmentation example
3 topological analysises
3.1 CC is divided into groups
3.1.1 seek the child of father CC
3.1.2 initialisation packet
3.1.2.1 the grouping of two CC test
3.2 check grouping
3.2.1 check alignment
3.2.2 the example of alignment
4 produce the output image of compression
4.1 reparation paster
4.1.1 forming paster prospect position shelters
4.1.2 repairing pixel and measurement paster activity
4.1.3 reparation example
4.1.4 paster planarization
5 hardware embodiment
5.1 color segments module
6 computer realization
7 industrial applicibilities
Describe in detail with top with said sequence hereinafter.
1 general introduction
First embodiment of the present invention is the processing that operates on the multi-purpose computer.Fig. 1 provides the high level general introduction of the processing 100 of segmentation, analysis and compressed digital video.The input of processing 100 preferably resolution is the RGB image of 300dpi.Yet, suitably revise and handle 100, also can import the image of other color spaces.Equally, can import the image of different resolution.In step 110, use the single-wheel that input picture is carried out to handle, carries out image is to the color segments that is communicated with component (CC).The use single-wheel is handled, processing digital images promptly, thus can handle a large amount of high-resolution images.Be communicated with component and be the group of similar colored pixels of (contact) of all linking together.For example, form the main body of the letter of printing with the black ink on the blank sheet of paper " i " or the pixel of trunk and constitute a connection component, the point on " i " constitutes another CC, and constitutes another CC around letter " i " any white pixel as a setting.Fig. 6 shows the image 600 that comprises character " i ", and wherein stain is represented black pixel, and white point is represented white pixel.Show the trunk CC 610 of " i " and the some CC 612 of " i ".Another that also shows blank pixel be CC 614 as a result.
As the part of segmentation, calculate compact information and the statistic of describing CC.To digital down-sampling, and directly be provided to step 130.In step 120, use this compact CC information and statistic that CC is carried out topological analysis.This topological analysis determines the layout of the feature on the page or leaf, and page or leaf for example comprises text character, paragraph, table and image.In step 130, use this layout information to create one or more foreground images.Usually constitute foreground image by the text character that in step 120, identifies, and bianry image preferably.Can be with input resolution (for example, 300dpi) storage foreground image, and can be with lower resolution (for example, 150dpi) storage background image.Foreground elements is removed from background image.Then, use different technique compresses prospect and background, and store with compound image format.Compound form can be, for example, and the PDF document.In color segments, topological analysis and the compressed digital video each is described in independent chapters and sections below.
An application that is used for embodiments of the invention is the grating pixel image of analyzing from scanner, and extracts high layer information as much as possible.According to this information, the high level that can generate page or leaf is described.For this purpose, system can be designed as by carrying out pixel analysis with hardware, operation quickly.Yet, will understand that from describe below this system also can realize with software fully.In addition,, can make amendment, so that utilize other pages descriptor format as output format to system though specific output format PDF is described below.
Fig. 8 is the high-level block diagram that is used to generate the system 800 of the compression of scanning input document 810 or compact representation 850 according to an embodiment of the invention.System 800 comprises front-end module 820, intermediate module 830 and rear module 830.Front-end module 820 is preferably with hard-wired front end based on paster, but can be the combination of ASIC and the software carried out by flush bonding processor.The paster raster order is a kind of disposal route based on paster, wherein handles paster one at a time from top to bottom and from left to right.This module is carried out color segments to input picture, and the background image of paster is offered rear module 840 (by the arrow indication).Front-end module 820 (for example, 300dpi) is carried out color and is communicated with component analysis with the resolution of whole page or leaf.Front-end module 820 also generates from digital picture 810 and is communicated with component (CC), and CC is offered the intermediate module of carrying out topological analysis.This module can realize with software fully.Front-end module 820 provides down-sampled images to module 840.The output of intermediate module 830 is offered rear module 840, and it carries out the reparation based on paster, and generates the compact representation of digital picture 810.Similar with front-end module 820, rear module 840 can realize with the hardware that has the software of carrying out on flush bonding processor and ASIC at least in part.This compactness output can comprise the prospect bitmap than the digital picture and the high-resolution of low spatial resolution.
Front-end module 820 is carried out and is related to all analytical works of checking each pixel, and forms the color CC in the zone of semantically relevant pixel.From the output of front-end module 820 is information about all colours CC on the page or leaf.The information that is used for each CC comprises bounding box, average color, contact tabulation and number of pixels.When for optimum performance during with the hardware implementation algorithm, algorithm should be effective aspect bandwidth.For the Flame Image Process task, algorithm is the whole scan page of random access not.But algorithm works on the little paster at every turn.
2 color segments
Fig. 2 illustrates in greater detail the step 110 of Fig. 1.Work of treatment shown in Fig. 2 is on the paster of input picture, and the size of these pasters for example, can be 32 * 32 pixels.Can handle paster-promptly by raster order, from left to right and from top to bottom.First processed paster is the paster in the upper left corner of input picture, and the last paster of handling is the paster in the lower right corner of input picture.Carrying out this paster for the purpose of efficient divides.
In step 210, the next paster of the input picture that acquisition will be handled.Can use pointer information to visit each paster effectively.With reference to figure 3 step 210 is described in more detail.Step 220,230,240 and 250 processing are confined to the pixel of current paster.In step 220, selectively current paster is carried out and gone halftone process.For example, the scanning input document of handling with this method may comprise the shadow tone from print processing.Shadow tone can be so that be difficult to carry out subsequent analysis, and may not compress well.Therefore, can use this step 220 to detect and remove shadow tone.Only as an example, can carry out the following halftone process of going.Go halftone process can work on 16 * 16 the paster.The input paster of each 32 * 32RGB pixel can be divided into 4 16 * 16 paster, and handle each respectively.Shadow tone detects and can work on the Color Channel (that is, R, G and B) at every turn.If in any passage, detect shadow tone, can on all passages, carry out shadow tone and remove.In order to detect shadow tone, each amount of pixels in the paster is turned to 4 grades.The scope of input Color Channel value is 0-255.These 4 grades are scope 0-63,64-127,128-191 and 192-255.Can measure the change of rank number between the pixel that is adjacent to each other.Can be flatly and vertically carry out this measurement.
Because shadow tone is generally point, it is big that this detection requires to change number, and level changes number less than the vertical number that changes.This prevents that the change of grade that the edge by text character is caused from detecting is shadow tone.Can specify the threshold value that is used to detect.If in 16 * 16 paster, detect shadow tone, can use, for example, the fuzzy shadow tone of removing in space.The shadow tone detecting device can also use the information from the paster of analyzing in the past.For example, have detected shadow tone in paster if contact in the paster of current paster, current paster may also comprise shadow tone.When this information of use, can adjust threshold value, detect requirement or make it stricter so that loosen shadow tone.
In step 230, if the color space of current paster not in the YUV color space, is then carried out conversion so that with the pixel transitions in the paster to the YUV color space.Therefore, this step selectively depends on the color space of input picture.Though use the YUV color space in this embodiment, can adopt other color spaces and do not depart from the scope of the present invention and spirit.Be used for from RGB be transformed into YUV conversion formula can with, for example, employed identical in independent JPEG group (IJG) JPEG storehouse.
In step 240, current paster is carried out the color segments that forms connection component (CC), and calculate compact information and statistic about the CC in the paster.Color CC comprises one or more semantically relevant spots of striding one or more pasters.For example, semantically relevant spot can have similar color.Spot is the one group of pixel that is communicated with that has similar painted characteristic in the single paster.It is that a kind of color segments and formation are communicated with the processing that component is represented that spot is divided.Each CC has following statistic: pixel size, average color, two-value are sheltered, spot boundary length and bounding box.Figure 10 illustrates in greater detail this step.
In step 250, to current paster down-sampling, so that form the appropriate section of background image.For example, can use storehouse formula wave filter on two dimensions with the 2:1 down-sampling, but can adopt additive method and do not depart from the scope of the present invention and spirit.In deciding step 260, check, to handle so that determine whether to remain any more paster.If the result of step 260 is not (that is, treated all pasters in the image), handle stopping.Yet,, handle and continue at step 210 place if the result of step 260 is for being.
2.1 obtain the next paster of input picture
Fig. 3 shows in detail the step 210 of Fig. 2.Handle 210 and be operated in being with of input digital image.The band of image is some coherent image lines.The height of each band can be identical with the height of paster.Therefore, for example, first of image 32 row form a band, and in this case, first band of image; 32 following image lines form next band, and the rest may be inferred.The width of each band can be the width of input picture.In deciding step 310, check so that determine whether and to read in another band.When treated all pasters of current band, need read in another band.It is execution in step 310 first time that another time needs read in another band, and this is because also tape reading this moment.If the result of step 310 is for being, handles and continue in step 320.Otherwise, handle and continue at step 340 place.
In step 320,, for example, read in impact damper in the storer from dish from the next one band of input picture reading of data.Storage buffer can be arranged as the every capable pixel that comprises band in continuous memory location.In addition, keep the record of memory location of the beginning of every row: promptly, the pointer of each band row.In step 330, initialization determine the paster-current paster that will visit-variable tx (that is, being set to 0).In step 340, the line pointer information updating is the current paster of sensing.Other of the processing 210 of calling graph 3 are handled the new paster of line pointer information acquisition that upgrades in step 340 by reference.Line pointer information can comprise the pointer that each paster is capable.Therefore, each paster can have 32 line pointers.Given line pointer points to the memory location of first pixel of paster in the given row.(paster width * tx) individual each line pointer of memory location stepping by outside the pointer of the beginning of pointing to each corresponding band line can upgrade line pointer information.In step 350, variable tx is increased by 1, thereby during the processing 210 of calling graph 3 next time, another paster input is handled.
2.2 form the color segments of paster
It is a kind of image segmentation algorithm with the work of paster raster order that the color spot is divided.Spot is the connection group of the pixel of the identical quantification label in the single paster.Each spot has following statistic: pixel size, average color, two-value are sheltered, spot boundary length and bounding box.Its objective is file and picture is segmented into one group of non-overlapped connection component, wherein each is communicated with the set that component comprises the relevant pixel of the semanteme of connection, for example, collection of pixels in the particular text letter forms one and is communicated with component, and the pixel in the part of the image of the text forms another and is communicated with component etc.
Figure 28 is the block scheme with color spot division system 2800 of 4 modules 2810,2820,2830 and 2840.Color quantizing module 2810 receives the color paster (for example, for the RGB color space, 24 pixel value) of input, determines the number of main color in the paster, and according to main color quantizing paster.The paster of main color and quantification is offered connection component and spot statistical module 2820 as input, and it is carried out 8 the tunnel to the paster that quantizes and is communicated with component analyses in single grating wheel.In same grating wheel, collect the spot statistic, such as pixel count, average color, spot boundary length and bounding box information.Spot and statistic are offered merging module 2830 in the paster as input, and it reduces vacation and number speckle in the paster by merging spot based on color, size and border statistic.To offer merging module 2840 between paster as input from the spot as a result and the statistic of this module 2830, it is according to the spot statistic, spot in the current paster is grouped in the contact (area) pattern in the adjacent patch (left side and top), so that formation is as the connection component of the output of system 2800.Will be further described with reference to the processing of Figure 10.
Figure 10 is that to be used for image segmentation be the detail flowchart of the step 240 of color CC.As the part of staging treating, calculate compact information and the statistic of describing CC.Begin staging treating by the input paster that receives from the pixel of step 230.At decision piece 1005, check so that determine whether this paster is smooth.If paster is smooth, handles and proceed to merging phase between paster at step 1040 place.Otherwise, handle and continue at step 1010 place.In step 1010, paster is carried out color quantizing.Three key steps of using color quantizing find the main color in the paster, and considered pixel geometric configuration not: 1) color reduces, 2) paster classification and 3) seek main color and quantification.By the main color in the definite input of the color histogram method paster.By according to main color quantizing input paster pixel, create the quantification paster of color label.It is obvious color visually in the paster that main color is perceived as by human beholder.This algorithm is suitable for hardware (HW) to be realized, and realizes also quite fast with software (SW).
Step 1020 is carried out 8 the tunnel to the paster that quantizes and is communicated with component analysis so that form spot in single grating wheel.In step 1030, carry out to merge in the paster and handle, so that by merging spot, reduce false and number speckle in the paster based on color, size and boundary information.In step 1040, carry out the merging between paster.The spot that is identified in the spot that identifies in the paster with quantification and two the previously treated pasters of the current paster left side and top compares, so that merge into color CC.Thereby a color CC comprises the one or more similar painted spot of crossing over one or more pasters.Like this, except boundary information, color CC has the above-mentioned statistic at the same type of spot.
In step 1050, the color CC that the spot in the current paster and these spots are formed is stored in the compact paster status data structure.This paster state does not comprise pixel data.The paster state only comprises the spot that will newly create and merges to information required in the existing color CC.Can carry out to merge between paster with high memory efficiency and handle 1040, because,, only need two or paster state still less for merging with current paster in any stage of staging treating.In addition, step 1050 is upgraded the contact tabulation of each color CC.Which connection component the contact tabulation describes is adjacent to each other.A part of analyzing as color CC in front end produces this contact tabulation.Step 240 among Fig. 2 produces the contact tabulation.Handle then and stop.
2.2.1 color quantizing
The purpose of color quantizing is the color set that whole color input is reduced to reduction, so that prepare for being communicated with the component generation.In order to find main color, each input pixel is once checked, and produced histogram.Embodiments of the invention adopt such histogram, and it uses brightness as first dimension, and makes up two chromatic components as second dimension.This is not similar with traditional color histogram that the storehouse is divided into three-dimensional with the axle according to three color components.Embodiments of the invention produce the histogram of the compactness help the main color that more easily finds.According to histogrammic characteristic, paster is categorized as three classes-smooth, two-stage, many colors.Classification produces the palette that is used for paster according to paster.After producing palette,, distribute for each pixel and quantize label according to the palette of colors that pixel mapping arrives.This method is designed for high speed processing and low storage requirement.Smooth paster only has one and quantizes label.The two-stage paster has two and quantizes label.Many colors paster has 4 of as many as and quantizes label.
The color quantizing step 1010 of Figure 10 is further launched in Figure 11.Provide the concise and to the point description of each step among Figure 11 herein, and and then provide the detailed description of each step.In step 1110, use first palette to form 2D histogram and key map simultaneously.Figure 40 provides the further details of step 1110.In step 1120, carry out the paster classification according to the histogrammic statistic of 2D.Figure 47 provides the further details of step 1120.In step 1130, classification forms second palette based on paster.This relates to compression first palette.Figure 41 provides the further details of step 1130.In step 1140, pixel is associated with second palette.Key map is remapped in second palette of colors one, has the quantification paster that quantizes label so that produce.Then, handle termination.
2.2.1.1 form the 2D histogram and first palette
Figure 40 illustrates in greater detail the processing of the step 1110 of Figure 11.By the predetermined mapped method, the whole color paster that will import in one takes turns with the random grating order is quantified as key map.Figure 29 (a) shows the example of original paster of input and the key map as a result that produces.Mapping can be configured at 32 color libraries that are organized as 8 brightness bands and 4 color row.Each color libraries can have color accumulator, pixel counter and registration ID, with the YUV value that is placed on first pixel in this storehouse this registration ID is set.Can change the predetermined mapped method according to the histogrammic state of 2D.As a result, the composition that can influence first palette less than predetermined color and pixel color order at each storehouse.The average color in last each non-NULL storehouse constitutes first palette.Figure 29 (c) shows the palette of generation, and wherein the ash part of upper and lower is represented empty storehouse.
In step 4010, obtain to have the pixel of color value (YUV) from paster.In step 4015, carry out predetermined mapped, so that to brightness band and color libraries (that is, bin_mapped) with pixel mapping.Predetermined mapped can be as follows:
Band=Y〉〉 5, and
column=(|U-REF_U|+|V-REF_V|)*NORMALISING_FACTOR[band].
The chromatic value of grey can be used for REF_U and REF_V:(promptly, for 8 RGB input data, and REF_U=128, REF_V=128).The REF_U and the REF_V that use to select calculate the NORMALISING_FACTOR that is used for each band in advance, so that each band is normalized into 4 storehouses in the RGB color space.Can use the pseudo-code of table 1 to produce NORMALISING_FACTOR.
Table 1
Figure C200580043979D00341
Step 4020 to 4025 is carried out and is used for selectable " being with anti-aliasing " that two-stage paster profile strengthens.In step 4020, if allow to carry out " being with anti-aliasing ", and the luminance difference between the band of the band of mapping band and top or bottom is carried out " being with anti-aliasing " above specified threshold value (for example, 16) in step 4025.Continue at step 4035 place otherwise handle.
In step 4025, execution is with anti-aliasing.The approaching non-NULL storehouse in upper side band or the lower sideband is found in trial.Candidate storehouse is with band-1 or band+1 mapping.In any one of two conditions, replace mapping library (bin_mapped) below with candidate storehouse:
1 candidate storehouse non-NULL, and its registration ID (Y) distance Y is less than 16, and bin_mapped be a sky.
Two of 2 candidate storehouses and bin_mapped are not empty, and Y than the registration ID (Y) of bin_mapped more near the registration ID (Y) of candidate storehouse.
Step 4035 to 4055 is carried out " storehouse is anti-aliasing " and is handled.Step 4035 checks whether mapping library (bin_mapped) is empty.If mapping library is not empty, the following inspection mapping error of step 4040:
max(|U-registration?ID(U)|,|V-registrationID(V)|)<MAX_BIN_ERROR[band],
MAX_BIN_ERROR[band wherein] be used for producing 1/8th of max_dist in each band that the pseudo-code of normalization factor defines above being.
If step 4035 is returned vacation (denying), handle and continue at step 4040 place.Otherwise, handle and continue at step 4045 place.In deciding step 4040, check so that determine whether mapping error has surpassed specified threshold value, and this threshold value is the maximum storehouse error at this band.
If mapping error in threshold value, is handled and is continued at step 4060 place.Otherwise execution in step 4055 is to seek more approaching storehouse.In step 4055, begin search from row 0, and move forward to the row 3 in the mapping band.Stop search during in satisfying following conditions any one:
1 find empty storehouse and
2 find to have the storehouse of the mapping error in the threshold value that allows
If the search of step 4055 stops on condition 1, (YUV) value is registered in the sky storehouse, and should substitutes bin_mapped in the sky storehouse.If two conditions have all been failed, substitute bin_mapped with storehouse with minimum mapping error.Handle then at step 4060 place and continue.
After the test of step 4035,, handle and continue at step 4045 place if mapping library is empty.In deciding step 4045, check so that determined to have been found that the approaching non-NULL storehouse that same band is interior.Step 4045 attempts to seek the non-NULL storehouse of the mapping error threshold value of definition before satisfying from row 0 to 3 search.If find this storehouse, in step 4052, substitute bin_mapped, and handle then at step 4052 place and continue with empty storehouse.Step 4052 is returned vacation (denying) else if, with the color in the step 4050 (YUV) value record bin_mapped, and handles and continues at step 4060 place.
In step 4060, pixel color (YUV) is accumulated in the mapping library (bin_mapped), and increases progressively the interior pixel counts of bin_mapped.In step 4065, be the position of current pixel record bin_mapped.In step 4070, check so that determine whether retained more pixel in the paster.If the result is for being, handles and continue at step 4010 place.Otherwise it continues at step 4075 place, and wherein the pixel counts in each non-NULL storehouse is divided by the color of its accumulation.The average color in each non-NULL storehouse forms first palette.Handle then and stop.
2.2.1.2 analyze histogram and to image classification
The paster classification is a method of finding the main color in the paster.Based on distribution in the palette and change color, paster is categorized as 3 groups: smooth, two-stage and many colors.Smooth paster has color visually constant for the mankind's eyes, and forms in the 2D histogram one bunch usually.Smooth palette has 3 colors of as many as, and change color is little.The two-stage paster has the color of two uniquenesses, and vertically lines up row usually in the 2D histogram.The two-stage palette has the color of crossing over minority brightness band, but the change color in each brightness band is little.Many colors paster launches on a large amount of storehouses in the 2D histogram usually.Many color palette comprise the paster of preceding two test crashs.
The storehouse that the step 1120 of Figure 11 is analyzed in the 2D histogram distributes and color characteristics, thereby and paster is classified.Paster is divided into 3 groups: smooth, two-stage and many colors.Figure 47 illustrates in greater detail the processing of step 1120.In step 4710, carry out the flat patch test.If the result is for being that it is smooth in step 4712 paster being categorized as.Otherwise, carry out second test in step 4720, so that determine whether paster is two-stage.If the result of two-stage test is for being that in step 4722 paster being categorized as is two-stage.Otherwise, paster is categorized as many colors in step 4724.Be described in more detail below step 4710 and 4720.
About step 4710, a LumRange is defined as the scope that the storehouse is not the highest and minimum brightness interband of full sky.For the paster by smooth test, paster must satisfy following all 3 conditions:
The number in 1 non-NULL storehouse<=3
2 LumRange<=2; With
3?FlatColourVariance<FLAT_COLOUR_VARIANCE
Wherein FlatColourVariance is defined as the summation of the pixel counts weighted manhattan distance between maximum storehouse and all the other storehouses.Threshold parameter can be FLAT_COLOUR_VARIANCE=15.
About the step 4720 for the paster of testing by two-stage, paster must satisfy following all 3 conditions:
Number<=the BILEVEL_MAX_BIN_CNT in 1 non-NULL storehouse
2 LumRange〉2; With
3?MaxColourVariance<BILEVEL_COLOUR_VARIANCE
MaxColourVariance is defined as max (ColourVariance[band]), wherein ColourVariance[band] be the summation of the manhatton distance of maximum storehouse in the band and the pixel counts weighting between all the other storehouses.Parameter value can be BILEVEL_MAX_BIN_CNT=16, and BILEVEL_COLOUR_VARIANCE=40.
2.2.1.3 form second palette
Figure 41 illustrates in greater detail the processing of the step 1130 of Figure 11.Whether step 4110 test patch is smooth.If paster is that then step 4120 forms smooth color.This can realize by the weighted mean of calculating the non-NULL storehouse.If the test result in the step 4110 is handled at step 4130 place and continued for not, whether test patch is classified as is two-stage.If test result is for being, handle moving to step 4140.In step 4010, produce the palette of two kinds of colors.Otherwise,, handle and continue at step 4150 place if step 4130 is returned vacation.In step 4150, produce many color palette.
2.2.1.3.1 produce two color palette
Figure 42 provides the further details that produces two kinds of color palette that are used for the two-stage paster in the step 4140.Purpose is to form this image of contrast color showing.Because shadow tone and registration (registration) error in printing represent that the color of two kinds of original contrast colors generally can be polluted.As a result, the average color of prospect and background area is not the good expression of original image.The color of getting rid of in the transitional region makes image look more sharpening.
In step 4210, select the darkest and the brightest color, so that form the initial palette of main color.In step 4220, the storehouse of using 6 quilts to fill at most produces the storehouse tabulation.From this palette, find 6 storehouses at top according to pixel counts.Step 4230 to step 4270 sequentially handle the tabulation in color.In step 4230, from tabulation, obtain next storehouse color C.Whether deciding step 4240 test color C have been included in the initial palette, or whether this color is extremely too far away from two.If the result ignores this color for being, and handle and return step 4230, obtain next storehouse color and handle.If the result of step 4240 handles and continues at step 4250 place for not.Whether deciding step 4250 these colors of test are suitable for merging to initial palette.The color that is suitable for merging is near the color of any color that is positioned at initial palette.If the test result of step 4250 is for being, based on the nearer Manhattan color distance of the pixel counts of weighting, this color is merged in the initial palette of colors one.The pixel counts of C is increased to the pixel counts of the palette of colors that will merge to.Processing continues at step 4270 place.If the result of step 4250 ignores this color for not, and handle and enter step 4270, whether any untreated color is arranged so that check.Be to handle and return step 4230 if the test of step 4270 is returned.Otherwise, handle stopping.
2.2.1.3.2 produce many color palette
Figure 43 has launched the step 4150 of Figure 41, and its generation is used for many color palette of many colors paster.In step 4310, select the initial palette of dark and the brightest color formation as initial main color.In step 4320,, then add the 3rd color to palette if following two conditions are true:
1 LumRange〉THIRD_COLOUR_MIN_LD; With
2?LargestVar>THIRD_COLOUR_MIN_VAR
LargestVar is defined as distance the darkest and the brightest in the storehouse between the brightness band and the most maximum Manhattan color distance of the average color of dark color.If top test is true, the color that produces LargestVar is added to the 3rd initial palette of colors.Threshold value can be THIRD_COLOUR_MIN_LD=4 and THIRD_COLOUR_MIN_VAR=40.
In step 4330, add top (that is, being filled at most) 6 storehouses to a storehouse tabulation.Step 4340 to 4395 is sequentially handled the color of this tabulation.In step 4340, obtain next storehouse color C from this tabulation.Whether this color of step 4350 test has been included in the initial palette.If the result ignores this color for being, and handle and return step 4340.Otherwise step 4360 attempts this color is merged in the palette of colors one.This color is merged to the color that has nearest Manhattan color distance in the palette, if this distance (wherein threshold value can be BIN_MERGE_THRESHOLD1=10) in BIN_MERGE_THRESHOLD1.If the trial and success of step 4360 handles at step 4395 place to continue to check whether have more color to handle.Otherwise, handle entering step 4370.
Whether step 4370 test can add another kind of color to palette.If step 4370 is returned very, handle and continue at step 4380 place.If the test in the following pseudo-code is true, in step 4380, add additional color.
Number_palette_colours<MAX_NUM_PALETTE_COLOURS
&&
(
minDist>BIN_MERGE_THRESHOLD2
(
minDist>BIN_MERGE_THRESHOLD3
&&
(minDist*pCnt)>BIN_NEW_MIN
&&
pixel_count_closest_palette_colour>BIN_DONT_TOUCH_CNT
)
)
MinDist is the nearest Manhattan color distance that C arrives palette of colors.PCnt is the pixel counts of C.Pixel_count_closest_palette_colour is the pixel counts that produces the palette of colors of minDist.Threshold value can be MAX_NUM_PALETTE_C OLOURS=4, BIN_MERGE_THRESHOLD2=70, BIN_MERGE_THRESHOLD3=40, BIN_NEW_MIN=4000 and BIN_DONT_TOUCH_CNT=150.
From step 4380, handle and continue at step 4395 place.
If the test in the step 4370 is false, handles and continue at step 4390 place.In step 4390, storehouse color C is merged with the palette of colors with nearest Manhattan color distance.Pixel counts with weighting merges color, and the pixel counts of C is increased to the pixel counts of the palette of colors that will merge to.Processing continues at step 4395 place.If do not have more color to handle, handle stopping in step 4395.
2.2.1.4 pixel is associated with second palette
In case found main color, the amount of pixels in the paster turned to one of main color.Produce with the main colors list that is used to be communicated with component analysis and to quantize mapping.Quantification treatment at each group is as follows:
1) smooth-do not quantize;
2) two-stage-palette is remapped to one of two main colors, or seek a threshold value with the original pixels binaryzation; With
3) many colors-palette is remapped to one of main color.
Binaryzation produces the profile of more sharpening, but because binaryzation need find appropriate threshold, so will take a long time.The step of seeking threshold value is: 1) carry out first order derivative, 2 on luminance channel) identification edge pixel and 3) use average brightness value from edge pixel as threshold value.Edge pixel is such pixel, and wherein the output of 3 * 3 first order derivatives around them is all more than predetermined threshold.
Figure 44 provides the further details of the step 1140 of Figure 11.Whether step 4410 test patch is classified as is two-stage.If test result is for being, handles and continue at step 4420 place.In step 4430, shine upon the paster of many colors.Otherwise, handle and continue at step 4430 place.At step 4420 mapping two-stage paster.
2.2.1.4.1 mapping two-stage paster
Figure 45 provides the further details of step 4420.Entire process 4420 is mapped to two kinds of colors in second palette with the quantization error inspection with all non-NULL storehouses.Step 4510 is carried out each non-NULL storehouse to step 4570 and is quantized and error-detecting.If do not find big quantization error, handle after the storehouse quantizes, all pixels are remapped to second palette.If found big quantization error, it is many colors that paster is re-classified as, and makes it carry out the mapping of many colors paster.The details of explained later mapping two-stage paster.
Step 4510 need to determine whether profile to strengthen, and selects the preferred extreme color that is used to quantize for carrying out storehouse that profile strengthens.If a pixel counts of two palette of colors has exceeded another 4 times, then need profile to strengthen.If two colors in second palette have the first color C1 and pixel counts P1 and the second color C2 and pixel counts P2.If (P1/P2) or (P2/P1) greater than 5, then need profile to strengthen, and OUTLINE_ENHANCE is set to very.Preferred extreme color can be the color with less pixel counts.
Step 4515 obtains having the next non-NULL storehouse of pixel counts pCnt.Step 4520 is calculated the quantization error of two colors.Calculate the manhatton distance (D1 and D2) of two palette of colors, and be defined as minDist less one.MinDist is a quantization error.Handle then in deciding step 4525 and continue, check whether quantization error is too big.Condition when following pseudo-code definition quantization error is too big:
(minDist*pCnt)>BIN_NEW_BILEVEL_THRESHOLD
(
minDist>BIN_NEW_BILEVEL_COLOUR_DIFF
&&
pCnt>BIN_MERGE_BILEVEL_CNT_MIN
)
Wherein threshold value can be BIN_NEW_BILEVEL_THRESGOLD=6000, BIN_NEW_BILEVEL_COLOUR_DIFF=50 and BIN_MERGE_BILEVEL_CNT_MIN=100.
If the test at step 4525 place is true, handles and continue at step 4540 place.In step 4540, add Current Library to extra main colors list.Handle then at step 4570 place and continue.If the test in the step 4525 is false, handle at deciding step 4530 places and continue, check whether extra main colors list is empty.If should tabulate not empty (denying), treatment conversion is to step 4570, and whether check has more non-NULL storehouse to handle.Otherwise,, need to determine whether profile to strengthen, and whether two distances are approaching if the result of step 4530 handles at step 4545 place and continues for being.Provide test condition with following pseudo-code:
OUTLINE_ENHANCE
&&
abs(D1-D2)<BILEVEL_THRESHOLD_MARGIN
Wherein threshold value can be BILEVEL_THRESHOLD_MARGIN=16.
If the test in the step 4545 is true, handles and continue at step 4550 place.In step 4550, the storehouse is quantified as preferred color.Processing continues at step 4570 place.Otherwise, handle at step 4555 place and continue, based on D1 and D2 the storehouse is quantified as more approaching color.Whether handle then at step 4570 place and continue, checking has more non-NULL storehouse to handle.If more non-NULL storehouse is arranged, handle and return step 4515.If there is not more non-NULL storehouse, handle at step 4560 place and continue, check whether extra main colors list is empty.It determines in the quantification treatment process of storehouse whether big quantization error is arranged.Be sky if this is tabulated, handle at step 4575 place and continue, and, all pixels are remapped to one of two palette of colors according to the mapping of the storehouse in step 4550 or 4555 (deciding) on suitable situation.Then, handle termination.If in step 4560, tabulate non-NULL, handle at step 4565 place and continue, add an extra color to palette.The storehouse with maximum pixel counting in the extra main colors list is chosen as the 3rd palette of colors.In step 4430, paster is remapped to many colors paster.Handle then and stop.
2.2.1.4.2 shine upon multistage paster
On Figure 46, launch the step 4430 of Figure 44.Step 4610 obtains next non-NULL storehouse.Step 4620 is quantified as one of palette of colors with the storehouse.This can carry out based on nearest Manhattan color distance.Whether step 4630 inspection has more non-NULL storehouse to handle.If the test in the step 4630 is true, handles and continue at step 4610 place.If the test in the step 4630 is returned not, handle and continue at step 4640 place.In step 4640,, all pixels are remapped to one of palette of colors according to the mapping of the storehouse in the step 4620.Then, handle termination.
2.2.2 color CC analyzes and the spot statistic
The quantification paster that this processing of Figure 10 1020 was adopted from former step, and form spot.Each spot has following statistic: shelter bounding box, size, average color, position, the spot boundary length.In single grating wheel, form spot with quick and effective and efficient manner.With raster order, the neighbor that will belong in the quantification paster of same color class is divided into one group, so that form " string ".In the ending of every string (section), with the contact segment (aspect 8 path connectednesses) on string and the lastrow relatively, so that grow or merge.If the string that will be given the spot label touches the spot of same item, then grow.On the contrary, when contacting, two spots of same item merge.If growth can not occur, then form new spot.The spot statistic is upgraded in ending place at every string.
Figure 36 is to use the color spot of 6 * 6 input paster 3610 of example to divide and handles 3600 diagram.Input paster 3610 with some colors is used color quantizing, quantize paster 3620 so that produce.Quantification paster 3620 comprises the class label corresponding to main color class.In this case, two main colors are arranged in the paster, therefore provide class label 0 and 1.In delegation, be grouped in together so that form string by adjacent label same item, begin to be communicated with component analysis.For example, in Figure 36, preceding 4 " 0 " of first row form a string, and ensuing two " 1 " form another string.By being combined together to form spot from the string of the same item that quantizes the continuous row in the paster 3620.Paster 3630 shows the string in the paster.When with present segment and up in contact segment when comparing, three kinds of possible behaviors are arranged:
1 by adding present segment to spottiness, this spot of growing;
2 are one by the statistic unification with two spots merges two spots;
3 by using present segment initialization spot to form new spot.
Paster 3640 shows the spot that obtains, spot 0 and spot 1, and wherein spot 0 has the outer boundary frame, and spot 1 has the inner boundary frame.The statistic of accumulation spot when forming string and spot.So, this processing stage ending, each spot has at all statistics shown in Figure 36 of spot 0 and spot 1.It is representational that the position shelters 3650 and 3660.In fact do not form exemplary position and shelter 3650 and 3660 in this stage.The color that quantizes is used as the average color of spot.Replacedly, can determine the average color of spot by the color of actual pixel value in the accumulation spot rather than quantification.This provides average color more accurately.Again, the spot statistic can comprise that size, average color, boundary length, bounding box and position shelter.
Fig. 9 illustrates in greater detail the step 1020 of Figure 10, and step 1020 adopts loop structure to form spot from quantize paster, and paster of single treatment is capable from top to bottom.In step 910, it is capable so that handle to obtain current paster.In step 920, form the continuous section of the pixel of identical quantification label.This can begin to carry out with end position by writing down it.This section is present segment S cThe starting position is the pixel on end position the right of previous section.Under the capable situation of new paster, the starting position is first pixel of this row.End position is from left to right to check in the process that quantizes label to pixel of a pixel-by-pixel basis the last pixel before the quantification label changes on current paster is capable.Detecting under the situation about finishing before the variation in that current paster is capable, end position is the last pixel of this row.Purpose for after a while the color that reappraises spot is beginning from it to the checking process of end position, accumulates the YUV value of each pixel in the section in the original panchromatic paster.Replacedly, can use the quantized color of this section, not reappraise and do not carry out color.
In step 930, use the new spot of this section formation or the spottiness of growing.Figure 12 provides the further details of this step.In deciding step 940, check so that determine whether remained untreated pixel in this row.Very (be) to handle and return step 920 if deciding step 940 is returned.Up to having handled current paster all pixels in capable, this just takes place.If step 940 is returned vacation (denying), handle and continue at step 950 place.In deciding step 950, check so that determine whether to remain untreated row.If step 950 is returned very, handle and continue at step 910 place.More untreated paster is not capable up to having, and this just takes place.If step 950 is returned vacation (denying), handle stopping; Each pixel that quantizes in the paster has been assigned to spot.
2.2.2.1 formation spot
Figure 12 illustrates in greater detail step 930.The present segment S that identifies in the step 920 of Fig. 9 is adopted in the processing of Figure 12 cAs input.In step 1205, k is initialized as 1 with variable.This value is quoted and present segment S cK the section S that paster on the current paster that connects is capable is capable kPreferably, connection is 8 tunnel connections.In step 1210, carry out S kAnd S cBetween comparison.In deciding step 1215, check so that determine S cAnd S kWhether be identical class.If two section S kAnd S cHave identical quantification label, handle and move to step 1220 from step 1215.Otherwise, handle at decision square frame 1235 places and continue.In deciding step 1220, check so that determine S kWhether be first section that links to each other of same item.Therefore, if S kBe to have and S cThe continuous section of first of identical quantification label is handled and is continued at step 1225 place.Otherwise, handle and continue at step 1230 place.In step 1225, the current spot of growing.Therefore, based on the spot growth present segment under k the section.The growth spot relates to the YUV value of size, boundary information, bounding box and the accumulation of upgrading spot.Then, processing continues at step 1235 place.On the contrary, if deciding step 1220 is returned vacation, this is indicating and is distributing the spot label to present segment, that is, spot [i], and with another spot, that is, and spot [j] contact, and handle and continue at step 1230 place.If these two spots have different spot labels, that is, i ≠ j then combines these two spots in step 1230.The spot that has provided among Figure 15 in the step 1230 merges processing.Then, processing continues at step 1235 place.
From step 1215,1225 and 1230, handle proceeding to decision frame 1235.In decision frame 1235, check so that determined whether last continuous section treated.If k section is not the last section that links to each other with present segment, handles at step 1240 place and continue, and k is increased by 1.Then, processing is returned step 1210 and is handled the next section that links to each other.Otherwise,, handle moving to decision frame 1245 if deciding step 1235 is returned very.In decision frame 1245, check so that what determine whether not have to link to each other section is identical class (that is, quantification label) identical with present segment.If step 1245 is returned very, handle and continue at step 1250 place.In step 1250, use present segment to form new spot.Form new spot and relate to, the spot number is increased by 1, and use the information initializing spot statistic of present segment to the new spot label of present segment distribution.Stop in step 1250 aftertreatment.Similarly, if decision frame 1245 returns vacation (denying), handle stopping.
2.2.2.2 the example that spot merges
Figure 15 shows spot and merges 1500 example.Section 1510 and 1520 all belongs to spot [i], and it links to each other with the section 1530 that belongs to spot [j].The current beginning of the section of showing 1520 and current ending, equally the section of showing 1530 on begin and go up ending.Among Figure 15 also the section of showing 1520 and 1530 overlapping.Spot merges handles 1500 statistics that relate in conjunction with two spots, and a spot label is mapped to another, and with spot decreased number 1.For example, in Figure 15, label j is mapped to label i.Pseudo-code packing of orders spot statistic below using:
Figure C200580043979D0046110944QIETU
?blob[i].boundingBox=combine(blob[i].boundingBox,blob[j].boundingBox)
Figure C200580043979D0046110948QIETU
?blob[i].size+=blob[j].size
Figure C200580043979D0046110952QIETU
?blob[i].tileBorderPixelCount+=blob[j].tileBorderPixelCount
Figure C200580043979D0046110956QIETU
?blob[i].horizontalEdges+=blob[j].horizontalEdges-
Figure C200580043979D0046101550QIETU
overlap
Figure C200580043979D0046110959QIETU
?blob[i].verticalEdges+=blob[j].verticalEdges
?blob[i].YUV+=blob[j].YUV
2.2.3 merge in the paster
In case by being communicated with the spot that component analysis has formed paster, the next stage is to use color, spot that size is relevant with semanteme with spot boundary length statistic to combine.From this stage, spot only can be merged rather than segmentation.Therefore, paster is from being become more approaching correct segment level by over-segmentation.An example that merges in the paster has been shown among Figure 37 and 38, has wherein used the assessment of spot statistic whether to merge the certain portions of giving of spot.In example, 4 quantized colors are arranged, and 10 spots are returned in the connection component analysis.Figure 37 shows on the left side and merges spot before 3710.Many in these spots are because the influence of the color " bleeding " between residual halftone pattern and two regions of different colours.After merging in using paster, the fleck with long relatively boundary length is integrated in the bigger spot with less color characteristic.The spot that merges after 3720 has been shown in example 3700.Figure 38 comprises the comparison 3800 of the spot 3820 of original paster 3810 and merging.
Quantize and the many little unwanted or incorrect spots of the common establishment of spot formation processing.They are by the noise speckle in the input picture, residue shadow tone and the fleck that causes, or the form of the spot thin, large ratio of height to width that is caused by the bleeding influence of the edge of bigger connection component.The spot approaching by the color that these spots and incorrect spot are touched merges mutually, can remove incorrect spot.
By the number of restriction, can improve speed and storer utilization rate by the spot in the paster of segmentation and connection component processing generation.If too many spot is arranged in the paster, can reduce its number by merging some similar painted spots, even these spots do not contact.This has produced has independently not connected component, but is used as the spot that individual element is treated.Because this occurs over just in the paster with little in a large number noise element that is abandoned in step after a while, so quality can not be affected.
Figure 19 at length shows the step 1030 of Figure 10.In step 1905, obtain the spot in the current paster.In step 1910, check the girth of spot and the ratio of area, so that determine whether it is high.Be higher than threshold levels if find this parameter, handle and continue at step 1915 place.Otherwise,, handle and continue at step 1930 place if step 1910 is returned vacation.In step 1915, check the ratio that touches the pixel of patch edges in the spot.If this ratio on threshold value, is handled and is continued at step 1920 place.In step 1920, spot is labeled as " forcing to merge between paster ".The pressure that is provided for this spot merges sign, and it makes this spot more may merge with the component that is communicated with in the adjacent patch in the step 1420 of Figure 14.After step 1920, handle and continue at step 1930 place.If the patch edges of spot ratio is lower than threshold value in step 1915, handles and continue at step 1925 place.In step 1925, spot and the adjacent spots with immediate color are merged.Find the institute's spottiness that touches current spot and the distance of their color and current spot.Then with current spot and the adjacent immediate spot merging of color.Then, processing continues at step 1930 place.Whether in deciding step 1930, checking so that determine has more spot to handle in the paster.If step 1930 is returned very, handle and return step 1905.Otherwise, handle and continue at step 1935 place.
In deciding step 1935, check the current number of the spot in the paster, so that determine whether too many.Be higher than predetermined restriction if find the spot number, handle and continue at step 1940 place.Otherwise processing finishes.In step 1940, the spot that does not touch the identical quantification color class of patch edges is combined.For each quantized color class is carried out this processing.Nonjoinder touches the spot of patch edges, because these spots can form the part of much bigger CC, and merges them and may confront measurer injurious effects are arranged.In step 1945, check the current number of the spot in the paster once more, in the paster whether too many spot is arranged still so that determine.Under predetermined restriction (denying), handle stopping if find this number now.Otherwise, handle and continue at step 1950 place.In step 1950, merge the spot of each color of touching patch edges.Step 1950 is carried out the processing that is similar to step 1940, still will touch the spot of patch edges and take into account, so that the spot decreased number is arrived under the restriction.Then, handle termination.
2.2.4 merge between paster
Figure 13 at length shows to merge between the paster of step 1040 and handles.Be not that the left side of current paster and in the coboundary each repeat this processing with specific order.Following description is suitable for merging along the border, left or up of current paster.For example, 32 * 32 paster has the border of 32 pixels that are adjacent the paster state, and each pixel has the spot label.The adjacent patch state also has the border of 32 pixels, and each pixel has the CC label.So, for each pixel step, there is the spot label in the current paster along the border, and the corresponding C C label in the adjacent patch state.By spot label in the current paster that obtains to be used for next paster boundary pixel and the CC label in its adjacent patch state, in step 1310, handle and begin along public boundary.
In decision frame 1320, carry out test,, detect the change of CC label, spot label or last pixel so that move along the border along with handling.If current pixel is last boundary pixel, decision frame 1320 returns and is.If step 1320 is returned vacation (denying), handle and continue at step 1380 place.Otherwise, (be) to handle and continue if step 1320 is returned very at step 1330 place.Step 1330 inspection can be used as spot number and the CC number that merges the candidate.In deciding step 1340, check so that determine whether to satisfy the candidate and count condition.Satisfy as shown in Figure 17 and predetermined condition 1700 that be described below if the merging candidate of spot and CC counts, decision frame 1340 returns and is.
If the deciding step of Figure 13 1340 is returned vacation (denying), handle and continue at step 1370 place.Otherwise, handle and continue at step 1350 place.Step 1350 is right based on the spot and the CC of the best that is used to merge among the color distance tolerance identification merging candidate.If (Y Cc, U Cc, V Cc) and (Y Blob, U Blob, V Blob) be the right YUV color value of CC and spot candidate, provide color squared-distance sd by following formula:
sd=W y(Y cc-Y blob) 2+W u(U cc-U blob) 2+W v(V cc-V blob) 2
W wherein y, W y, W yBe respectively Y, the weight of U and V passage.Weights W y, W y, W yCan be set to 0.6,0.2 and 0.2 respectively.Best spot and CC are to being to have of minimum squared distance value.
Right by step 1360 this best spot of processing and CC, step 1360 is carried out various union operations, and describes in more detail with reference to Figure 14.After step 1360, or from decision frame 1340 not after, handle at step 1370 place and continue, wherein upgrade spot and CC candidate counting.Figure 48 shows all operations condition is upgraded before and CC afterwards and spot candidate counting.If find to have changed CC and spot label, CC and spot candidate counting are set to 1 (" x " expression " inessential " before upgrading) for all possible counting combination before merging.If only find to have changed a label, and on any side on paster border two candidates are arranged, according to merging among two candidates which, counting can be set to 0 or 1.If select second spot to merge (seeing 4920 among Figure 49), then after merging had taken place, two candidates' countings were set to 0.Yet if select first spot or do not select spot to merge, two countings are set to 1.In needing all the other situations of candidate's count update, the counting of the label that changed is increased by 1, and the counting of unaltered label is set to 1.
After step 1320 or 1370, handle and continue at step 1380 place.In decision frame 1380, check so that determine whether current pixel is last boundary pixel.If step 1380 is returned vacation (denying), handle in step 1310, moving to next location of pixels.Otherwise, handle stopping.
2.2.4.1 merging condition between paster
According to Figure 17, when removing CC and spot label and changing (in this case,, having the candidate just enough on every side) simultaneously for merging outside, when two candidates are arranged on a side, and merge usually when on opposite side, having one.This makes has avoided two neighboring candidate on the side are merged to same candidate on the opposite side.Figure 17 shows the condition that can carry out union operation between paster.If current C C and spot counting are (1,1), then in order to merge, two labels must change simultaneously.Yet if current C C and spot counting are (1,2) or (2,1), the change of any label is enough to merge.Situation (2,2) never can occur.
2.2.4.2 the example that merges between paster
Figure 49 provides the example that merges between spot and CC candidate.Have 3 kinds of situations: i) in 4910, every side only has a candidate; Ii) 4920 have a CC and two spots and iii) 4930 have two CC and a spot.In situation 4920, the CC candidate links to each other with two spot candidates.The color and the CC that suppose two spots are approaching, but only have one can merge with CC.If carry out merging by candidate ground order from top to bottom of each every side, then the spot at top and CC merge, the spot nonjoinder of remaining bottom.When the bottom spot may be better candidate for merging, this may not produce optimal result.Therefore, for situation about being similar in 4920 and 4930, need two candidates on the side.Situation 4910 only needs a candidate on every side, because for 4 connections, do not have interchangeable merging combination.
By merging spot between paster, form and cross over more than the continuous component of the color of a paster with the paster raster order.As shown in the example 3900 of Figure 39, this is carried out with raster order, and wherein the spot in any one of the spot in the paster 3910 and two adjacent patch 3930 that lay respectively at paster 3910 left sides and top and 3920 merges.
Each CC is stored in the data structure of size that maintenance weighs about its bounding box, average color, with pixel and the information of the CC that contacts.Between paster, merge in the processing, distribute CC label for each spot in the current paster, and use spot statistic renewal corresponding C C data structure.Carry out at current paster 1630 and 1612,1622 of states in two adjacent patch 1610,1620 of the left side of as shown in Figure 16 current paster 1630 and coboundary and to merge.Current paster 1630 be one processed so that form the pixel data of spot, and have the spot label 1634 that is used for along the pixel of public boundary.On the contrary, former paster state 1612,1614 does not comprise pixel data, only spottiness statistic and be communicated with component information with these spots link.Paster state the 1612, the 1622nd, the data structure of compression, it comprises the information about the spot on the border in this paster 1610,1620, and points to the CC1614 under these spots, 1624 pointer.Particularly, two kinds of paster states are arranged: left paster state 1612 and last paster state 1622; Each has and is used for respectively merging with their the right and following paster, along the CC label information 1614,1624 of each pixel of the public boundary of they and current paster 1630.Spot in the paster in the past is a part that is communicated with component now.
2.2.4.3 it is right to handle best spot and CC
Figure 14 illustrates in greater detail the step 1360 of Figure 13, wherein adopts the best spot and the CC that identify in step 1350 that conduct is imported.Processing begins in step 1410.In deciding step 1410, check so that determine this spot whether merge with another CC.If the spot that identifies in decision frame 1410 has been assigned with the CC label, handles and continue at step 1440 place.Otherwise, handle moving to decision frame 1420.In decision frame 1420, spot that identifies that will in step 1330, calculate and CC to color distance and color merging threshold ratio.This threshold value can be 450.Force to merge sign if be provided with, then threshold value can be 900.If color distance less than threshold value, is handled and is continued at step 1430 place.Otherwise, handle and continue at step 1460 place.In step 1430, this spot is combined with the CC that identifies.This can upgrade the statistic of CC by the statistic of using spot, and the label distribution of CC is carried out to spot.Handle then and stop.In step 1460, for current spot forms new CC.Handle then and stop.
In decision frame 1440, color distance between the CC under CC that identifies and the spot that identifies and the color threshold that is used to merge are compared.If the color distance between two CC under threshold value, is handled and is continued at step 1450 place.In step 1450, these CC are mapped in together.This can be by combining their statistic, and " being mapped to " pointer that CC is linked at together is set carries out.Handle then and stop.Similarly, if step 1440 is returned vacation (denying), handle stopping.
2.2.4.4 CC mapping result
Figure 18 is the diagram according to the result of the CC mapping processing of the step 1450 of Figure 14.The figure shows as merging CC1805,1810,1820,1830 result, can form the chained list 1850 of CC.In this diagram, CC (k) 1830 have point to NULL1840 be mapped to pointer 1832, it indicates it never to be integrated in another CC, so it is called root CC.In addition, by each CC statistic of accumulation in some merging, determine the statistic 1834 of root CC.For example, the final statistic of CC1830 is with CC (h) 1805, CC (i) 1810, and before CC (j) 1820 and CC (k) 1830 are incorporated in together, the combination statistic of these CC.The order that merges these statistics is inessential.In Figure 18, CC (i) 1810 points to CC (j) 1820, and CC (h) 1805 and CC (j) 1820 point to CC (k) 1830.
2.2.5 merging, the back handles
Figure 50 is the process flow diagram of the back merging treatment step 1050 of Figure 10.In step 5010, for each spot that does not merge in the processing identical with the step 1460 of Figure 14 in the current paster forms new CC.In step 5020, use the output of spot label to be used to store the bianry image of the shape and the outward appearance of spot.For paster with n spot, only need to export n-1 bianry image, because impliedly being stored as, n bianry image remove back, n-1 zone remaining areas.Therefore, the flat patch that comprises single spot does not need to store any bianry image.In interchangeable embodiment, the data structure that bianry image can be stored in single compression is such as in the key map, and wherein each location of pixels has a spot index, and is used log2 (n) bit representation.For example, if the maximum spot number of each paster is 16, use the spot index of each pixel position of numerical coding of 4.In another interchangeable embodiment, can use 1 bitmap storage to be used for the bianry image of two-stage paster.In step 5030, upgrade the contact tabulation of each CC in the current paster.This can all adjacent CC carry out in this paster by identifying.In step 5040, output paster state.Color segments is handled spot and CC information stores in the paster status data structure of the compression that is used for merging with next input paster.As mentioned above, two paster states are arranged, wherein the paster state on the left side has along the CC label information on right paster border, and the paster state on top has along the CC label information on following paster border.
2.3 segmentation example
Figure 25 (a) shows a simple example, and it shows based on the advantage more than 2 quantification gradation segmentations.Background 2510 is a black, and has placed the letter 2530 of the word " text " of the triangle 2520 of a white and grey on it.As Figure 25 (b) with (c), the two-value segmentation of this image typically causes the merging of text 2530 and background 2510 or triangle 2520.In these segmentations any one all can not be used for the document layout analysis so that select text filed-text feature to lose.
Simultaneously, some feature that has the two-value segmentation of the topological analysis that allows simplification to be communicated with component.Consider to have be communicated with outer boundary and by 4 the tunnel be communicated with the CC that forms page situation.In this case, except the edge of page or leaf exists, contact each CC of another CC or comprised at boundary, or comprise this CC, and a CC only can be comprised by other CC by this CC.Can produce a clear and definite hierarchy that comprises, and represent with tree construction.The continuous layer of in the tree each comprises the CC with the front opposite polarity, and each branch is made of one group of CC that shares unique father.Because this hierarchy can be used for selecting the subclass (share unique father those) as the CC that will be grouped in candidate together, this hierarchy can be used for CC is divided into groups.Because once need to consider the CC of minority, this is useful aspect processing speed, and also is useful aspect accuracy.Can be in the different branches of tree from the CC of the zones of different of page or leaf, and be not grouped in together.In addition, can begin the processing of CC at the top of tree, and can (for example, text the tree branch of) CC stops, so that further improve the processing time in order to be lower than certain classification.
If segmentation is not a two-value, usually can not produce clear and definite hierarchy helpfully.Consider the letter " e " among Figure 25 (a).This alphabetical outer boundary contacts with white triangles shape with black background, thereby background or triangle can be considered to the father of this CC.Leg-of-mutton situation even more complicated is because its outer boundary contact background and all letters.Although this uncertainty is arranged, can be implemented in the benefit of using hierarchy when object divided into groups.This is by being defined in the father-child's relation property between contacted two CC in border, and need not finish at unique father of each child.For example, if these CC contact at boundary, and the bounding box of a CC (father) surrounds the bounding box of second (child) fully, then can define father-child's relation between two CC.For the example shown in Figure 25 (a) uses this definition, all letters of triangle and word " text " all are defined as the child of background CC.
3 topological analysises
Topological analysis is the part of this system, wherein the foreground content of identifying page.Middle (topological analysis) module adopts the row from front-end module to be communicated with component and " contacting tabulation " conduct input.The output of topological analysis is about which is communicated with the decision that component is represented foreground content (that is, text, table, stain (bullet point)) in the scan image basically.Topological analysis is based on color segments, rather than bianry image.This has some benefits aspect can the classification of detectable foreground object, but do not exist similar for bianry image, exist clearly comprise hierarchy.For efficient, topological analysis only uses the statistic of bounding box and a little other general connection component, so that as the basis of its grouping.Topological analysis does not visit raw pixel data or or even position other segmentation of level.
The key step of topological analysis is: form based on contact tabulation and comprise hierarchy, they are divided into groups and test these groups based on the bounding box of CC and color so that determine CC whether the row of similar text alignd well like that.Use the contact tabulation to be provided for the hierarchy of CC, it is many colors equivalent that two-stage comprises hierarchy.Given CC can be considered to the father of a subclass of its contact list element.Particularly, given CC can be that this given CC contacts, and its bounding box is completely contained in the father of those CC in the bounding box of father CC.
Fig. 4 at length shows the step 120 of Fig. 1, and it adopts compression CC information, statistic and " contacting tabulation " information that is produced by step 110 as input.Contact which CC of tabulation description to be adjacent to each other-promptly, which CC shares the border.Input picture is not visited in the processing of Fig. 4.The part of this information is the tabulation of all CC in the input picture.
In step 410, based on the statistic of CC it is classified, comprise hierarchy thereby form color from the CC tabulation.It is such structure that color comprises hierarchy, and wherein each node is a CC.Father's node has that father's node touches and its bounding box and is completely contained in CC in the bounding box of father CC as its child.Child nodes can have more than father's node.Analysis can be only based on bounding box size and shape.Have width and highly all be considered to noise, and be removed less than the CC of 1/100 inch (for example 3 of 300dpi resolution pixels).Have width or height more than 1 inch, or width and highly all be classified as image in the connection component more than 8/15 inch.Any other thing is classified as potential text.Interchangeable embodiment can comprise the classification (such as the number of pixels in table, the connection component) that relates to other document layout feature, and can use other value.
In step 420, potential text CC is grouped in together, represent text filed.CC is grouped in near CC usually, and effectively grouping algorithm utilizes this fact by sought adjacent C C before determining grouping.The high resolving power color segments method of using in the front end can find the thousands of compatriot (sibling) that consider when dividing into groups on the exemplary scanning document.In these cases, use and simply formula is relatively sought adjacent C C, the method for a kind of O (N2) may be slow, and must use the more complicated method of determining neighbours.Can on comprising the node of hierarchy, color carry out triangulation.If the center of the bounding box of the CC on the page or leaf is defined node in the plane, effectively triangulation method can be used for this purpose, such as Delaunay triangulation.The normally individual processing of O (NlogN) of these methods.
Figure 53 shows the Delaunay triangulation (being pointed out by dotted line) of the group node in the plane and the example 5300 of V0ronoi figure (being pointed out by solid line).Voronoi figure is a kind of segmentation of page or leaf to the zone, and it is nearer that other point is compared to given point in described zone.Delaunay triangulation is double Voronoi figure, this figure generation that can link together by the point that will share the border among the Voronoi figure.In this triangulation, the typical point in the plane in the point of placing at random in the plane has about 5 points that are connected to it.These points can be thought of as neighbours candidate good in the grouping stage.
The output of triangulation is suitable for as the method that forms the CC grouping it.Based on the bounding box of those adjacent in Delaunay triangulation CC to formula relatively, these CC are grouped in together.Then after this initial packet adjacent C C on carry out subsequent rounds so that these groups are combined, or ungrouped CC is placed in the existing group.Processing can also be sought dissimilar groupings (text, table etc.) by these data in single the wheel.The group of text CC generally has following feature: similar color; The similar size of bounding box; Along level or Z-axis gross alignment (depending on text justification); Be close together along the alignment axle with size with respect to CC.
In step 430, check or checking CC group, so that determine which CC group is a text character.The information of the merging that produces about the group content with in the grouping stage by processor storage.Can be with separately information stores of grouping to be in a data structure about each, this data structure comprises the content of color, bounding box and group.In the grouping stage when the group content changing the time, upgrade these structures.In interchangeable embodiment, packet tagging is included in the CC data structure, and can be from CC data reconstruction data, such as group color and bounding box.In step 430, text character CC is alignd test as extra inspection, be text so that guarantee CC.
The group that forms generally comprises full text, but may also comprise the image section of not wishing to be classified as text.In order to alleviate this problem, check these groups, so that check whether connection component in the group is similar to text and arranges with neat row (or row), or be similar to such random alignment that the similar painted zone of noise or image often shows.
This mainly can pass through 4 histograms on the limit of formation bounding box, and one of every side (that is, a left side, last, limit, right or down) is finished.One in them should be full storehouse, the baseline of text there, and other position is empty.For it is checked, can find the quadratic sum of these Nogata picture libraries, and compare with desired value.If any one that find 4 Nogata picture libraries is desired more much higher than the bounding box of random arrangement, can think that this group is a text.Use the limit of all 4 bounding boxes, thereby allow with side or the page or leaf that scans of mode or with row but not the text that row is arranged from the top down.
3.1 CC is divided into groups
Figure 20 at length illustrates the step 420 of Fig. 4, and it is to one group of CC grouping by step 110 segmentation.Begin at step 2010 place by obtaining root CC, handling.Root CC is not integrated in other CC one in the color segments stage.In step 2020, find the child CC of root CC.Form the child CC tabulation of this root CC.Child can be defined as in the bounding box that its bounding box is completely contained in current root CC, and contacts the CC of this bounding box on the border.With reference to Figure 22 step 2020 is described in more detail.
From step 2020, handle moving to step 2030.In step 2030, the child of current C C is carried out neighbor analysis.For each child, find one group of approaching on the path of certain definition adjacent CC.This can be by for example finding each child CC the Delaunay triangulation at center of bounding box realize.The connection between adjacent C C is represented on limit in the triangulation.Interchangeable method can use the bounding box data approaching with the different element definitions of the colouring information of this CC tabulation.In step 2040, use neighbours' data to carry out initial packet.This treatment step 2040 forms the like attribute of identical child in tabulating, and () group of objects for example, geometric configuration and color is so that determine the feature of document layout.Figure 23 describes step 2040 in more detail.
In deciding step 2050, checking so that determine whether residue has more root CC to handle.If more root CC is arranged, handle and return step 2010, and obtain next root CC, and handle subsequently.Otherwise the grouping stage (420) stops.
3.1.1 seek the child of father CC
Figure 22 shows the step 2020 of child's tabulation of seeking child CC and forming given father CC.In step 2210, from the contact CC tabulation of father CC, obtain contact CC.In step 2220, find root CC.This can use and contact any pooling information from the color segments stage that CC is associated with this and finish.In deciding step 2230, check CC classification from the step 410 of Fig. 4, whether satisfy class testing (that is whether, this CC is stored in child is fit in tabulating) so that determine CC.Can store all CC except noise size CC, but other combination that interchangeable embodiment can storage class for example, is only stored potential text.If the class of contact CC is fit to, handles and continue at step 2240 place.Otherwise, handle and continue at step 2260 place.In deciding step 2240, detect the inclusive of CC and father CC.Whether comprise test and relate to the bounding box of checking this CC and covered by the bounding box of father CC fully, still interchangeable method also is feasible.Comprise test if satisfy, handle and continue at step 2260 place.In step 2250, this CC can be included in child tabulates, handle then and proceed to step 2260.Any CC only should occur once in child's tabulation, checks that this CC is not included in the tabulation thereby step 2250 comprises.Can use these inspections of hash table.Handle then at deciding step 2260 places and continue.In step 2260 test father's the contact tabulation whether arbitrarily more element is arranged.If, handle and continue at step 2210 place.Otherwise, finished child's tabulation of current C C, and handled termination.
3.1.2 initial packet
Figure 23 shows the step 2040 of use from Figure 20 of the initial packet of neighbours' data execution of the step 2030 of Figure 20.This method can be designed as the text that only divides into groups, but in interchangeable embodiment, this method can also be to (such as table) classification of other document object or grouping.Initial packet can be that two-wheeled is handled, and wherein the first round is attached to CC in the group, and second takes turns group combined and become bigger group.In step 2305, counter PASS is set to 1.In step 2310, obtain child.In step 2320, obtain child's neighbours.In step 2330, check so that determine child and neighbours whether satisfy the grouping test.Use a series of tests that these objects are divided into groups to test.Test can be based on preliminary classification, geometric configuration and color, and can take turns middle difference first and second.The Figure 26 that describes below provides the extraction to a zone of document, with the grouping test of the document interpretation procedure 2320.
With reference to Figure 23,, child and neighbours CC are grouped in together in step 2340 if satisfy the grouping test.Then, processing continues at step 2350 place.Otherwise,, handle and continue at step 2350 place if step 2330 is returned vacation.If two CC are grouped in together, each is labeled as belongs to mutually on the same group.The group that CC is labeled depends on the grouping formerly of these two CC.If also, then form the new grouping that comprises two CC not to any one grouping.Iff one in two being divided into groups, another CC is included in this group.If two are divided into groups, and the group be identical, hold fire.At last, if two are grouped, and group is different, and two groups are merged into single group, and be sky with another group echo.
In step 2350, check so that determine whether the more neighbours of current C C.If there are more neighbours, handle and continue at step 2320 place.Otherwise, handle and continue at step 2360 place.In step 2360, handle and check the more child of father CC.If there is more child, handles and continue at step 2310 place.Otherwise, handle and continue at step 2370 place.In step 2370, test in case determine whether to have finished two-wheeled (PASS〉1?).If this is the case, handle termination.If only finished the first round, handle at step 2380 place and continue, and count-up counter PASS.Processing continues at step 2390 place.In step 2390, handle the beginning of returning child's tabulation.Handle then and return step 2310, and begin second and take turns.
In interchangeable embodiment, handle 2040 limit rather than child CC and the circulations of neighbours' centering in trigdatum.Because each neighbour is to only being considered once, this is more effective a little.
3.1.2.1 grouping test to two CC
For the preferred grouping test to two adjacent C C is shown, Figure 26 shows the simple extraction from the document areas that just comprises two object-letters " g " and " h ".Dotted line is represented the coordinate of the bounding box that each is alphabetical.The left side of i CC and right x coordinate and upper and lower y coordinate lay respectively at x i 1, x i r, y i t, y i bSubscript 1 and 2 is indicated first and second CC (that is letter " g " and " h ") respectively.Color [the y of each CC in the yuv space is also used in the processing of present embodiment i, u i, v i] and the width w of bounding box iWith height h i
The horizontal overlap distance of two CC is defined as the length of the horizontal component that is covered by two CC, if or CC not overlapping then be 0.Define vertically superposed similarly apart from dy Ov, and as shown in Figure 26.In Figure 26 level overlapping be 0, thereby and not the mark.Overlap distance can be expressed as follows:
dx ov = max ( 0 , min ( x 1 r , x 2 r ) - max ( x 1 l , x 2 l ) ) ,
(1)
dy ov = max ( 0 , min ( b 1 r , b 2 r ) - max ( y 1 t , y 2 t ) ) .
The horizontal inner distance definition of two CC is the left side of a CC and the bee-line between another the right, if or horizontal overlap distance be not 0 then it is 0.Use the upper and lower limit of CC to define vertical inner distance in an identical manner.Horizontal range dx has been shown among Figure 26 In, and example hereto, vertical inner distance is 0, and is not expressed.These inner distance can be expressed as follows:
dx in = max ( x 2 l - x 1 r , x 1 l - x 2 r , 0 )
(2)
dy in = max ( y 2 t - y 1 b , y 1 t - y 2 b , 0 )
In the first round, if two adjacent C C satisfy the requirement based on color, size and 3 conditions of aliging, these CC are used as text packets together.If:
( y i - y j ) 2 + ( u i - u j ) 2 + ( v i - v j ) 2 < T C , - - - ( 3 )
Then satisfy color condition, wherein threshold parameter can be Tc=500.
If:
max ( w min w max , h min h max ) > T R , - - - ( 4 )
Then satisfy the size test, wherein w MinBe the minimum widith of two CC, w MaxBe breadth extreme, h MinBe minimum constructive height, and h MaxIt is maximum height.Threshold parameter can be T R=0.55.
If any one below satisfying in the condition then satisfies aligned condition:
[(dx ov>0)and(dy in/max(w min,h min)<T S)],
Or
[(dy ov>0)and(dx in/max(w min,h min)<T S)], (5)
Threshold parameter can be T S=0.65.
Second takes turns the parameter of use based on group rather than single CC.Can use the average color [Y of every group element i, U i, V i], width W iAnd height H iFor the situation of ungrouped CC, these values are set to color, width and the height parameter of independent CC.Test also use the CC be considered in the heart distance D, it is defined as follows:
D = ( x 1 l + x 1 r 2 - x 2 l + x 2 r 2 ) 2 + ( y 1 t + y 1 b 2 - y 2 t + y 2 b 2 ) 2 . - - - ( 6 )
As for the first round, if satisfy a series of conditions then will make up also.These conditions relate to being similar in color property T c, big or small T RAnd interval T o, and describe by following formula:
( Y 1 - Y 2 ) 2 + ( U 1 - U 2 ) 2 + ( V 1 - V 2 ) 2 < T Cg ,
max ( W min / W max , H min / H max ) > T R ,
(7)
min ( W min / W max , H min / H max ) > T R 2 ,
D / max ( W min , H min ) < T D ,
Wherein, parameter value can be: if arbitrarily group comprises 3 or element still less, T Cg=500, T R=0.55, T R2=0.3, and T D=1, and otherwise T Cg=100, T R=0.55, T R2=0.3, and T D=2.In the second grouping stage, do not use the alignment test.
Threshold value can depend on the feature of the CC that is carried out the grouping test, for example, and the pixel counts of each CC.
3.2 inspection group
Figure 21 at length shows the step 430 of the Fig. 4 that is used for inspection group.This processing determines whether each group is made of text.Mainly be expert at or list alignment and carry out this decision based on the object in the whether discovery group.Suppose that group is a text, it is carried out the test of text generic attribute, and if organize by these tests, then with its rejection.
In step 2110, obtain the next one group that in step 420, forms.In step 2120, the size of the text character in the estimation group.The size of estimating is based on the statistic of the length of single character.These length can be defined as the width and the maximal value highly of the bounding box of object.It is quite insensitive that this measures the alignment that inclination and page or leaf are gone up text, and for enough unifying for the character set in the exemplary font of sizing.In interchangeable embodiment, can use bounding box area, pixel counts and/or stroke width as the measurement of length value.Can form the histogram of character length, and the size of estimating can be based on the maximum length that is associated with the Nogata picture library that has more than storehouse interior element threshold number.The threshold value of using is minimum to be 3 objects, and is at least 15% of the interior object number of group.Estimation is not returned in if there is no such storehouse.
In deciding step 2125, check so that determine whether to find the size of character.If fail to find suitable character boundary, refuse this group, and handle and continue at step 2160 place.Otherwise, handle and continue at step 2130 place.
That other that comprises in the CC in step 2130 processed group and the bounding box of this group is fit to, but be not assigned to any group CC yet.This processing 2130 is for increasing the text that may be missed by initial packet, and small object is such as may be from being useful based on uncared-for punctuation mark the initial packet of classification.Only have with this group in object share father and the enough similar object of color can be added to this group.If satisfy following conditions, then satisfy the being similar in color property condition of the CC that organizes and comprise:
( Y - y ) 2 + ( U - u ) 2 + ( V - v ) 2 < T Cg 2 - - - ( 8 )
Wherein [Y, U, V] is the color of group, and [y, u, v] is the color of CC.Parameter value can be T Cg2=500.
Replacedly, can be in step 2130 the applicating geometric test, and the bounding box that can loosen CC is completely contained in the requirement in the bounding box of group, thereby will be attached in the group near the object of organizing.The replaceable scheme of other of step 2130 can merge some object so that form character.This is intended at the script with complex characters, such as Chinese, these characters may be segmented into more than an independent object, and are of value to the accuracy of the alignment test that improves in the later process.Only when the bounding box of two objects is overlapping, can merge two objects.If merge the aspect ratio that will create greater than 1.6, or create the big combining objects of estimating than in the step 2120 of character boundary, then restriction does not merge.
In step 2150, the alignment of the object in the inspection group.Group of text and other group differentiation are opened in this test, and are described in greater detail below.After this step, carry out test in step 2160, so that determined whether that more groups will be handled.If there be more group, handle and return step 2110.Otherwise, handle 430 and finish.
3.2.1 check alignment
Figure 24 has provided the step 2150 of Figure 21 in more detail.In this step process, the subset definition of the CC that group is interior is a character.Character can be half that has greater than the character boundary of estimating in 2120, and less than those objects of the size of this big or small twice.
Based on the histogram analysis about the series of parameters of group element, the acceptance test that step 2430 to 2450 is organized.These parameters are the left side of the bounding box of each character, upper and lower and the right.Use a plurality of parameters to allow to identify the different texts that align, because the alignment of text on page or leaf depended on many factors, the inclination on page or leaf such as language and text.Replacedly, can usage level and the various combinations of vertical boundary frame parameter discern the more text justification of wide region.
In step 2430, be the group element value formation histogram of next parameter.Can be according to the size in the storehouse in the scaled histogram of group character.Can be 1/5 value (rounding) of the average height of character in the upper and lower border edge use group, and can be 1/5 (also rounding) of the mean breadth of character in a left side and the right margin edge use group.Storehouse scope in the histogram is set, thereby all data are included in the non-NULL storehouse at each end place of this scope.The minimum of the parameter in the minimum that can histogram covers is set to organize.
Well whether the value in the deciding step 2440 test histograms alignment, forms discrete bunch (cluster) (baseline of representing the different rows text ideally), rather than scatters at random.Whether the number of characters N in the step 2440 test group is greater than the threshold value T with preferred value T=7.
(N<T), step 2440 is checked 3 parameter A L1, AL2 and OV for group.AL1 is the counting in the maximum storehouse in the histogram.AL2 is the counting in second largest storehouse in the histogram.OV is the size of the maximal subset of overlapping character in the group.Pseudo-code in the table 2 has been described the test that is used for this group.If pseudo-code is returned Y, then group is by alignment test, and if pseudo-code return N, test crash.
Table 2
Figure C200580043979D00621
Figure C200580043979D00631
For big group, carry out the quadratic sum of Nogata picture library and the test of the comparison of the desired value of the CC of the interior random arrangement of group.Provide the equation that is used for this test below:
&Sigma; i = 1 m h i 2 &GreaterEqual; 2 &times; ( n + n ( n - 1 ) m ) ,
Wherein m is the sum of Nogata picture library, and n is the character sum, and h iIt is the filling rate in i storehouse of histogram.The item of this equation right-hand side is 2 times of expectation value (average), and for enough big m and n, is approximately such value, has the possibility of the character received 0.1% of random arrangement for this value.The example of this processing has been shown among the Figure 27 that is described below.
With reference to Figure 24,, and handle and continue at step 2470 place if according to this test acceptance group.In step 2470, keep this group, alignment inspection is finished.Handle then and stop.Otherwise,, handle and continue at step 2450 place if step 2440 is returned vacation (denying).Whether deciding step 2450 inspection has more multiparameter to test.If have, handle and continue at step 2430 place at next parameter.Otherwise, if after tested all parameters, handle at step 2480 place and continue, wherein refuse this group.Handle then and stop.
The description of front discloses once based on the test of a parameter and has refused those all tests all kaput group.Yet, with the disclosure, those skilled in the art will understand, can put into practice interchangeable method at the combined test of different parameters, and do not depart from the scope of the present invention and spirit, such as accepting the almost well enough but similar group of alignment in the parameter (such as, the upper and lower limit of bounding box) two differences, or create the integral body scoring of this group based on many parameters.
3.2.2 alignment example
Figure 27 (b) shows the bounding box 2710,2712,2714 of the irregular layout that draws of part that may segmentation from Figure 27 00 ... selection.Figure 27 (b) shows the connection component of random arrangement.Figure 27 (a) shows the histogram 2720 of value at the end of the bounding box of this figure.Figure 27 (d) shows has the layout of bounding box 2740,2742 on page or leaf of the group of text of the connection component of alignment well, and Figure 27 (c) shows corresponding histogram 2730.As shown in the figure, the histogram 2730 of text has a little big value bunch, and other histogram 2720 has the more value of uniformly dispersing.The quadratic sum of the value of use Nogata picture library is as measured value, and Figure 27 (a) has provided value 19, and Figure 27 (c) has provided value 47.The item of the right-hand side of the top acceptance test of m=19 and n=13 is 45.Therefore according to this test, refuse the view data of Figure 27 (b), and accept the text data of Figure 27 (d).Replacedly, accepting test can be based on other statistic, such as obvious sum greater than the value in the storehouse of average counter.
4 produce the output image of compression
By drawing on foreground area with the background color of estimating, rear module is used repair process, makes background image more compressible.Preferably (for example, carry out reparation on background image 150dpi) at low resolution.Repair algorithm and be single-wheel, based on the algorithm of paster, and make great efforts to strengthen compressibility.Carry out interpolation by surrounding pixel, rather than use an average color in big zone, select each color of pixel from the left side and the right.
Step below this algorithm is carried out: 1) combination is used for sheltering of all prospect components, and a prospect that is formed for this paster is sheltered; 2) enlarge this and shelter, thereby repair the little additional areas that centers on this prospect component; 3) on paster with raster order:
If a pixel is not masked, upgrade the activity of paster; With
If the b pixel is masked, does not shelter the color that the color of pixel interpolation goes out and draw pixel with nearest from the left side and the right;
4), draw whole paster (this has provided the improved compression with the ZLib JPEG image compression) with the average color of not sheltering pixel if the activity of masking regional is not lower than certain threshold value; With 5) if sheltered whole paster, draw whole paster with the average color of the paster of front.Above step 2) eliminated the bleeding influence, and improved compression, and sharpening output quality.
Fig. 5 at length shows the step 130 of Fig. 1.Step 510 to 540 is used and be similar to the disposal system of describing based on paster in Fig. 2 and 3.In step 510, the next paster that acquisition will be handled.In step 520, on current paster, carry out repair process, so that remove any prospect CC that identifies in the step 120 of Fig. 1, and planarization visually looks near smooth any paster.In step 530, compress current paster.This can use the JPEG that has with in the YCrCb color space of 2 chrominance channels of 2:1 level and vertical sub sampling to finish.Each paster that dwindles the background image of resolution comprises 16 * 16 pixels, it means that these pixels can direct coding be the JPEG piece at 48 * 8 pixels of Y passage, and at each 1 the 8 * 8JPEG piece in Cr and the Cb passage, and do not need any buffering required between paster.
In step 540, check so that determine whether to exist more pasters to handle.If there are any more pasters to repair and to compress, handle and return step 510.Otherwise, handle and continue at step 550 place.In step 550, the compression prospect, this relates to and is compressed in the foreground elements that identifies in the step 120.According to color foreground elements is divided into groups, and form a bianry image building complete input resolution for each similar painted foreground elements.Can be CCITT G4 Fax with the picture coding of each establishment then,, make coding produce the words of the compression advantage in the output document if image is enough big.
In step 560, produce output document.With the background of compound compressed form store compressed and the foreground image of compression.This form can be, for example, and the PDF document.Can also use the background image of the further JPEG image compression coding of Flate (Zlib) compression.For the jpeg image that comprises by the flat block of step 520 and the 530 a large amount of repetitions that produce, this has brought tangible space to save.Can write synthetic document, synthetic document comprise the background image of Flate and JPEG compression and comprise size, position, order and-under the situation of two-value foreground image-on page or leaf, draw the page or leaf description of details of the color of each image.
4.1 reparation paster
Figure 30 at length shows the step 520 of Fig. 5.This handles the background image of revising down-sampling, so that increase compressibility, and by from background, removing prospect CC and with low visual active planarization image paster, strengthening the acutance of prospect CC.Also repair prospect CC zonule on every side,, thereby strengthen image and increase compressibility so that remove the bleeding influence.
By to draw this prospect CC by carry out the color that interpolation estimates between the color of pixel on the left side of the prospect CC that selects and the right, the prospect CC of selection is removed in the processing 520 shown in Figure 30 from the background of low resolution.This has increased compressibility.Also repair round the zonule of the outside of this CC, so that further increase compressibility, and the outward appearance of enhancing image.Have the paster of the background image of low visual activity by identification, and all their pixels are set to same color, further increase compressibility.
The input of the processing of Figure 30 is the background image of the down-sampling created in the step 250 of Fig. 2, from the CC tabulation of the step 240 of Fig. 2, and selects information from the prospect or the background of the step 120 of Fig. 1.In step 3010, check that paster is so that check that whether paster is marked as is " smooth ".If, handle at paster planarisation step 3040 places and continue, otherwise, handle at step 3020 place and continue, wherein, patch-shaped shelters for becoming full resolution prospect position.It has the position that is provided with in each position corresponding to prospect CC.In step 3030, repair them by the color that goes out with color of pixel interpolation from the left side of this CC and the right, remove in the background image corresponding to the zone of prospect CC with around their zonule.Also measure in the paster the not activity of repairing pixel.In step 3040, if find that the paster activity is enough low, so that visually be smooth, with the smooth color constancy that turns to of whole paster.Handle then and stop.
4.1.1 forming paster prospect position shelters
Figure 31 illustrates in greater detail the step 3020 that is used to form the Figure 30 that shelters paster prospect position.In step 3110, create the initial bit that is used for paster and shelter.This width of sheltering is identical with paster, and than the high delegation of paster.This first row of sheltering is set to identical with last column of top paster, unless this paster is first band in the document, in this case, first row that shelter the position is set to sky.This allows restoring area to extend under the prospect CC that has with the base of paster boundary alignment.
In step 3120, obtain the next CC in the paster.In step 3130, check this CC so that determine whether this CC is prospect CC (in step 120).If CC is prospect CC, handles and continue at step 3140 place.Otherwise, handle and continue at step 3150 place.In step 3140, use by turn the OR operation, will corresponding to the position of this CC and current paster shelter with step 3110 in the position of creating shelter and combine.Then, processing continues at step 3150 place.In step 3150, check so that determine in the current paster whether more CC are arranged.If handle and return step 3120.Otherwise if all CC in the treated paster, whether the result of step 3150, and handle and continue at step 3160 place.In step 3160, the last column of sheltering that preserve to form, thus when handling page or leaf and go up direct paster under this paster, can in step 3110, use it.Shelter the position of creating in the processing 3020 of Figure 31 is the full resolution of input picture.
4.1.2 repairing pixel and measurement paster activity
Figure 32 illustrates in greater detail the step 3030 of Figure 30.Check each pixel with raster order, up to one that finds to repair, its mark the beginning of a string pixel that will repair.Check later pixel, up to the ending of finding the pixel string that to repair.Give this pixel level string painted with the color of linear interpolation between the color of pixel on the left side of repairing string and the right then.
With reference to Figure 32, step 3210 obtains the delegation of paster.In step 3220, find the beginning of next pixel string according to the accumulation pixel activity of row.With each pixel in the raster order inspection row, and the pixel activity in the accumulation paster, up to finding the pixel that should repair.By accumulation pixel value and these pixel values square, and keep the counting of the number of the pixel measured, the pixel activity of the pixel that record is not repaired.In order to determine whether each pixel should repair, the relevant position in location of pixels and the paster prospect position of creating in the step 3020 of Figure 30 sheltered compares.Because the resolution that shelter the paster position is the twice of background image, 4 pixels corresponding to the zone of a pixel covering in the background image are arranged in sheltering.In order to improve compression and from prospect CC, to remove the influence of bleeding limit, repair the limit little additional areas on every side of prospect CC.In order to carry out this processing, 8 pixels during inspection full resolution prospect is sheltered are so that whether decision repairs the current background pixel.Figure 34 shows the example pixel 3410 in the background image paster 3430, and the corresponding full resolution of checking is sheltered pixel 3420.If shelter be provided with in 3440 by 3420 the indication these 8 pixels in any one, that is, and corresponding to the position of prospect CC, repairing pixel 3410.Because 3440 matrixes that are stored as bit vector are sheltered in the position, use the AND computing by turn can these 8 pixels 3420 of quick check.Replacedly, this can shelter the pixel that is provided with in 3420 by enlarging the position, and down-sampling is realized then.Write down the color of pixel of the last inspection of not repairing on every row, and if colleague's first pixel mutually that will repair next paster, it can be used when handling the next paster in the right, thereby this color is used as interpolate value.
With reference to Figure 32, in deciding step 3230, check so that determine before line endings, whether to have found any pixel that to repair.If found the pixel that to repair, handle and continue at step 3250 place.Otherwise, handle and continue at step 3240 place.In step 3250, check remaining pixel in this row with raster order, so that find the ending of the pixel string of this row planted agent reparation.Use the test identical to determine whether to answer repairing pixel herein with the test of use in the step 3220.In step 3260, check so that whether found the ending of the pixel string that will repair before the ending of determining to be expert at.If before reaching capable ending, found the ending of the pixel string that will repair, handle and continue at step 3270 place.In step 3270, each pixel in the pixel string that will repair is set to the color of linear interpolation between the color of nearest not repairing pixel on the left side and the right.If the pixel string of repairing extends to the left-hand side of paster, the last value for colleague's preservation mutually in last paster is used as left interpolate value.Then, handle at step 3220 place and continue, and search for the next pixel string that to repair.Do not find any pixel that should not repair if reached the ending of row in step 3250, deciding step 3260 commanders handle and continue in step 3280.In step 3280, each pixel in the pixel string that will repair is set to the value of the nearest pixel of not repairing in the left side, and it can be the value for going together and preserve mutually in the former paster.After step 3280, handle and continue at step 3240 place.
Whether in step 3240, checking so that determine has more row to handle in the paster.If more row is arranged in the paster, handle and continue at step 3210 place.Otherwise if not more row, processing finishes.
4.1.3 the example of repairing
Figure 35 shows the example that some one dimension is repaired.Provide before and curve map afterwards for two examples 3510,3520, wherein pixel intensity is plotted as the function of position X.In first example 3510, the pixel string that repair and replaces the value (with the indication of diagonal angle hachure) of repairing pixel fully in paster with the value of linear interpolation between the value of the not repairing pixel on the left side and the right and then.Restoring area influences so that remove any bleeding less times greater than the prospect component; This is illustrated with the masking regional that enlarges.Carry out interpolation between pixel color by not sheltering of on the left side and the right and find each color of pixel of reparation.In second example 3520 of Figure 25, the pixel string that repair has been crossed paster limit 3530.When handling the left hand paster, substitute the value of the pixel that will repair with the value of the not repairing pixel on the left side and then.When handling the right hand paster of example 3520, the value of repairing pixel is set to first value of the linear interpolation between the value of repairing pixel not of the value of not repairing pixel of the last record of this row and the right.
4.1.4 paster planarization
Figure 33 illustrates in greater detail the step 3040 of Figure 30.In deciding step 3310, check so that determine whether to have repaired all pixels.Especially, check the number of the picture rope that is repaired of record in step 3220.If repaired all pixels in the paster, think that then the viewing area of paster is low, and handle and continue at step 3320 place.In step 3320, with the average color of all pixels in the current paster painted (setting) former paster that is raster order.When using block-based compress technique for example during JPEG, this has increased the compressibility of background image significantly.Handle then and stop.If step 3310 determines it is not all pixels of having repaired in the paster, handle and continue at deciding step 3330 places.In step 3330, the activity of the not repairing pixel that the inspection of contrast predetermined threshold value is measured in step 3220.If find activity, think that the active viewing area of the paster that reconstructed image is interior is approaching smooth less than this threshold value.In this case, handle at step 3340 place and continue, and, make this paster smooth fully by draw all pixels in the paster with the average color of the visible pixel do not repaired in the paster.This has significantly improved the compressibility of background image.After step 3340, if or find that the activity in the paster is higher than threshold value after step 3330, the processing end of Figure 33.
5 hardware embodiment
Figure 51 shows system 5100 according to a second embodiment of the present invention.System 5100 is a feature with the data stream efficiently between the different phase of handling streamline.This design has greatly reduced bandwidth of memory, and can make system 5100 move fast.
Scanner is the scan-data of raster order acquisition according to pixels usually.Storage pixel data then, and be generally further Flame Image Process and compress.Scan during document uses traditional, usually need be from memory device the retrieves scan data, decompress, and remain on then in the storer that is used for segmentation and topological analysis so that image data processing.This is the situation of high speed scanner normally, and this is for no other reason than that staging treating can not be caught up with the speed that scanner one page connects one page stream transmission raster data.
This not only needs big storage buffer, and because each image pixel must also need high bandwidth of memory at least by write and read twice.At first must pixel be write storer by scanner, and then compressor reducer from memory read data and packed data.After a while, decompressor must read packed data, and data decompression is reduced in the storer.At last, image processor can be given the decompressed data segmentation.For each original image pixel, have once redundant storer read and write at least, let alone the data of compression.(for example, 600dpi), this means the excessive data that surpasses 200MB for high-resolution scanner.
This embodiment of the present invention adopts at a high speed, automatic segmentation, and its on-line operation connects the raster data of one page stream transmission in the real-time one page from scanner 5105.As a result, eliminate redundant storer read and write fully, and greatly reduced the size of storage buffer.
Bus 5110 is transported the raster data of scanning from scanner 5105, and it is written in the module 5115 (line buffer).In this example, use 64 line buffers, but can use other size according to the height of the band of explained later.The data tape that line buffer 5115 storages are handled by code segmentation module 5125 is collected the new scan-data band that enters simultaneously.Module 5125 from line buffer 5115 reading of data pasters, and is that the basis is to be communicated with component with this data color segments with the paster by bus 5120.When module 5125 finishes a data tape, read new data tape so that handle from line buffer 5115.Old then band impact damper is used to collect the new raster data that enters.By the definite height of being with of the height of paster, and line buffer 5115 needs the height of twice band.In this embodiment, preferred patch size is 32 * 32.
Realize module 5125 with hardware in this embodiment, thereby the processing speed of band can be caught up with the speed that scanner 5105 produces the new data band.The output of module 5125 is the connection component (CC) of the compression on the bus 5135 of topological analysis's module 5140 and the not compression on the bus 5130 of repairing module 5150 or the down-sampled images of compression.Data on the bus 5135 can be write storer, up to the whole page or leaf of obvious zone that has produced page or leaf or CC.Module 5140 only uses compact connection component data to carry out topological analysis.Because data are compact, it is little carrying out the required processing power of topological analysis.Therefore, topological analysis's module can be implemented as the software (SW) by the flush bonding processor executed in real time.The output of topological analysis's module is the foreground information that provides on bus 5145, and it follows the data from bus 5135 closely.Also the data on the bus 5130 can be write in the storer, up to the whole page or leaf of obvious zone that has produced page or leaf or CC.
Repairing module 5150 removes down-sampled images (providing by bus 5130) execution prospect by paster ground.This module 5150 can be implemented on the same flush bonding processor of operation topological analysis software 5140, or replacedly, it can use hardware (HW) to realize.Module 5150 is repaired foreground area with the background color of estimating then on the image of down-sampling, so that produce background image.The background image that the output prospect is removed on bus 5155, and the generation prospect is sheltered on bus 5175.
Output generation module 5160 is created document through topological analysis from the foreground image that produces in bus 5175 with at the background image that bus 5155 produces, such as pdf document.Module 5160 can realize with the software that operates on the same flush bonding processor.
In a second embodiment, module 5115 and 5125 is operated on the scan-data that real-time one page connects one page, and produces the connection component and the down-sampled images of the compactness on the page or leaf N.Module 5140,5150 and 5160 uses the data that produced on page or leaf N-1 by module 5125 sequentially to work.Therefore, system can send out the document through topological analysis from the real time data of high speed scanner in real time.
Can use at the described mode of corresponding steps among first embodiment and realize module 5125,5140,5150 and 5160.
5.1 color segments module
Figure 52 at length illustrates module 5125.Color segments is that the assembly of CC module 5125 comprises shadow tone module 5220, and it obtains pixel stream by the paster order, and removes because any artefact that the material that scanning uses the halftone system (for example ink-jet printer) of printing to print causes.Going shadow tone module 5220 is hardware embodiment of previously described software implementation example.This inside modules can use static RAM (SRAM) and pipeline processes to reach required speed.
Be passed to color conversion module 5230 from the pixel of going 5220 outputs of shadow tone module, it is converted to the YCbCr luminance/chrominance space with pixel from input color space (being generally RGB).This module 5230 is carried out according to following formula each pixel is carried out necessary multiplication and addition:
Figure C200580043979D0072113432QIETU
Figure C200580043979D0072113437QIETU
Figure C200580043979D0072113440QIETU
Carry out these arithmetical operations with the fixed point arithmetic of bi-directional scaling, so that reduce the speed of complicacy and increase module 5125.To be delivered to two modules from the output of color conversion module 5230, and promptly descend scan module 5240 and be communicated with component analysis module 5260.Following scan module 5240 is carried out the simple average of a group 4 (in 2 * 2 squares) or 16 (in 4 * 4 square) colors, so that form each output pixel.Then by the pixel of hardware JPEG compressor reducer 5250 compressions from following scan module 5240 outputs.
6 computer realization
Can use one or more general-purpose computing systems, printing device and other computing equipment that is fit to implement method according to an embodiment of the invention.Processing with reference to the one or more descriptions among the figure 1-52 may be implemented as software, such as the application program of carrying out in computer system or be embedded into printing device.Software can comprise one or more computer programs, comprises application program, operating system, process, rule, data structure and data.Instruction can be constituted as one or more code modules, and each is used to carry out one or more particular tasks.This software can be stored in the computer-readable medium (comprising the one or more memory devices that for example describe below).Computer system is from computer-readable medium this software of packing into, and carries out this software then.
Fig. 7 shows the example of the computer system 700 that can implement embodiments of the invention.The computer-readable medium that records this software on it is a kind of computer program.The use of this computer program in computer system can realize being used to realize the favourable device of one or more said methods.
In Fig. 7, computer system 700 is coupled to network.The operator can use keyboard 730 and/or pointing apparatus such as mouse 732 (or for example touch pads) input to be provided for computing machine 750.Computer system 700 can have the output device of arbitrary number, comprises line printer, laser printer, draught machine and is connected to other reproducer of computing machine.Computer system 700 can use the communication port 740 that is fit to be connected to one or more other computing machines such as modem communication path, router etc. by communication interface 764.Computer network 720 can comprise for example Local Area Network, wide area network (WAN), Intranet and/or internet.Computing machine 750 can comprise processing unit 766 (for example, one or more CPU (central processing unit)), storer 770 (it can comprise random access storage device (RAM), ROM (read-only memory) (ROM) or both combinations), I/O (IO) interface 772, graphical interfaces 760 and one or more memory device 762.Memory device 762 can comprise following one or more: known some other non-volatile memory devices of floppy disk, hard disk drive, magneto optical driver, CD-ROM, DVD, data card or memory stick, flash RAM equipment, tape or those skilled in the art.Though memory device is shown as and is directly connected to bus in Fig. 7, can connect this memory device by the interface that is fit to arbitrarily, such as parallel port, serial port, USB interface, live wire (Firewire) interface, wave point, PCMCIA groove etc.For the purpose of this description, storage unit can comprise one or more (as what indicate round the frame of broken lines of these elements among Fig. 7) in storer 770 and the memory device 762.
Each assembly of computing machine 750 typically is connected to one or more miscellaneous equipments by the one or more buses 780 that briefly illustrate in Fig. 7, these buses comprise data, address and control bus again.Though figure 7 illustrates single bus 780, those skilled in the art is to be understood that, computing machine, printing device or other electronic computing device can have some buses, comprise one or more in processor bus, memory bus, graphics card bus and the peripheral bus.Can adopt suitable bridge to come communication between these buses of interface.Though described the system that uses CPU, it will be understood by those of skill in the art that can use can deal with data and other processing unit of executable operations and not departing from the scope of the present invention and spirit.
Computer system 700 only is provided for purposes of illustration, and can adopt other configuration and do not depart from the scope of the present invention and spirit.The computing machine that can implement these embodiment comprises one of Macintosh (TM) family of IBM PC/AT or compatible, above-knee/notebook, PC, Sun Sparcstation (TM), PDA, workstation etc.Only be the example that to implement the device type of embodiments of the invention above.Typically, the processing of these embodiment that describe below is used as software or is recorded in as the program residence on the hard disk drive of computer-readable medium, and uses processor to read and control.Can use semiconductor memory to realize program, intermediate data and the intermediate storage of any data of obtaining from network.
In some cases, can provide the program that is coded on CD ROM or the floppy disk, or replacedly, can pass through, for example, the modem device that is connected to computing machine reads this program from network.In addition, can be from other computer-readable medium (comprising the wireless of tape, ROM or integrated circuit, magneto-optic disk, computing machine and another equipment room or infrared transmission passage, computer-readable card (such as pcmcia card) and internet and Intranet (comprise mail transfer and be recorded in information on the website etc.)) with the software computer system of packing into.The front only is the example of relevant computer-readable medium.Can use other computer-readable medium and do not depart from the scope of the present invention and spirit.
7 industrial applicibilities
Embodiments of the invention are applicable to computer and data treatment industry. Root has only been described in the front According to embodiments of the invention for the treatment of with a small amount of method, device and the calculating of compressed digital video The machine program product, and can modify and/or change and do not depart from the scope of the present invention it And spirit, these embodiment are illustrative and not restrictive.

Claims (41)

1. method of giving the digital image segmentation comprise a plurality of pixels, the method comprising the steps of:
Form several image lines that link up as band;
Described band is handled as a plurality of block of pixels; And
Use described block of pixels to be communicated with component for each piece generates at least one in the mode of single-wheel, described generation step comprises:
A block of pixels is segmented at least one is communicated with component, each is communicated with component and comprises continuous and semantic one group of relevant pixel on the space;
With described described at least one be communicated with component with go out from least one other piece segmentation with pre-treatment at least one be communicated with the component merging; And
The position of described connection component in described image with described of the form of compactness storage.
2. method as claimed in claim 1, the pixel that wherein said semanteme is relevant comprises similar painted pixel.
3. method as claimed in claim 1, wherein:
Described digital picture is formed a plurality of bands, and each band comprises the pixel column that links up of predetermined number; And
Described band is cushioned one by one and is handled, and wherein said treatment step comprises the following substep that the band of each current buffering is carried out:
Described current band is handled as a plurality of block of pixels; With
Cushion and handle described current band one by one for described generation step described.
4. method as claimed in claim 1, wherein said storage substep comprises store M-1 bitonal bitmap, and wherein M is communicated with component in a piece, and M is an integer.
5. method as claimed in claim 1, wherein said storage substep comprises the storage key map.
6. method as claimed in claim 1, wherein said segmentation substep comprises:
Estimate some representative colors of each piece;
Each piece is quantified as described representative colors; With
Form the connection component from the piece of each quantification.
7. method as claimed in claim 6, wherein said segmentation substep also comprises: the subclass of the described connection component that will form merges.
8. method as claimed in claim 7, the substep that the subclass of the wherein said described connection component that will form merges comprises the statistic of collecting described connection component.
9. method as claimed in claim 8, wherein said segmentation substep also comprise the step of the connection component of removing the described formation that is considered to noise.
10. method as claimed in claim 9, wherein said noise comprise having the pixel counts that is lower than predetermined threshold and be higher than the boundary length of another predetermined threshold and the component that is communicated with of pixel counts ratio.
11. method as claimed in claim 8, wherein said statistic comprise in bounding box, pixel counts, boundary length and the average color any one or a plurality of.
12. method as claimed in claim 8, wherein said with described described at least one be communicated with component and comprise with at least one step that is communicated with the component merging that goes out from least one other piece segmentation with pre-treatment:
The connection component of a piece is merged with the component that is communicated with of the piece of the left side and top; With
Upgrade the described statistic of the connection component of described merging.
13. as the method for claim 12, wherein said statistic comprises all any one in the color or a plurality of of bounding box, pixel counts, filling ratio peace.
14. method as claimed in claim 6, wherein said estimator step comprises:
Based on the yuv data of the pixel in each piece, form the histogram relevant with a plurality of color libraries;
Based on the statistics with histogram amount to each block sort; With
Merge the storehouse color based on described block sort, so that form described representative colors.
15. as the method for claim 14, wherein said estimator step is included in one and forms the step of key map for each pixel in taking turns.
16. as the method for claim 15, wherein said quantization step comprises:
The non-NULL storehouse is quantified as representative colors;
Be created to the storehouse mapping of described representative colors; With
Use the mapping of described storehouse that described key map is remapped to described representative colors.
17. as the method for claim 14, wherein said formation substep comprises:
Determine the brightness band based on the Y value;
Determine the color row based on U and V value;
Pixel color is accumulated to described mapping library; With
Increase progressively the pixel counts of described mapping library.
18. as the method for claim 17, the step of wherein said definite brightness band comprises that also the brightness band is anti-aliasing.
19. as the method for claim 17, the step of wherein said definite color row comprises that also the color row are anti-aliasing.
20. as the method for claim 12, wherein said with described described at least one be communicated with component with go out from least one other piece segmentation with pre-treatment at least one be communicated with step that component merges be included as each connection component execution in the current block that contact a left side and coboundary below substep:
Searching is communicated with component along the row that described public boundary touches described current connection component; With
Determine the optimal candidate of merging.
21. a device that is used for to the digital image segmentation that comprises a plurality of pixels, this device comprises:
Be used to form the parts of several image lines that link up as band;
Be used for parts that described band is handled as a plurality of block of pixels; And
Be used for using described block of pixels to generate the parts that at least one is communicated with component for each piece in the mode of single-wheel, described being used for uses described block of pixels to comprise for each piece generates at least one parts that are communicated with component in the mode of single-wheel:
Be used for a block of pixels is segmented into the parts that at least one is communicated with component, each
Be communicated with component and comprise continuous and semantic one group of relevant pixel on the space;
Be used for described described at least one be communicated with component with go out from least one other piece segmentation with pre-treatment at least one be communicated with the parts of component merging; And
Be used for the described connection component of described of the form of compactness storage parts in the position of described image.
22. a method that produces the compact representation of color document automatically, described method comprises step:
In one takes turns by the piece raster order, with the digital image segmentation of color documentation page for being communicated with component;
Based on the compactness of described whole page or leaf, be communicated with the component statistic, use topological analysis that the described digital picture of described page or leaf is divided into prospect and background image;
In the position that at least one part of described foreground image has been covered described background image, in one takes turns, repair at least one part of described background image with the piece raster order; With
Store described foreground image and background image, to form compact document.
23. as the method for claim 22, also be included in described digital picture divided after and before described reparation step to the step of described background image down-sampling.
24., also comprise the step of the background image of compression after the described reparation as the method for claim 22 or 23.
25. as the method for claim 24, wherein said compression step relates to lossy compression method.
26., also comprise difference compression to the background image of described lossy compression method as the method for claim 25.
27. a device that is used for producing automatically the compact representation of color document, described device comprises:
Be used for taking turns by the piece raster order the digital image segmentation of color documentation page to being communicated with the parts of component one;
Be used for based on the compactness of described whole page or leaf, be communicated with the component statistic, use topological analysis the described digital picture of described page or leaf to be divided into the parts of prospect and background image;
Be used for having covered the position of described background image, in one takes turns, repair the parts of at least one part of described background image with the piece raster order at least one part of described foreground image; With
Be used to store described foreground image and background image, to form the parts of compact document.
28. an analysis comprises the method for the digital picture of a plurality of pixels, described method comprises step:
With described digital image segmentation is object, wherein to represent described segmentation more than two labels;
For each object provides one group of attribute;
Be the subclass of described object, use to comprise to measure to determine between the adjacent object of sharing the border, whether there is father-child's relation;
Object-based attribute forms many group objects of sharing common father; With
Object class is given in attribute and grouping according to object.
29., wherein use the bounding box around each object and the information of the relation of touching between description object to determine described comprising as the method for claim 28.
30. as the method for claim 29, if wherein two objects contact on the border, and the bounding box of an object comprises the bounding box of another object fully, a then described object comprises described another object.
31. as the method for claim 28, wherein said formation group step comprises:
Consider common father's the child object of row among the child right; With
Use described object properties, determine whether each is to being grouped into together.
32., wherein only consider the adjacent object in the row child object with identical father is divided into groups as the method for claim 31.
33., wherein divide into groups to object based on bounding box and colouring information as the method for claim 31.
34., wherein, a group objects is categorized as text according to the test of the text class quality of described object in described group as the method for claim 28.
35. as the method for claim 34, the wherein said test that is used for text class quality comprises:
The single value of representing the position of described object for each object identification;
Form the histogram of described value; With
According to described histogrammic Attribute Recognition text.
36. as the method for claim 28, also be included in described classification step and add them the text classification group of object to according to the attribute of other object afterwards, and no matter their father-child's attribute step how.
37. an analysis comprises the method for digital picture of a plurality of pixels of documentation page, described method comprises step:
Give described digital image segmentation, so that form object based on described image;
Form the group of described object; With
Whether each that determine described group of objects represents text, and described determining step comprises:
According to the position of described object on described page or leaf, for each object is discerned single value;
Form the histogram of described value; With
According to described histogrammic Attribute Recognition text.
38. as the method for claim 37, wherein said histogrammic described attribute is the sum that has in the described histogram more than the object in the storehouse of the object number of appointment.
39. as the method for claim 37, wherein said attribute is the quadratic sum of the counting in the described histogram.
40. as the method for claim 37, wherein the described single value of the position of the described object of expression of each object is the limit of the bounding box of described object.
41. a device that is used to analyze the digital picture that comprises a plurality of pixels, described device comprises:
Be used for to described digital image segmentation, so that form the parts of object based on described image;
Be used to form the parts of the group of described object; With
Whether each that is used for determining described group of objects represents the parts of text, and whether described each that is used for determining described group of objects represents that the parts of text comprise:
Be used for discerning the parts of single value for each object according to the position of described object on described page or leaf;
Be used to form the histogrammic parts of described value; With
Be used for parts according to described histogrammic Attribute Recognition text.
CNB200580043979XA 2004-12-21 2005-12-20 Image segmentation methods, compact representation production method, image analysis method and device Expired - Fee Related CN100543766C (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
AU2004242419 2004-12-21
AU2004242421 2004-12-21
AU2004242419A AU2004242419A1 (en) 2004-12-21 2004-12-21 Analysing digital image of a document page

Publications (2)

Publication Number Publication Date
CN101091186A CN101091186A (en) 2007-12-19
CN100543766C true CN100543766C (en) 2009-09-23

Family

ID=36660018

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB200580043979XA Expired - Fee Related CN100543766C (en) 2004-12-21 2005-12-20 Image segmentation methods, compact representation production method, image analysis method and device

Country Status (2)

Country Link
CN (1) CN100543766C (en)
AU (1) AU2004242419A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5847062B2 (en) * 2012-11-27 2016-01-20 京セラドキュメントソリューションズ株式会社 Image processing device
US20180123612A1 (en) * 2015-04-08 2018-05-03 Siemens Aktiengesellschaft Data reduction method and apparatus

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5761344A (en) * 1994-04-15 1998-06-02 Canon Kabushiki Kaisha Image pre-processor for character recognition system
US6285458B1 (en) * 1996-07-31 2001-09-04 Fuji Xerox Co., Ltd. Image processing apparatus and method
US6577763B2 (en) * 1997-11-28 2003-06-10 Fujitsu Limited Document image recognition apparatus and computer-readable storage medium storing document image recognition program
CN1453747A (en) * 2002-04-25 2003-11-05 微软公司 Cluster
CN1458791A (en) * 2002-04-25 2003-11-26 微软公司 Sectioned layered image system
CN1458628A (en) * 2002-04-25 2003-11-26 微软公司 System and method for simplifying file and image compression using mask code

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5761344A (en) * 1994-04-15 1998-06-02 Canon Kabushiki Kaisha Image pre-processor for character recognition system
US6285458B1 (en) * 1996-07-31 2001-09-04 Fuji Xerox Co., Ltd. Image processing apparatus and method
US6577763B2 (en) * 1997-11-28 2003-06-10 Fujitsu Limited Document image recognition apparatus and computer-readable storage medium storing document image recognition program
CN1453747A (en) * 2002-04-25 2003-11-05 微软公司 Cluster
CN1458791A (en) * 2002-04-25 2003-11-26 微软公司 Sectioned layered image system
CN1458628A (en) * 2002-04-25 2003-11-26 微软公司 System and method for simplifying file and image compression using mask code

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于视觉颜色聚类的彩色图像分割. 王圆圆,丁志杰,万华林.北京理工大学学报,第23卷第6期. 2003
复杂彩色文本图像中字符的提取. 陈又新,刘长松,丁晓青.中文信息学报,第17卷第5期. 2003

Also Published As

Publication number Publication date
AU2004242419A1 (en) 2006-07-06
CN101091186A (en) 2007-12-19

Similar Documents

Publication Publication Date Title
JP5008572B2 (en) Image processing method, image processing apparatus, and computer-readable medium
US7343046B2 (en) Systems and methods for organizing image data into regions
CN101453575B (en) Video subtitle information extracting method
CN1080059C (en) Dropped-from document image compression
US6807298B1 (en) Method for generating a block-based image histogram
US8526723B2 (en) System and method for the detection of anomalies in an image
Fauqueur et al. Region-based image retrieval: Fast coarse segmentation and fine color description
CN100563296C (en) The picture system of staged and layered
Cetinic et al. Learning the principles of art history with convolutional neural networks
GB2431793A (en) Image comparison
CN114005123A (en) System and method for digitally reconstructing layout of print form text
US20030012438A1 (en) Multiple size reductions for image segmentation
GB2431797A (en) Image comparison
US6360006B1 (en) Color block selection
CN100543766C (en) Image segmentation methods, compact representation production method, image analysis method and device
CN109800758A (en) A kind of natural scene character detecting method of maximum region detection
JP4217969B2 (en) Image processing apparatus and program
JP4182891B2 (en) Image processing device
JP2005260452A (en) Image processing apparatus
JP4507656B2 (en) Image processing device
CN112419208A (en) Construction drawing review-based vector drawing compiling method and system
JP4193687B2 (en) Image processing apparatus and program
Nguyen et al. Automatically improving image quality using tensor voting
JP4311183B2 (en) Image processing apparatus and program
AU2005211628A1 (en) Processing digital image

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090923

Termination date: 20191220

CF01 Termination of patent right due to non-payment of annual fee