EP2352121A1 - Image processing apparatus and method - Google Patents

Image processing apparatus and method Download PDF

Info

Publication number
EP2352121A1
EP2352121A1 EP10187171A EP10187171A EP2352121A1 EP 2352121 A1 EP2352121 A1 EP 2352121A1 EP 10187171 A EP10187171 A EP 10187171A EP 10187171 A EP10187171 A EP 10187171A EP 2352121 A1 EP2352121 A1 EP 2352121A1
Authority
EP
European Patent Office
Prior art keywords
image
image processing
noise
characteristic
low resolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP10187171A
Other languages
German (de)
French (fr)
Inventor
Kyung-Sun Min
Bo-Gun Park
Byung-Cheol Song
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Samsung Electronics Co Ltd
Original Assignee
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Samsung Electronics Co Ltd filed Critical Samsung Electronics Co Ltd
Publication of EP2352121A1 publication Critical patent/EP2352121A1/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4046Scaling the whole image or part thereof using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration

Definitions

  • Apparatuses and methods consistent with the exemplary embodiments relate to an image processing apparatus and an image processing method which is capable of effectively generating a high resolution image from a low resolution image through learning even under environments having various noises.
  • Scaling for enlarging or reducing the size of an image is an important technique in the field of image display apparatuses. Recently, with rapid increase of a screen size and a resolution, the scaling technique has been developed to generate a high quality image, beyond simple enlargement or reduction of an image.
  • Super-resolution (SR) technique is one of a variety of techniques for generating a high quality image.
  • the SR technique is classified into multiple-frame SR for extracting a single high resolution image from a plurality of low resolution images and single-frame SR for extracting a single high resolution image from a single low resolution image.
  • FIGs. 1A and 1B are diagrams illustrating the multi-frame SR.
  • a single high resolution image is generated through registration, etc. from a plurality of image frames of the same scene which are slightly different in phase from each other.
  • a low resolution (LR) image is inputted, a plurality of pixels is extracted from each of a plurality of image frames of the same scene which are slightly different in phase from each other.
  • a plurality of pixels ⁇ , a plurality of pixels ⁇ , a plurality of pixels ⁇ and a plurality of pixels • are sampled, respectively.
  • the pixels ⁇ , the pixels ⁇ , the pixel ⁇ and the pixels • are extracted from different image frames, respectively.
  • pixels for forming high resolution image frames are generated on the basis of the sampled pixels. For example, as shown in FIG. 1B , a plurality of pixels ⁇ may be generated using the pixels ⁇ , ⁇ , ⁇ and •. High resolution (HR) image frames may be generated from the pixels ⁇ .
  • HR high resolution
  • This multiple-frame SR requires suitable movement estimation with respect to a plurality of image frames.
  • the amount of operations is generally large, thereby causing difficulty in real-time processing.
  • a frame memory having a considerable size is required for storing the operations, thereby causing much difficulty in practical realization.
  • FIG. 2 is a diagram illustrating the single-frame SR.
  • the single-frame SR is a learning-based technique which is used to overcome the problems of the multiple-frame SR.
  • a learning process 210 pairs of blocks or patches having a predetermined size in consideration of image characteristics are generated using a variety of high resolution images and low resolution images corresponding to the high resolution images, and the generated pairs of blocks or patches are stored.
  • each pair of blocks or patches includes high resolution information and low resolution information.
  • the learning process 210 the following operations are performed.
  • First, low resolution images corresponding to a variety of high resolution images are extracted through low-pass filtering (LPF) and sub-sampling (212).
  • Third, low frequency components are removed from the original high resolution images and the scaled images using a Band-Pass Filter (BPF) or a High-Pass Filter (HPF).
  • BPF Band-Pass Filter
  • HPF High-Pass Filter
  • examples of high frequency patches (HFP) having a predetermined size from which the low frequency components are removed and the corresponding scaled low frequency patches (LFP) are stored in a lookup table (LUT) (216).
  • a low resolution block in the pairs matching each block of the inputted image is searched, and high resolution information is obtained.
  • the following operations are performed. First, a low resolution image is inputted (222). Second, the inputted low resolution image is scaled, and every LFP is compared with the LFPs in the LUT. Then, an HFP corresponding to an optimal LFP which is selected in the LUT is used as a high frequency component of the inputted patch (224). Third, an extracted high resolution image is outputted (226).
  • the matching may be performed so that a high frequency component in a causal region which has been previously obtained in the optimal matching (searching) process is slightly overlapped, to thereby provide smoothness with respect to surrounding regions.
  • the single-frame SR has a relatively small operational amount compared with the multiple-frame SR.
  • an operational amount is significantly large in real applications.
  • scaling efficiency deteriorates according to a noise.
  • FIG. 3 illustrates a process of scaling an image mixed with noises.
  • a cascade technique is typically used which firstly removes noises and then interpolates the image, as shown in FIG. 3 . That is, referring to FIG. 3 , if a low resolution image mixed with noises is inputted, the noises are firstly removed through a noise removing process (310), and new pixels are subsequently generated through an image interpolating process (320). In this way, a high resolution image may be output with noises being removed.
  • noises may remain after passing through the noise removing process, which may cause deterioration in a scaling.
  • the SR process may be affected according to the degree of the blur.
  • Another aspect of the exemplary embodiments is to provide an image processing apparatus and an image processing method can enhances a scaling ability by performing matching after initial noise removal in a learning process and a high resolution image synthesizing process, and selects a lookup table according to a local noise characteristic, to thereby perform effective noise removal and scaling at the same time, in the case where a low resolution image with noises is up-scaled to a high resolution image without noises using the lookup table which includes high frequency synthesizing information previously obtained through learning.
  • an image processing apparatus including: an image input unit which receives an image; and an image processing unit which generates reference data on the basis of a plurality of learning images which are classified into a plurality of first classes according to a noise characteristic and a plurality of second classes according to an image characteristic, and which performs scaling for the received image on the basis of the generated reference data.
  • the image processing unit may classify each of the plurality of first classes into the plurality of second classes.
  • the learning images may include pairs of low resolution images and high resolution images corresponding to the low resolution images.
  • the reference data may include pairs of the low resolution images and corresponding weights which are set according to the image characteristic.
  • the image processing unit may convert a low resolution image having noises into a high resolution image.
  • the image characteristic may include at least one of a high frequency component and an edge characteristic of the image.
  • the noise characteristic may include at least one of a kind and an intensity of noises.
  • the image processing unit may predict, in a case where the received image is distorted by noise, an intensity of the noise in a region unit of the received image.
  • the image processing unit may perform the scaling for the received image on the basis of the reference data corresponding to a noise characteristic and an image characteristic of the received image.
  • an image processing method including: receiving an image; generating reference data on the basis of a plurality of learning images classified into a plurality of first classes according to a noise characteristic and a plurality of second classes according to an image characteristic; and performing scaling for the received image on the basis of the generated reference data.
  • Each of the plurality of first classes may be classified into the plurality of second classes.
  • the learning images may include pairs of low resolution images and high resolution images corresponding to the low resolution images.
  • the reference data may include pairs of the low resolution images and corresponding weights which are set according to the image characteristic.
  • a low resolution image having noise may be converted into a high resolution image.
  • the image characteristic may include at least one of a high frequency component and an edge characteristic of the image.
  • the noise characteristic may include at least one of a kind and an intensity of noises.
  • the intensity of noises may be predicted in a region unit of the received image, in a case where the received image is distorted by the noises.
  • the scaling may be performed for the received image on the basis of the reference data corresponding to a noise characteristic and an image characteristic of the received image.
  • a method of generating reference data used to scale an image including: inserting noise in a low resolution image; predicting an intensity of the inserted noise in the low resolution image; removing the noise from the low resolution image; classifying the low resolution image and a corresponding high resolution image into a first class, of a plurality of first classes, according to the predicted intensity of the inserted noise; classifying the low resolution image and the corresponding high resolution image into a second class, of a plurality of second classes, according to an image characteristic; and generating the reference data according to the classified low resolution image.
  • a high resolution image can be effectively generated from a low resolution image even under environments having noises.
  • a user can enjoy the high resolution image from an image signal of low quality or low resolution, through a high resolution image processing apparatus.
  • FIGs. 1A and 1B are diagrams illustrating multiple-frame SR
  • FIG. 2 is a diagram illustrating single-frame SR
  • FIG. 3 is a diagram for illustrating a process of scaling an image mixed with noises
  • FIG. 4 is a block diagram illustrating a configuration of an image processing apparatus according to an exemplary embodiment
  • FIG. 5 is a flowchart illustrating an image processing method according to an exemplary embodiment
  • FIG. 6A is a diagram illustrating a learning process for forming a lookup table (LUT) according to an exemplary embodiment
  • FIG. 6B is a diagram illustrating a process of synthesizing a low resolution image into a high resolution image according to an exemplary embodiment
  • FIG. 7 is a diagram illustrating image classes which are classified according to a noise characteristic and an image characteristic according to an exemplary embodiment
  • FIG. 8 is a block diagram illustrating a process of generating an LUT for every class according to an exemplary embodiment
  • FIG. 9 is a diagram illustrating a process of extracting pairs of low resolution (LR) blocks and high resolution (HR) blocks according to an exemplary embodiment
  • FIG. 10 is a diagram illustrating a final LUT with respect to an N-th class according to an exemplary embodiment
  • FIG. 11 is a diagram illustrating an image frame which is divided into a plurality of regions according to an exemplary embodiment.
  • FIG. 12 is a diagram illustrating a process of generating an HR block corresponding to an LR block according to an exemplary embodiment.
  • FIG. 4 is a block diagram illustrating a configuration of an image processing apparatus 400 according to an exemplary embodiment.
  • the image processing apparatus 400 may be provided as a digital TV, a desktop computer, a laptop computer, a large format display (LFD), a digital camera, a mobile device, a set-top box, a display device, or the like. Further, the image processing apparatus 400 may be provided as any electronic device which can perform image scaling.
  • LFD large format display
  • the image processing apparatus 400 may be provided as any electronic device which can perform image scaling.
  • the image processing apparatus 400 may include an image input unit 410, an image processing unit 420, and a display unit 430.
  • the image input unit 410 may receive an image.
  • the received image may be a low resolution image or a high resolution image.
  • the low resolution image may be an image mixed with noises.
  • the image processing unit 420 may generate reference data on the basis of a plurality of learning images which belong to a first class with reference to a noise characteristic and a second class with reference to an image characteristic.
  • the image processing unit 420 may classify each of the plurality of first classes into the plurality of second classes. More specifically, images may be classified into the plurality of first classes according to the intensity of noises, and the images which are included in each of the first classes may be classified into the plurality of second classes by clustering to have a similar high frequency characteristic.
  • the learning images may include pairs of low resolution images and high resolution images corresponding to the low resolution images.
  • the reference data may include pairs of the low resolution images and weights set according to the corresponding to the image characteristic.
  • the image characteristic may include at least one of a high frequency component and an edge characteristic of the image.
  • the noise characteristic may include at least one of a kind and an intensity of the noises.
  • the image processing unit 420 may perform scaling for an inputted image on the basis of the generated reference data.
  • the image processing unit 420 may convert a low resolution image having noises into a high resolution image. More specifically, the image processing unit 420 may predict, if an image distorted by noises is inputted, the intensity of the noises by every region of the input image. Further, the image processing unit 420 may perform scaling for the inputted image on the basis of the reference data corresponding to a noise characteristic and an image characteristic of the inputted image.
  • the display unit 430 may display an image processed by the image processing unit 420. More specifically, in a case where the image processing unit 420 scales a low resolution image into a high resolution image, the display unit 430 may display the scaled high resolution image.
  • the display unit 430 may include a panel driver (not shown) and a display panel (not shown) such as a Liquid Crystal Display (LCD), Organic Light Emitting Diode (OLED), Plasma Display Panel (PDP), a tube-based display, or the like.
  • LCD Liquid Crystal Display
  • OLED Organic Light Emitting Diode
  • PDP Plasma Display Panel
  • FIG. 5 is a diagram illustrating an image processing method according to an exemplary embodiment.
  • the image processing method may include a learning process which generates a lookup table (LUT), and a synthesizing process which synthesizes a low resolution image into a high resolution image.
  • LUT lookup table
  • the image processing apparatus 400 performs learning for generating the LUT (S501 This learning process for generating the LUT will be described in detail below with reference to FIG. 6A .
  • the image processing apparatus 400 determines whether a low resolution image is to be converted into a high resolution image (S502). If it is determined that the low resolution image is not to be converted into the high resolution image, the procedure is terminated.
  • the image processing apparatus 400 synthesizes the low resolution image into the high resolution image with reference to the LUT (S503). The processing of synthesizing the low resolution image into the high resolution image will be described in detail below with reference to FIG. 6B .
  • FIG. 6A is a diagram illustrating a learning process for generating an LUT according to an exemplary embodiment.
  • the LUT is generated using low and high resolution images for learning.
  • operation S611 for noise insertion, operation S612 for noise prediction, operation S613 for initial noise removal, operation S614 for image classification, and operation S615 for LUT generation are sequentially performed.
  • arbitrary noises are inserted in a low resolution image.
  • virtual noises such as Additive White Gaussian Noise (AWGN) may be inserted in the low resolution image.
  • AWGN Additive White Gaussian Noise
  • the intensity of the noises is predicted.
  • a variety of methods may be used to predict the intensity of the noises.
  • a dispersion value may be calculated with respect to flat regions in the image, and the calculated dispersion value may then be considered as the intensity of the noises.
  • movement estimation/compensation may be performed in an image time axis, blocks having good movement estimation only may be extracted, and a dispersion value of movement estimation errors of the extracted blocks may then be considered as the intensity of the noises.
  • time and space characteristics are all used to predict the noises. It is understood that all exemplary embodiments are not limited to the above-described prediction methods, and any existing prediction method may be used in another exemplary embodiment.
  • the intensity of noises does not significantly vary for every image. Thus, it may be considered that images present in the same shot or in a certain time frame have the same intensity of noises. Furthermore, the prediction of the intensity of the noises may be applied to both the learning process for generating the LUT and the synthesizing processing for synthesizing the low resolution image into the high resolution image (operation S503 in FIG. 5 ).
  • noises are removed.
  • the noises have random patterns, it may be difficult to classify images in a noisy state. Further, since noises badly affect exact high frequency synthesized information generation, the noises are firstly removed.
  • NR noise removal
  • a variety of noise removal (NR) techniques may be used.
  • NR noise removal
  • a bilateral filter is a possible technique which may be used.
  • wavelet noise reduction using Gaussian scale mixture (GSM) may be used.
  • GSM Gaussian scale mixture
  • K-singular value decomposition (K-SVD) algorithm forms an LUT or dictionary through learning with respect to a variety of noise patches.
  • noises are removed in such a manner that a plurality of optimal patches corresponding to respective noise-distorted patches is searched from an LUT obtained by SVD, and then a weight average thereof is calculated. It is understood that all exemplary embodiments are not limited to the above-described noise removal techniques, and a variety of noise removal techniques such as a low-pass filter, median filter, Wiener filter, or the like may be used in other exemplary embodiments.
  • image classes are classified according to a noise characteristic and an image characteristic.
  • each of a plurality of first classes which is classified according to the noise characteristic may be classified into a plurality of second classes according to the image characteristic.
  • the image classification is performed as follows. First, pairs of low resolution (LR) images and high resolution (HR) images are generated from the LR images and the HR images corresponding to the LR images. In this respect, the pairs of the LR and HR images are classified into the plurality of first classes according to the intensity of noises. Then, the pairs having a similar high frequency characteristic among the pairs of the LR and HR images included in each first class are clustered to then be classified into the plurality of second classes.
  • LR low resolution
  • HR high resolution
  • FIG. 7 illustrates an example of image classes classified according to the noise characteristic and the image characteristic, according to an exemplary embodiment.
  • the pairs of the LR and HR images included in each class have similar noise and image characteristics.
  • a first second class 710 may include pairs of the LR and HR images which have a noise intensity of 1-3 and a high frequency component of 70% or more
  • a second second class 720 may include pairs of the LR and HR images which have a noise intensity of 1-3 and a high frequency component of 40-70%
  • an Nth second class 730 may include pairs of the LR and HR images which have a noise intensity of 1-3 and a high frequency component less than 40%.
  • FIG. 7 illustrates an example of image classes classified according to the noise characteristic and the image characteristic, according to an exemplary embodiment.
  • the pairs of the LR and HR images included in each class have similar noise and image characteristics.
  • a first second class 710 may include pairs of the LR and HR images which have a noise intensity of 1-3 and a high frequency component of 70% or more
  • each of the second classes 710, 720, and 730 includes three pairs of the LR and HR images, respectively, though it is understood that another exemplary embodiment is not limited thereto.
  • the pairs of the LR and HR images that belong to each class are employed to generate the LUT for each class.
  • an LUT or dictionary is generated for every class.
  • the Additive White Gaussian Noise is exemplified for the convenience of description, but there may be present a Rayleigh, gamma, exponential, uniform, or salt & pepper noise or the like.
  • the LUT may be generated in consideration of only the intensity of noises, or may be generated in consideration of both the kind and the intensity of noises.
  • FIG. 6B illustrates a process of synthesizing a low resolution image into a high resolution image according to an exemplary embodiment.
  • a high resolution image is synthesized using a low resolution image.
  • operation S621 for region division, operation S622 for noise prediction by every region, operation S623 for initial noise removal, operation S624 for patch generation, and operation S625 of high resolution image synthesizing are sequentially performed.
  • operation S622, S623 and S625 operation S626 for LUT searching may be further performed.
  • an inputted low resolution image is divided in a predetermined region unit.
  • the intensity of noises is predicted by every region divided in operation S621.
  • the prediction of the intensity of the noises may be performed in a similar manner as in operation S612 for noise prediction in the learning process for generating the LUT as described with reference to FIG. 6A .
  • operation S623 for initial noise removal the noises are removed.
  • the noise removal may be performed in a similar manner as in operation S613 for initial noise removal in the learning process for generating the LUT as described with reference to FIG. 6A .
  • the LUT is searched to obtain each patch and high frequency synthesizing information corresponding to the intensity of noises and an image characteristic.
  • the high frequency synthesizing information may be a high frequency component itself, or may be weight information for directly obtaining high resolution information from low resolution information through linear combination. Then, a high resolution image patch may be generated from the high frequency synthesizing information and the low resolution image patch, to thereby synthesize a high resolution image.
  • FIG. 8 is a block diagram illustrating a process of generating an LUT for every class according to an exemplary embodiment.
  • an LUT is generated for storing therein high frequency information for generating a high resolution image for every class.
  • operation S810 for LR and HR blocks extraction
  • operation S820 for LR Laplacian (LRL) extraction and normalization is sequentially performed.
  • operation S830 for LRL reference clustering is sequentially performed.
  • a plurality of low resolution (LR) blocks and a plurality of high resolution (HR) blocks are extracted from a low resolution (LR) image and a high resolution (HR) image corresponding to the low resolution image.
  • the LR blocks and the corresponding HR blocks are obtained from the pairs of the LR and HR images classified in the first classes.
  • the extracted LR blocks and corresponding HR blocks form pairs of the LR blocks and HR blocks.
  • FIG. 9 is a diagram illustrating a process of extracting example pairs of LR blocks 910 and HR blocks 920 according to an exemplary embodiment.
  • the sizes of the LR block 910 and the HR block 920 are 5x5, respectively.
  • the LR block 910 is indicated by a solid line includes LR pixels of 5x5.
  • the HR block 920 corresponding to the LR block 910 is indicated by a dotted line and includes HR pixels of 5x5.
  • the HR block 920 is arranged in a center area of the LR block 910. In this respect, a variety of pairs of the LR and HR images may be selected for effective learning.
  • the LR blocks 910 and the HR blocks 920 are extracted on the basis of a sampled LR image, with a suitable interval of one or more pixels in horizontal and vertical directions.
  • the extraction of the pairs of the LR and HR blocks is performed for each first class. Considering that the extracted LR and HR blocks are used for learning, the LR and HR blocks are hereinafter represented as LRt and HRt, respectively.
  • a Laplacian operator is applied to each LRt block to sufficiently consider a high frequency characteristic and an edge characteristic, to extract LRt Laplacian (LRLt) blocks.
  • LRLt LRt Laplacian
  • a 3x3 Laplacian operator as shown in the following table 1 may be applied to each pixel of the LRt block, though it is understood that another exemplary embodiment is not limited thereto. [Table 1] -1 -1 -1 -1 -1 -1 8 -1 -1 -1 -1 -1 -1
  • the normalized LRLt may be represented as LRL t n .
  • the pairs of the LRt and HRt are changed into pairs of a LRL t n - LR t - HR t shape.
  • initial learning information may be defined as the following formula 2: B ⁇ P m
  • the clustering is performed on the basis of LRL t n the blocks, to thereby group the pairs having a similar Laplacian characteristic.
  • K-means clustering or the like may be employed for the clustering.
  • cluster center blocks of the number L may be defined as the following formula 3: LRLC t , l n
  • Each cluster may include at least one pair of the LRt and HRt blocks having a similar characteristic.
  • IH(i,j) represents pixels in a position (i,j) in the HR blocks
  • IL(x,y) represents pixels in a position (x,y) in the LR blocks.
  • i, j, x and y satisfy 0 ⁇ i,j,x,y ⁇ 4.That is, one 5x5 weight is present for every HR position.
  • each of the clusters of the number L includes at least one pair of the LRt and HRt block.
  • the total number of the pairs of the LRt and HRt blocks included in the respective clusters is M.
  • a least mean square (LMS) algorithm may be employed. Since the pairs of the LRt and HRt blocks included in the I-th cluster have a similar Laplacian characteristic, the weights according to the formula 4 may be similar to each other. In this case, the weight corresponding to the position (i, j) in the HRt block with respect to the LRt block included in the I-th cluster is defined as the following formula 6: w ij l x ⁇ y ⁇ 0 ⁇ x , y ⁇ N - 1
  • the LUT including high frequency information on the plurality of classes may be generated.
  • the LUT generated in this way is shown in FIG. 10 , which illustrates a final LUT for the N-th class.
  • the final LUT (or dictionary) includes pairs of LRLC t , l n the number L for indexing and weights corresponding thereto.
  • Each of the LRLC t , l n has a size of NxN, and weight coefficients of NxN are present in the HR positions in the weights.
  • the total number of the weight coefficients for each cluster is N4.
  • the size of the total weight information in the LUT is 320 KB (where it is assumed that every coefficient occupies 1 byte).
  • the total size of the LUT may be about 333 KB. In this way, the LUT including high frequency information may be generated with respect to all the classes of the number N. Since the learning is performed using the HR image without noises, the final result becomes an image which is scaled with noises being removed.
  • the LUT generated as described above is used to synthesize each LR block into the HR block in the high resolution image synthesizing process to be described later.
  • FIG. 11 illustrates an image frame which is divided into a plurality of regions according to an exemplary embodiment.
  • the intensity of the noises is predicted in the same manner as in the learning process for generating the LUT. However, in this synthesizing process, the intensity of the noises is predicted in a predetermined region unit, differently from the learning process.
  • the noises are hardly uniformly distributed over the images, and the intensity of the noises may vary locally. Thus, it the noises may be predicted in the region unit.
  • the regions of the image may be divided without overlapping between the regions.
  • the regions may be divided to overlap each other.
  • the regions may be divided on the basis of objects, foregrounds and backgrounds, other than rectangles of a predetermined size.
  • initial noise removal is performed for an inputted LR image in a similar manner as in the learning process, in consideration of the intensity of the noises for every region.
  • Wiener filtering may be performed for the concerned region in consideration of the intensity of the noises, though it is understood that another exemplary embodiment is not limited thereto.
  • FIG. 12 is a diagram illustrating a process of generating HR blocks corresponding to LR blocks, according to an exemplary embodiment.
  • an LR image from which initial noises are removed for every region is divided into LR blocks of a predetermined number in an overlapping manner.
  • the LRL is extracted and normalized for each LRin block as shown in FIG. 8 , and LRLC t n which is similar to LRL in n is searched in the LUT. That is, distances between the inputted LRL in n and LRLC t n of the number L included in the LUT are calculated to search for a cluster having a minimum distance. This process may be referred to as matching.
  • a variety of distance measures such as well-known L1-norm or L2-norm may be employed for distance measurement for the matching. If it is assumed that an optimal cluster obtained through the matching process is an lbest-th cluster, pixels in the position (i,j) in the HR blocks corresponding to the inputted LR blocks are obtained according to the formula 4, above, using w ij l best k ⁇ l .
  • two or more pixels may be generated in a specific position in the HR blocks, according to how the overlapping is performed in FIG. 12 .
  • an average thereof is obtained and determined as final HR pixels.
  • noise removal and up-scaling are simultaneously performed for every region as described above.
  • noises pixels may distort the LUT itself, or may affect the matching in the inference process.
  • the specific pixel may be considered as the noise pixel and excluded from the LUT generation and the matching in the inference process.
  • all (i.e., 100%) of the noise pixels may be excluded for learning, or may be replaced with an average value of surrounding pixel values for learning.
  • the second threshold value T2 is generally smaller than the first threshold value T1.
  • the exemplary embodiments can also be embodied as computer-readable code on a computer-readable recording medium.
  • the computer-readable recording medium is any data storage device that can store data that can be thereafter read by a computer system. Examples of the computer-readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, and optical data storage devices.
  • the computer-readable recording medium can also be distributed over network-coupled computer systems so that the computer-readable code is stored and executed in a distributed fashion.
  • the exemplary embodiments may be written as computer programs transmitted over a computer-readable transmission medium, such as a carrier wave, and received and implemented in general-use digital computers that execute the programs.
  • one or more units of the image processing apparatus 400 can include a processor or microprocessor executing a computer program stored in a computer-readable medium, such as a local storage.

Abstract

An image processing apparatus and an image processing method, the image processing apparatus including: an image input unit which receives an image; and an image processing unit which generates reference data on the basis of a plurality of learning images classified into a plurality of first classes according to a noise characteristic and a plurality of second classes according to an image characteristic, and which performs scaling for the received image on the basis of the generated reference data.

Description

    BACKGROUND 1. Field
  • Apparatuses and methods consistent with the exemplary embodiments relate to an image processing apparatus and an image processing method which is capable of effectively generating a high resolution image from a low resolution image through learning even under environments having various noises.
  • 2. Description of the Related Art
  • Scaling for enlarging or reducing the size of an image is an important technique in the field of image display apparatuses. Recently, with rapid increase of a screen size and a resolution, the scaling technique has been developed to generate a high quality image, beyond simple enlargement or reduction of an image.
  • Super-resolution (SR) technique is one of a variety of techniques for generating a high quality image. The SR technique is classified into multiple-frame SR for extracting a single high resolution image from a plurality of low resolution images and single-frame SR for extracting a single high resolution image from a single low resolution image.
  • FIGs. 1A and 1B are diagrams illustrating the multi-frame SR. In the case of the multiple-frame SR, a single high resolution image is generated through registration, etc. from a plurality of image frames of the same scene which are slightly different in phase from each other. More specifically, if a low resolution (LR) image is inputted, a plurality of pixels is extracted from each of a plurality of image frames of the same scene which are slightly different in phase from each other. For example, as shown in FIG. 1A, a plurality of pixels ○, a plurality of pixels ◇, a plurality of pixels Δ and a plurality of pixels • are sampled, respectively. In this respect, the pixels ○, the pixels ◇, the pixel Δ and the pixels • are extracted from different image frames, respectively.
  • Then, pixels for forming high resolution image frames are generated on the basis of the sampled pixels. For example, as shown in FIG. 1B, a plurality of pixels □ may be generated using the pixels ○, ◇, Δ and •. High resolution (HR) image frames may be generated from the pixels □.
  • This multiple-frame SR requires suitable movement estimation with respect to a plurality of image frames. Thus, the amount of operations is generally large, thereby causing difficulty in real-time processing. Further, a frame memory having a considerable size is required for storing the operations, thereby causing much difficulty in practical realization.
  • FIG. 2 is a diagram illustrating the single-frame SR. The single-frame SR is a learning-based technique which is used to overcome the problems of the multiple-frame SR. Referring to FIG. 2, in a learning process 210, pairs of blocks or patches having a predetermined size in consideration of image characteristics are generated using a variety of high resolution images and low resolution images corresponding to the high resolution images, and the generated pairs of blocks or patches are stored. In this respect, each pair of blocks or patches includes high resolution information and low resolution information.
  • For example, as shown in FIG. 2, in the learning process 210, the following operations are performed. First, low resolution images corresponding to a variety of high resolution images are extracted through low-pass filtering (LPF) and sub-sampling (212). Second, the low resolution images are scaled using predetermined interpolation such as cubic convolution (214). Third, low frequency components are removed from the original high resolution images and the scaled images using a Band-Pass Filter (BPF) or a High-Pass Filter (HPF). Then, examples of high frequency patches (HFP) having a predetermined size from which the low frequency components are removed and the corresponding scaled low frequency patches (LFP) are stored in a lookup table (LUT) (216).
  • In a synthesis or inference process (220), if an arbitrary low resolution image is input, a low resolution block in the pairs matching each block of the inputted image is searched, and high resolution information is obtained. For example, as shown in FIG. 2, in the synthesis or inference process (220), the following operations are performed. First, a low resolution image is inputted (222). Second, the inputted low resolution image is scaled, and every LFP is compared with the LFPs in the LUT. Then, an HFP corresponding to an optimal LFP which is selected in the LUT is used as a high frequency component of the inputted patch (224). Third, an extracted high resolution image is outputted (226). In this respect, the matching may be performed so that a high frequency component in a causal region which has been previously obtained in the optimal matching (searching) process is slightly overlapped, to thereby provide smoothness with respect to surrounding regions.
  • The single-frame SR has a relatively small operational amount compared with the multiple-frame SR. However, even in the case of the single-frame SR, since every LFP should be compared with all the LFPs in the LUT, an operational amount is significantly large in real applications. Further, there exists a problem in which scaling efficiency deteriorates according to a noise. Thus, it is necessary to provide a technique which can effectively reduce an operational amount while removing influences due to noises.
  • FIG. 3 illustrates a process of scaling an image mixed with noises. In order to scale the noise-mixed image, a cascade technique is typically used which firstly removes noises and then interpolates the image, as shown in FIG. 3. That is, referring to FIG. 3, if a low resolution image mixed with noises is inputted, the noises are firstly removed through a noise removing process (310), and new pixels are subsequently generated through an image interpolating process (320). In this way, a high resolution image may be output with noises being removed.
  • However, according to the cascade technique, noises may remain after passing through the noise removing process, which may cause deterioration in a scaling. Further, in the case of the learning-based single-frame SR, if a blurred image is inputted after noise removing, the SR process may be affected according to the degree of the blur.
  • SUMMARY
  • According to the present invention there is provided an apparatus and method as set forth in the appended claims. Other features of the invention will be apparent from the dependent claims, and the description which follows.
  • Accordingly, it is an aspect of the exemplary embodiments to provide an image processing apparatus and an image processing method which efficiently generate a high resolution image from a low resolution image through learning, even under environments having various noises.
  • Another aspect of the exemplary embodiments is to provide an image processing apparatus and an image processing method can enhances a scaling ability by performing matching after initial noise removal in a learning process and a high resolution image synthesizing process, and selects a lookup table according to a local noise characteristic, to thereby perform effective noise removal and scaling at the same time, in the case where a low resolution image with noises is up-scaled to a high resolution image without noises using the lookup table which includes high frequency synthesizing information previously obtained through learning.
  • According to an aspect of an exemplary embodiment, there is provided an image processing apparatus including: an image input unit which receives an image; and an image processing unit which generates reference data on the basis of a plurality of learning images which are classified into a plurality of first classes according to a noise characteristic and a plurality of second classes according to an image characteristic, and which performs scaling for the received image on the basis of the generated reference data.
  • The image processing unit may classify each of the plurality of first classes into the plurality of second classes.
  • The learning images may include pairs of low resolution images and high resolution images corresponding to the low resolution images.
  • The reference data may include pairs of the low resolution images and corresponding weights which are set according to the image characteristic.
  • The image processing unit may convert a low resolution image having noises into a high resolution image.
  • The image characteristic may include at least one of a high frequency component and an edge characteristic of the image.
  • The noise characteristic may include at least one of a kind and an intensity of noises.
  • The image processing unit may predict, in a case where the received image is distorted by noise, an intensity of the noise in a region unit of the received image.
  • The image processing unit may perform the scaling for the received image on the basis of the reference data corresponding to a noise characteristic and an image characteristic of the received image.
  • According to an aspect of another exemplary embodiment, there is provided an image processing method including: receiving an image; generating reference data on the basis of a plurality of learning images classified into a plurality of first classes according to a noise characteristic and a plurality of second classes according to an image characteristic; and performing scaling for the received image on the basis of the generated reference data.
  • Each of the plurality of first classes may be classified into the plurality of second classes.
  • The learning images may include pairs of low resolution images and high resolution images corresponding to the low resolution images.
  • The reference data may include pairs of the low resolution images and corresponding weights which are set according to the image characteristic.
  • A low resolution image having noise may be converted into a high resolution image.
  • The image characteristic may include at least one of a high frequency component and an edge characteristic of the image.
  • The noise characteristic may include at least one of a kind and an intensity of noises.
  • The intensity of noises may be predicted in a region unit of the received image, in a case where the received image is distorted by the noises.
  • The scaling may be performed for the received image on the basis of the reference data corresponding to a noise characteristic and an image characteristic of the received image.
  • According to an aspect of another exemplary embodiment, there is provided a method of generating reference data used to scale an image, the method including: inserting noise in a low resolution image; predicting an intensity of the inserted noise in the low resolution image; removing the noise from the low resolution image; classifying the low resolution image and a corresponding high resolution image into a first class, of a plurality of first classes, according to the predicted intensity of the inserted noise; classifying the low resolution image and the corresponding high resolution image into a second class, of a plurality of second classes, according to an image characteristic; and generating the reference data according to the classified low resolution image.
  • According to the exemplary embodiments, a high resolution image can be effectively generated from a low resolution image even under environments having noises. Thus, a user can enjoy the high resolution image from an image signal of low quality or low resolution, through a high resolution image processing apparatus.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The above and/or other aspects will become apparent and more readily appreciated from the following description of the exemplary embodiments, taken in conjunction with the accompanying drawings, in which:
  • FIGs. 1A and 1B are diagrams illustrating multiple-frame SR;
  • FIG. 2 is a diagram illustrating single-frame SR;
  • FIG. 3 is a diagram for illustrating a process of scaling an image mixed with noises;
  • FIG. 4 is a block diagram illustrating a configuration of an image processing apparatus according to an exemplary embodiment;
  • FIG. 5 is a flowchart illustrating an image processing method according to an exemplary embodiment;
  • FIG. 6A is a diagram illustrating a learning process for forming a lookup table (LUT) according to an exemplary embodiment;
  • FIG. 6B is a diagram illustrating a process of synthesizing a low resolution image into a high resolution image according to an exemplary embodiment;
  • FIG. 7 is a diagram illustrating image classes which are classified according to a noise characteristic and an image characteristic according to an exemplary embodiment;
  • FIG. 8 is a block diagram illustrating a process of generating an LUT for every class according to an exemplary embodiment;
  • FIG. 9 is a diagram illustrating a process of extracting pairs of low resolution (LR) blocks and high resolution (HR) blocks according to an exemplary embodiment;
  • FIG. 10 is a diagram illustrating a final LUT with respect to an N-th class according to an exemplary embodiment;
  • FIG. 11 is a diagram illustrating an image frame which is divided into a plurality of regions according to an exemplary embodiment; and
  • FIG. 12 is a diagram illustrating a process of generating an HR block corresponding to an LR block according to an exemplary embodiment.
  • DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS
  • Below, exemplary embodiments will be described in detail with reference to accompanying drawings so as to be easily realized by a person having ordinary knowledge in the art. The exemplary embodiments may be embodied in various forms without being limited to the exemplary embodiments set forth herein. Descriptions of well-known parts are omitted for clarity, and like reference numerals refer to like elements throughout. Expressions such as "at least one of," when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list.
  • FIG. 4 is a block diagram illustrating a configuration of an image processing apparatus 400 according to an exemplary embodiment. The image processing apparatus 400 may be provided as a digital TV, a desktop computer, a laptop computer, a large format display (LFD), a digital camera, a mobile device, a set-top box, a display device, or the like. Further, the image processing apparatus 400 may be provided as any electronic device which can perform image scaling.
  • Referring to FIG. 4, the image processing apparatus 400 may include an image input unit 410, an image processing unit 420, and a display unit 430.
  • The image input unit 410 may receive an image. The received image may be a low resolution image or a high resolution image. The low resolution image may be an image mixed with noises.
  • The image processing unit 420 may generate reference data on the basis of a plurality of learning images which belong to a first class with reference to a noise characteristic and a second class with reference to an image characteristic.
  • According to an exemplary embodiment, the image processing unit 420 may classify each of the plurality of first classes into the plurality of second classes. More specifically, images may be classified into the plurality of first classes according to the intensity of noises, and the images which are included in each of the first classes may be classified into the plurality of second classes by clustering to have a similar high frequency characteristic.
  • The learning images may include pairs of low resolution images and high resolution images corresponding to the low resolution images. The reference data may include pairs of the low resolution images and weights set according to the corresponding to the image characteristic. The image characteristic may include at least one of a high frequency component and an edge characteristic of the image. The noise characteristic may include at least one of a kind and an intensity of the noises. The image processing unit 420 may perform scaling for an inputted image on the basis of the generated reference data.
  • According to an exemplary embodiment, the image processing unit 420 may convert a low resolution image having noises into a high resolution image. More specifically, the image processing unit 420 may predict, if an image distorted by noises is inputted, the intensity of the noises by every region of the input image. Further, the image processing unit 420 may perform scaling for the inputted image on the basis of the reference data corresponding to a noise characteristic and an image characteristic of the inputted image.
  • The display unit 430 may display an image processed by the image processing unit 420. More specifically, in a case where the image processing unit 420 scales a low resolution image into a high resolution image, the display unit 430 may display the scaled high resolution image. The display unit 430 may include a panel driver (not shown) and a display panel (not shown) such as a Liquid Crystal Display (LCD), Organic Light Emitting Diode (OLED), Plasma Display Panel (PDP), a tube-based display, or the like.
  • FIG. 5 is a diagram illustrating an image processing method according to an exemplary embodiment. The image processing method may include a learning process which generates a lookup table (LUT), and a synthesizing process which synthesizes a low resolution image into a high resolution image.
  • Referring to FIG. 5, the image processing apparatus 400 performs learning for generating the LUT (S501 This learning process for generating the LUT will be described in detail below with reference to FIG. 6A.
  • The image processing apparatus 400 determines whether a low resolution image is to be converted into a high resolution image (S502). If it is determined that the low resolution image is not to be converted into the high resolution image, the procedure is terminated.
  • In contrast, if it is determined that the low resolution image is to be converted into the high resolution image, the image processing apparatus 400 synthesizes the low resolution image into the high resolution image with reference to the LUT (S503). The processing of synthesizing the low resolution image into the high resolution image will be described in detail below with reference to FIG. 6B.
  • FIG. 6A is a diagram illustrating a learning process for generating an LUT according to an exemplary embodiment. In the learning process for generating the LUT, the LUT is generated using low and high resolution images for learning. To this end, operation S611 for noise insertion, operation S612 for noise prediction, operation S613 for initial noise removal, operation S614 for image classification, and operation S615 for LUT generation are sequentially performed.
  • In operation S611 for noise insertion, arbitrary noises are inserted in a low resolution image. For example, virtual noises such as Additive White Gaussian Noise (AWGN) may be inserted in the low resolution image.
  • In operation S612 for noise prediction, the intensity of the noises is predicted. In this respect, a variety of methods may be used to predict the intensity of the noises. According to an exemplary embodiment, a dispersion value may be calculated with respect to flat regions in the image, and the calculated dispersion value may then be considered as the intensity of the noises. According to another exemplary embodiment, movement estimation/compensation may be performed in an image time axis, blocks having good movement estimation only may be extracted, and a dispersion value of movement estimation errors of the extracted blocks may then be considered as the intensity of the noises. According to still another exemplary embodiment, time and space characteristics are all used to predict the noises. It is understood that all exemplary embodiments are not limited to the above-described prediction methods, and any existing prediction method may be used in another exemplary embodiment.
  • The intensity of noises does not significantly vary for every image. Thus, it may be considered that images present in the same shot or in a certain time frame have the same intensity of noises. Furthermore, the prediction of the intensity of the noises may be applied to both the learning process for generating the LUT and the synthesizing processing for synthesizing the low resolution image into the high resolution image (operation S503 in FIG. 5).
  • In operation S613 for initial noise removal, noises are removed. As the noises have random patterns, it may be difficult to classify images in a noisy state. Further, since noises badly affect exact high frequency synthesized information generation, the noises are firstly removed. To this end, a variety of noise removal (NR) techniques may be used. For example, a bilateral filter is a possible technique which may be used. Furthermore, wavelet noise reduction using Gaussian scale mixture (GSM) may be used. The K-singular value decomposition (K-SVD) algorithm forms an LUT or dictionary through learning with respect to a variety of noise patches. In this case, noises are removed in such a manner that a plurality of optimal patches corresponding to respective noise-distorted patches is searched from an LUT obtained by SVD, and then a weight average thereof is calculated. It is understood that all exemplary embodiments are not limited to the above-described noise removal techniques, and a variety of noise removal techniques such as a low-pass filter, median filter, Wiener filter, or the like may be used in other exemplary embodiments.
  • In operation S614 for image classification, image classes are classified according to a noise characteristic and an image characteristic. According to an exemplary embodiment, each of a plurality of first classes which is classified according to the noise characteristic may be classified into a plurality of second classes according to the image characteristic.
  • The image classification is performed as follows. First, pairs of low resolution (LR) images and high resolution (HR) images are generated from the LR images and the HR images corresponding to the LR images. In this respect, the pairs of the LR and HR images are classified into the plurality of first classes according to the intensity of noises. Then, the pairs having a similar high frequency characteristic among the pairs of the LR and HR images included in each first class are clustered to then be classified into the plurality of second classes.
  • The pairs of the LR and HR images classified into the second classes are shown in FIG. 7. FIG. 7 illustrates an example of image classes classified according to the noise characteristic and the image characteristic, according to an exemplary embodiment. The pairs of the LR and HR images included in each class have similar noise and image characteristics. For example, a first second class 710 may include pairs of the LR and HR images which have a noise intensity of 1-3 and a high frequency component of 70% or more, a second second class 720 may include pairs of the LR and HR images which have a noise intensity of 1-3 and a high frequency component of 40-70%, and an Nth second class 730 may include pairs of the LR and HR images which have a noise intensity of 1-3 and a high frequency component less than 40%. In the case of the example in FIG. 7, each of the second classes 710, 720, and 730 includes three pairs of the LR and HR images, respectively, though it is understood that another exemplary embodiment is not limited thereto. The pairs of the LR and HR images that belong to each class are employed to generate the LUT for each class.
  • In operation S615 for LUT generation, an LUT or dictionary is generated for every class.
  • Noises are various in kind. In the above description, the Additive White Gaussian Noise is exemplified for the convenience of description, but there may be present a Rayleigh, gamma, exponential, uniform, or salt & pepper noise or the like. Thus, the LUT may be generated in consideration of only the intensity of noises, or may be generated in consideration of both the kind and the intensity of noises.
  • FIG. 6B illustrates a process of synthesizing a low resolution image into a high resolution image according to an exemplary embodiment. In the synthesizing process, a high resolution image is synthesized using a low resolution image. To this end, operation S621 for region division, operation S622 for noise prediction by every region, operation S623 for initial noise removal, operation S624 for patch generation, and operation S625 of high resolution image synthesizing are sequentially performed. In operations S622, S623 and S625, operation S626 for LUT searching may be further performed.
  • In operation S621 for region division, an inputted low resolution image is divided in a predetermined region unit.
  • In operation S622 for noise prediction by every region, the intensity of noises is predicted by every region divided in operation S621. The prediction of the intensity of the noises may be performed in a similar manner as in operation S612 for noise prediction in the learning process for generating the LUT as described with reference to FIG. 6A.
  • In operation S623 for initial noise removal, the noises are removed. The noise removal may be performed in a similar manner as in operation S613 for initial noise removal in the learning process for generating the LUT as described with reference to FIG. 6A.
  • In operation S624 of patch generation, a low resolution image with noises being removed is divided in a patch unit.
  • In operation S625 for high resolution image synthesizing, the LUT is searched to obtain each patch and high frequency synthesizing information corresponding to the intensity of noises and an image characteristic. The high frequency synthesizing information may be a high frequency component itself, or may be weight information for directly obtaining high resolution information from low resolution information through linear combination. Then, a high resolution image patch may be generated from the high frequency synthesizing information and the low resolution image patch, to thereby synthesize a high resolution image.
  • Hereinafter, a process of generating an LUT for every class will be described with reference to FIGs. 8 and 9. For the convenience of description, a case that an image is enlarged to two times is exemplified, though it is understood that another exemplary embodiment is not limited thereto.
  • FIG. 8 is a block diagram illustrating a process of generating an LUT for every class according to an exemplary embodiment. Referring to FIG. 8, in the process of generating the LUT for every class, an LUT is generated for storing therein high frequency information for generating a high resolution image for every class. To this end, operation S810 for LR and HR blocks extraction, operation S820 for LR Laplacian (LRL) extraction and normalization, operation S830 for LRL reference clustering, operation S840 for weight generation for every cluster, and operation S850 for LRL-weight LUT generation are sequentially performed.
  • In operation S810 for LR and HR blocks extraction, a plurality of low resolution (LR) blocks and a plurality of high resolution (HR) blocks are extracted from a low resolution (LR) image and a high resolution (HR) image corresponding to the low resolution image. In this respect, the LR blocks and the corresponding HR blocks are obtained from the pairs of the LR and HR images classified in the first classes. The extracted LR blocks and corresponding HR blocks form pairs of the LR blocks and HR blocks.
  • The process of extracting the pairs of the LR and HR blocks will be described in more detail with reference to FIG. 9. FIG. 9 is a diagram illustrating a process of extracting example pairs of LR blocks 910 and HR blocks 920 according to an exemplary embodiment. In the example shown in FIG. 9, the sizes of the LR block 910 and the HR block 920 are 5x5, respectively. The LR block 910 is indicated by a solid line includes LR pixels of 5x5. The HR block 920 corresponding to the LR block 910 is indicated by a dotted line and includes HR pixels of 5x5. The HR block 920 is arranged in a center area of the LR block 910. In this respect, a variety of pairs of the LR and HR images may be selected for effective learning. Then, the LR blocks 910 and the HR blocks 920 are extracted on the basis of a sampled LR image, with a suitable interval of one or more pixels in horizontal and vertical directions.
  • Referring back to FIG. 8, in operation S810 for LR and HR block extraction, the extraction of the pairs of the LR and HR blocks is performed for each first class. Considering that the extracted LR and HR blocks are used for learning, the LR and HR blocks are hereinafter represented as LRt and HRt, respectively.
  • In operation S820 for LRL extraction and normalization, a Laplacian operator is applied to each LRt block to sufficiently consider a high frequency characteristic and an edge characteristic, to extract LRt Laplacian (LRLt) blocks. According to an exemplary embodiment, a 3x3 Laplacian operator as shown in the following table 1 may be applied to each pixel of the LRt block, though it is understood that another exemplary embodiment is not limited thereto. [Table 1]
    -1 -1 -1
    -1 8 -1
    -1 -1 -1
  • After the LRLt blocks are extracted, normalization is performed for each LRLt block. If an average of the LRLt blocks is represented as µ, and standard deviation is represented as σ, the normalization is performed for each pixel value X in the LRLt block, to thereby obtain a result according to the following formula 1: x - μ σ
    Figure imgb0001
  • Here, the normalized LRLt may be represented as LRL t n .
    Figure imgb0002
    Through this process, the pairs of the LRt and HRt are changed into pairs of a LRL t n - LR t - HR t
    Figure imgb0003
    shape.
  • In operation S830 for LRL clustering, the block pairs having a similar image characteristic are clustered. When the LRL t n - LR t - HR t
    Figure imgb0004
    shaped block pairs of a total number M are present for learning, if an m-th block pair BPm is LRL t , m n LR t , m HR t , m ,
    Figure imgb0005
    initial learning information may be defined as the following formula 2: B P m | 0 m M - 1
    Figure imgb0006
  • In this case, the clustering is performed on the basis of LRL t n
    Figure imgb0007
    the blocks, to thereby group the pairs having a similar Laplacian characteristic. K-means clustering or the like may be employed for the clustering.
  • If it is assumed that clusters of a number L, which is less than M, are LRL t n
    Figure imgb0008
    cluster center blocks of the number L may be defined as the following formula 3: LRLC t , l n | 0 l L - 1
    Figure imgb0009
  • Each cluster may include at least one pair of the LRt and HRt blocks having a similar characteristic.
  • In operation S840 for weight generation for every cluster, a weight suitable for each cluster is obtained through training.
  • The following formula 4 may be made between pixels in each pair of the LRt and HRt blocks: I H i j = x = 0 N - 1 y = 0 N - 1 w ij x y I L x y
    Figure imgb0010
  • In this respect, IH(i,j) represents pixels in a position (i,j) in the HR blocks, and IL(x,y) represents pixels in a position (x,y) in the LR blocks. For example, in the case of the LR block and the HR block in which N is five as shown FIG. 9 (that is, the sizes of the blocks are 5x5), i, j, x and y satisfy 0 ≤ i,j,x,y ≤ 4.That is, one 5x5 weight is present for every HR position.
  • As described above, each of the clusters of the number L includes at least one pair of the LRt and HRt block. For example, if it is assumed that pairs of the LRt and HRt blocks of a number CSI in an I-th cluster, the following formula 5 is made: l = 0 L C S i = M
    Figure imgb0011
  • That is, since the pair of LRt and HRt blocks of the total number M are present, the total number of the pairs of the LRt and HRt blocks included in the respective clusters is M.
  • In the learning process for the pairs of the LRt and HRt blocks included in the I-th cluster, a least mean square (LMS) algorithm may be employed. Since the pairs of the LRt and HRt blocks included in the I-th cluster have a similar Laplacian characteristic, the weights according to the formula 4 may be similar to each other. In this case, the weight corresponding to the position (i, j) in the HRt block with respect to the LRt block included in the I-th cluster is defined as the following formula 6: w ij l x y 0 x , y N - 1
    Figure imgb0012
  • In operation S850 for LRL-weight LUT generation, the LUT including high frequency information on the plurality of classes may be generated. The LUT generated in this way is shown in FIG. 10, which illustrates a final LUT for the N-th class.
  • Referring to FIG. 10, the final LUT (or dictionary) includes pairs of LRLC t , l n
    Figure imgb0013
    the number L for indexing and weights corresponding thereto. Each of the LRLC t , l n
    Figure imgb0014
    has a size of NxN, and weight coefficients of NxN are present in the HR positions in the weights. Thus, if the HR positions are NxN, the total number of the weight coefficients for each cluster is N4. For example, if N is five, and L is 512, the size of the total weight information in the LUT is 320 KB (where it is assumed that every coefficient occupies 1 byte). In consideration of the size of the LRLC t , l n ,
    Figure imgb0015
    the total size of the LUT may be about 333 KB. In this way, the LUT including high frequency information may be generated with respect to all the classes of the number N. Since the learning is performed using the HR image without noises, the final result becomes an image which is scaled with noises being removed.
  • The LUT generated as described above is used to synthesize each LR block into the HR block in the high resolution image synthesizing process to be described later.
  • Hereinafter, the process of synthesizing a low resolution image into a high resolution image will be described in detail with reference to FIGs. 11 and 12. FIG. 11 illustrates an image frame which is divided into a plurality of regions according to an exemplary embodiment.
  • If an LR image is distorted by noises, the intensity of the noises is predicted in the same manner as in the learning process for generating the LUT. However, in this synthesizing process, the intensity of the noises is predicted in a predetermined region unit, differently from the learning process.
  • In the case of real images distorted by noises, the noises are hardly uniformly distributed over the images, and the intensity of the noises may vary locally. Thus, it the noises may be predicted in the region unit.
  • According to an exemplary embodiment, as shown in FIG. 11, the regions of the image may be divided without overlapping between the regions. According to another exemplary embodiment, the regions may be divided to overlap each other. According to still another exemplary embodiment, the regions may be divided on the basis of objects, foregrounds and backgrounds, other than rectangles of a predetermined size.
  • If the noises are predicted in the region unit, initial noise removal is performed for an inputted LR image in a similar manner as in the learning process, in consideration of the intensity of the noises for every region. For example, Wiener filtering may be performed for the concerned region in consideration of the intensity of the noises, though it is understood that another exemplary embodiment is not limited thereto.
  • FIG. 12 is a diagram illustrating a process of generating HR blocks corresponding to LR blocks, according to an exemplary embodiment. As shown in FIG. 12, an LR image from which initial noises are removed for every region is divided into LR blocks of a predetermined number in an overlapping manner. If it is assumed that inputted LR blocks are represented as LRin, the LRL is extracted and normalized for each LRin block as shown in FIG. 8, and LRLC t n
    Figure imgb0016
    which is similar to LRL in n
    Figure imgb0017
    is searched in the LUT. That is, distances between the inputted LRL in n
    Figure imgb0018
    and LRLC t n
    Figure imgb0019
    of the number L included in the LUT are calculated to search for a cluster having a minimum distance. This process may be referred to as matching. A variety of distance measures such as well-known L1-norm or L2-norm may be employed for distance measurement for the matching. If it is assumed that an optimal cluster obtained through the matching process is an lbest-th cluster, pixels in the position (i,j) in the HR blocks corresponding to the inputted LR blocks are obtained according to the formula 4, above, using w ij l best k l .
    Figure imgb0020
  • On the other hand, two or more pixels may be generated in a specific position in the HR blocks, according to how the overlapping is performed in FIG. 12. In this case, an average thereof is obtained and determined as final HR pixels. Moreover, in this case, noise removal and up-scaling are simultaneously performed for every region as described above.
  • In the case of strong noises, noises pixels may distort the LUT itself, or may affect the matching in the inference process. To solve this problem, if a specific pixel is larger than a predetermined first threshold value T1 and 8 pixels adjacent to the specific pixel are all smaller than a predetermined second threshold value T2 in the Laplacian result, the specific pixel may be considered as the noise pixel and excluded from the LUT generation and the matching in the inference process. For example, in the learning process, all (i.e., 100%) of the noise pixels may be excluded for learning, or may be replaced with an average value of surrounding pixel values for learning. In this respect, the second threshold value T2 is generally smaller than the first threshold value T1.
  • The above process is applied to the case of each patch in the inputted image. That is, pixels considered as noise pixels are excluded for matching.
  • While not restricted thereto, the exemplary embodiments can also be embodied as computer-readable code on a computer-readable recording medium. The computer-readable recording medium is any data storage device that can store data that can be thereafter read by a computer system. Examples of the computer-readable recording medium include read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tapes, floppy disks, and optical data storage devices. The computer-readable recording medium can also be distributed over network-coupled computer systems so that the computer-readable code is stored and executed in a distributed fashion. Also, the exemplary embodiments may be written as computer programs transmitted over a computer-readable transmission medium, such as a carrier wave, and received and implemented in general-use digital computers that execute the programs. Moreover, while not required in all aspects, one or more units of the image processing apparatus 400 can include a processor or microprocessor executing a computer program stored in a computer-readable medium, such as a local storage.
  • Although a few preferred embodiments have been shown and described, it will be appreciated by those skilled in the art that various changes and modifications might be made without departing from the scope of the invention, as defined in the appended claims.

Claims (15)

  1. An image processing apparatus comprising:
    an image input unit which receives an image; and
    an image processing unit which generates reference data according to a plurality of learning images classified into a plurality of first classes according to a noise characteristic and a plurality of second classes according to an image characteristic, and which performs scaling for the received image according to the generated reference data.
  2. The image processing apparatus according to claim 1, wherein the image processing unit classifies each of the plurality of first classes into the plurality of second classes.
  3. The image processing apparatus according to claim 1, wherein the plurality of learning images comprises pairs of low resolution images and high resolution images corresponding to the low resolution images.
  4. The image processing apparatus according to claim 3, wherein the reference data comprises pairs of the low resolution images and corresponding weights which are set according to the image characteristic.
  5. The image processing apparatus according to claim 1, wherein the image processing unit converts a low resolution image having noise into a high resolution image without the noise.
  6. The image processing apparatus according to claim 1, wherein the image characteristic comprises at least one of a high frequency component and an edge characteristic of the image.
  7. The image processing apparatus according to claim 1, wherein the noise characteristic comprises at least one of a kind and an intensity of noises.
  8. The image processing apparatus according to claim 1, wherein the image processing unit predicts, in a case where the received image is distorted by noise, an intensity of the noise in a region unit of the received image.
  9. The image processing apparatus according to claim 1, wherein the image processing unit performs the scaling for the received image according to the reference data corresponding to a noise characteristic and an image characteristic of the received image.
  10. The image processing apparatus according to claim 1, wherein the image processing unit inserts noise in a low resolution image of the plurality of learning images, predicts an intensity of the inserted noise in the low resolution image, removes the noise from the low resolution image, classifies the low resolution image and a corresponding high resolution image into a first class, of the plurality of first classes, according to the predicted intensity of the inserted noise, and classifies the low resolution image and the corresponding high resolution image into a second class, of the plurality of second classes, according to the image characteristic.
  11. An image processing method comprising:
    receiving an image;
    generating reference data according to a plurality of learning images classified into a plurality of first classes according to a noise characteristic and a plurality of second classes according to an image characteristic; and
    performing scaling for the received image according to the generated reference data.
  12. The method according to claim 11, wherein each of the plurality of first classes is classified into the plurality of second classes.
  13. The method according to claim 11, wherein the plurality of learning images comprises pairs of low resolution images and high resolution images corresponding to the low resolution images.
  14. The method according to claim 13, wherein the reference data comprises pairs of the low resolution images and corresponding weights which are set according to the image characteristic.
  15. The method according to claim 11, wherein the image characteristic comprises at least one of a high frequency component and an edge characteristic of the image.
EP10187171A 2009-12-10 2010-10-11 Image processing apparatus and method Withdrawn EP2352121A1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020090122717A KR20110065997A (en) 2009-12-10 2009-12-10 Image processing apparatus and method of processing image

Publications (1)

Publication Number Publication Date
EP2352121A1 true EP2352121A1 (en) 2011-08-03

Family

ID=43569175

Family Applications (1)

Application Number Title Priority Date Filing Date
EP10187171A Withdrawn EP2352121A1 (en) 2009-12-10 2010-10-11 Image processing apparatus and method

Country Status (3)

Country Link
US (1) US8805120B2 (en)
EP (1) EP2352121A1 (en)
KR (1) KR20110065997A (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101791919B1 (en) 2010-01-22 2017-11-02 톰슨 라이센싱 Data pruning for video compression using example-based super-resolution
JP5911809B2 (en) 2010-01-22 2016-04-27 トムソン ライセンシングThomson Licensing Sampling-based super-resolution video encoding and decoding method and apparatus
US9544598B2 (en) 2010-09-10 2017-01-10 Thomson Licensing Methods and apparatus for pruning decision optimization in example-based data pruning compression
US20130163661A1 (en) * 2010-09-10 2013-06-27 Thomson Licensing Video encoding using example - based data pruning
WO2012033971A1 (en) 2010-09-10 2012-03-15 Thomson Licensing Recovering a pruned version of a picture in a video sequence for example - based data pruning using intra- frame patch similarity
EP2614646A1 (en) * 2010-09-10 2013-07-17 Thomson Licensing Video encoding using block- based mixed - resolution data pruning
EP2671168A4 (en) 2011-02-03 2017-03-08 Voxeleron LLC Method and system for image analysis and interpretation
CN103235949B (en) * 2013-04-12 2016-02-10 北京大学 Image point of interest detection method and device
DE102014020074B3 (en) * 2013-11-25 2022-09-01 Canon Kabushiki Kaisha Image pickup device and image signal control method
KR101653038B1 (en) * 2014-05-12 2016-09-12 주식회사 칩스앤미디어 An apparatus for scaling a resolution using an image patch and method for using it
US9607359B2 (en) * 2014-12-29 2017-03-28 Kabushiki Kaisha Toshiba Electronic device, method, and computer program product
US10296605B2 (en) 2015-12-14 2019-05-21 Intel Corporation Dictionary generation for example based image processing
CN105847968B (en) * 2016-03-21 2018-12-21 京东方科技集团股份有限公司 Based on the solution of deep learning as method and system
CN106056562B (en) * 2016-05-19 2019-05-28 京东方科技集团股份有限公司 A kind of face image processing process, device and electronic equipment
KR102580519B1 (en) * 2016-09-07 2023-09-21 삼성전자주식회사 Image processing apparatus and recording media
US10552944B2 (en) * 2017-10-13 2020-02-04 Adobe Inc. Image upscaling with controllable noise reduction using a neural network
KR102452653B1 (en) 2018-02-20 2022-10-11 삼성전자주식회사 Electronic apparatus, method for processing image and computer-readable recording medium
KR102095518B1 (en) * 2018-10-11 2020-03-31 옥임식 Re-mastering system for high resolution conversion of image and a method thereof
CN110414456B (en) * 2019-07-30 2021-08-03 合肥师范学院 Signal distinguishing and extracting method based on expanded Rayleigh criterion
KR20210062477A (en) * 2019-11-21 2021-05-31 삼성전자주식회사 Electronic apparatus and control method thereof
KR20210078218A (en) * 2019-12-18 2021-06-28 삼성전자주식회사 Electronic apparatus and method of controlling the same
KR20210108027A (en) * 2020-02-25 2021-09-02 삼성전자주식회사 Electronic apparatus and control method thereof
KR102273377B1 (en) * 2020-12-14 2021-07-06 국방기술품질원 Method for synthesizing image
KR102467092B1 (en) * 2021-11-26 2022-11-16 블루닷 주식회사 Super-resolution image processing method and system robust coding noise using multiple neural networks

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1321896A2 (en) * 2001-12-21 2003-06-25 International Business Machines Corporation Method and circuits for scaling images using neural networks

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4670169B2 (en) * 2000-11-15 2011-04-13 ソニー株式会社 Information signal processing device, information signal processing method, image signal processing device and image display device using the same, coefficient seed data generation device used therefor, and information recording medium
US7171042B2 (en) * 2000-12-04 2007-01-30 Intel Corporation System and method for classification of images and videos
KR100904340B1 (en) * 2001-06-15 2009-06-23 소니 가부시끼 가이샤 Image processing apparatus and method and image pickup apparatus
JP4295612B2 (en) * 2001-07-12 2009-07-15 ディーエックスオー ラブズ Method and system for supplying formatted information to image processing means
US7130776B2 (en) * 2002-03-25 2006-10-31 Lockheed Martin Corporation Method and computer program product for producing a pattern recognition training set
JP4026491B2 (en) * 2002-12-13 2007-12-26 ソニー株式会社 Image signal processing apparatus, image signal processing method, program, and medium
JP4175124B2 (en) * 2003-01-24 2008-11-05 ソニー株式会社 Image signal processing device
JP4055655B2 (en) * 2003-05-29 2008-03-05 ソニー株式会社 Coefficient generation apparatus and generation method, class configuration generation apparatus and generation method, information signal processing apparatus, and program for executing each method
US7627198B2 (en) * 2003-05-29 2009-12-01 Sony Corporation Information signal processing device and processing method, codebook generating device and generating method, and program for executing the methods
US7679676B2 (en) * 2004-06-03 2010-03-16 Koninklijke Philips Electronics N.V. Spatial signal conversion
US7447382B2 (en) * 2004-06-30 2008-11-04 Intel Corporation Computing a higher resolution image from multiple lower resolution images using model-based, robust Bayesian estimation
US7809155B2 (en) * 2004-06-30 2010-10-05 Intel Corporation Computing a higher resolution image from multiple lower resolution images using model-base, robust Bayesian estimation
US7974476B2 (en) * 2007-05-30 2011-07-05 Microsoft Corporation Flexible MQDF classifier model compression
JP4882999B2 (en) * 2007-12-21 2012-02-22 ソニー株式会社 Image processing apparatus, image processing method, program, and learning apparatus
JP5061882B2 (en) * 2007-12-21 2012-10-31 ソニー株式会社 Image processing apparatus, image processing method, program, and learning apparatus
JP4803279B2 (en) 2009-05-01 2011-10-26 ソニー株式会社 Image signal processing apparatus and method, recording medium, and program

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1321896A2 (en) * 2001-12-21 2003-06-25 International Business Machines Corporation Method and circuits for scaling images using neural networks

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DAVID R ET AL: "Noise reduction and image enhancement using a hardware implementation of artificial neural networks", PROCEEDINGS OF SPIE, SPIE, USA, vol. 3728, 1 January 1999 (1999-01-01), pages 212 - 221, XP002345531 *
MADANI K ET AL: "IMAGE PROCESSING USING RBF LIKE NEURAL NETWORKS: A ZISC-036 BASED FULLY PARALLEL IMPLEMENTATION SOLVING REAL WORLD AND REAL COMPLEXITY INDUSTRIAL PROBLEMS", APPLIED INTELLIGENCE, KLUWER ACADEMIC PUBLISHERS, DORDRECHT, NL, vol. 18, no. 2, 1 March 2003 (2003-03-01), pages 195 - 213, XP009053983 *
MADANI K ET AL: "ZISC-036 neuro-processor based image processing", 20010101, vol. 2085, 1 January 2001 (2001-01-01), pages 200 - 207, XP002345532 *

Also Published As

Publication number Publication date
US20110142330A1 (en) 2011-06-16
US8805120B2 (en) 2014-08-12
KR20110065997A (en) 2011-06-16

Similar Documents

Publication Publication Date Title
EP2352121A1 (en) Image processing apparatus and method
EP2164040B1 (en) System and method for high quality image and video upscaling
JP6362333B2 (en) Image processing apparatus, image processing method, and program
US9305362B1 (en) Image stabilization
US8731337B2 (en) Denoising and artifact removal in image upscaling
US8369653B1 (en) System and method for image upsampling using natural image statistics of first and second derivatives
US10198801B2 (en) Image enhancement using self-examples and external examples
US20140354886A1 (en) Device, system, and method of blind deblurring and blind super-resolution utilizing internal patch recurrence
US11620480B2 (en) Learning method, computer program, classifier, and generator
US20060088209A1 (en) Video image quality
JP2009037597A (en) Method of filtering input image to create output image
US8730268B2 (en) Image processing systems and methods
US20120321214A1 (en) Image processing apparatus and method, program, and recording medium
US8571315B2 (en) Information processing apparatus, information processing method, and program
EP3846118A1 (en) Image display apparatus and image displaying method
Cao et al. Single image motion deblurring with reduced ringing effects using variational Bayesian estimation
KR102635355B1 (en) Directional scaling systems and methods
JP2001084368A (en) Data processor, data processing method and medium
US9154671B2 (en) Image processing apparatus, image processing method, and program
US8606031B2 (en) Fast, accurate and efficient gaussian filter
JP2002015327A (en) Image type discrimination device, image processor using the same, and image type discrimination method
JP2000348019A (en) Data processor, data processing method and medium
CN110580880A (en) RGB (red, green and blue) triangular sub-pixel layout-based sub-pixel rendering method and system and display device
Chen et al. An efficient reconfigurable architecture design and implementation of image contrast enhancement algorithm
US20180007239A1 (en) Signal correction method and apparatus, and terminal

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

17P Request for examination filed

Effective date: 20111130

17Q First examination report despatched

Effective date: 20120330

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: SAMSUNG ELECTRONICS CO., LTD.

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20141210