WO1998035500A1 - Method and apparatus for optimizing quantizer values in an image encoder - Google Patents

Method and apparatus for optimizing quantizer values in an image encoder Download PDF

Info

Publication number
WO1998035500A1
WO1998035500A1 PCT/US1998/001827 US9801827W WO9835500A1 WO 1998035500 A1 WO1998035500 A1 WO 1998035500A1 US 9801827 W US9801827 W US 9801827W WO 9835500 A1 WO9835500 A1 WO 9835500A1
Authority
WO
WIPO (PCT)
Prior art keywords
blocks
values
frame
quantization values
bits
Prior art date
Application number
PCT/US1998/001827
Other languages
French (fr)
Inventor
Jordi Ribas-Corbera
Shaw-Min Lei
Original Assignee
Sharp Kabushiki Kaisha
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/008,137 external-priority patent/US6111991A/en
Application filed by Sharp Kabushiki Kaisha filed Critical Sharp Kabushiki Kaisha
Priority to JP53480498A priority Critical patent/JP2001526850A/en
Publication of WO1998035500A1 publication Critical patent/WO1998035500A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/19Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding using optimisation based on Lagrange multipliers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • H04N19/126Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/147Data rate or code amount at the encoder output according to rate distortion criteria
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/149Data rate or code amount at the encoder output by estimating the code amount by means of a model, e.g. mathematical model or statistical model
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • H04N19/15Data rate or code amount at the encoder output by monitoring actual compressed data size at the memory before deciding storage at the transmission buffer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability

Definitions

  • the invention relates to computing quantization values used for encoding coeffients of a digital image or video frame and more particularly to optimizing the computed quantization values to reduce distortion in the digital image or video frame when encoding is performed with a limited number of bits.
  • the quality of encoded images is controlled by selecting one or more quality parameters.
  • Block-based image and video coders use a parameter known as a quantization scale or step for each block of pixels in the image.
  • the quantization steps are used for scaling pixel values within the same step ranges to the same values.
  • Image blocks encoded with the same quantization scale have approximately the same quality.
  • the number of bits needed for encoding an image depends on desired image quality (quantization scales) and on the inherent statistics of the image. As a result, different images encoded with the same scales (same image quality) will occupy a different number of bits.
  • the number of bits available for encoding one or several frames is fixed in advance, and some technique is necessary to select the quantization scales that will produce that target number of bits and encode the video frames with the highest possible quality. For example, in a digital video recording, a group of frames (GOP) must occupy the same number of bits for an efficient fast-forward/fast-rewind capability.
  • the channel rate, communication delay, and size of encoder buffers determine the available number of bits for one or more frames.
  • a first type of quantizer control method encodes each image block several times with a set of quantization scales. The number of bits produced for each case is measured and a scale for each block is smartly selected so the sum of the bits for all combined blocks hits the desired target bit number.
  • the first type of quantizer control techniques cannot be used for real-time encoding because of the high computational complexity required to encode each image block multiple times.
  • a second type of quantizer control technique measures the number of bits spent in previously encoded image blocks and measures other parameters such as, buffer fullness, block activity, etc. These measurements are used to select the quantization scale for the current block.
  • the second type of quantizer control is popular for real-time encoding because of its low computational complexity. However, the second type of quantizer control is inaccurate in achieving the target number of bits and must be combined with additional encoding techniques to avoid bit or buffer overflow and underflow.
  • a third type of quantizer control technique uses a model to predict the number of bits needed for encoding the image blocks.
  • the quantizer model includes the blocks' quantization scales and other parameters, such as, block variances.
  • the quantization scales are determined by some mathematical optimization of the encoder model.
  • the third type of quantizer control is computationally simple and can be used in real-time, but is highly sensitive to model errors and often produces inaccurate results.
  • a quantizer controller generates quantization values using a new block- adaptive, Lagrangian optimization.
  • the quantizer controller is updated and improved using information from earlier quantized blocks.
  • the quantizer controller is robust to model errors and produces results as accurate as type-1 quantizer control techniques, while having the simpler computational complexity of the type-2 quantizer control techniques.
  • the quantizer controller identifies a target bit value equal to a total number of bits available for encoding the frame.
  • a total amount of distortion in the frame is modeled according to the predicted quantization values assigned to each one of the blocks.
  • the predicted quantization values are characterized according to an amount of energy in each block and a number of bits available for encoding each block.
  • Optimum quantization values are adapted to each block by minimizing the modeled distortion in the frame subject to the constraint that the total number of bits for encoding the frame is equal to the target bit value.
  • Each block is then encoded with the optimized quantization value.
  • the quantizer controller is adaptive to each block by reducing quantization values for the blocks having less energy and increasing the quantization values for the blocks having more energy.
  • the quantization values assigned to the blocks are also optimized according to a number of image blocks remaining to be encoded and a number of bits still available for encoding the remaining image blocks.
  • Different weighting factors are optionally applied to the quantization values that vary the accuracy of the encoded blocks.
  • One weighting factor is applied to the quantization values according to the location of the block in the frame.
  • Optimized quantization values are applied to blocks in each frame, frames in a group of multiple frames or applied generally for any region in an array of image data.
  • the quantizer controller only encodes the image once to accurately generate the quantization values for each block.
  • the quantization values produce a target number of bits for the encoded image or video frame.
  • the quantizer controller is less computationally exhaustive than a quantizer control technique of similar accuracy.
  • the general framework of the quantizer controller can be used in a variety of quantizer/rate control strategies.
  • the quantizer controller can be used to select in real-time the value of the quantization scales for the Discrete
  • FIG. 1 is a schematic diagram of multiple image frames each including multiple blocks assigned optimized quantization values according to the invention.
  • FIG. 2 is a block diagram of an image coder according to one embodiment of the invention.
  • FIG. 3 is a step diagram for generating the optimized quantization values.
  • FIGS. 4 and 5 show results from applying the optimized quantization values to image data.
  • FIG. 6 is a block diagram of the quantizer controller according to one embodiment of the invention.
  • a block-based image coder 12 is used to describe the invention. However, the invention can be used for controlling the quantizer of any image or video coder.
  • images 15 are transmitted in multiple frames 26.
  • Each frame 26 is decomposed into multiple image blocks 14 of the same size, typically of 16x16 pixels per block.
  • the number of bits B ; produced after encoding an ith image block 14, is a function of the value of a quantization parameter Q; and the statistics of the block.
  • the pixel values for each image block 14 are transformed into a set of coefficients, for example using a Discrete Cosine Transform (DCT) in block transform 16. These coefficients are quantized in block quantization 18 and encoded in coder 20.
  • Bits B, of the encoded and quantized image blocks 14 are then transmitted over a communication channel 21 over a telephone line, microwave channel, etc. to a receiver (not shown).
  • the receiver includes a decoder that decodes the quantized bits and an inverse transform block that performs an Inverse Discrete Cosine Transform (IDCT).
  • IDCT Inverse Discrete Cosine Transform
  • Quantization of the transformed coefficients in quantization block 18 is a key procedure since it determines the quality with which the image block 14 will be encoded.
  • the quantization of the ith block 14 is controlled by the parameter, Q j .
  • Q f is known as the quantization step for the ith block and its value corresponds to half the step size used for quantizing the transformed coefficients.
  • Q ( is called the quantization scale and the jth coefficient of a block is quantized using a quantizer of step size QjW, , where w, is the jth value of a quantization matrix chosen by the designer of the MPEG codec.
  • N the number of 16x16 image blocks in one image frame 26.
  • the total number of bits B available for encoding one image frame 26 is:
  • the invention comprises a quantizer controller 22 (FIG. 1) that chooses optimum values for the Q s for a limited total number of available bits B for encoding the frames 26.
  • the quantizer controller 22 is implemented in a variety of different maps including in software in a programmable processing unit with dedicated hardware.
  • the image blocks 14 are said to be intracoded or of class intra.
  • many of the blocks 14 in a frame 26 are very similar to blocks in previous frames.
  • the values of the pixels in a block 14 are often predicted from previously encoded blocks and only the difference or prediction error is encoded.
  • These blocks are said to be intercoded or of class inter.
  • the invention can be used in frames with both intra and inter blocks.
  • the value Q is the quantizer step size or quantization scale
  • K and C are constants
  • ⁇ ( . is the empirical standard deviation of the pixels in the block
  • the value P-(j) is the jth pixel in the ith block and P. is the average of the pixel values in the block where,
  • the P-(j) 's are the values of the luminance and chrominance components of the respective pixels.
  • the model in equation 2 is derived using a rate-distortion analysis of the block's encoder.
  • the constant C in equation 2 models the average number of bits per pixel used for encoding the coder's overhead.
  • C accounts for header and syntax information, pixel color or chrominance components, transmitted Q values, motion vectors, etc. sent to the receiver for decoding the image blocks. If the values of K and C are not known, they are estimated with an inventive technique described below in the section entitled, "Updating the Parameters of the Encoder Model".
  • Equation 5 models distortion D for the N encoded blocks
  • the quantizer controller 22 selects the optimal quantization values, Q*,, Q* 2 , ... , Q* N , that minimize the distortion model in equation 5, subject to the constraint that the total number of bits must be equal to B as defined in equation 1, which can be expressed mathematically as follows:
  • the next objective is to find a formula for each of the Q * 's. To do this, the method
  • the optimal quantization parameter for the ith block is,
  • N, N-i+1 is the number of image blocks that remain to be encoded and B, is the number of bits available to encode them,
  • Equation 6 generate optimized quantization values that minimize distortion for a limited number of available bits.
  • the image in frame 26 in FIG. 1 will have less distortion than other quantization schemes when displayed on a display unit at the receiver end of the channel 21.
  • FIG. 3 describes the steps performed by quantizer controller 22 (FIG. 2) for selecting quantizer values used for encoding N image blocks 14 with B bits. Note that N could be the number of blocks in an image, part of an image, several images, or generally any region of an image.
  • Step 1. Receive energy values and initialization.
  • Pixel values for the N image blocks are obtained to the quantizer controller 22 from the digital image (FIG. 2) in step 1 A.
  • the amount of energy ⁇ ,. is derived from the DCT coefficients of the pixel values generated by transform block 16.
  • the values of the parameters K and C in the encoder model in equation 7 are known or estimated in advance.
  • Step 2 Compute the optimal quantization parameter for the ith block.
  • Step 4. Update quantizer values.
  • step 4 the parameters Kj +1 and C i+1 are updated in the quantizer controller 22.
  • Kj +] K
  • C i+1 C.
  • the updates K j+1 and C i+ are found using a model fitting technique.
  • a model fitting technique is described below in the section entitled "Updating
  • Step 5 Generate quantizer value for next block.
  • FIGS. 4 and 5 the frames of video sequences encoded by quantizer controller 22 where compared to those of a Telenor H.263 offline method, which is the quantizer control technique adopted for MPEG-4 anchors.
  • FIG. 4 the total number of bits per video frame obtained by the quantization technique described in FIG. 3 are shown in solid line.
  • the H.263 offline encoding technique is shown in dashed line. Encoding was performed on 133 frames of a well-known video sequence "Foreman".
  • the target number of bits B is 11200 bits per frame.
  • the quantizer controller 22 produces a significantly more accurate and steady number of bits per frame. Similar results were obtained for a wide range of bit rates. In the experiments, there were little if no visible differences in the quality of the two encoded video sequences. The signal to noise ratio performance of the images processed by quantizer controller 22 was only 0.1-0.3 dB lower on average. Thus, even though the image is only encoded once, quantizer controller 22 achieves the target bit rate accurately with high image quality at every frame.
  • Alternative Implementations Several quantization variations are based on the base quantization optimization framework discussed above. If the computation of all ⁇ A 's in Step IB of FIG. 3 cannot be performed in advance, a good estimate for S, is used, such as the value of S, from the previous video frame 26.
  • equation 9 A low-complexity estimate of S, can be used, in order to further reduce computational complexity.
  • equation 9 For the low complexity estimate, equation 3 is replaced by equation 9,
  • A is the number of pixels in the jth region.
  • the region of quantization does not need to be a block. Additional model parameters ⁇ and ⁇ can either be set prior to quantization or obtained using parameter estimation techniques described below. If the quantization model in equation 10 is used in step 2, the optimized quantization values Q * 's are derived using equation 11,
  • step 3 S 1+1 is replaced
  • performance of the quantizer controller 22 can be improved by dividing the standard deviation of the intra blocks by a factor f .
  • the factor -J$ is applied as follows:
  • the values K 7 and K P are the averages of the K's measured for the intra and inter blocks, respectively.
  • the same quantizer controller 22 shown in FIG.1 can be used for encoding one or several frames.
  • the parameters ⁇ , , ⁇ ; , B( , are the weight, variance, and bits for the ith frame, respectively.
  • the parameters K, and C, are updates of the coder model for that frame.
  • each image block 14 can be encoded several times and, using a classical model fitting procedure (e.g., least- squares fit, linear regression, etc.), a good estimate of the K,'s and C,'s for the blocks can be found in advance. Then, in step 2, the quantization values Q * are determined according to,
  • model parameters in one embodiment of the invention are updated or estimated on a block-by-block basis using the following weighted averages,
  • B DCT>1 is the number of bits spent for the DCT coefficients of the i-th image
  • the updates are a linearly weighted average of K j , C j and their respective initial estimates K hin C,
  • the values of the quantizer values used Q*,, ..., Q* N need to be encoded and sent to the decoder.
  • quantizer values are encoded in a raster-scan order and there is a five-bit penalty for changing the quantizer value.
  • bit overhead for changing the quantizer values is negligible and the optimization techniques described above are effective.
  • this overhead is significant and some technique is needed for constraining the number of times that the quantizer changes.
  • existing optimization methods that take quantization overhead into account are mathematically inaccurate or computationally expensive.
  • a heuristic method joins blocks of similar standard deviation together into a set so that the quantizer value remains constant within the set. This technique is referred to as block joining, and reduces the changes of the quantizer at lower bit rates. Block joining is accomplished by choosing the values of the ⁇ ,. weights as follows,
  • B/(AN) is the bit rate in bits per pixel for the current frame.
  • FIG. 6 is a detailed block diagram of the quantizer controller 22 shown in FIG. 2.
  • the quantizer controller 22 in one embodiment is implemented in a general purpose programmable processor.
  • the functional blocks in FIG. 6 represent the primary operations performed by the processor.
  • Initialization parameters in block 31 are either derived from pre-processing the current image or from parameters previously derived from previous frames or from prestored values in processor memory (not shown). Initialization parameters include N,, S-, B,, K, and C, (or K N+1 and C N+ , from a previous frame).
  • the image is decomposed into N image blocks 14 (FIG. 2) of A pixels in block 30.
  • the energy of the pixels in each block is computed in block 32.
  • the weight factors assigned to each block are computed in block 34.
  • the amount of energy left in the image is updated in block 40 and the bits left for encoding the image are updated in block 38.
  • Parameters for the encoder model are updated in block 42 and the number of blocks remaining to be encoded are tracked in block 44.
  • the processor in block 36 then computes the optimized quantizer step size according to the values derived in blocks 32, 34, 40, 38, 42 and 44.

Abstract

A quantizer controller (22) identifies a target bit value equal to a total number of bits available for encoding a frame. A total amount of distortion in the frame is modeled according to predicted quantization values for each one of the blocks. The predicted quantization values are characterized according to an amount of energy in each block and a number of bits available for encoding each block. Quantization values for each block are optimized by minimizing the modeled distortion in the frame subject to the constraint that the total number of bits for encoding the frame is equal to the target bit value. Each block is then encoded with the optimized quantization value.

Description

METHOD AND APPARATUS FOR OPTIMIZING QUANTIZER VALUES IN AN IMAGE ENCODER
BACKGROUND OF THE INVENTION The invention relates to computing quantization values used for encoding coeffients of a digital image or video frame and more particularly to optimizing the computed quantization values to reduce distortion in the digital image or video frame when encoding is performed with a limited number of bits.
In many of today's image and video coders, the quality of encoded images is controlled by selecting one or more quality parameters. Block-based image and video coders, use a parameter known as a quantization scale or step for each block of pixels in the image. The quantization steps are used for scaling pixel values within the same step ranges to the same values. Image blocks encoded with the same quantization scale have approximately the same quality. The number of bits needed for encoding an image depends on desired image quality (quantization scales) and on the inherent statistics of the image. As a result, different images encoded with the same scales (same image quality) will occupy a different number of bits.
In many applications, the number of bits available for encoding one or several frames is fixed in advance, and some technique is necessary to select the quantization scales that will produce that target number of bits and encode the video frames with the highest possible quality. For example, in a digital video recording, a group of frames (GOP) must occupy the same number of bits for an efficient fast-forward/fast-rewind capability. In video telephony, the channel rate, communication delay, and size of encoder buffers determine the available number of bits for one or more frames.
Existing quantizer or buffer control methods are classified into three major types. A first type of quantizer control method encodes each image block several times with a set of quantization scales. The number of bits produced for each case is measured and a scale for each block is smartly selected so the sum of the bits for all combined blocks hits the desired target bit number. The first type of quantizer control techniques cannot be used for real-time encoding because of the high computational complexity required to encode each image block multiple times.
The first type of quantizer control is described in the following publications: K. Ramchandran, A. Ortega, and M. Vetterli, "Bit Allocation for Dependent Quantization with Applications to Multi-Resolution and MPEG Video Coders," IEEE Trans, on Image Processing, Vol. 3, N. 5, pp. 533-545, September 1994; W. Ding and B. Liu, "Rate Control of MPEG Video Coding and Recording by Rate-Quantization Modeling," IEEE Trans, on Circuits and Systems for Video Technology, Vol. 6, N. 1, pp. 12-19, February 1996; and L.J. Lin, A. Ortega, and C.C. J. Kuo, "Rate Control Using Spline-Interpolated R-D Characteristics," Proc. of SPIE Visual Communications and Image Processing, pp. 111-122, Orlando, FL, March 1996.
A second type of quantizer control technique measures the number of bits spent in previously encoded image blocks and measures other parameters such as, buffer fullness, block activity, etc. These measurements are used to select the quantization scale for the current block. The second type of quantizer control is popular for real-time encoding because of its low computational complexity. However, the second type of quantizer control is inaccurate in achieving the target number of bits and must be combined with additional encoding techniques to avoid bit or buffer overflow and underflow.
The second method is described in the following publications: U.S. Patent No. 5,038,209 entitled "Adaptive Buffer/Quantizer Control for Transform Video Coders", issued August 6, 1991 to H.M. Ming; U.S. Patent No. 5,159,447 entitled "Buffer Control for Variable Bit-Rate Channel", issued October 27, 1992 to B.G. Haskell and A.R. Reibman; and U.S. Patent No. 5,141,383 entitled "Pseudo- Constant Bit Rate Video Coding with Quantization Parameter Adjustment", issued August 31, 1993 to C.T. Cheng and A.H. Wong.
A third type of quantizer control technique uses a model to predict the number of bits needed for encoding the image blocks. The quantizer model includes the blocks' quantization scales and other parameters, such as, block variances. The quantization scales are determined by some mathematical optimization of the encoder model. The third type of quantizer control is computationally simple and can be used in real-time, but is highly sensitive to model errors and often produces inaccurate results.
The third type of quantizer control technique is described in the following publications. E.D. Frimout, J. Biemond, and R.L. Lagendik, "Forward Rate Control for MPEG Recording," Proc. of SPIE Visual Communications and Image Processing, Cambridge, MA, pp. 184-194, November 1993; U.S. Patent No. 5,323,187 entitled, "Image Compression System by Setting Fixed Bit Rates", issued June 21, 1994 to K. Park and A. Nicoulin, "Composite Source Modeling for Image Compression," Ph.D. Thesis N. 1444 (1995), Ecole Polytechnique Federale de Lausanne, Chapter 6,1995.
Thus, a need remains for improving the image quality of quantized image or video frames while reducing the time and computational complexity required to generate optimized quantization values.
SUMMARY OF THE INVENTION A quantizer controller generates quantization values using a new block- adaptive, Lagrangian optimization. The quantizer controller is updated and improved using information from earlier quantized blocks. The quantizer controller is robust to model errors and produces results as accurate as type-1 quantizer control techniques, while having the simpler computational complexity of the type-2 quantizer control techniques.
The quantizer controller identifies a target bit value equal to a total number of bits available for encoding the frame. A total amount of distortion in the frame is modeled according to the predicted quantization values assigned to each one of the blocks. The predicted quantization values are characterized according to an amount of energy in each block and a number of bits available for encoding each block. Optimum quantization values are adapted to each block by minimizing the modeled distortion in the frame subject to the constraint that the total number of bits for encoding the frame is equal to the target bit value. Each block is then encoded with the optimized quantization value.
The quantizer controller is adaptive to each block by reducing quantization values for the blocks having less energy and increasing the quantization values for the blocks having more energy. The quantization values assigned to the blocks are also optimized according to a number of image blocks remaining to be encoded and a number of bits still available for encoding the remaining image blocks.
Different weighting factors are optionally applied to the quantization values that vary the accuracy of the encoded blocks. One weighting factor is applied to the quantization values according to the location of the block in the frame. Optimized quantization values are applied to blocks in each frame, frames in a group of multiple frames or applied generally for any region in an array of image data. The quantizer controller only encodes the image once to accurately generate the quantization values for each block. The quantization values produce a target number of bits for the encoded image or video frame. Thus, the quantizer controller is less computationally exhaustive than a quantizer control technique of similar accuracy.
The general framework of the quantizer controller can be used in a variety of quantizer/rate control strategies. For example, the quantizer controller can be used to select in real-time the value of the quantization scales for the Discrete
Cosine Transform DCT-based encoding of the frame macroblocks in the current video coding standards MPEG 1-2 and 4, H.261, H.263, and H.263+. A frame, several frames, or several macroblocks within a frame are encoded with a fixed number of bits. The foregoing and other objects, features and advantages of the invention will become more readily apparent from the following detailed description of a preferred embodiment of the invention, which proceeds with reference to the accompanying drawings. BRIEF DESCRIPTION OF DRAWINGS
FIG. 1 is a schematic diagram of multiple image frames each including multiple blocks assigned optimized quantization values according to the invention. FIG. 2 is a block diagram of an image coder according to one embodiment of the invention.
FIG. 3 is a step diagram for generating the optimized quantization values. FIGS. 4 and 5 show results from applying the optimized quantization values to image data.
FIG. 6 is a block diagram of the quantizer controller according to one embodiment of the invention.
DETAILED DESCRIPTION A block-based image coder 12 is used to describe the invention. However, the invention can be used for controlling the quantizer of any image or video coder. Referring to FIG. 1, in block-based image coding, images 15 are transmitted in multiple frames 26. Each frame 26 is decomposed into multiple image blocks 14 of the same size, typically of 16x16 pixels per block. The number of bits B; produced after encoding an ith image block 14, is a function of the value of a quantization parameter Q; and the statistics of the block. For example, image block i = 9 contains more image information (energy) σ(. than image block i = 17. This is because the image in block i = 9 contains portions of a facial image along with background information. Conversely, image block i = 17 has less image information energy σ; because it contains substantially the same background imagery in substantially each pixel location. Referring to FIG. 2, the pixel values for each image block 14 are transformed into a set of coefficients, for example using a Discrete Cosine Transform (DCT) in block transform 16. These coefficients are quantized in block quantization 18 and encoded in coder 20. Bits B, of the encoded and quantized image blocks 14 are then transmitted over a communication channel 21 over a telephone line, microwave channel, etc. to a receiver (not shown). The receiver includes a decoder that decodes the quantized bits and an inverse transform block that performs an Inverse Discrete Cosine Transform (IDCT). The decoded bits B, are then displayed on a visual display screen to a user.
Quantization of the transformed coefficients in quantization block 18 is a key procedure since it determines the quality with which the image block 14 will be encoded. The quantization of the ith block 14 is controlled by the parameter, Qj. In the H.261 and H.263 video coding standards, Qf is known as the quantization step for the ith block and its value corresponds to half the step size used for quantizing the transformed coefficients. In the MPEG-1 and MPEG-2 standards, Q( is called the quantization scale and the jth coefficient of a block is quantized using a quantizer of step size QjW, , where w, is the jth value of a quantization matrix chosen by the designer of the MPEG codec.
Let N be the number of 16x16 image blocks in one image frame 26. The total number of bits B available for encoding one image frame 26 is:
B : B| + B2 ~n~ B3 + ... + B N» 0)
where the value of B depends on the quantization parameters selected, Q,, Q2, ..., QN and the statistics of the blocks. The invention comprises a quantizer controller 22 (FIG. 1) that chooses optimum values for the Q s for a limited total number of available bits B for encoding the frames 26. The quantizer controller 22 is implemented in a variety of different maps including in software in a programmable processing unit with dedicated hardware.
In image coding, the image blocks 14 are said to be intracoded or of class intra. In video coding, many of the blocks 14 in a frame 26 are very similar to blocks in previous frames. The values of the pixels in a block 14 are often predicted from previously encoded blocks and only the difference or prediction error is encoded. These blocks are said to be intercoded or of class inter. The invention can be used in frames with both intra and inter blocks. Encoder Model
The following model in equation 2 identifies the number of bits invested in the ith image block :
Bi = A( K^T + C ), (2)
The value Q, is the quantizer step size or quantization scale, A is the number of pixels in a block (e.g., in MPEG and H.263 A = 162 pixels), K and C are constants, and σ(. is the empirical standard deviation of the pixels in the block,
°- = iP,ω-p,y' ■ (3)
The value P-(j) is the jth pixel in the ith block and P. is the average of the pixel values in the block where,
(4)
Α ,=1
For color images, the P-(j) 's are the values of the luminance and chrominance components of the respective pixels. The model in equation 2 is derived using a rate-distortion analysis of the block's encoder. The value of K in equation 2 depends on the statistics of the image blocks 26 and the quantization matrix used in the encoder. For example, it can be shown that if the pixel values are approximately uncorrelated and Gaussian distributed, and the quantization matrix is flat with unitary weights (i.e., Wj =1 for all j), then K=π /ln2. The constant C in equation 2 models the average number of bits per pixel used for encoding the coder's overhead. For example, C accounts for header and syntax information, pixel color or chrominance components, transmitted Q values, motion vectors, etc. sent to the receiver for decoding the image blocks. If the values of K and C are not known, they are estimated with an inventive technique described below in the section entitled, "Updating the Parameters of the Encoder Model".
Equation 5 models distortion D for the N encoded blocks,
1 N D = — Vα ? (5)
N ' 12 where the α j's are weights chosen to incorporate the importance or cost of the block distortion. For example, larger α ,'s are chosen for image blocks having artifacts more visible to the human eye or for image blocks that belong to more important objects in the scene. If α , =α 2 = ... =α N = 1, the distortion represented by equation 5 is approximately the mean squared error (MSE) between the original and encoded blocks. Optimization
The quantizer controller 22 (FIG. 1) selects the optimal quantization values, Q*,, Q*2, ... , Q*N, that minimize the distortion model in equation 5, subject to the constraint that the total number of bits must be equal to B as defined in equation 1, which can be expressed mathematically as follows:
1 A Q2 Q; ,... ,Q = argmin — ∑α,2 -j- (6a) N l 12
Figure imgf000010_0001
The next objective is to find a formula for each of the Q* 's. To do this, the method
of Lagrange is used to convert the constrained-minimization in equation (6a) to the
following:
(6b)
Figure imgf000010_0002
where λ is called the Lagrange multiplier. Next, equation (2) is used for B: in (6b)
to obtain:
Figure imgf000011_0001
Finally, by setting partial derivatives in (6c) to zero, the following formula is
derived for the optimal quantizer step size for the i-th image block:
Figure imgf000011_0002
Moreover, if i-1 blocks 26 have already been quantized and encoded, the optimal quantization parameter for the ith block is,
Figure imgf000011_0003
where N, = N-i+1 is the number of image blocks that remain to be encoded and B, is the number of bits available to encode them,
Figure imgf000011_0004
where B,., is obtained using equation 2 with the optimized quantization value Q*_, . Thus, equations 6 and 7 generate optimized quantization values that minimize distortion for a limited number of available bits. As a result, using the same number of bits, the image in frame 26 in FIG. 1 will have less distortion than other quantization schemes when displayed on a display unit at the receiver end of the channel 21. QUANTIZER CONTROL METHOD
FIG. 3 describes the steps performed by quantizer controller 22 (FIG. 2) for selecting quantizer values used for encoding N image blocks 14 with B bits. Note that N could be the number of blocks in an image, part of an image, several images, or generally any region of an image. Step 1. Receive energy values and initialization.
Pixel values for the N image blocks are obtained to the quantizer controller 22 from the digital image (FIG. 2) in step 1 A. Initialization is performed in step IB by setting i = 1 (first block), B, = B (available bits), N, = N (number of
N blocks). Let S, = ∑αAσA , where the σ^ 's are found using equation 3 and the
*=1 k 's are preset (e.g., set α , =α 2 = ... =α N = 1 to minimize mean squared error).
In one example, the amount of energy σ,. is derived from the DCT coefficients of the pixel values generated by transform block 16. For a fixed mode, the values of the parameters K and C in the encoder model in equation 7 are known or estimated in advance. For example, using linear regression, K, = K and C, = C. For an adaptive mode, the model parameters are not known, K, and C, are then set to some small non-negative values. For example, experiments have shown K, = 0.5 and C, = 0 to be good initial estimates. In video coding, K, and C, can be set to the values KN+I and CN+I , respectively, from the previous encoded frame.
Step 2. Compute the optimal quantization parameter for the ith block.
If the values of the Q-parameters are restricted to a fixed set (e.g., in
H.263, QP=Q/2 and takes values 1,2,3,..., 31,), Q7 is rounded to the nearest value in the set. The square root operation is then implemented using look-up tables, where
AK; σ
Q; = ( ANiC α, Step 3. Encode the ith block with a block-based coder.
B[ is the number of bits used to encode the ith block, compute
B1+1 = B, - B; , Sl+I = S, -α ,σ, , and Nw = 1^ - 1.
Step 4. Update quantizer values.
In step 4, the parameters Kj+1 and Ci+1 are updated in the quantizer controller 22. For the fixed mode, Kj+] = K, Ci+1 = C. For the adaptive mode, the updates Kj+1 and Ci+, are found using a model fitting technique. One example of a model fitting technique is described below in the section entitled "Updating
Parameters in the Quantizer controller".
Step 5. Generate quantizer value for next block.
If i = N in decision step 5, quantization values have been derived for all image blocks 14 and the quantizer controller 22 stops. If all of the image blocks 14 have not been quantized, the quantizer controller 22 receives the coefficients for the next image block i = i+1 in step 6, and jumps back to step 2. The quantization value for block i = i+1 are then derived as described above.
Referring to FIGS. 4 and 5, the frames of video sequences encoded by quantizer controller 22 where compared to those of a Telenor H.263 offline method, which is the quantizer control technique adopted for MPEG-4 anchors. In FIG. 4, the total number of bits per video frame obtained by the quantization technique described in FIG. 3 are shown in solid line. The H.263 offline encoding technique is shown in dashed line. Encoding was performed on 133 frames of a well-known video sequence "Foreman". The target number of bits B is 11200 bits per frame. FIG. 5 is like FIG. 4, but for 140 frames of the video sequence "Mother and Daughter" with B=6400.
The quantizer controller 22 produces a significantly more accurate and steady number of bits per frame. Similar results were obtained for a wide range of bit rates. In the experiments, there were little if no visible differences in the quality of the two encoded video sequences. The signal to noise ratio performance of the images processed by quantizer controller 22 was only 0.1-0.3 dB lower on average. Thus, even though the image is only encoded once, quantizer controller 22 achieves the target bit rate accurately with high image quality at every frame. Alternative Implementations Several quantization variations are based on the base quantization optimization framework discussed above. If the computation of all σA 's in Step IB of FIG. 3 cannot be performed in advance, a good estimate for S, is used, such as the value of S, from the previous video frame 26.
A low-complexity estimate of S, can be used, in order to further reduce computational complexity. For the low complexity estimate, equation 3 is replaced by equation 9,
σ, = γ∑βte(W) - J, /) , (9)
where abs(x) is the absolute value of x. In video coding, the mean value of pixels in inter blocks is usually zero and hence equation 9 may be simplified by setting
A fixed optimization selects the quantization parameters using equation 6 instead of equation 7. To do this, in step 3 in FIG. 3, the values for B]+1 , S1+1 , and NI+, are replaced by B1+1 = B, S1+) = S, , and N1+1 = N , respectively.
For a variable-rate channel, if the number of bits available after encoding i blocks changes to B , because of a change of channel rate or other factors, set
Bl+1 = B in step 3.
The quantization model defined in equation 2 can be generalized to equation 10,
Figure imgf000014_0001
where A, is the number of pixels in the jth region. The region of quantization does not need to be a block. Additional model parameters φ and γ can either be set prior to quantization or obtained using parameter estimation techniques described below. If the quantization model in equation 10 is used in step 2, the optimized quantization values Q* 's are derived using equation 11,
Figure imgf000015_0001
2Y in step 1, S, is replaced by S, = ∑(A„σ * )γ +2 α,]'+2 . In step 3, S1+1 is replaced
«=1
by S,+1 = S, - (A,.σ: φ)^γ +2α γ +2
Encoding Intra and Inter Blocks.
If some of the blocks to be encoded are of class intra (in the same frame) and some ter (between different frames), performance of the quantizer controller 22 can be improved by dividing the standard deviation of the intra blocks by a factor f . Specifically, after computing the value for the σk 's in step 1, the factor -J$ is applied as follows:
if kth block is intra
Figure imgf000015_0002
otherwise
The factor β is,
2 K,
The values K7 and KP are the averages of the K's measured for the intra and inter blocks, respectively. The value of β is estimated and updated during encoding. During experimentation it was found that using a constant β = 3 works well. Frame-Based Quantizer Control.
If the quantization step is fixed for all the blocks 14 within a frame 26, the same quantizer controller 22 shown in FIG.1 can be used for encoding one or several frames. The parameters are reinterpreted so that: N = Number of frames, B = Bits available for encoding the N frames, i = Frame number in the video sequence, Q, = Quantization step for all the blocks in the ith frame, and A = Number of pixels in a frame.
The parameters α, ,σ; , B( , are the weight, variance, and bits for the ith frame, respectively. The parameters K, and C, are updates of the coder model for that frame.
If computational complexity is not an issue, each image block 14 can be encoded several times and, using a classical model fitting procedure (e.g., least- squares fit, linear regression, etc.), a good estimate of the K,'s and C,'s for the blocks can be found in advance. Then, in step 2, the quantization values Q* are determined according to,
Figure imgf000016_0001
N
In Step 1, S, is replaced by S, = ∑^K^σ^α^ , and in step 3, S1+) is replaced by k=\
S1+1 = S, - Λ κ σ1α1 . To reduce the complexity, one can set C=C =0 and avoid
the updating and computation of this model parameter. In that case, observe that
Q* in step 2 is simply,
' B. α. '
or equivalently, Q; = Jy- , where .
Figure imgf000017_0001
Any subset of the different techniques described in "Alternative Implementations" can be combined and used together. Updating Parameters of the Encoder Model. The following is one technique for updating the parameters K1+1 and C1+1 in the quantizer controller 22. This update technique is used with the adaptive mode described above in step 4. Other classical parameter estimation or model fitting techniques such as least squares, recursive least squares, Kalman prediction, etc. can alternatively be used. The model parameters can be updated in every block, frame, group of blocks, or group of frames.
The model parameters in one embodiment of the invention, are updated or estimated on a block-by-block basis using the following weighted averages,
(B; - A C,) Q *2 B' σ;
K. = and C, K
Aσ,2 ' A ' Q *2 -
The values of K and C predict B[ using equation 2. Alternatively, in some codecs
these formulas are used for measuring K, and C, :
BDCT Q, t — BDCT v = _ — s an(i c,
Acs 2 ' ' A
where B DCT>1 is the number of bits spent for the DCT coefficients of the i-th image
block.
The averages of the K,' s and C,'-? are computed as follows,
K, = — K, , + -K, , and C, = — C, , + -C, . i i i i The updates are a linearly weighted average of Kj , Cj and their respective initial estimates K„ C,,
N - i N - i
K i+1 K; + K, , and C; :C; + - C,
N N N N
If the general model in equation 10 is used, a variety of estimators can be used to
estimate the additional parameters ψ and γ . These parameters can also be
updated on a block-by-block basis, and the ith updates φ(. and γ . can be found
using averaging techniques similar to those for Kj, C, .
Selection of the α, Weights.
The α j values can be chosen to incorporate the importance or weight of block distortions. If default values are used α , =α 2 = ... =α N = 1, the MSE distortion is minimized between the original and the encoded blocks. Otherwise, the MSE distortion decreases in blocks with larger α j's and increases where the α j's are smaller. Two examples are described below for choosing the α f weights. A region, such as a rectangular window in a video telephone image, is assigned a larger value of α ; and, in turn, smaller quantization values. The weighted region will be coded with better quality since a smaller quantization scale is used to quantize pixel values. People usually pay more attention to the central region of a picture. Thus, larger values of α ; are assigned to the regions near the center of the picture. A pyramid formula is used to assign larger values of α , to blocks closer to the center of the frame. Specifically, let Bx and BY be the number of blocks along the horizontal and vertical coordinates, respectively. The weight for the i-th block is computed as follows,
Figure imgf000018_0001
Figure imgf000018_0002
where (a, + a and z^ are the height and offset of the pyramid, respectively, and bx and bγ are the horizontal and vertical position of the block in the frame. For example, choosing a,=15 and a^l causes the α ; value of the center block to be 16 times that of boundary blocks. Block Joining
In a codec, the values of the quantizer values used Q*,, ..., Q*N (see step 2 above) need to be encoded and sent to the decoder. For example, in H.263, quantizer values are encoded in a raster-scan order and there is a five-bit penalty for changing the quantizer value. At high bit rates, the bit overhead for changing the quantizer values is negligible and the optimization techniques described above are effective. However, at very-low bit rates, this overhead is significant and some technique is needed for constraining the number of times that the quantizer changes. Unfortunately, existing optimization methods that take quantization overhead into account are mathematically inaccurate or computationally expensive.
In another aspect of the invention, a heuristic method joins blocks of similar standard deviation together into a set so that the quantizer value remains constant within the set. This technique is referred to as block joining, and reduces the changes of the quantizer at lower bit rates. Block joining is accomplished by choosing the values of the α,. weights as follows,
Figure imgf000019_0001
where B/(AN) is the bit rate in bits per pixel for the current frame. The values of B, A, and N were defined earlier as the number of bits available, number of pixels in a block, and number of blocks, respectively. If the bit rate is above 0.5, the α, 's are all equal to 1 and hence have no effect. At lower bit rates, the α,. 's linearly approach the respective σ, 's and progressively reduce the range of the Q*'s. In fact, if the bit rate is 0, then α(. = σ, and all the quantizer values are equal, and hence all blocks are joined into one set,
Figure imgf000020_0001
FIG. 6 is a detailed block diagram of the quantizer controller 22 shown in FIG. 2. The quantizer controller 22 in one embodiment is implemented in a general purpose programmable processor. The functional blocks in FIG. 6 represent the primary operations performed by the processor. Initialization parameters in block 31 are either derived from pre-processing the current image or from parameters previously derived from previous frames or from prestored values in processor memory (not shown). Initialization parameters include N,, S-, B,, K, and C, (or KN+1 and CN+, from a previous frame).
The image is decomposed into N image blocks 14 (FIG. 2) of A pixels in block 30. The energy of the pixels in each block is computed in block 32. The weight factors assigned to each block are computed in block 34. The amount of energy left in the image is updated in block 40 and the bits left for encoding the image are updated in block 38. Parameters for the encoder model are updated in block 42 and the number of blocks remaining to be encoded are tracked in block 44. The processor in block 36 then computes the optimized quantizer step size according to the values derived in blocks 32, 34, 40, 38, 42 and 44.
Having described and illustrated the principles of the invention in a preferred embodiment thereof, it should be apparent that the invention can be modified in arrangement and detail without departing from such principles. I claim all modifications and variation coming within the spirit and scope of the following claims.

Claims

1. A method for encoding multiple blocks in a frame of image data, comprising: identifying a target bit value equal to a total number of bits available for encoding the frame; predicting a total distortion in the frame according to quantization values assigned to each one of the blocks, the quantization values characterized according to an amount of energy in each block and a number of bits assigned to each block; adapting optimum quantization values for each of the multiple blocks by minimizing the total predicted distortion in the frame subject to a constraint that the total number of bits available for encoding the frame is equal to the target bit value; and encoding the blocks with the predicted optimum quantization values.
2. A method according to claim 1 wherein the optimum quantization values are generated using a Lagrange optimization on the predicted total distortion.
3. A method according to claim 1 wherein the optimum quantization values are derived according to the following,
Figure imgf000021_0001
where, Q- * is the optimum quantization value for each block i, N is the number of blocks in the frame, B is the total number of bits available for encoding the frame, A is a number of pixels in each of the multiple blocks, K and C are constants associated with the image blocks, σ,. is an empirical standard deviation of pixel values in the block, and α j is a weighting incorporating the importance of distortion in the block.
4. A method according to claim 1 including adjusting the optimum quantization values according to a number of image blocks remaining to be encoded and a number of bits still available for encoding the remaining image blocks.
5. A method according to claim 3 including using a K parameter and a C parameter on a block-by-block basis to adjust the optimum quantization values for each of the multiple blocks, the K parameter modeling correlation statistics of the pixels in the image blocks and the C parameter modeling bits required to code overhead data.
6. A method according to claim 5 including deriving the optimum quantization values in either a fixed mode where the K and C parameters are known in advance or an adaptive mode where the K and C parameters are derived according to the K and C parameters of previously encoded blocks.
7. A method according to claim 6 wherein the adaptive mode includes the following steps: deriving values for the K and C parameters that exactly predict the number of bits B used for encoding previous blocks; deriving averages for the derived K and C parameters for the previously encoded video blocks; and predicting the K and C parameters for a next video block by linearly weighting the average K and C parameters according to the initial estimates for the K and C parameters.
8. A method according to claim 7 wherein the values of K and C that predict B are,
.2
(B;-A C,)Q B'
K. = and C. = -i--K,
Aσ; A Q,
B'-B DCT.l or K. and C. =
Aσ2
where B'DCT╬╣╬╣ is a number of bits spent for the DCT coefficients of the current image block;
the averages of K and C are;
i-l~ 1 * ~ ι-l~ 1 K, =— K, ,+τK, and C, =— C, , +TC, ; and
1 1 1 1
the linearly weighted averages of K and C are,
Figure imgf000023_0001
9. A method according to claim 3 wherein the amount of energy in the frame is not determined in advance and is estimated according to the following where,
Figure imgf000023_0002
where,
S, =∑α,σs ,
and the α^'s and σ^ 's are those obtained for the blocks in a previously encoded
video frame.
10. A method according to claim 9 including encoding the image blocks several times to estimate parameters KΓÇ₧ K2, ... KΓÇ₧ and CΓÇ₧ C2, ..., Cn for each of the image blocks and then deriving a super optimum quantization value by setting:
Figure imgf000024_0001
where, S, = ∑.jKkσk k and S1+1 = S, /K.σ.α,
A=l
11. A method according to claim 1 including predicting optimum quantization values for each frame according to the following steps: identifying a multiframe bit value equal to a total number of bits available for encoding multiple frames; modeling a total amount of distortion in the multiple frames according to quantization values assigned to each one of the frames, the quantization values characterized according to an amount of energy in each frame and a number of bits assigned to each frame; predicting optimum quantization values for each frame that minimize the total modeled distortion in the multiple frames; and encoding each frame with the predicted optimum quantization value.
12. A method according to claim 1 including applying weighting factors to each of the optimum quantization values according to location of the blocks in the frame.
13. A method according to claim 1 including controlling a number of different optimum quantization values assignable to the blocks by assigning the same quantization values to blocks having similar standard deviation values.
13. A method according to claim 1 including controlling a number of different optimum quantization values assignable to the blocks by assigning the same quantization values to blocks having similar standard deviation values.
14. A method according to claim 13 wherein the optimum quantization values are controlled by assigning the following weighting values,
Figure imgf000025_0001
where B/(AN) is the bit rate in bits per pixel for a current frame, B is the number of bits available, A is a number of pixels in the block, N is the total number of blocks in the frame, and σ(- is the standard deviation for the pixels in the blocks.
15. A method for quantizing regions in a video image, comprising: receiving image information for different regions; predicting an amount of distortion created in the video image according to quantization values assigned to the regions, the predicted distortion characterized according to the amount of information in the region and the number of bits available for encoding the information in the regions into the quantization values; optimizing the quantization values assigned to the regions so that the amount of predicted distortion in the regions is minimized for the number of available bits; and encoding the regions with the optimized quantization values.
16. A method according to claim 15 wherein the optimized quantization
values are derived as follows,
Figure imgf000026_0001
where,
Figure imgf000026_0002
7 27 S,+1 = S, - (A,.σ,Φ)^2 α/+2 ,
γ , φ , K and C are constants, A, is a number of pixels in an ith region, σ, represents energy of the pixel values for the ith region, B, represents the number of bits available and , is a weighting factor incorporating importance of the region distortion.
17. A method according to claim 15 wherein the optimized quantization values for a selected region is derived as follows: summing the energy in each of the regions to determine a total energy in the video image; multiplying the total energy with an amount of energy in the selected region; scaling the multiplied energies according to a scaling factor; and squaring the scaled energies thereby generating the optimized quantization value for the selected region.
18. A method according to claim 17 wherein scaling the multiplied energies includes the following: applying a first scaling factor proportional to the number of regions in the frame remaining to be quantized; and applying a second scaling factor that varies for each region according to a total number of bits available for encoding the frame and a total number of bits already used to encode previous regions in the frame.
19. A method according to claim 18 including applying a third and fourth scaling factor, the third scaling factor modeling correlation statistics in the region and the fourth scaling factor representing overhead data in the encoded frame.
20. A method according to claim 19 including applying a fifth scaling factor proportional to a number of pixels in each region.
21. A method according to claim 15 wherein the energy in each region is proportional to a standard deviation for pixel values or a sum of the absolute values of the pixels in relation to an average of all pixel values in the same region.
22. A method according to claim 15 including predicting the total energy in the video image by taking the total energy for a previous video image.
23. A method according to claim 15 wherein the predicted optimum quantization values are reduced for the blocks having less energy and the predicted optimum quantization values are increased for the blocks having more energy.
24. A method according to claim 21 wherein the energy is adjusted by a scaling value for the pixels in intracoded regions, the scaling value characterized according to different K values representing correlation statistics for different types of coded regions.
25. An encoder for quantizing regions in video frames, comprising: a circuit for detecting an amount of video information in one of the regions; a quantizer controller assigning quantization values to each region that minimize an amount of predicted distortion in the video frames for a target bit value, the quantizer controller predicting an amount of distortion created in the video frames before the information in the region is actually quantized and adjusting the quantization values assigned to each region to minimize the predicted distortion according to a constraint that a total number of available bits for encoding the frames is equal to the target bit value; and a quantizer quantizing the video information in the regions according to the adapted quantization values associated with the regions generated from the quantizer controller.
26. An encoder according to claim 25 including a transform circuit receiving the video image at an input and generating transform coefficients at an output, the quantizer quantizing the transform coefficients for each region according to the associated quantization values.
PCT/US1998/001827 1997-02-11 1998-01-30 Method and apparatus for optimizing quantizer values in an image encoder WO1998035500A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP53480498A JP2001526850A (en) 1997-02-11 1998-01-30 Method and apparatus for optimizing a quantization value in an image encoder

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US4029597P 1997-02-11 1997-02-11
US60/040,295 1997-02-11
US09/008,137 1998-01-16
US09/008,137 US6111991A (en) 1998-01-16 1998-01-16 Method and apparatus for optimizing quantizer values in an image encoder

Publications (1)

Publication Number Publication Date
WO1998035500A1 true WO1998035500A1 (en) 1998-08-13

Family

ID=26677843

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1998/001827 WO1998035500A1 (en) 1997-02-11 1998-01-30 Method and apparatus for optimizing quantizer values in an image encoder

Country Status (2)

Country Link
JP (1) JP2001526850A (en)
WO (1) WO1998035500A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1102492A2 (en) * 1999-11-18 2001-05-23 Sony United Kingdom Limited Data compression
WO2002080567A1 (en) * 2001-03-30 2002-10-10 Sony Corporation Image signal quantizing device and its method
WO2004010702A1 (en) 2002-07-22 2004-01-29 Institute Of Computing Technology Chinese Academy Of Sciences A bit-rate control method and device combined with rate-distortion optimization
WO2004091221A2 (en) * 2003-04-04 2004-10-21 Avid Technology, Inc. Fixed bit rate, intraframe compression and decompression of video
US7403561B2 (en) 2003-04-04 2008-07-22 Avid Technology, Inc. Fixed bit rate, intraframe compression and decompression of video
WO2009157827A1 (en) * 2008-06-25 2009-12-30 Telefonaktiebolaget L M Ericsson (Publ) Row evaluation rate control
US7916363B2 (en) 2003-04-04 2011-03-29 Avid Technology, Inc. Bitstream format for compressed image data
EP1892965A3 (en) * 2003-04-04 2011-04-06 Avid Technology, Inc. Fixed bit rate, intraframe compression and decompression of video
US9445102B2 (en) 2011-07-07 2016-09-13 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V Model parameter estimation for a rate- or distortion-quantization model function

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5134475A (en) * 1990-12-11 1992-07-28 At&T Bell Laboratories Adaptive leak hdtv encoder
US5144423A (en) * 1990-12-11 1992-09-01 At&T Bell Laboratories Hdtv encoder with forward estimation and constant rate motion vectors
US5374958A (en) * 1992-06-30 1994-12-20 Sony Corporation Image compression based on pattern fineness and edge presence
US5426463A (en) * 1993-02-22 1995-06-20 Rca Thomson Licensing Corporation Apparatus for controlling quantizing in a video signal compressor
US5475501A (en) * 1991-09-30 1995-12-12 Sony Corporation Picture encoding and/or decoding method and apparatus
US5550590A (en) * 1994-03-04 1996-08-27 Kokusai Denshin Denwa Kabushiki Kaisha Bit rate controller for multiplexer of encoded video
US5606371A (en) * 1993-11-30 1997-02-25 U.S. Philips Corporation Video signal coding with proportionally integrating quantization control
US5627581A (en) * 1993-06-08 1997-05-06 Sony Corporation Encoding apparatus and encoding method
US5734755A (en) * 1994-03-11 1998-03-31 The Trustees Of Columbia University In The City Of New York JPEG/MPEG decoder-compatible optimized thresholding for image and video signal compression

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5134475A (en) * 1990-12-11 1992-07-28 At&T Bell Laboratories Adaptive leak hdtv encoder
US5144423A (en) * 1990-12-11 1992-09-01 At&T Bell Laboratories Hdtv encoder with forward estimation and constant rate motion vectors
US5475501A (en) * 1991-09-30 1995-12-12 Sony Corporation Picture encoding and/or decoding method and apparatus
US5374958A (en) * 1992-06-30 1994-12-20 Sony Corporation Image compression based on pattern fineness and edge presence
US5426463A (en) * 1993-02-22 1995-06-20 Rca Thomson Licensing Corporation Apparatus for controlling quantizing in a video signal compressor
US5627581A (en) * 1993-06-08 1997-05-06 Sony Corporation Encoding apparatus and encoding method
US5606371A (en) * 1993-11-30 1997-02-25 U.S. Philips Corporation Video signal coding with proportionally integrating quantization control
US5550590A (en) * 1994-03-04 1996-08-27 Kokusai Denshin Denwa Kabushiki Kaisha Bit rate controller for multiplexer of encoded video
US5734755A (en) * 1994-03-11 1998-03-31 The Trustees Of Columbia University In The City Of New York JPEG/MPEG decoder-compatible optimized thresholding for image and video signal compression

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1102492A2 (en) * 1999-11-18 2001-05-23 Sony United Kingdom Limited Data compression
EP1102492A3 (en) * 1999-11-18 2008-05-07 Sony United Kingdom Limited Data compression
US7145949B2 (en) 2001-03-30 2006-12-05 Sony Corporation Video signal quantizing apparatus and method thereof
US7065138B2 (en) 2001-03-30 2006-06-20 Sony Corporation Video signal quantizing apparatus and method thereof
US6865225B2 (en) 2001-03-30 2005-03-08 Sony Corporation Image signal quantizing device and its method
WO2002080567A1 (en) * 2001-03-30 2002-10-10 Sony Corporation Image signal quantizing device and its method
EP1549074A1 (en) * 2002-07-22 2005-06-29 Institute of Computing Technology Chinese Academy of Sciences A bit-rate control method and device combined with rate-distortion optimization
WO2004010702A1 (en) 2002-07-22 2004-01-29 Institute Of Computing Technology Chinese Academy Of Sciences A bit-rate control method and device combined with rate-distortion optimization
EP1549074A4 (en) * 2002-07-22 2009-04-15 Inst Computing Tech Cn Academy A bit-rate control method and device combined with rate-distortion optimization
WO2004091221A2 (en) * 2003-04-04 2004-10-21 Avid Technology, Inc. Fixed bit rate, intraframe compression and decompression of video
WO2004091221A3 (en) * 2003-04-04 2005-04-21 Avid Technology Inc Fixed bit rate, intraframe compression and decompression of video
US7403561B2 (en) 2003-04-04 2008-07-22 Avid Technology, Inc. Fixed bit rate, intraframe compression and decompression of video
US7729423B2 (en) 2003-04-04 2010-06-01 Avid Technology, Inc. Fixed bit rate, intraframe compression and decompression of video
US7916363B2 (en) 2003-04-04 2011-03-29 Avid Technology, Inc. Bitstream format for compressed image data
EP1892965A3 (en) * 2003-04-04 2011-04-06 Avid Technology, Inc. Fixed bit rate, intraframe compression and decompression of video
US8154776B2 (en) 2003-04-04 2012-04-10 Avid Technology, Inc. Bitstream format for compressed image data
WO2009157827A1 (en) * 2008-06-25 2009-12-30 Telefonaktiebolaget L M Ericsson (Publ) Row evaluation rate control
US9445102B2 (en) 2011-07-07 2016-09-13 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V Model parameter estimation for a rate- or distortion-quantization model function

Also Published As

Publication number Publication date
JP2001526850A (en) 2001-12-18

Similar Documents

Publication Publication Date Title
US6111991A (en) Method and apparatus for optimizing quantizer values in an image encoder
Ribas-Corbera et al. Rate control in DCT video coding for low-delay communications
US6529631B1 (en) Apparatus and method for optimizing encoding and performing automated steerable image compression in an image coding system using a perceptual metric
US6037987A (en) Apparatus and method for selecting a rate and distortion based coding mode for a coding system
Gerken Object-based analysis-synthesis coding of image sequences at very low bit rates
US6243497B1 (en) Apparatus and method for optimizing the rate control in a coding system
US6690833B1 (en) Apparatus and method for macroblock based rate control in a coding system
CA2295689C (en) Apparatus and method for object based rate control in a coding system
US6389072B1 (en) Motion analysis based buffer regulation scheme
JP4122130B2 (en) Multi-component compression encoder motion search method and apparatus
EP1034513B1 (en) Method and device for determining bit allocation in a video compression system
US7095784B2 (en) Method and apparatus for moving picture compression rate control using bit allocation with initial quantization step size estimation at picture level
EP1256238A1 (en) Method for encoding and decoding video information, a motion compensated video encoder and a corresponding decoder
WO2008121281A2 (en) Controlling the amount of compressed data
US20030152151A1 (en) Rate control method for real-time video communication by using a dynamic rate table
Ribas-Corbera et al. Optimizing motion-vector accuracy in block-based video coding
Ribas-Corbera et al. Optimizing block size in motion-compensated video coding
US8326068B1 (en) Method and apparatus for modeling quantization matrices for image/video encoding
WO1998035500A1 (en) Method and apparatus for optimizing quantizer values in an image encoder
US20050123038A1 (en) Moving image encoding apparatus and moving image encoding method, program, and storage medium
Ribas-Corbera et al. Optimal block size for block-based motion-compensated video coders
US7133448B2 (en) Method and apparatus for rate control in moving picture video compression
WO1997016031A1 (en) Apparatus and method for selecting a coding mode in a block-based coding system
KR100595144B1 (en) An adaptive quantization algorithm for video coding
Ribas-Corbera et al. Reducing rate/complexity in video coding by motion estimation with block adaptive accuracy

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): CN JP KR SG

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase