US9628821B2 - Motion compensation using decoder-defined vector quantized interpolation filters - Google Patents

Motion compensation using decoder-defined vector quantized interpolation filters Download PDF

Info

Publication number
US9628821B2
US9628821B2 US12/896,552 US89655210A US9628821B2 US 9628821 B2 US9628821 B2 US 9628821B2 US 89655210 A US89655210 A US 89655210A US 9628821 B2 US9628821 B2 US 9628821B2
Authority
US
United States
Prior art keywords
codebook
pixel block
data
filter
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US12/896,552
Other versions
US20120082217A1 (en
Inventor
Barin Geoffry Haskell
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apple Inc
Original Assignee
Apple Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Apple Inc filed Critical Apple Inc
Priority to US12/896,552 priority Critical patent/US9628821B2/en
Assigned to APPLE INC. reassignment APPLE INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HASKELL, BARIN G.
Priority to PCT/US2011/053975 priority patent/WO2012044814A1/en
Priority to AU2011308759A priority patent/AU2011308759A1/en
Publication of US20120082217A1 publication Critical patent/US20120082217A1/en
Application granted granted Critical
Publication of US9628821B2 publication Critical patent/US9628821B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/80Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation
    • H04N19/82Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/117Filters, e.g. for pre-processing or post-processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/189Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding
    • H04N19/192Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding the adaptation method, adaptation tool or adaptation type being iterative or recursive

Definitions

  • the present invention relates to video coding and, more particularly, to video coding system using interpolation filters as part of motion-compensated coding.
  • Video codecs typically code video frames using a discrete cosine transform (“DCT”) on blocks of pixels, called “pixel blocks” herein, much the same as used for the original JPEG coder for still images.
  • An initial frame (called an “intra” frame) is coded and transmitted as an independent frame.
  • Subsequent frames which are modeled as changing slowly due to small motions of objects in the scene, are coded efficiently in the inter mode using a technique called motion compensation (“MC”) in which the displacement of pixel blocks from their position in previously-coded frames are transmitted as motion vectors together with a coded representation of a difference between a predicted pixel block and a pixel block from the source image.
  • MC motion compensation
  • FIGS. 1 and 2 show a block diagram of a motion-compensated image coder/decoder system.
  • the system combines transform coding (in the form of the DCT of pixel blocks of pixels) with predictive coding (in the form of differential pulse coded modulation (“PCM”)) in order to reduce storage and computation of the compressed image, and at the same time to give a high degree of compression and adaptability.
  • PCM differential pulse coded modulation
  • the first step in the interframe coder is to create a motion compensated prediction error. This computation requires one or more frame stores in both the encoder and decoder.
  • the resulting error signal is transformed using a DCT, quantized by an adaptive quantizer, entropy encoded using a variable length coder (“VLC”) and buffered for transmission over a channel.
  • VLC variable length coder
  • FIG. 3 The way that the motion estimator works is illustrated in FIG. 3 .
  • the current frame is partitioned into motion compensation blocks, called “mcblocks” herein, of constant size, e.g., 16 ⁇ 16 or 8 ⁇ 8.
  • mcblocks motion compensation blocks
  • variable size mcblocks are often used, especially in newer codecs such as H.264.
  • ITU-T Recommendation H.264, Advanced Video Coding Indeed nonrectangular mcblocks have also been studied and proposed.
  • Mcblocks are generally larger than or equal to pixel blocks in size.
  • the previous decoded frame is used as the reference frame, as shown in FIG. 3 .
  • the reference frame is used as the reference frame, as shown in FIG. 3 .
  • one of many possible reference frames may also be used, especially in newer codecs such as H.264.
  • a different reference frame may be used for each mcblock.
  • Each mcblock in the current frame is compared with a set of displaced mcblocks in the reference frame to determine which one best predicts the current mcblock.
  • a motion vector is determined that specifies the displacement of the reference mcblock.
  • Intraframe coding exploits the spatial redundancy that exists between adjacent pixels of a frame. Frames coded using only intraframe coding are called “I-frames”.
  • a target mcblock in the frame to be encoded is matched with a set of mcblocks of the same size in a past frame called the “reference frame”.
  • the mcblock in the reference frame that “best matches” the target mcblock is used as the reference mcblock.
  • the prediction error is then computed as the difference between the target mcblock and the reference mcblock.
  • Prediction mcblocks do not, in general, align with coded mcblock boundaries in the reference frame.
  • the position of this best-matching reference mcblock is indicated by a motion vector that describes the displacement between it and the target mcblock.
  • the motion vector information is also encoded and transmitted along with the prediction error. Frames coded using forward prediction are called “P-frames”.
  • the prediction error itself is transmitted using the DCT-based intraframe encoding technique summarized above.
  • Bidirectional temporal prediction also called “Motion-Compensated Interpolation”
  • Frames coded with bidirectional prediction use two reference frames, typically one in the past and one in the future. However, two of many possible reference frames may also be used, especially in newer codecs such as H.264. In fact, with appropriate signaling, different reference frames may be used for each mcblock.
  • a target mcblock in bidirectionally-coded frames can be predicted by a mcblock from the past reference frame (forward prediction), or one from the future reference frame (backward prediction), or by an average of two mcblocks, one from each reference frame (interpolation).
  • a prediction mcblock from a reference frame is associated with a motion vector, so that up to two motion vectors per mcblock may be used with bidirectional prediction.
  • Motion-Compensated Interpolation for a mcblock in a bidirectionally-predicted frame is illustrated in FIG. 4 . Frames coded using bidirectional prediction are called “B-frames”.
  • Bidirectional prediction provides a number of advantages.
  • the primary one is that the compression obtained is typically higher than can be obtained from forward (unidirectional) prediction alone.
  • bidirectionally-predicted frames can be encoded with fewer bits than frames using only forward prediction.
  • bidirectional prediction does introduce extra delay in the encoding process, because frames must be encoded out of sequence. Further, it entails extra encoding complexity because mcblock matching (the most computationally intensive encoding procedure) has to be performed twice for each target mcblock, once with the past reference frame and once with the future reference frame.
  • FIG. 5 shows a typical bidirectional video encoder. It is assumed that frame reordering takes place before coding, i.e., I- or P-frames used for B-frame prediction must be coded and transmitted before any of the corresponding B-frames. In this codec, B-frames are not used as reference frames. With a change of architecture, they could be as in H.264.
  • Input video is fed to a Motion Compensation Estimator/Predictor that feeds a prediction to the minus input of the subtractor.
  • the Inter/Intra Classifier For each mcblock, the Inter/Intra Classifier then compares the input pixels with the prediction error output of the subtractor. Typically, if the mean square prediction error exceeds the mean square pixel value, an intra mcblock is decided. More complicated comparisons involving DCT of both the pixels and the prediction error yield somewhat better performance, but are not usually deemed worth the cost.
  • the prediction is set to zero. Otherwise, it comes from the Predictor, as described above.
  • the prediction error is then passed through the DCT and quantizer before being coded, multiplexed and sent to the Buffer.
  • Quantized levels are converted to reconstructed DCT coefficients by the Inverse Quantizer and then the inverse is transformed by the inverse DCT unit (“IDCT”) to produce a coded prediction error.
  • the Adder adds the prediction to the prediction error and clips the result, e.g., to the range 0 to 255, to produce coded pixel values.
  • the Motion Compensation Estimator/Predictor uses both the previous frame and the future frame kept in picture stores.
  • the coded pixels output by the Adder are written to the Next Picture Store, while at the same time the old pixels are copied from the Next Picture store to the Previous Picture store. In practice, this is usually accomplished by a simple change of memory addresses.
  • the coded pixels may be filtered by an adaptive deblocking filter prior to entering the picture stores. This improves the motion compensation prediction, especially for low bit rates where coding artifacts may become visible.
  • the Coding Statistics Processor in conjunction with the Quantizer Adapter controls the output bit-rate and optimizes the picture quality as much as possible.
  • FIG. 6 shows a typical bidirectional video decoder. It has a structure corresponding to the pixel reconstruction portion of the encoder using inverting processes. It is assumed that frame reordering takes place after decoding and video output.
  • the interpolation filter might be placed at the output of the motion compensated predictor as in the encoder.
  • FIG. 3 and FIG. 4 show reference mcblocks in reference frames as being displaced vertically and horizontally with respect to the position of the current mcblock being decoded in the current frame.
  • the amount of the displacement is represented by a two-dimensional vector [dx, dy], called the motion vector.
  • Motion vectors may be coded and transmitted, or they may be estimated from information already in the decoder, in which case they are not transmitted. For bidirectional prediction, each transmitted mcblock requires two motion vectors.
  • dx and dy are signed integers representing the number of pixels horizontally and the number of lines vertically to displace the reference mcblock.
  • reference mcblocks are obtained merely by reading the appropriate pixels from the reference stores.
  • Fractional motion vectors require more than simply reading pixels from reference stores. In order to obtain reference mcblock values for locations between the reference store pixels, it is necessary to interpolate between them.
  • the optimum motion compensation interpolation filter depends on a number of factors. For example, objects in a scene may not be moving in pure translation. There may be object rotation, both in two dimensions and three dimensions. Other factors include zooming, camera motion and lighting variations caused by shadows, or varying illumination.
  • Camera characteristics may vary due to special properties of their sensors. For example, many consumer cameras are intrinsically interlaced, and their output may be de-interlaced and filtered to provide pleasing-looking pictures free of interlacing artifacts. Low light conditions may cause an increased exposure time per frame, leading to motion dependent blur of moving objects. Pixels may be non-square.
  • interpolation filters may be designed by minimizing the mean square error between the current mcblocks and their corresponding reference mcblocks over each frame. These are the so-called Wiener filters. The filter coefficients would then be quantized and transmitted at the beginning of each frame to be used in the actual motion compensated coding.
  • FIG. 1 is a block diagram of a conventional video coder.
  • FIG. 2 is a block diagram of a conventional video decoder.
  • FIG. 3 illustrates principles of motion compensated prediction.
  • FIG. 4 illustrates principles of bidirectional temporal prediction.
  • FIG. 5 is a block diagram of a conventional bidirectional video coder.
  • FIG. 6 is a block diagram of a conventional bidirectional video decoder.
  • FIG. 7 illustrates an encoder/decoder system suitable for use with embodiments of the present invention.
  • FIG. 8 is a simplified block diagram of a video encoder according to an embodiment of the present invention.
  • FIG. 9 illustrates a method according to an embodiment of the present invention.
  • FIG. 10 illustrates a method according to another embodiment of the present invention.
  • FIG. 11 is a simplified block diagram of a video decoder according to an embodiment of the present invention.
  • FIG. 12 illustrates a method according to a further embodiment of the present invention.
  • FIG. 13 illustrates a codebook architecture according to an embodiment of the present invention.
  • FIG. 14 illustrates a codebook architecture according to another embodiment of the present invention.
  • FIG. 15 illustrates a codebook architecture according to a further embodiment of the present invention.
  • FIG. 16 illustrates a decoding method according to an embodiment of the present invention.
  • FIG. 17 illustrates a method according to an embodiment of the present invention.
  • FIG. 18 illustrates another method according to an embodiment of the present invention.
  • Embodiments of the present invention provide a video coder/decoder system that uses dynamically assignable interpolation filters as part of motion compensated prediction.
  • An encoder and a decoder each may store common codebooks that define a variety of interpolation filters that may be applied to predicted video data.
  • an encoder calculates characteristics of an ideal interpolation filter to be applied to a reference block that would minimize prediction error when the reference block would be used to predict an input block of video data.
  • the encoder may search its local codebook to find a filter that best matches the ideal filter.
  • the encoder may filter the reference block by the best matching filter stored in the codebook as it codes the input block.
  • the encoder also may transmit an identifier of the best matching filter to a decoder, which will use the interpolation filter on a predicted block as it decodes coded data for the block.
  • embodiments of the present invention propose to use a codebook of filters and send an index into the codebook for each mcblock.
  • Embodiments of the present invention provide a method of building and applying filter codebooks between an encoder and a decoder ( FIG. 7 ).
  • FIG. 8 illustrates a simplified block diagram of an encoder system showing operation of the interpolation filter.
  • FIG. 9 illustrates a method of building a codebook according to an embodiment of the present invention.
  • FIG. 10 illustrates a method of using a codebook during runtime coding and decoding according to an embodiment of the present invention.
  • FIG. 11 illustrates a simplified block diagram of a decoder showing operation of the interpolation filter and consumption of the codebook indices.
  • FIG. 8 is a simplified block diagram of an encoder suitable for use with the present invention.
  • the encoder 100 may include a block-based coding chain 110 and a prediction unit 120 .
  • the block coding chain 110 may include a subtractor 112 , a transform unit 114 , a quantizer 116 and a variable length coder 118 .
  • the subtractor 112 may receive an input mcblock from a source image and a predicted mcblock from the prediction unit 120 . It may subtract the predicted mcblock from the input mcblock, generating a block of pixel residuals.
  • the transform unit 114 may convert the mcblock's residual data to an array of transform coefficient according to a spatial transform, typically a discrete cosine transform (“DCT”) or a wavelet transform.
  • the quantizer 116 may truncate transform coefficients of each block according to a quantization parameter (“QP”).
  • QP quantization parameter
  • the QP values used for truncation may be transmitted to a decoder in a channel.
  • the variable length coder 118 may code the quantized coefficients according to an entropy coding algorithm, for example, a variable length coding algorithm. Following variable length coding, the coded data of each mcblock may be stored in a buffer 140 to await transmission to a decoder via a channel.
  • the prediction unit 120 may include: an inverse quantization unit 122 , an inverse transform unit 124 , an adder 126 , a reference picture cache 128 , a motion estimator 130 , a motion compensated predictor 132 , an interpolation filter 134 and a codebook 136 .
  • the inverse quantization unit 122 may quantize coded video data according to the QP used by the quantizer 116 .
  • the inverse transform unit 124 may transform re-quantized coefficients to the pixel domain.
  • the adder 126 may add pixel residuals output from the inverse transform unit 124 with predicted motion data from the motion compensated predictor 132 .
  • the reference picture cache 128 may store recovered frames for use as reference frames during coding of later-received mcblocks.
  • the motion estimator 130 may estimate image motion between a source image being coded and reference frame(s) stored in the reference picture cache. For example, it may select a prediction mode to be used (for example, unidirectional P-coding or bidirectional B-coding), and generate motion vectors for use in such predictive coding.
  • the motion compensated predictor 132 may generate a predicted mcblock for use by the block coder. In this regard, the motion compensated predictor may retrieve stored mcblock data of the selected reference frames.
  • the interpolation filter 134 may filter a predicted mcblock from the motion compensated predictor 132 according to configuration parameters output by codebook 136 .
  • the codebook 136 may store configuration data that defines operation of the interpolation filter 134 . Different instances of configuration data are identified by an index into the codebook.
  • motion vectors, quantization parameters and codebook indices may be output to a channel along with coded mcblock data for decoding by a decoder (not shown).
  • FIG. 9 illustrates a method according to an embodiment of the present invention.
  • a codebook may be constructed by using a large set of training sequences having a variety of detail and motion characteristics.
  • an integer motion vector and reference frame may be computed according to traditional techniques (box 210 ).
  • an N ⁇ N Wiener interpolation filter may be constructed (box 220 ) by computing cross-correlation matrices (box 222 ) and auto-correlation matrices (box 224 ) between uncoded pixels and coded pixels from the reference picture cache, each averaged over the mcblock.
  • the cross-correlation matrices and auto-correlation matrices may be averaged over a larger surrounding area having similar motion and detail as the mcblock.
  • the interpolation filter may be a rectangular interpolation filter or a circularly-shaped Wiener interpolation filter.
  • This procedure may produce auto-correlation matrices that are singular, which means that some of the filter coefficients may be chosen arbitrarily. In these cases, the affected coefficients farthest from the center may be chosen to be zero.
  • the resulting filter may be added to the codebook (box 230 ).
  • Filters may be added pursuant to vector quantization (“VQ”) clustering techniques, which are designed to either produce a codebook with a desired number of entries or a codebook with a desired accuracy of representation of the filters.
  • VQ vector quantization
  • the codebook Once the codebook is established, it may be transmitted to the decoder (box 240 ). After transmission, both the encoder and decoder may store a common codebook, which may be referenced during runtime coding operations.
  • Transmission to a decoder may occur in a variety of ways.
  • the codebook may then be transmitted periodically to the decoder during encoding operations.
  • the codebook may be coded into the decoder a priori, either from coding operations performed on generic training data or by representation in a coding standard.
  • Other embodiments permit a default codebook to be established in an encoder and decoder but to allow the codebook to be updated adaptively by transmissions from the encoder to the decoder.
  • Indices into the codebook may be variable length coded based on their probability of occurrence, or they may be arithmetically coded.
  • FIG. 10 illustrates a method for runtime encoding of video, according to an embodiment of the present invention.
  • an integer motion vector and reference frame(s) may be computed (box 310 ), coded and transmitted.
  • an N ⁇ N Wiener interpolation filter may be constructed for the mcblock (box 320 ) by computing cross-correlation matrices (box 322 ) and auto-correlation matrices (box 324 ) averaged over the mcblock.
  • the cross-correlation matrices and auto-correlation matrices may be averaged over a larger surrounding area that has similar motion and detail as the mcblock.
  • the interpolation filter may be a rectangular interpolation filter or a circularly-shaped Wiener interpolation filter.
  • the codebook may be searched for a previously-stored filter that best matches the newly-constructed interpolation filter (box 330 ).
  • the matching algorithm may proceed according to vector quantization search methods.
  • the encoder may code the resulting index and transmit it to a decoder (box 340 ).
  • an encoder when an encoder identifies a best matching filter from the codebook, it may compare the newly generated interpolation filter with the codebook's filter (box 350 ). If the differences between the two filters exceed a predetermined error threshold, the encoder may transmit filter characteristics to the decoder, which may cause the decoder to store the characteristics as a new codebook entry (boxes 360 - 370 ). If the differences do not exceed the error threshold, the encoder may simply transmit the index of the matching codebook (box 340 ).
  • the decoder that receives the integer motion vector, reference frame index and VQ interpolation filter index may use this data to perform motion compensation.
  • FIG. 11 is a simplified block diagram of a decoder 400 according to an embodiment of the present invention.
  • the decoder 400 may include a block-based decoder 402 that may include a variable length decoder 410 , an inverse quantizer 420 , an inverse transform unit 430 and an adder 440 .
  • the decoder 400 further may include a prediction unit 404 that may include a reference picture cache 450 , a motion compensated predictor 460 , a codebook 470 and an interpolation filter 480 .
  • the prediction unit 404 may generate a predicted pixel block in response to motion compensation data, such as motion vectors and codebook indices received from a channel.
  • the block-based decoder 402 may decode coded pixel block data with reference to the predicted pixel block data to recover pixel data of the pixel blocks.
  • the coded video data may include motion vectors and codebook indices that govern operation of the prediction unit 404 .
  • the reference picture cache 450 may store recovered image data of previously decoded frames that were identified as candidates for prediction of later-received coded video data (e.g., decoded I- or P-frames).
  • the motion compensated predictor 460 responsive to mcblock motion vector data, may retrieve a reference mcblock from identified frame(s) stored in the reference picture cache. Typically, a signal reference mcblock is retrieved when decoding a P-coded block and a pair of reference mcblocks are retrieved and averaged together when decoding a B-coded block.
  • the motion compensated predictor 460 may output the resultant mcblock and, optionally, pixels located near to the reference mcblocks, to the interpolation filter 480 .
  • the codebook 470 may supply filter parameter data to the interpolation filter 480 in response to a codebook index received from the channel data associated with the mcblock being decoded.
  • the codebook 470 may be provisioned as storage and control logic to store filter parameter data and output search data in response to a codebook inbox.
  • the interpolation filter 480 may filter the predicted mcblock based on parameter data applied by the codebook 470 .
  • the output of the interpolation filter 480 may be input to the block-based decoder 402 .
  • the coded video data may include coded residual coefficients that have been entropy coded.
  • a variable length decoder 410 may decode data received from a channel buffer according to an entropy coding process to recover quantized coefficients therefrom.
  • the inverse quantizer 420 may multiply coefficient data received from the inverse variable length decoder 410 by a quantization parameter received in the channel data.
  • the inverse quantizer 420 may output recovered coefficient data to the inverse transform unit 430 .
  • the inverse transform unit 430 may transform dequantized coefficient data received from the inverse quantizer 420 to pixel data.
  • the inverse transform unit 430 performs the converse of transform operations performed by the transform unit of an encoder (e.g., DCT or wavelet transforms).
  • An adder 440 may add, on a pixel-by-pixel basis, pixel residual data obtained by the inverse transform unit 430 with predicted pixel data obtained from the prediction unit 404 .
  • the adder 440 may output recovered mcblock data, from which a recovered frame may be constructed and rendered at a display device (not shown).
  • FIG. 12 illustrates a method according to another embodiment of the present invention.
  • an integer motion vector and reference frame may be computed according to traditional techniques (box 510 ).
  • an N ⁇ N Wiener interpolation filter may be selected by serially determining prediction results that would be obtained by each filter stored in the codebook (box 220 ).
  • the method may perform filtering operations on a predicted block using either all or a subset of the filters in succession (box 522 ) and estimate a prediction residual obtained from each codebook filter (box 524 ).
  • the method may determine which filter configuration gives the best prediction (box 530 ).
  • the index of that filter may be coded and transmitted to a decoder (box 540 ). This embodiment, conserves processing resources that otherwise might be spent computing Wiener filters for each source mcblock.
  • select filter coefficients may be forced to be equal to other filter coefficients. This embodiment can simplify the calculation of Wiener filters.
  • the vector Q p may take the form:
  • Q p [ q 1 q 2 ⁇ q N ] , where q 1 to q N represent pixels in or near the translated reference mcblock to be used in the prediction of p.
  • R is an N ⁇ 1 cross-correlation matrix derived from uncoded pixels (p) to be coded and their corresponding Q p vectors.
  • ri at each location i may be derived as p ⁇ qi averaged over the pixels p in the mcblock.
  • S is an N ⁇ N auto-correlation matrix derived from the N ⁇ 1 vectors Q p .
  • si,j at each location i,j may be derived as qi ⁇ qj averaged over the pixels p in the mcblock.
  • the cross-correlation matrices and auto-correlation matrices may be averaged over a larger surrounding area having similar motion and detail as the mcblock.
  • Derivation of the S and R matrices occurs for each mcblock being coded. Accordingly, derivation of the Wiener filters involves substantial computational resources at an encoder. According to this embodiment, select filter coefficients in the F matrix may be forced to be equal to each other, which reduces the size of F and, as a consequence, reduces the computational burden at the encoder.
  • select filter coefficients in the F matrix may be forced to be equal to each other, which reduces the size of F and, as a consequence, reduces the computational burden at the encoder.
  • filter coefficients f 1 and f 2 are set to be equal each other.
  • the F and Q p matrices may be modified as:
  • Deletion of the single coefficient reduces the size of F and Q p both to N ⁇ 1 ⁇ 1. Deletion of other filter coefficients in F and consolidation of values in Q p can result in further reductions to the sizes of the F and Q p vectors. For example, it often is advantageous to delete filter coefficients at all positions (save one) that are equidistant to each other from the pixel p. In this manner, derivation of the F matrix is simplified.
  • encoders and decoders may store separate codebooks that are indexed not only by the filter index but also by supplemental identifiers ( FIG. 13 ).
  • the supplemental identifiers may select one of the codebooks as being active and the index may select an entry from within the codebook to be output to the interpolation filter.
  • the supplemental identifier may be derived from many sources.
  • a block's motion vector may serve as the supplemental identifier.
  • separate codebooks may be provided for each motion vector value or for different ranges of integer motion vectors ( FIG. 14 ). Then in operation, given the value of integer motion vector and reference frame index, the encoder and decoder both may use the corresponding codebook to recover the filter to be used in motion compensation.
  • separate codebooks may be provided for different values or ranges of values of deblocking filters present in the current or reference frame. Then in operation, given the values of the deblocking filters, the encoder and decoder use the corresponding codebook to recover the filter to be used in motion compensation.
  • separate codebooks may be provided for different values or ranges of values of other codec parameters such as pixel aspect ratio and bit rate. Then in operation, given the values of these other codec parameters, the encoder and decoder use the corresponding codebook to recover the filter to be used in motion compensation.
  • separate codebooks may be provided for P-frames and B-frames or, alternatively, for coding types (P- or B-coding) applied to each mcblock.
  • different codebooks may be generated from discrete sets of training sequences.
  • the training sequences may be selected to have consistent video characteristics within the feature set, such as speeds of motion, complexity of detail and/or other parameters.
  • separate codebooks may be constructed for each value or range of values of the feature set.
  • Features in the feature set, or an approximation thereto, may be either coded and transmitted or, alternatively, derived from coded video data as it is received at the decoder.
  • the encoder and decoder will store common sets of codebooks, each tailored to characteristics of the training sequences from which they were derived.
  • the characteristics of input video data may be measured and compared to the characteristics that were stored from the training sequences.
  • the encoder and decoder may select a codebook that corresponds to the measured characteristics of the input video data to recover the filter to be used in motion compensation.
  • an encoder may construct separate codebooks arbitrarily and switch among the codebooks by including an express codebook specifier in the channel data.
  • an encoder may toggle between two modes of operation: a first mode in which motion vectors may be coded as fractional values and a default interpolation filter is used for predicted mcblocks and a second mode in which motion vectors are coded as integer distances and the vector coded interpolation filters of the foregoing embodiments are used.
  • a first mode in which motion vectors may be coded as fractional values and a default interpolation filter is used for predicted mcblocks
  • a second mode in which motion vectors are coded as integer distances and the vector coded interpolation filters of the foregoing embodiments are used.
  • both units may build a new interpolation filter from the fractional motion vectors and characteristics of the default interpolation filter and store it in the codebook. In this manner, if an encoder determines that more accurate interpolation is achieved via the increased bit rate of fractional motion vectors, the resultant interpolation filter may be stored in the codebook for future use if the interpolation were needed again.
  • FIG. 16 illustrates a decoding method 600 according to an embodiment of the present invention.
  • the method 600 may be repeated for each coded mcblock received by a decoder from a channel for which integer motion vectors are provided.
  • a decoder may retrieve parameters of an interpolation from a local codebook based on an index received in the channel data for the coded mcblock (box 610 ).
  • the decoder further may retrieve data of a reference mcblock based on a motion vector received from the channel for the coded mcblock (box 620 ).
  • the decoder may retrieve data in excess of a mcblock; for example, the decoder may retrieve pixels adjacent to the mcblock's boundary based on the size of the filter.
  • the method may apply the interpolation filter to the retrieved reference mcblock data (box 630 ) and decode the coded mcblock by motion compensation using the filtered reference mcblock as a prediction reference (box 640 ).
  • interpolation filters are designed by minimizing the mean square error between the current mcblocks and their corresponding reference mcblocks over each frame or part of a frame.
  • the interpolation filters may be designed to minimize the mean square error between filtered current mcblocks and their corresponding reference mcblocks over each frame or part of a frame.
  • the filters used to filter the current mcblocks need not be standardized or known to the decoder. They may adapt to parameters such as those mentioned above, or to others unknown to the decoder such as level of noise in the incoming video.
  • encoders and decoder may generate interpolation codebooks independently but in synchronized fashion.
  • Codebook managers (not shown) within the encoder and decoder each may generate interpolation filters using pixel values of decoded frames and may revise their codebooks based on the interpolation filters obtained thereby. If a newly-generated filter is stored to a codebook, the filter may be referenced during coding of later-received video data. Such an embodiment can avoid transmission of filter parameter data over the channel and thereby conserve bandwidth for other coding processes.
  • FIG. 17 illustrates a method operable at a coder, according to an embodiment of the present invention.
  • the encoder may initialize a codebook (box 710 ), for example, by opening a default codebook on start up or by building the codebook from set(s) of training video. Further, it is permissible to initialize the codebook to a null state (no codebook); in this latter case, the codebook may begin as empty but will build quickly during runtime coding.
  • the encoder may code and decode a new mcblock (box 720 ) as described in the foregoing embodiments and, as part of this process, the encoder may search the codebook for a best-matching interpolation filter (boxes 720 , 730 ). The encoder may transmit its codebook identifier to the decoder, along with the coded video data (box 740 ).
  • the method 700 further may compute parameters of an ideal interpolation filter for the coded mcblock, using the decoded mcblock data as a reference (box 750 ).
  • the method may search the codebook to determine if an already-stored codebook entry that best matches the ideal filter (box. 760 ). If the difference between the two filters exceeds a predetermined error threshold, the encoder may store characteristics of the new filter to the codebook (box 770 ). If the differences do not exceed the error threshold, the encoder may conclude operation for the current mcblock. The encoder need not transmit parameter data of the new filter to the decoder.
  • FIG. 18 illustrates a method 800 operable at a decoder according to an embodiment.
  • the decoder may initialize a locally-stored codebook to a same condition as the encoder's codebook (box 810 ).
  • the codebook may be opened from a default codebook, may be built using common training sequences as used at the encoder or may be opened as an “empty” codebook.
  • the method 800 may decode coded mcblocks received from the encoder, using integer motion vectors, reference frame indices and other coding parameters contained in the channel. As part of this process, the method 800 may retrieve and apply interpolation filter parameter data from the codebook as identified by the encoder (boxes 820 , 830 ). Recovered video data obtained from the decoding and filtering may be output from the decoder for rendering and may be stored in the reference picture cache for use in decoding subsequently-received frames.
  • the method 800 further may compute parameters of an ideal interpolation filter for the coded mcblock, using the decoded mcblock as a basis for comparison with the predicted mcblock (box 840 ). Since the decoded mcblock data obtained the decoder should be identical to the decoded mcblock data at the encoder, the encoder and decoder should obtain identical filters.
  • the method may search the codebook to determine if an already-stored codebook entry that best matches the ideal filter (box 850 ). If the differences between the two filters exceed a predetermined error threshold, the method 800 may store characteristics of the new filter to the codebook (box 860 ). If the differences do not exceed the error threshold, the method 800 may conclude operation for the current mcblock.
  • the decoder need not communicate with the encoder to revise its copy of the codebook.
  • decoding operations often include deblocking filters included within a synchronized coding loop.
  • calculations of new filters may be performed using decoded, deblocked video.
  • calculation of new filters may be performed using decoded but non-deblocked video.
  • the encoder and decoder need not communicate with each other, improved performance may be obtained by other embodiments of the present invention, which involve low bandwidth communication between the encoder and decoder.
  • the encoder may include a flag with coded mcblock data that indicates to the decoder whether to use deblocked decoded data or non-deblocked decoded data when computing filter parameters.
  • the encoder may calculate characteristics of an ideal interpolation filter using both types of source data—first, using deblocked, decoded data and second, using non-deblocked, decoded data.
  • the encoder further may identify, with reference to the source video data, which of the two filters generates the least error.
  • the encoder may set a short flag in the channel, possibly a single bit, data to identify to the decoder which process is to be used during decode.
  • FIGS. 17 and 18 provide encoders and decoders that concurrently maintain codebooks based on coded video data without requiring expressly communication to synchronize these efforts. Although these embodiments are bandwidth-efficient, improper operation may arise in the event of channel errors that corrupt data received at the decoder. In the event of such an error, decoded mcblock data may be corrupt or unavailable. Thus, the decoder's codebook may lose synchronization with the encoder's codebook.
  • Embodiments of the present invention provide for exchange of an error mitigation protocol to reduce errors that might arise from communication errors.
  • an encoder and decoder may operate under a codebook management policy in which the encoder and decoder each reset the codebook to a predetermined state at predetermined intervals. For example, they may reset their codebooks to the initialized state at such intervals (boxes 710 , 810 ).
  • the encoder and decoder may purge codebook entries associated with predetermined types of coded data, retaining other types.
  • encoders and decoder that use Long Term Reference frames (LTRs) as part of their coding protocols may purge codebook entries associated with all frames except LTR frames.
  • LTRs Long Term Reference frames
  • encoders receive acknowledgment messages from decoders indicating that LTR frames were received and successfully decoded.
  • an encoder may command a decoder to reset its codebook by placing an appropriate signal in the channel data.
  • Codebook management messages likely are to be more bandwidth efficient than other systems in which the encoder and decoder exchange data expressly defining filter parameters.
  • the interpolation filters may be calculated as an N ⁇ N Wiener interpolation filter constructed for the decoded mcblock by computing cross-correlation matrices and auto-correlation matrices over the mcblock.
  • the cross-correlation matrices and auto-correlation matrices may be averaged over a larger surrounding area that has similar motion and detail as the mcblock.
  • the interpolation filter may be a rectangular interpolation filter or a circularly-shaped Wiener interpolation filter.
  • select filter coefficients may be forced to be equal to other filter coefficients. This embodiment can simplify the calculation of Wiener filters.
  • the vector Q dec may take the form:
  • Q dec [ q 1 q 2 ⁇ q n ] , where q 1 to q N represent pixels in or near the translated reference mcblock to be used in the prediction of p dec .
  • R is an N ⁇ 1 cross-correlation matrix derived from decoded pixels (p dec ) to be coded and their corresponding Q dec vectors.
  • ri at each location i may be derived as p dec ⁇ qi averaged over the pixels p in the mcblock.
  • S is an N ⁇ N auto-correlation matrix derived from the N ⁇ 1 vectors Q dec .
  • si,j at each location i,j may be derived as qi ⁇ qj averaged over the pixels p in the mcblock.
  • the cross-correlation matrices and auto-correlation matrices may be averaged over a larger surrounding area having similar motion and detail as the mcblock.
  • Derivation of the S and R matrices occurs for each mcblock being coded. Accordingly, derivation of the Wiener filters involves substantial computational resources at an encoder. According to this embodiment, select filter coefficients in the F matrix may be forced to be equal to each other, which reduces the size of F and, as a consequence, reduces the computational burden at the encoder.
  • select filter coefficients in the F matrix may be forced to be equal to each other, which reduces the size of F and, as a consequence, reduces the computational burden at the encoder.
  • filter coefficients f 1 and f 2 are set to be equal each other.
  • the F and Q dec matrices may be modified as:
  • Deletion of the single coefficient reduces the size of F and Q dec both to N ⁇ 1 ⁇ 1. Deletion of other filter coefficients in F and consolidation of values in Q dec can result in further reductions to the sizes of the F and Q dec vectors. For example, it often is advantageous to delete filter coefficients of at all positions (save one) that are equidistant to each other from the decoded pixel p dec . In this manner, derivation of the F matrix is simplified.
  • encoders and decoders may store separate codebooks that are indexed not only by the filter index but also by supplemental identifiers ( FIG. 13 ).
  • the supplemental identifiers may select one of the codebooks as being active and the index may select an entry from within the codebook to be output to the interpolation filter.
  • the decoder may use the supplemental identifiers both to read codebook entries during filtering of decoded data and to build new entries in the codebooks.
  • a block's motion vector may serve as the supplemental identifier.
  • separate codebooks may be provided for each motion vector value or for different ranges of integer motion vectors. Then in operation, given the value of integer motion vector and reference frame index, the encoder and decoder both may use the corresponding codebook to recover the filter to be used in motion compensation.
  • separate codebooks may be provided for different values or ranges of values of deblocking filters present in the current or reference frame. Then in operation, given the values of the deblocking filters, the encoder and decoder use the corresponding codebook to recover the filter to be used in motion compensation.
  • separate codebooks may be provided for different values or ranges of values of other codec parameters such as pixel aspect ratio and bit rate. Then in operation, given the values of these other codec parameters, the encoder and decoder use the corresponding codebook to recover the filter to be used in motion compensation.
  • separate codebooks may be provided for P-frames and B-frames or, alternatively, for coding types (P- or B-coding) applied to each mcblock.
  • an encoder and decoder may switch among codebooks in response to scene changes detected within the video data.
  • the encoder and decoder may execute a common scene change algorithm independently of each other, each system detecting scene changes from image content of the decoded video data.
  • the encoder may execute the scene change detection algorithm and include signals within the channel data identifying scene changes to a decoder.
  • the encoder and decoder may reset their codebooks when a scene change occurs.
  • FIG. 8 illustrates the components of the block-based coding chain 110 and prediction unit 120 as separate units, in one or more embodiments, some or all of them may be integrated and they need not be separate units. Such implementation details are immaterial to the operation of the present invention unless otherwise noted above.

Abstract

The present disclosure describes use of dynamically assignable interpolation filters as part of motion compensated prediction. An encoder and a decoder each may store common codebooks that define a variety of interpolation filters that may be applied to predicted video data. During runtime coding, an encoder calculates characteristics of an ideal interpolation filter to be applied to a reference block that would minimize prediction error when the reference block would be used to predict an input block of video data. Once the characteristics of the ideal filter are identified, the encoder may search its local codebook to find a filter that best matches the ideal filter. The encoder may filter the reference block by the best matching filter stored in the codebook as it codes the input block. The encoder also may transmit an identifier of the best matching filter to a decoder, which will use the interpolation filter on predicted block as it decodes coded data for the block. The encoder and decoder may build their codebooks and maintain them independently from the other but in synchronism. The encoder and decoder may use decoded pixel block data as source data for calculation of interpolation filters.

Description

BACKGROUND
The present invention relates to video coding and, more particularly, to video coding system using interpolation filters as part of motion-compensated coding.
Video codecs typically code video frames using a discrete cosine transform (“DCT”) on blocks of pixels, called “pixel blocks” herein, much the same as used for the original JPEG coder for still images. An initial frame (called an “intra” frame) is coded and transmitted as an independent frame. Subsequent frames, which are modeled as changing slowly due to small motions of objects in the scene, are coded efficiently in the inter mode using a technique called motion compensation (“MC”) in which the displacement of pixel blocks from their position in previously-coded frames are transmitted as motion vectors together with a coded representation of a difference between a predicted pixel block and a pixel block from the source image.
A brief review of motion compensation is provided below. FIGS. 1 and 2 show a block diagram of a motion-compensated image coder/decoder system. The system combines transform coding (in the form of the DCT of pixel blocks of pixels) with predictive coding (in the form of differential pulse coded modulation (“PCM”)) in order to reduce storage and computation of the compressed image, and at the same time to give a high degree of compression and adaptability. Since motion compensation is difficult to perform in the transform domain, the first step in the interframe coder is to create a motion compensated prediction error. This computation requires one or more frame stores in both the encoder and decoder. The resulting error signal is transformed using a DCT, quantized by an adaptive quantizer, entropy encoded using a variable length coder (“VLC”) and buffered for transmission over a channel.
The way that the motion estimator works is illustrated in FIG. 3. In its simplest form the current frame is partitioned into motion compensation blocks, called “mcblocks” herein, of constant size, e.g., 16×16 or 8×8. However, variable size mcblocks are often used, especially in newer codecs such as H.264. ITU-T Recommendation H.264, Advanced Video Coding. Indeed nonrectangular mcblocks have also been studied and proposed. Mcblocks are generally larger than or equal to pixel blocks in size.
Again, in the simplest form of motion compensation, the previous decoded frame is used as the reference frame, as shown in FIG. 3. However, one of many possible reference frames may also be used, especially in newer codecs such as H.264. In fact, with appropriate signaling, a different reference frame may be used for each mcblock.
Each mcblock in the current frame is compared with a set of displaced mcblocks in the reference frame to determine which one best predicts the current mcblock. When the best matching mcblock is found, a motion vector is determined that specifies the displacement of the reference mcblock.
Exploiting Spatial Redundancy
Because video is a sequence of still images, it is possible to achieve some compression using techniques similar to JPEG. Such methods of compression are called intraframe coding techniques, where each frame of video is individually and independently compressed or encoded. Intraframe coding exploits the spatial redundancy that exists between adjacent pixels of a frame. Frames coded using only intraframe coding are called “I-frames”.
Exploiting Temporal Redundancy
In the unidirectional motion estimation described above, called “forward prediction”, a target mcblock in the frame to be encoded is matched with a set of mcblocks of the same size in a past frame called the “reference frame”. The mcblock in the reference frame that “best matches” the target mcblock is used as the reference mcblock. The prediction error is then computed as the difference between the target mcblock and the reference mcblock. Prediction mcblocks do not, in general, align with coded mcblock boundaries in the reference frame. The position of this best-matching reference mcblock is indicated by a motion vector that describes the displacement between it and the target mcblock. The motion vector information is also encoded and transmitted along with the prediction error. Frames coded using forward prediction are called “P-frames”.
The prediction error itself is transmitted using the DCT-based intraframe encoding technique summarized above.
Bidirectional Temporal Prediction
Bidirectional temporal prediction, also called “Motion-Compensated Interpolation”, is a key feature of modern video codecs. Frames coded with bidirectional prediction use two reference frames, typically one in the past and one in the future. However, two of many possible reference frames may also be used, especially in newer codecs such as H.264. In fact, with appropriate signaling, different reference frames may be used for each mcblock.
A target mcblock in bidirectionally-coded frames can be predicted by a mcblock from the past reference frame (forward prediction), or one from the future reference frame (backward prediction), or by an average of two mcblocks, one from each reference frame (interpolation). In every case, a prediction mcblock from a reference frame is associated with a motion vector, so that up to two motion vectors per mcblock may be used with bidirectional prediction. Motion-Compensated Interpolation for a mcblock in a bidirectionally-predicted frame is illustrated in FIG. 4. Frames coded using bidirectional prediction are called “B-frames”.
Bidirectional prediction provides a number of advantages. The primary one is that the compression obtained is typically higher than can be obtained from forward (unidirectional) prediction alone. To obtain the same picture quality, bidirectionally-predicted frames can be encoded with fewer bits than frames using only forward prediction.
However, bidirectional prediction does introduce extra delay in the encoding process, because frames must be encoded out of sequence. Further, it entails extra encoding complexity because mcblock matching (the most computationally intensive encoding procedure) has to be performed twice for each target mcblock, once with the past reference frame and once with the future reference frame.
Typical Encoder Architecture for Bidirectional Prediction
FIG. 5 shows a typical bidirectional video encoder. It is assumed that frame reordering takes place before coding, i.e., I- or P-frames used for B-frame prediction must be coded and transmitted before any of the corresponding B-frames. In this codec, B-frames are not used as reference frames. With a change of architecture, they could be as in H.264.
Input video is fed to a Motion Compensation Estimator/Predictor that feeds a prediction to the minus input of the subtractor. For each mcblock, the Inter/Intra Classifier then compares the input pixels with the prediction error output of the subtractor. Typically, if the mean square prediction error exceeds the mean square pixel value, an intra mcblock is decided. More complicated comparisons involving DCT of both the pixels and the prediction error yield somewhat better performance, but are not usually deemed worth the cost.
For intra mcblocks the prediction is set to zero. Otherwise, it comes from the Predictor, as described above. The prediction error is then passed through the DCT and quantizer before being coded, multiplexed and sent to the Buffer.
Quantized levels are converted to reconstructed DCT coefficients by the Inverse Quantizer and then the inverse is transformed by the inverse DCT unit (“IDCT”) to produce a coded prediction error. The Adder adds the prediction to the prediction error and clips the result, e.g., to the range 0 to 255, to produce coded pixel values.
For B-frames, the Motion Compensation Estimator/Predictor uses both the previous frame and the future frame kept in picture stores.
For I- and P-frames, the coded pixels output by the Adder are written to the Next Picture Store, while at the same time the old pixels are copied from the Next Picture store to the Previous Picture store. In practice, this is usually accomplished by a simple change of memory addresses.
Also, in practice the coded pixels may be filtered by an adaptive deblocking filter prior to entering the picture stores. This improves the motion compensation prediction, especially for low bit rates where coding artifacts may become visible.
The Coding Statistics Processor in conjunction with the Quantizer Adapter controls the output bit-rate and optimizes the picture quality as much as possible.
Typical Decoder Architecture for Bidirectional Prediction
FIG. 6 shows a typical bidirectional video decoder. It has a structure corresponding to the pixel reconstruction portion of the encoder using inverting processes. It is assumed that frame reordering takes place after decoding and video output. The interpolation filter might be placed at the output of the motion compensated predictor as in the encoder.
Fractional Motion Vector Displacements
FIG. 3 and FIG. 4 show reference mcblocks in reference frames as being displaced vertically and horizontally with respect to the position of the current mcblock being decoded in the current frame. The amount of the displacement is represented by a two-dimensional vector [dx, dy], called the motion vector. Motion vectors may be coded and transmitted, or they may be estimated from information already in the decoder, in which case they are not transmitted. For bidirectional prediction, each transmitted mcblock requires two motion vectors.
In its simplest form, dx and dy are signed integers representing the number of pixels horizontally and the number of lines vertically to displace the reference mcblock. In this case, reference mcblocks are obtained merely by reading the appropriate pixels from the reference stores.
However, in newer video codecs it has been found beneficial to allow fractional values for dx and dy. Typically, they allow displacement accuracy down to a quarter pixel, i.e., an integer+−0.25, 0.5 or 0.75.
Fractional motion vectors require more than simply reading pixels from reference stores. In order to obtain reference mcblock values for locations between the reference store pixels, it is necessary to interpolate between them.
Simple bilinear interpolation can work fairly well. However, in practice it has been found beneficial to use two-dimensional interpolation filters especially designed for this purpose. In fact, for reasons of performance and practicality, the filters are often not shift-invariant filters. Instead different values of fractional motion vectors may utilize different interpolation filters.
Motion Compensation Using Adaptive Interpolation Filters
The optimum motion compensation interpolation filter depends on a number of factors. For example, objects in a scene may not be moving in pure translation. There may be object rotation, both in two dimensions and three dimensions. Other factors include zooming, camera motion and lighting variations caused by shadows, or varying illumination.
Camera characteristics may vary due to special properties of their sensors. For example, many consumer cameras are intrinsically interlaced, and their output may be de-interlaced and filtered to provide pleasing-looking pictures free of interlacing artifacts. Low light conditions may cause an increased exposure time per frame, leading to motion dependent blur of moving objects. Pixels may be non-square.
Thus, in many cases improved performance can be had if the motion compensation interpolation filter can adapt to these and other outside factors. In such systems interpolation filters may be designed by minimizing the mean square error between the current mcblocks and their corresponding reference mcblocks over each frame. These are the so-called Wiener filters. The filter coefficients would then be quantized and transmitted at the beginning of each frame to be used in the actual motion compensated coding.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of a conventional video coder.
FIG. 2 is a block diagram of a conventional video decoder.
FIG. 3 illustrates principles of motion compensated prediction.
FIG. 4 illustrates principles of bidirectional temporal prediction.
FIG. 5 is a block diagram of a conventional bidirectional video coder.
FIG. 6 is a block diagram of a conventional bidirectional video decoder.
FIG. 7 illustrates an encoder/decoder system suitable for use with embodiments of the present invention.
FIG. 8 is a simplified block diagram of a video encoder according to an embodiment of the present invention.
FIG. 9 illustrates a method according to an embodiment of the present invention.
FIG. 10 illustrates a method according to another embodiment of the present invention.
FIG. 11 is a simplified block diagram of a video decoder according to an embodiment of the present invention.
FIG. 12 illustrates a method according to a further embodiment of the present invention.
FIG. 13 illustrates a codebook architecture according to an embodiment of the present invention.
FIG. 14 illustrates a codebook architecture according to another embodiment of the present invention.
FIG. 15 illustrates a codebook architecture according to a further embodiment of the present invention.
FIG. 16 illustrates a decoding method according to an embodiment of the present invention.
FIG. 17 illustrates a method according to an embodiment of the present invention.
FIG. 18 illustrates another method according to an embodiment of the present invention.
DETAILED DESCRIPTION
Embodiments of the present invention provide a video coder/decoder system that uses dynamically assignable interpolation filters as part of motion compensated prediction. An encoder and a decoder each may store common codebooks that define a variety of interpolation filters that may be applied to predicted video data. During runtime coding, an encoder calculates characteristics of an ideal interpolation filter to be applied to a reference block that would minimize prediction error when the reference block would be used to predict an input block of video data. Once the characteristics of the ideal filter are identified, the encoder may search its local codebook to find a filter that best matches the ideal filter. The encoder may filter the reference block by the best matching filter stored in the codebook as it codes the input block. The encoder also may transmit an identifier of the best matching filter to a decoder, which will use the interpolation filter on a predicted block as it decodes coded data for the block.
Motion Compensation Using Vector Quantized Interpolation Filters—VQIF
Improved codec performance can be achieved if an interpolation filter can be adapted to each mcblock. However, transmitting a filter per mcblock is usually too expensive. Accordingly, embodiments of the present invention propose to use a codebook of filters and send an index into the codebook for each mcblock.
Embodiments of the present invention provide a method of building and applying filter codebooks between an encoder and a decoder (FIG. 7). FIG. 8 illustrates a simplified block diagram of an encoder system showing operation of the interpolation filter. FIG. 9 illustrates a method of building a codebook according to an embodiment of the present invention. FIG. 10 illustrates a method of using a codebook during runtime coding and decoding according to an embodiment of the present invention. FIG. 11 illustrates a simplified block diagram of a decoder showing operation of the interpolation filter and consumption of the codebook indices.
FIG. 8 is a simplified block diagram of an encoder suitable for use with the present invention. The encoder 100 may include a block-based coding chain 110 and a prediction unit 120.
The block coding chain 110 may include a subtractor 112, a transform unit 114, a quantizer 116 and a variable length coder 118. The subtractor 112 may receive an input mcblock from a source image and a predicted mcblock from the prediction unit 120. It may subtract the predicted mcblock from the input mcblock, generating a block of pixel residuals. The transform unit 114 may convert the mcblock's residual data to an array of transform coefficient according to a spatial transform, typically a discrete cosine transform (“DCT”) or a wavelet transform. The quantizer 116 may truncate transform coefficients of each block according to a quantization parameter (“QP”). The QP values used for truncation may be transmitted to a decoder in a channel. The variable length coder 118 may code the quantized coefficients according to an entropy coding algorithm, for example, a variable length coding algorithm. Following variable length coding, the coded data of each mcblock may be stored in a buffer 140 to await transmission to a decoder via a channel.
The prediction unit 120 may include: an inverse quantization unit 122, an inverse transform unit 124, an adder 126, a reference picture cache 128, a motion estimator 130, a motion compensated predictor 132, an interpolation filter 134 and a codebook 136. The inverse quantization unit 122 may quantize coded video data according to the QP used by the quantizer 116. The inverse transform unit 124 may transform re-quantized coefficients to the pixel domain. The adder 126 may add pixel residuals output from the inverse transform unit 124 with predicted motion data from the motion compensated predictor 132. The reference picture cache 128 may store recovered frames for use as reference frames during coding of later-received mcblocks. The motion estimator 130 may estimate image motion between a source image being coded and reference frame(s) stored in the reference picture cache. For example, it may select a prediction mode to be used (for example, unidirectional P-coding or bidirectional B-coding), and generate motion vectors for use in such predictive coding. The motion compensated predictor 132 may generate a predicted mcblock for use by the block coder. In this regard, the motion compensated predictor may retrieve stored mcblock data of the selected reference frames. The interpolation filter 134 may filter a predicted mcblock from the motion compensated predictor 132 according to configuration parameters output by codebook 136. The codebook 136 may store configuration data that defines operation of the interpolation filter 134. Different instances of configuration data are identified by an index into the codebook.
During coding operations, motion vectors, quantization parameters and codebook indices may be output to a channel along with coded mcblock data for decoding by a decoder (not shown).
FIG. 9 illustrates a method according to an embodiment of the present invention. According to the embodiment, a codebook may be constructed by using a large set of training sequences having a variety of detail and motion characteristics. For each mcblock, an integer motion vector and reference frame may be computed according to traditional techniques (box 210). Then, an N×N Wiener interpolation filter may be constructed (box 220) by computing cross-correlation matrices (box 222) and auto-correlation matrices (box 224) between uncoded pixels and coded pixels from the reference picture cache, each averaged over the mcblock. Alternatively, the cross-correlation matrices and auto-correlation matrices may be averaged over a larger surrounding area having similar motion and detail as the mcblock. The interpolation filter may be a rectangular interpolation filter or a circularly-shaped Wiener interpolation filter.
This procedure may produce auto-correlation matrices that are singular, which means that some of the filter coefficients may be chosen arbitrarily. In these cases, the affected coefficients farthest from the center may be chosen to be zero.
The resulting filter may be added to the codebook (box 230). Filters may be added pursuant to vector quantization (“VQ”) clustering techniques, which are designed to either produce a codebook with a desired number of entries or a codebook with a desired accuracy of representation of the filters. Once the codebook is established, it may be transmitted to the decoder (box 240). After transmission, both the encoder and decoder may store a common codebook, which may be referenced during runtime coding operations.
Transmission to a decoder may occur in a variety of ways. The codebook may then be transmitted periodically to the decoder during encoding operations. Alternatively, the codebook may be coded into the decoder a priori, either from coding operations performed on generic training data or by representation in a coding standard. Other embodiments permit a default codebook to be established in an encoder and decoder but to allow the codebook to be updated adaptively by transmissions from the encoder to the decoder.
Indices into the codebook may be variable length coded based on their probability of occurrence, or they may be arithmetically coded.
FIG. 10 illustrates a method for runtime encoding of video, according to an embodiment of the present invention. For each mcblock to be coded, an integer motion vector and reference frame(s) may be computed (box 310), coded and transmitted. Then an N×N Wiener interpolation filter may be constructed for the mcblock (box 320) by computing cross-correlation matrices (box 322) and auto-correlation matrices (box 324) averaged over the mcblock. Alternatively, the cross-correlation matrices and auto-correlation matrices may be averaged over a larger surrounding area that has similar motion and detail as the mcblock. The interpolation filter may be a rectangular interpolation filter or a circularly-shaped Wiener interpolation filter.
Once the interpolation filter is established, the codebook may be searched for a previously-stored filter that best matches the newly-constructed interpolation filter (box 330). The matching algorithm may proceed according to vector quantization search methods. When a matching codebook entry is identified, the encoder may code the resulting index and transmit it to a decoder (box 340).
Optionally, in an adaptive process shown in FIG. 10 in phantom, when an encoder identifies a best matching filter from the codebook, it may compare the newly generated interpolation filter with the codebook's filter (box 350). If the differences between the two filters exceed a predetermined error threshold, the encoder may transmit filter characteristics to the decoder, which may cause the decoder to store the characteristics as a new codebook entry (boxes 360-370). If the differences do not exceed the error threshold, the encoder may simply transmit the index of the matching codebook (box 340).
The decoder that receives the integer motion vector, reference frame index and VQ interpolation filter index may use this data to perform motion compensation.
FIG. 11 is a simplified block diagram of a decoder 400 according to an embodiment of the present invention. The decoder 400 may include a block-based decoder 402 that may include a variable length decoder 410, an inverse quantizer 420, an inverse transform unit 430 and an adder 440. The decoder 400 further may include a prediction unit 404 that may include a reference picture cache 450, a motion compensated predictor 460, a codebook 470 and an interpolation filter 480. The prediction unit 404 may generate a predicted pixel block in response to motion compensation data, such as motion vectors and codebook indices received from a channel. The block-based decoder 402 may decode coded pixel block data with reference to the predicted pixel block data to recover pixel data of the pixel blocks.
Specifically, the coded video data may include motion vectors and codebook indices that govern operation of the prediction unit 404. The reference picture cache 450 may store recovered image data of previously decoded frames that were identified as candidates for prediction of later-received coded video data (e.g., decoded I- or P-frames). The motion compensated predictor 460, responsive to mcblock motion vector data, may retrieve a reference mcblock from identified frame(s) stored in the reference picture cache. Typically, a signal reference mcblock is retrieved when decoding a P-coded block and a pair of reference mcblocks are retrieved and averaged together when decoding a B-coded block. The motion compensated predictor 460 may output the resultant mcblock and, optionally, pixels located near to the reference mcblocks, to the interpolation filter 480.
The codebook 470 may supply filter parameter data to the interpolation filter 480 in response to a codebook index received from the channel data associated with the mcblock being decoded. The codebook 470 may be provisioned as storage and control logic to store filter parameter data and output search data in response to a codebook inbox. The interpolation filter 480 may filter the predicted mcblock based on parameter data applied by the codebook 470. The output of the interpolation filter 480 may be input to the block-based decoder 402.
With respect to the block-based decoder 402, the coded video data may include coded residual coefficients that have been entropy coded. A variable length decoder 410 may decode data received from a channel buffer according to an entropy coding process to recover quantized coefficients therefrom. The inverse quantizer 420 may multiply coefficient data received from the inverse variable length decoder 410 by a quantization parameter received in the channel data. The inverse quantizer 420 may output recovered coefficient data to the inverse transform unit 430. The inverse transform unit 430 may transform dequantized coefficient data received from the inverse quantizer 420 to pixel data. The inverse transform unit 430, as its name implies, performs the converse of transform operations performed by the transform unit of an encoder (e.g., DCT or wavelet transforms). An adder 440 may add, on a pixel-by-pixel basis, pixel residual data obtained by the inverse transform unit 430 with predicted pixel data obtained from the prediction unit 404. The adder 440 may output recovered mcblock data, from which a recovered frame may be constructed and rendered at a display device (not shown).
FIG. 12 illustrates a method according to another embodiment of the present invention. For each mcblock, an integer motion vector and reference frame may be computed according to traditional techniques (box 510). Then, an N×N Wiener interpolation filter may be selected by serially determining prediction results that would be obtained by each filter stored in the codebook (box 220). Specifically, for each mcblock, the method may perform filtering operations on a predicted block using either all or a subset of the filters in succession (box 522) and estimate a prediction residual obtained from each codebook filter (box 524). The method may determine which filter configuration gives the best prediction (box 530). The index of that filter may be coded and transmitted to a decoder (box 540). This embodiment, conserves processing resources that otherwise might be spent computing Wiener filters for each source mcblock.
Simplifying Calculation of Wiener Filters
In another embodiment, select filter coefficients may be forced to be equal to other filter coefficients. This embodiment can simplify the calculation of Wiener filters.
Derivation of a Wiener filter for a mcblock involves derivation of an ideal N×1 filter F according to:
F=S −1 R
that minimizes the mean squared prediction error. For each pixel p in the mcblock, the matrix F yields a predicted pixel {circumflex over (p)} by {circumflex over (p)}=FT·Qp and a prediction error represented by err=p−{circumflex over (p)}.
More specifically, for each pixel p, the vector Qp may take the form:
Q p = [ q 1 q 2 q N ] ,
where
q1 to qN represent pixels in or near the translated reference mcblock to be used in the prediction of p.
In the foregoing, R is an N×1 cross-correlation matrix derived from uncoded pixels (p) to be coded and their corresponding Qp vectors. In the R matrix, ri at each location i may be derived as p·qi averaged over the pixels p in the mcblock. S is an N×N auto-correlation matrix derived from the N×1 vectors Qp. In the S matrix, si,j at each location i,j may be derived as qi·qj averaged over the pixels p in the mcblock. Alternatively, the cross-correlation matrices and auto-correlation matrices may be averaged over a larger surrounding area having similar motion and detail as the mcblock.
Derivation of the S and R matrices occurs for each mcblock being coded. Accordingly, derivation of the Wiener filters involves substantial computational resources at an encoder. According to this embodiment, select filter coefficients in the F matrix may be forced to be equal to each other, which reduces the size of F and, as a consequence, reduces the computational burden at the encoder. Consider an example where filter coefficients f1 and f2 are set to be equal each other. In this embodiment, the F and Qp matrices may be modified as:
F = [ f 1 f 3 f N ] and Q p = [ q 1 + q 2 q 3 q N ] .
Deletion of the single coefficient reduces the size of F and Qp both to N−1×1. Deletion of other filter coefficients in F and consolidation of values in Qp can result in further reductions to the sizes of the F and Qp vectors. For example, it often is advantageous to delete filter coefficients at all positions (save one) that are equidistant to each other from the pixel p. In this manner, derivation of the F matrix is simplified.
In another embodiment, encoders and decoders may store separate codebooks that are indexed not only by the filter index but also by supplemental identifiers (FIG. 13). In such embodiments, the supplemental identifiers may select one of the codebooks as being active and the index may select an entry from within the codebook to be output to the interpolation filter.
The supplemental identifier may be derived from many sources. In one embodiment, a block's motion vector may serve as the supplemental identifier. Thus, separate codebooks may be provided for each motion vector value or for different ranges of integer motion vectors (FIG. 14). Then in operation, given the value of integer motion vector and reference frame index, the encoder and decoder both may use the corresponding codebook to recover the filter to be used in motion compensation.
In another embodiment, separate codebooks may be provided for different values or ranges of values of deblocking filters present in the current or reference frame. Then in operation, given the values of the deblocking filters, the encoder and decoder use the corresponding codebook to recover the filter to be used in motion compensation.
In a further embodiment, shown in FIG. 15, separate codebooks may be provided for different values or ranges of values of other codec parameters such as pixel aspect ratio and bit rate. Then in operation, given the values of these other codec parameters, the encoder and decoder use the corresponding codebook to recover the filter to be used in motion compensation.
In another embodiment, separate codebooks may be provided for P-frames and B-frames or, alternatively, for coding types (P- or B-coding) applied to each mcblock.
In a further embodiment, different codebooks may be generated from discrete sets of training sequences. The training sequences may be selected to have consistent video characteristics within the feature set, such as speeds of motion, complexity of detail and/or other parameters. Then separate codebooks may be constructed for each value or range of values of the feature set. Features in the feature set, or an approximation thereto, may be either coded and transmitted or, alternatively, derived from coded video data as it is received at the decoder. Thus, the encoder and decoder will store common sets of codebooks, each tailored to characteristics of the training sequences from which they were derived. In operation, for each mcblock, the characteristics of input video data may be measured and compared to the characteristics that were stored from the training sequences. The encoder and decoder may select a codebook that corresponds to the measured characteristics of the input video data to recover the filter to be used in motion compensation.
In yet another embodiment, an encoder may construct separate codebooks arbitrarily and switch among the codebooks by including an express codebook specifier in the channel data.
Toppling Between Fractional Motion Vectors and Integer Motion Vectors
Use of vector coded codebooks to select interpolation filters advantageously allows a video coder to select motion vectors that are integers and to avoid the additional data that would be required to code motion vectors as fractions (e.g., half or quarter pixel distances). In an embodiment, an encoder may toggle between two modes of operation: a first mode in which motion vectors may be coded as fractional values and a default interpolation filter is used for predicted mcblocks and a second mode in which motion vectors are coded as integer distances and the vector coded interpolation filters of the foregoing embodiments are used. Such a system allows an encoder to manage computational resources needed to perform video coding and accuracy of prediction.
In such an embodiment, when fractional motion vectors are communicated from an encoder to a decoder, both units may build a new interpolation filter from the fractional motion vectors and characteristics of the default interpolation filter and store it in the codebook. In this manner, if an encoder determines that more accurate interpolation is achieved via the increased bit rate of fractional motion vectors, the resultant interpolation filter may be stored in the codebook for future use if the interpolation were needed again.
FIG. 16 illustrates a decoding method 600 according to an embodiment of the present invention. The method 600 may be repeated for each coded mcblock received by a decoder from a channel for which integer motion vectors are provided. According to the method, a decoder may retrieve parameters of an interpolation from a local codebook based on an index received in the channel data for the coded mcblock (box 610). The decoder further may retrieve data of a reference mcblock based on a motion vector received from the channel for the coded mcblock (box 620). As noted, depending on the interpolation filter specified by the codebook index, the decoder may retrieve data in excess of a mcblock; for example, the decoder may retrieve pixels adjacent to the mcblock's boundary based on the size of the filter. The method may apply the interpolation filter to the retrieved reference mcblock data (box 630) and decode the coded mcblock by motion compensation using the filtered reference mcblock as a prediction reference (box 640).
Minimizing Mean Square Error Between Filtered Current Mcblocks and their Corresponding Reference Mcblocks
Normally, interpolation filters are designed by minimizing the mean square error between the current mcblocks and their corresponding reference mcblocks over each frame or part of a frame. In an embodiment, the interpolation filters may be designed to minimize the mean square error between filtered current mcblocks and their corresponding reference mcblocks over each frame or part of a frame. The filters used to filter the current mcblocks need not be standardized or known to the decoder. They may adapt to parameters such as those mentioned above, or to others unknown to the decoder such as level of noise in the incoming video.
Synchronized, Local Codebook Generation at Encoder and Decoder
In another embodiment, encoders and decoder may generate interpolation codebooks independently but in synchronized fashion. Codebook managers (not shown) within the encoder and decoder each may generate interpolation filters using pixel values of decoded frames and may revise their codebooks based on the interpolation filters obtained thereby. If a newly-generated filter is stored to a codebook, the filter may be referenced during coding of later-received video data. Such an embodiment can avoid transmission of filter parameter data over the channel and thereby conserve bandwidth for other coding processes.
FIG. 17 illustrates a method operable at a coder, according to an embodiment of the present invention. In the method, the encoder may initialize a codebook (box 710), for example, by opening a default codebook on start up or by building the codebook from set(s) of training video. Further, it is permissible to initialize the codebook to a null state (no codebook); in this latter case, the codebook may begin as empty but will build quickly during runtime coding. During runtime, the encoder may code and decode a new mcblock (box 720) as described in the foregoing embodiments and, as part of this process, the encoder may search the codebook for a best-matching interpolation filter (boxes 720, 730). The encoder may transmit its codebook identifier to the decoder, along with the coded video data (box 740).
The method 700 further may compute parameters of an ideal interpolation filter for the coded mcblock, using the decoded mcblock data as a reference (box 750). The method may search the codebook to determine if an already-stored codebook entry that best matches the ideal filter (box. 760). If the difference between the two filters exceeds a predetermined error threshold, the encoder may store characteristics of the new filter to the codebook (box 770). If the differences do not exceed the error threshold, the encoder may conclude operation for the current mcblock. The encoder need not transmit parameter data of the new filter to the decoder.
FIG. 18 illustrates a method 800 operable at a decoder according to an embodiment. In the method 800, the decoder may initialize a locally-stored codebook to a same condition as the encoder's codebook (box 810). Again, the codebook may be opened from a default codebook, may be built using common training sequences as used at the encoder or may be opened as an “empty” codebook.
During runtime coding, the method 800 may decode coded mcblocks received from the encoder, using integer motion vectors, reference frame indices and other coding parameters contained in the channel. As part of this process, the method 800 may retrieve and apply interpolation filter parameter data from the codebook as identified by the encoder (boxes 820, 830). Recovered video data obtained from the decoding and filtering may be output from the decoder for rendering and may be stored in the reference picture cache for use in decoding subsequently-received frames.
The method 800 further may compute parameters of an ideal interpolation filter for the coded mcblock, using the decoded mcblock as a basis for comparison with the predicted mcblock (box 840). Since the decoded mcblock data obtained the decoder should be identical to the decoded mcblock data at the encoder, the encoder and decoder should obtain identical filters. The method may search the codebook to determine if an already-stored codebook entry that best matches the ideal filter (box 850). If the differences between the two filters exceed a predetermined error threshold, the method 800 may store characteristics of the new filter to the codebook (box 860). If the differences do not exceed the error threshold, the method 800 may conclude operation for the current mcblock. The decoder need not communicate with the encoder to revise its copy of the codebook.
The foregoing discussion has presented the methods 700 and 800 as operating on decoded mcblock data, which is obtained by a prediction unit in the encoder and the decoder. Such decoding operations often include deblocking filters included within a synchronized coding loop. In an embodiment, calculations of new filters may be performed using decoded, deblocked video. Alternatively, calculation of new filters may be performed using decoded but non-deblocked video.
Although the encoder and decoder need not communicate with each other, improved performance may be obtained by other embodiments of the present invention, which involve low bandwidth communication between the encoder and decoder. In one embodiment, for example, the encoder may include a flag with coded mcblock data that indicates to the decoder whether to use deblocked decoded data or non-deblocked decoded data when computing filter parameters. In such an embodiment, the encoder may calculate characteristics of an ideal interpolation filter using both types of source data—first, using deblocked, decoded data and second, using non-deblocked, decoded data. The encoder further may identify, with reference to the source video data, which of the two filters generates the least error. The encoder may set a short flag in the channel, possibly a single bit, data to identify to the decoder which process is to be used during decode.
The embodiments of FIGS. 17 and 18 provide encoders and decoders that concurrently maintain codebooks based on coded video data without requiring expressly communication to synchronize these efforts. Although these embodiments are bandwidth-efficient, improper operation may arise in the event of channel errors that corrupt data received at the decoder. In the event of such an error, decoded mcblock data may be corrupt or unavailable. Thus, the decoder's codebook may lose synchronization with the encoder's codebook.
Embodiments of the present invention provide for exchange of an error mitigation protocol to reduce errors that might arise from communication errors. For example, an encoder and decoder may operate under a codebook management policy in which the encoder and decoder each reset the codebook to a predetermined state at predetermined intervals. For example, they may reset their codebooks to the initialized state at such intervals (boxes 710, 810). Alternatively, the encoder and decoder may purge codebook entries associated with predetermined types of coded data, retaining other types. For example, encoders and decoder that use Long Term Reference frames (LTRs) as part of their coding protocols may purge codebook entries associated with all frames except LTR frames. In such protocols, encoders receive acknowledgment messages from decoders indicating that LTR frames were received and successfully decoded. In another embodiment, an encoder may command a decoder to reset its codebook by placing an appropriate signal in the channel data. Codebook management messages likely are to be more bandwidth efficient than other systems in which the encoder and decoder exchange data expressly defining filter parameters.
As in the prior embodiments, the interpolation filters may be calculated as an N×N Wiener interpolation filter constructed for the decoded mcblock by computing cross-correlation matrices and auto-correlation matrices over the mcblock. Alternatively, the cross-correlation matrices and auto-correlation matrices may be averaged over a larger surrounding area that has similar motion and detail as the mcblock. The interpolation filter may be a rectangular interpolation filter or a circularly-shaped Wiener interpolation filter.
Also as in the prior embodiments, select filter coefficients may be forced to be equal to other filter coefficients. This embodiment can simplify the calculation of Wiener filters.
Derivation of a Wiener filter for a mcblock involves derivation of an ideal N×1 filter F according to:
F=S −1 R
that minimizes the mean squared prediction error. For each pixel pdec in the decoded mcblock, the matrix F yields a predicted pixel {circumflex over (p)} by {circumflex over (P)}=FT·Qdec and a prediction error represented by err=pdec−{circumflex over (p)}.
More specifically, for each pdec, the vector Qdec may take the form:
Q dec = [ q 1 q 2 q n ] ,
where
q1 to qN represent pixels in or near the translated reference mcblock to be used in the prediction of pdec.
In the foregoing, R is an N×1 cross-correlation matrix derived from decoded pixels (pdec) to be coded and their corresponding Qdec vectors. In the R matrix, ri at each location i may be derived as pdec·qi averaged over the pixels p in the mcblock. S is an N×N auto-correlation matrix derived from the N×1 vectors Qdec. In the S matrix, si,j at each location i,j may be derived as qi·qj averaged over the pixels p in the mcblock. Alternatively, the cross-correlation matrices and auto-correlation matrices may be averaged over a larger surrounding area having similar motion and detail as the mcblock.
Derivation of the S and R matrices occurs for each mcblock being coded. Accordingly, derivation of the Wiener filters involves substantial computational resources at an encoder. According to this embodiment, select filter coefficients in the F matrix may be forced to be equal to each other, which reduces the size of F and, as a consequence, reduces the computational burden at the encoder. Consider an example where filter coefficients f1 and f2 are set to be equal each other. In this embodiment, the F and Qdec matrices may be modified as:
F = [ f 1 f 3 f N ] and Q dec = [ q 1 + q 2 q 3 q n ] .
Deletion of the single coefficient reduces the size of F and Qdec both to N−1×1. Deletion of other filter coefficients in F and consolidation of values in Qdec can result in further reductions to the sizes of the F and Qdec vectors. For example, it often is advantageous to delete filter coefficients of at all positions (save one) that are equidistant to each other from the decoded pixel pdec. In this manner, derivation of the F matrix is simplified.
In another embodiment, encoders and decoders may store separate codebooks that are indexed not only by the filter index but also by supplemental identifiers (FIG. 13). In such embodiments, the supplemental identifiers may select one of the codebooks as being active and the index may select an entry from within the codebook to be output to the interpolation filter. The decoder may use the supplemental identifiers both to read codebook entries during filtering of decoded data and to build new entries in the codebooks.
In one embodiment, a block's motion vector may serve as the supplemental identifier. Thus, separate codebooks may be provided for each motion vector value or for different ranges of integer motion vectors. Then in operation, given the value of integer motion vector and reference frame index, the encoder and decoder both may use the corresponding codebook to recover the filter to be used in motion compensation.
In another embodiment, separate codebooks may be provided for different values or ranges of values of deblocking filters present in the current or reference frame. Then in operation, given the values of the deblocking filters, the encoder and decoder use the corresponding codebook to recover the filter to be used in motion compensation.
In a further embodiment, separate codebooks may be provided for different values or ranges of values of other codec parameters such as pixel aspect ratio and bit rate. Then in operation, given the values of these other codec parameters, the encoder and decoder use the corresponding codebook to recover the filter to be used in motion compensation.
In another embodiment, separate codebooks may be provided for P-frames and B-frames or, alternatively, for coding types (P- or B-coding) applied to each mcblock.
In another embodiment, an encoder and decoder may switch among codebooks in response to scene changes detected within the video data. In one embodiment, the encoder and decoder may execute a common scene change algorithm independently of each other, each system detecting scene changes from image content of the decoded video data. In another embodiment, the encoder may execute the scene change detection algorithm and include signals within the channel data identifying scene changes to a decoder. Optionally, in such embodiments, the encoder and decoder may reset their codebooks when a scene change occurs.
The foregoing discussion identifies functional blocks that may be used in video coding systems constructed according to various embodiments of the present invention. In practice, these systems may be applied in a variety of devices, such as mobile devices provided with integrated video cameras (e.g., camera-enabled phones, entertainment systems and computers) and/or wired communication systems such as videoconferencing equipment and camera-enabled desktop computers. In some applications, the functional blocks described hereinabove may be provided as elements of an integrated software system, in which the blocks may be provided as separate elements of a computer program. In other applications, the functional blocks may be provided as discrete circuit components of a processing system, such as functional units within a digital signal processor or application-specific integrated circuit. Still other applications of the present invention may be embodied as a hybrid system of dedicated hardware and software components. Moreover, the functional blocks described herein need not be provided as separate units. For example, although FIG. 8 illustrates the components of the block-based coding chain 110 and prediction unit 120 as separate units, in one or more embodiments, some or all of them may be integrated and they need not be separate units. Such implementation details are immaterial to the operation of the present invention unless otherwise noted above.
Several embodiments of the invention are specifically illustrated and/or described herein. However, it will be appreciated that modifications and variations of the invention are covered by the above teachings and within the purview of the appended claims without departing from the spirit and intended scope of the invention.

Claims (50)

I claim:
1. A codebook management method comprising, at a video processing device:
decoding coded pixel block data according to motion compensated prediction techniques to generate decoded video data,
calculating characteristics of an ideal interpolation filter based on the decoded video data, the ideal interpolation filter calculated with a cross-correlation matrix and an auto-correlation matrix between uncoded pixel block data and the decoded pixel block data,
adding the calculated characteristics to a codebook stored at the video processing device for use with later received pixel blocks, the characteristics including filter configuration data that defines the operation of the interpolation filter.
2. The method of claim 1, wherein the video processing device is a video encoder and the method further comprises coding an input pixel block data according to motion compensated prediction, the coding including:
identifying a pixel block from a reference frame as a prediction reference,
calculating characteristics of an ideal filter to be applied to the identified pixel block to match the input pixel block,
searching the codebook to identify a matching codebook filter,
if a match is found, coding the input pixel block with respect to the reference pixel block having been filtered by the matching codebook filter, and
transmitting coded data of the input pixel block and an identifier of the matching codebook filter to a decoder.
3. The method of claim 2, wherein a matching codebook is identified if the difference between the calculated filter and an identified best matching codebook filter does not exceed a predetermined threshold.
4. The method of claim 3, wherein if the difference exceeds the predetermined threshold, adding the calculated filter to the codebook and transmitting the calculated filter characteristics to the decoder.
5. The method of claim 1, wherein the video processing device is a video decoder and the decoding further comprises:
identifying, based on the coded pixel block data, a pixel block from a reference frame as a prediction reference,
filtering the identified pixel block based on interpolation filter parameters then-stored in the codebook, the parameters identified by a codebook index received by the decoder,
decoding the pixel block with reference to the filtered reference pixel block.
6. The method of claim 1, further comprising resetting the codebook at predetermined intervals.
7. The method of claim 1, further comprising resetting the codebook in response to a predetermined command exchanged between a video encoder and a video decoder.
8. The method of claim 1, further comprising resetting the codebook on detection of a scene change.
9. The method of claim 1, further comprising resetting the codebook to an empty state.
10. The method of claim 1, further comprising resetting the codebook to a predetermined state.
11. The method of claim 1, further comprising purging from the codebook entries associated with a predetermined coding assignment type.
12. The method of claim 1, further comprising purging from the codebook all entries except entries expressly acknowledged between a video encoder and a video decoder.
13. The method of claim 1, wherein the decoded video data is decoded, deblocked video data.
14. The method of claim 1, wherein the decoded video data is decoded, non-deblocked video data.
15. The method of claim 1, wherein the decoded video data is selected from:
1) decoded, deblocked video data, and
2) decoded, non-deblocked video data
in response to an identifier exchanged between a video coder and a video decoder.
16. The method of claim 1, wherein the cross correlation matrix and the auto correlation matrix are averaged over the pixel block.
17. The method of claim 1, wherein the cross correlation matrix and the auto correlation matrix are averaged over an area larger than and surrounding the pixel block.
18. A video coding method, comprising:
coding an input pixel block data according to motion compensated prediction, the coding including:
identifying a pixel block from a reference frame as a prediction reference,
calculating characteristics of an ideal filter to be applied to the identified pixel block to match the input pixel block, the ideal filter calculated with a cross-correlation matrix and an auto-correlation matrix between uncoded pixel block data and the decoded pixel block data,
searching a codebook of previously-stored filter characteristics to identify a matching codebook filter,
if a match is found, coding the input pixel block with respect to the reference pixel block having been filtered by the matching codebook filter, and
transmitting the coded pixel block data and an identifier of the matching codebook filter to a decoder;
decoding coded pixel block data according to motion compensated prediction techniques to generate decoded video data,
calculating characteristics of an ideal interpolation filter based on the decoded video data,
adding the calculated characteristics to the codebook for use in decoding later-received input pixel block data, the characteristics including filter configuration data that defines the operation of the interpolation filter.
19. The video coding method of claim 18, wherein the codebook is chosen from a plurality of stored codebooks, the codebook selected by a codebook identifier.
20. The video coding method of claim 18, wherein the codebook is chosen from a plurality of stored codebooks, the codebook selected by a motion vector calculated for the input block.
21. The video coding method of claim 18, wherein the codebook is chosen from a plurality of stored codebooks, the codebook selected by an aspect ratio calculated for the input block.
22. The video coding method of claim 18, wherein the codebook is chosen from a plurality of stored codebooks, the codebook selected by coding type assigned to the input block.
23. The video coding method of claim 18, wherein the codebook is chosen from a plurality of stored codebooks, the codebook selected by an indicator of the input block's complexity.
24. The video coding method of claim 18, wherein the codebook is chosen from a plurality of stored codebooks, the codebook selected by an encoder bit rate.
25. The video coding method of claim 18, wherein the codebook is chosen from a plurality of stored codebooks of the plurality of codebooks generated from a respective set of training sequences.
26. The video coding method of claim 18, wherein the coded data of the input pixel block includes motion vectors having integer values.
27. The video coding method of claim 18, further comprising resetting the codebook at predetermined intervals.
28. The video coding method of claim 18, further comprising resetting the codebook in response to a predetermined command exchanged between a video encoder and a video decoder.
29. The video coding method of claim 18, further comprising resetting the codebook on detection of a scene change.
30. The video coding method of claim 18, further comprising resetting the codebook to an empty state.
31. The video coding method of claim 18, further comprising resetting the codebook to a predetermined state.
32. A video decoding method, comprising:
decoding coded pixel block data according to motion compensated prediction to generate decoded video data, the coding including:
retrieving predicted pixel block data from a reference store according to a motion vector,
retrieving filter parameter data from a codebook store according to a codebook index,
filtering the predicted pixel block data according to the parameter data, wherein the coded pixel block decoding is performed using the filtered pixel block data as a prediction reference,
calculating characteristics of an ideal interpolation filter based on the decoded video data, the ideal interpolation filter calculated with a cross-correlation matrix and an auto-correlation matrix between uncoded pixel block data and the decoded pixel block data,
adding the calculated characteristics to the codebook for use with later received coded pixel blocks, the characteristics including filter configuration data that defines the operation of the interpolation filter.
33. The method of claim 32, wherein the codebook is chosen from a plurality of stored codebooks, the codebook selected by a codebook identifier.
34. The method of claim 32, wherein the codebook is chosen from a plurality of stored codebooks, the codebook selected by a motion vector of the coded pixel block.
35. The method of claim 32, wherein the codebook is chosen from a plurality of stored codebooks, the codebook selected by a pixel aspect ratio.
36. The method of claim 32, wherein the codebook is chosen from a plurality of stored codebooks, the codebook selected by coding type of the coded pixel block.
37. The method of claim 32, wherein the codebook is chosen from a plurality of stored codebooks, the codebook selected by an indicator of the coded pixel block's complexity.
38. The method of claim 32, wherein the codebook is chosen from a plurality of stored codebooks, the codebook selected by a bit rate of coded video data.
39. The method of claim 32, further comprising resetting the codebook at predetermined intervals.
40. The method of claim 32, further comprising resetting the codebook in response to a predetermined command exchanged between a video encoder and a video decoder.
41. The method of claim 32, further comprising resetting the codebook on detection of a scene change.
42. The method of claim 32, further comprising resetting the codebook to an empty state.
43. The method of claim 32, further comprising resetting the codebook to a predetermined state.
44. The method of claim 32, further comprising purging from the codebook entries associated with a predetermined coding assignment type.
45. The method of claim 32, further comprising purging from the codebook all entries except entries expressly acknowledged between a video encoder and a video decoder.
46. The method of claim 32, wherein the decoded video data is decoded, deblocked video data.
47. The method of claim 32, wherein the decoded video data is decoded, non-deblocked video data.
48. The method of claim 32, wherein the decoded video data is selected from:
1) decoded, deblocked video data, and
2) decoded, non-deblocked video data
in response to an identifier exchanged between a video coder and a video decoder.
49. A video encoder, comprising:
a block-based coder to encode coded pixel blocks by motion compensated prediction,
a prediction unit to supply predicted pixel block data to the block-based encoder, the prediction unit, comprising:
a motion compensated predictor having an output for pixel block data,
an interpolation filter coupled to an output of the motion compensated predictor and having an output for filtered pixel block data, to search a codebook of previously-stored filter characteristics to identify a matching codebook filter, and
codebook storage, storing plural sets of configuration parameters for the interpolation filter,
a codebook manager to:
calculate characteristics of an ideal interpolation filter based on decoding the coded pixel blocks, the ideal interpolation filter calculated with a cross-correlation matrix and an auto-correlation matrix between uncoded pixel block data and the decoded pixel block data,
add the calculated characteristics to the codebook for use with later received pixel blocks, the characteristics including filter configuration data that defines the operation of the interpolation filter.
50. A video decoder, comprising:
a block-based decoder to decode coded pixel blocks by motion compensated prediction,
a prediction unit to supply predicted pixel block data to the block-based decoder, the prediction unit, comprising:
a motion compensated predictor having an output for pixel block data,
an interpolation filter coupled to an output of the motion compensated predictor and having an output for filtered pixel block data, and
codebook storage, storing plural sets of configuration parameters for the interpolation filter, responsive to a codebook index received with coded pixel block data to supply a set of configuration parameters to the interpolation filter,
a codebook manager to:
calculate characteristics of an ideal interpolation filter based on the decoded video data, the ideal interpolation filter calculated with a cross-correlation matrix and an auto-correlation matrix between uncoded pixel block data and the decoded pixel block data,
add the calculated characteristics to the codebook for use with later received pixel blocks, the characteristics including filter configuration data that defines the operation of the interpolation filter.
US12/896,552 2010-10-01 2010-10-01 Motion compensation using decoder-defined vector quantized interpolation filters Active 2033-05-05 US9628821B2 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US12/896,552 US9628821B2 (en) 2010-10-01 2010-10-01 Motion compensation using decoder-defined vector quantized interpolation filters
PCT/US2011/053975 WO2012044814A1 (en) 2010-10-01 2011-09-29 Motion compensation using decoder-defined vector quantized interpolation filters
AU2011308759A AU2011308759A1 (en) 2010-10-01 2011-09-29 Motion compensation using decoder-defined vector quantized interpolation filters

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US12/896,552 US9628821B2 (en) 2010-10-01 2010-10-01 Motion compensation using decoder-defined vector quantized interpolation filters

Publications (2)

Publication Number Publication Date
US20120082217A1 US20120082217A1 (en) 2012-04-05
US9628821B2 true US9628821B2 (en) 2017-04-18

Family

ID=44800263

Family Applications (1)

Application Number Title Priority Date Filing Date
US12/896,552 Active 2033-05-05 US9628821B2 (en) 2010-10-01 2010-10-01 Motion compensation using decoder-defined vector quantized interpolation filters

Country Status (3)

Country Link
US (1) US9628821B2 (en)
AU (1) AU2011308759A1 (en)
WO (1) WO2012044814A1 (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101444691B1 (en) * 2010-05-17 2014-09-30 에스케이텔레콤 주식회사 Reference Frame Composing and Indexing Apparatus and Method
US8761245B2 (en) * 2010-12-21 2014-06-24 Intel Corporation Content adaptive motion compensation filtering for high efficiency video coding
US9462280B2 (en) * 2010-12-21 2016-10-04 Intel Corporation Content adaptive quality restoration filtering for high efficiency video coding
US10484693B2 (en) * 2011-06-22 2019-11-19 Texas Instruments Incorporated Method and apparatus for sample adaptive offset parameter estimation for image and video coding
US9819965B2 (en) 2012-11-13 2017-11-14 Intel Corporation Content adaptive transform coding for next generation video
WO2014120369A1 (en) * 2013-01-30 2014-08-07 Intel Corporation Content adaptive partitioning for prediction and coding for next generation video
KR102184884B1 (en) * 2014-06-26 2020-12-01 엘지디스플레이 주식회사 Data processing apparatus for organic light emitting diode display
WO2017065357A1 (en) * 2015-10-16 2017-04-20 엘지전자 주식회사 Filtering method and apparatus for improving prediction in image coding system
US10063892B2 (en) 2015-12-10 2018-08-28 Adobe Systems Incorporated Residual entropy compression for cloud-based video applications
US10264262B2 (en) 2016-02-29 2019-04-16 Adobe Inc. Codebook generation for cloud-based video applications
US10602176B2 (en) * 2016-04-15 2020-03-24 Google Llc Coding interpolation filter type

Citations (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5541660A (en) 1993-12-24 1996-07-30 Daewoo Electronics Co., Ltd. Systolic realization of motion compensated interpolation filter
US5796875A (en) 1996-08-13 1998-08-18 Sony Electronics, Inc. Selective de-blocking filter for DCT compressed images
US6069670A (en) 1995-05-02 2000-05-30 Innovision Limited Motion compensated filtering
US20020106026A1 (en) 2000-11-17 2002-08-08 Demmer Walter Heinrich Image scaling and sample rate conversion by interpolation with non-linear positioning vector
US6442202B1 (en) 1996-03-13 2002-08-27 Leitch Europe Limited Motion vector field error estimation
WO2004006558A2 (en) 2002-07-09 2004-01-15 Nokia Corporation Method and system for selecting interpolation filter type in video coding
US20040022318A1 (en) 2002-05-29 2004-02-05 Diego Garrido Video interpolation coding
US20050031211A1 (en) * 2001-09-06 2005-02-10 Meur Olivier Le Device and process for coding video images by motion compensation
US7146311B1 (en) 1998-09-16 2006-12-05 Telefonaktiebolaget Lm Ericsson (Publ) CELP encoding/decoding method and apparatus
US20060294171A1 (en) 2005-06-24 2006-12-28 Frank Bossen Method and apparatus for video encoding and decoding using adaptive interpolation
US20080075165A1 (en) 2006-09-26 2008-03-27 Nokia Corporation Adaptive interpolation filters for video coding
US20080089417A1 (en) 2006-10-13 2008-04-17 Qualcomm Incorporated Video coding with adaptive filtering for motion compensated prediction
US20080175322A1 (en) 2007-01-22 2008-07-24 Samsung Electronics Co., Ltd. Method and apparatus for encoding and decoding image using adaptive interpolation filter
US20080247467A1 (en) 2007-01-09 2008-10-09 Nokia Corporation Adaptive interpolation filters for video coding
US20080262312A1 (en) 2007-04-17 2008-10-23 University Of Washington Shadowing pipe mosaicing algorithms with application to esophageal endoscopy
US20090003717A1 (en) * 2007-06-28 2009-01-01 Mitsubishi Electric Corporation Image encoding device, image decoding device, image encoding method and image decoding method
US20090052555A1 (en) 2007-08-21 2009-02-26 David Mak-Fan System and method for providing dynamic deblocking filtering on a mobile device
US20090257500A1 (en) 2008-04-10 2009-10-15 Qualcomm Incorporated Offsets at sub-pixel resolution
US20090274216A1 (en) 2006-11-30 2009-11-05 Sadaatsu Kato Dynamic image encoding device, dynamic image encoding method, dynamic image encoding program, dynamic image decoding device, dynamic image decoding method, and dynamic image decoding program
US20100002770A1 (en) 2008-07-07 2010-01-07 Qualcomm Incorporated Video encoding by filter selection
US20100021071A1 (en) * 2007-01-09 2010-01-28 Steffen Wittmann Image coding apparatus and image decoding apparatus
US7664184B2 (en) 2004-07-21 2010-02-16 Amimon Ltd. Interpolation image compression
US20100098345A1 (en) 2007-01-09 2010-04-22 Kenneth Andersson Adaptive filter representation
US7778472B2 (en) 2006-03-27 2010-08-17 Qualcomm Incorporated Methods and systems for significance coefficient coding in video compression
WO2010123862A1 (en) 2009-04-20 2010-10-28 Dolby Laboratories Licensing Corporation Adaptive interpolation filters for multi-layered video delivery
EP2262267A1 (en) 2009-06-10 2010-12-15 Panasonic Corporation Filter coefficient coding scheme for video coding
US20120002722A1 (en) 2009-03-12 2012-01-05 Yunfei Zheng Method and apparatus for region-based filter parameter selection for de-artifact filtering

Patent Citations (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5541660A (en) 1993-12-24 1996-07-30 Daewoo Electronics Co., Ltd. Systolic realization of motion compensated interpolation filter
US6069670A (en) 1995-05-02 2000-05-30 Innovision Limited Motion compensated filtering
US6442202B1 (en) 1996-03-13 2002-08-27 Leitch Europe Limited Motion vector field error estimation
US5796875A (en) 1996-08-13 1998-08-18 Sony Electronics, Inc. Selective de-blocking filter for DCT compressed images
US7146311B1 (en) 1998-09-16 2006-12-05 Telefonaktiebolaget Lm Ericsson (Publ) CELP encoding/decoding method and apparatus
US20020106026A1 (en) 2000-11-17 2002-08-08 Demmer Walter Heinrich Image scaling and sample rate conversion by interpolation with non-linear positioning vector
US20050031211A1 (en) * 2001-09-06 2005-02-10 Meur Olivier Le Device and process for coding video images by motion compensation
US20040022318A1 (en) 2002-05-29 2004-02-05 Diego Garrido Video interpolation coding
US7397858B2 (en) 2002-05-29 2008-07-08 Innovation Management Sciences, Llc Maintaining a plurality of codebooks related to a video signal
WO2004006558A2 (en) 2002-07-09 2004-01-15 Nokia Corporation Method and system for selecting interpolation filter type in video coding
US7664184B2 (en) 2004-07-21 2010-02-16 Amimon Ltd. Interpolation image compression
US20060294171A1 (en) 2005-06-24 2006-12-28 Frank Bossen Method and apparatus for video encoding and decoding using adaptive interpolation
US7778472B2 (en) 2006-03-27 2010-08-17 Qualcomm Incorporated Methods and systems for significance coefficient coding in video compression
US20080075165A1 (en) 2006-09-26 2008-03-27 Nokia Corporation Adaptive interpolation filters for video coding
US20080089417A1 (en) 2006-10-13 2008-04-17 Qualcomm Incorporated Video coding with adaptive filtering for motion compensated prediction
US20090274216A1 (en) 2006-11-30 2009-11-05 Sadaatsu Kato Dynamic image encoding device, dynamic image encoding method, dynamic image encoding program, dynamic image decoding device, dynamic image decoding method, and dynamic image decoding program
US20080247467A1 (en) 2007-01-09 2008-10-09 Nokia Corporation Adaptive interpolation filters for video coding
US20100021071A1 (en) * 2007-01-09 2010-01-28 Steffen Wittmann Image coding apparatus and image decoding apparatus
US20100098345A1 (en) 2007-01-09 2010-04-22 Kenneth Andersson Adaptive filter representation
US20080175322A1 (en) 2007-01-22 2008-07-24 Samsung Electronics Co., Ltd. Method and apparatus for encoding and decoding image using adaptive interpolation filter
US20080262312A1 (en) 2007-04-17 2008-10-23 University Of Washington Shadowing pipe mosaicing algorithms with application to esophageal endoscopy
US20090003717A1 (en) * 2007-06-28 2009-01-01 Mitsubishi Electric Corporation Image encoding device, image decoding device, image encoding method and image decoding method
US20090052555A1 (en) 2007-08-21 2009-02-26 David Mak-Fan System and method for providing dynamic deblocking filtering on a mobile device
US20090257500A1 (en) 2008-04-10 2009-10-15 Qualcomm Incorporated Offsets at sub-pixel resolution
US20100002770A1 (en) 2008-07-07 2010-01-07 Qualcomm Incorporated Video encoding by filter selection
US20120002722A1 (en) 2009-03-12 2012-01-05 Yunfei Zheng Method and apparatus for region-based filter parameter selection for de-artifact filtering
WO2010123862A1 (en) 2009-04-20 2010-10-28 Dolby Laboratories Licensing Corporation Adaptive interpolation filters for multi-layered video delivery
EP2262267A1 (en) 2009-06-10 2010-12-15 Panasonic Corporation Filter coefficient coding scheme for video coding

Non-Patent Citations (94)

* Cited by examiner, † Cited by third party
Title
Amonou et al., "Description of video coding technology proposal by France Telecom, NTT, NTT DOCOMO, Panasonic and Technicolor," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 1st Meeting: Dresden, DE, Apr. 15-23, 2010 (JCTVC-A114).
Bjontegaard et al., "Adaptive Deblocking Filter," IEEE Transactions on Circuits and Systems for Video Technology, 13(7): 614-619, Jul. 2003.
Chiu et al., "Description of video coding technology proposal: self derivation of motion estimation and adaptive (Wiener) loop filtering," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 1st Meeting: Dresden, DE, Apr. 15-23, 2010.
Chono et al., "Adaptive motion interpolation on MB basis,"Joint Video Team (JVT) of ISO/IEC MPEG & ITU-T VCEG, (ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 0.6), 3rd Meeting: Fairfax, Virginia, USA, May 6-10, 2002 (JVT-C040).
Chono et al., "Description of video coding technology proposal by NEC Corporation," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 1st Meeting: Dresden, DE, Apr. 15-23, 2010.
Chujoh et al., "Block-based Adaptive Loop Filter," ITU—Telecommunications Standardization Sector, Video Coding Experts Group (VCEG), 35th Meeting: Berlin, Germany, Jul. 16-18, 2008 (VCEG-AI18).
Chujoh et al., "Specification and experimental results of Quadtree-based Adaptive Loop Filter," ITU—Telecommunications Standardization Sector, Video Coding Experts Group (VCEG), 37th Meeting: Yokohama, Japan Apr. 15-18, 2009 (VCEG-AK22 r1).
Fowler et al., "A Survey of Adaptive Vector Quantization—Part I: A Unifying Structure," Proceedings of the Data Compression Conference, Mar. 25, 1997.
Fowler et al., "A Survey of Adaptive Vector Quantization—Part II: Classification and Comparison of Algorithms," IPS Laboratory Technical Report, Mar. 1, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2, Chapter 5, pp. 80-109, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2, Chapter 7, pp. 146-155, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 1: Introduction to Digital Multimedia, Compression and MPEG-2, pp. 1-13, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 10: Video Stream Syntax and Semantics, pp. 230-257, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 11: Requirements and Profiles, pp. 258-279, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 12: Digital Networks, pp. 280-291, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 13: Interactive Television, pp. 292-306, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 14: High Definition Television (HDTV), pp. 307-321, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 15: Three-Dimensional TV, pp. 322-359, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 16: Processing Architecture and Implementation Dr. Horng-Dar Lin, pp. 361-368, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 17: MPEG-4 and the Future, pp. 369-411, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 2: Anatomy of MPEG-2, pp. 14-31, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 3: MPEG-2 Systems, pp. 32-54, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 4: Audio, pp. 55-79, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 5: Video Basics, pp. 80-109, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 6: Digital Compression: Fundamentals, pp. 110-145, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 7: Motion Compensation Modes in MPEG, pp. 146-155, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 8: MPEG-2 Video Coding and Compression, pp. 156-182, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Chapter 9: MPEG-2 Scalability Techniques, pp. 183-229, Chapman & Hall, New York, 1997.
Haskell et al., Digital Video: An Introduction to MPEG-2—Table of Contents, Chapman & Hall, New York, 1997.
Huang et al., "A Technical Description of MediaTek's Proposal to the JCT-VC CfP ," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 1st Meeting: Dresden, Germany, Apr. 2010 (JCTVC-A109—r2).
Huang et al., "A Technical Description of MediaTek's Proposal to the JCT-VC CfP," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 1st Meeting: Dresden, DE, Apr. 15-23, 2010 (JCTVC-A109 r2).
Huang et al., "Adaptive Quadtree-based Multi-reference loop Filter," ITU—Telecommunications Standardization Sector, Study Group 16 Question 6, Video Coding Experts Group (VCEG), 37th Meeting: Yokohama, Japan, Apr. 15-18, 2009 (VCEG-AK24).
International Search Report and Written Opinion, dated Dec. 8, 2011, from corresponding International Patent Application No. PCT/US2011/053975, filed Sep. 29, 2011.
ITU-T Recommendation H.263, Video Coding for Low Bit Rate Communication (Jan. 2005).
ITU-T Recommendation H.264, Advanced Video Coding for Generic Audiovisual Services (Mar. 2010).
KAI ZHANG ; XUN GUO ; YU-WEN HUANG ; SHAWMIN LEI ; WEN GAO: "A single-pass based adaptive interpolation filtering algorithm for video coding", 2010 17TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2010); 26-29 SEPT. 2010; HONG KONG, CHINA, IEEE, PISCATAWAY, NJ, USA, 26 September 2010 (2010-09-26), Piscataway, NJ, USA, pages 3401 - 3404, XP031814182, ISBN: 978-1-4244-7992-4
Keman et al., "Half-Pixel Motion Estimation Bypass Based on a Linear Model," Picture Coding Symposium, Dec. 15, 2004, San Francisco, CA.
Kimata et al., "3D Motion Vector Coding with Block Base Adaptive Interpolation Filter on H.264," Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, Apr. 6-10, 2003, Hong Kong.
Mitchell et al., MPEG Video Compression Standard—Chapter 1: Introduction, pp. 1-16, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 10: MPEG-2 Main Profile Syntax, pp. 187-236, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 11: Motion Compensation, pp. 237-262, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 12: Pel Reconstruction, pp. 263-282, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 13: Motion Estimation, pp. 283-312, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 14: Variable Quantization, pp. 313-332, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 15: Rate Control in MPEG, pp. 333-356, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 16: MPEG Patents, pp. 357-362, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 17: MPEG Vendors and Products, pp. 363-382, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 18: MPEG History, pp. 383-394, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 19: Other Video Standards, pp. 395-440, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 2: Overview of MPEG, pp. 17-32, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 3: The Discrete Cosine Transform, pp. 33-50, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 4: Aspects of Visual Perception, pp. 51-80, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 5: MPEG Coding Principles, pp. 81-104, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 6: Pseudocode and Flowcharts, pp. 105-116, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 7: MPEG System Syntax, pp. 117-134, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 8: MPEG-1 Video Syntax, pp. 135-170, Chapman & Hall, New York 1997.
Mitchell et al., MPEG Video Compression Standard—Chapter 9: MPEG-2 Overview, pp. 171-186, Chapman & Hall, New York 1997.
Motta et al., "Improved Filter Selection for B-Slices in E-AIF," ITU—Telecommunications Standardization Sector, Video Coding Experts Group (VCEG), 35th Meeting: Berlin, Germany, Jul. 2008,(VCEG-Al38).
Netravali et al., Digital Pictures: Representation, Compression, and Standards—Chapter 1: Numerical Representation of Visual Information, pp. 1-92, Second Edition, Plenum Press, New York 1988.
Netravali et al., Digital Pictures: Representation, Compression, and Standards—Chapter 2: Common Picture Communications Systems, pp. 93-156, Second Edition, Plenum Press, New York 1988.
Netravali et al., Digital Pictures: Representation, Compression, and Standards—Chapter 3: Redundancy-Statistics-Models, pp. 157-252, Second Edition, Plenum Press, New York 1988.
Netravali et al., Digital Pictures: Representation, Compression, and Standards—Chapter 4: Visual Psychophysics, pp. 253-308, Second Edition, Plenum Press, New York 1988.
Netravali et al., Digital Pictures: Representation, Compression, and Standards—Chapter 5: Basic Compression Techniques, pp. 309-520, Second Edition, Plenum Press, New York 1988 (in 2 parts).
Netravali et al., Digital Pictures: Representation, Compression, and Standards—Chapter 6: Examples of Codec Designs, pp. 521-570, Second Edition, Plenum Press, New York 1988.
Netravali et al., Digital Pictures: Representation, Compression, and Standards—Chapter 7: Still Image Coding Standards—ISO JBIG and JPEG, pp. 571-594, Second Edition, Plenum Press, New York 1988.
Netravali et al., Digital Pictures: Representation, Compression, and Standards—Chapter 8: CCITT H.261 (P*64) Videoconferencing Coding Standards, pp. 595-612, Second Edition, Plenum Press, New York 1988.
Netravali et al., Digital Pictures: Representation, Compression, and Standards—Chapter 9: Coding for Entertainment Video—ISO MPEG, pp. 613-640, Second Edition, Plenum Press, New York 1988.
Ramamurthi et al., "Classified Vector Quantization of Images," IEEE Transactions on Communications, vol. COM-34 No. 11, Nov. 1, 1986.
Ren et al., "Comparison of Power Consumption for Motion Compensation and Deblocking Filters in High Definition Video Coding," IEEE International Symposium on Consumer Electronics, Jun. 1, 2007.
Ribas-Corbera et al., "On the Optimal Motion Vector Accuracy for Block-Based Motion-Compensation Video Coders," International Society for Optical Engineering, SPIE Proceedings, Bellingham, WA, vol. 2668, Jan. 1, 1996.
Rusanovskyy et al., "Adaptive Interpolation with Directional Filters," ITU—Telecommunications Standardization Sector, Video Coding Experts Group (VCEG), 33rd rd Meeting: Shenzhen, China, Oct. 2007 (VCEG-AG21).
Segall et al., "A Highly Efficient and Highly Parallel System for Video Coding," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/EC JTC1ISC29/WG11, 1st Meeting: Dresden, DE, Apr. 15-23, 2010 (JCTVC-A105).
Segall et al., "A Highly Efficient and Highly Parallel System for Video Coding," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 1st Meeting: Dresden, Germany, Apr. 2010 (JCTVC-A105).
Shen et al., "An Adaptive and Fast Fractional Pixel Search Algorithm in H.264," Signal Processing, Elsevier Science Publishers B.V. Amsterdam, NL, vol. 87(11): 2629-2639, Aug. 1, 2007.
Suzuki et al., "Description of video coding technology proposal by Sony ," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 1st Meeting: Dresden, Germany, Apr. 2010 (JCTVC-A103—r1).
Suzuki et al., "Description of video coding technology proposal by Sony," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11, 1st Meeting: Dresden, DE, Apr. 15-23, 2010 (JCTVC-A103).
Ugur et al., "Coding Adaptive Interpolation Filter Coefficients," ITU—Telecommunications Standardization Sector, Study Group 16 Question 6, Video Coding Experts Group (VCEG), 31st Meeting: Marrakech, MA, Jan. 15-16, 2007 (VCEG-AE20).
Vatis et al., "Comparison of Complexity between Two-dimensional non-separable Adaptive Interpolation Filter and Standard Weiner Filter," ITU—Telecommunications Standardization Sector, Video Coding Experts Group (VCEG), 28th Meeting: Nice, FR, Apr. 16-22, 2005 (VCEG-AA11).
Vatis et al., "Comparison of Complexity between Two-dimensional non-separable Adaptive Interpolation Filter and Standard Wiener Filter ," ITU—Telecommunications Standardization Sector, Video Coding Experts Group (VCEG), 28th Meeting: Nice, France, Apr. 2005 (VCEG-AA11).
Vatis et al., "Locally Adaptive Non-Separable Interpolation Filter for H.264/AVC," Proceedings of the 2006 IEEE International Conference on Image Processing, Oct. 8, 2006, New York, NY.
Vatis et al., "Rate-distortion optimised coder control for adaptive interpolation filter in the KTA reference model," 31st VCEG Meeting, Jan. 15-16, 2007, Marrakech, MA.
Vatis et al., "Two-dimensional non-separable Adaptive Wiener Interpolation Filter for H.264/AVC," ITU —Telecommunications Standardization Sector, Video Coding Experts Group (VCEG), 26th Meeting: Busan, Korea, Apr. 2005 (VCEG-Z17).
Vatis et al., "Two-dimensional non-separable Adaptive Wiener Interpolation Filter for H.264/AVC," ITU—Telecommunications Standardization Sector, Video Coding Experts Group (VCEG), 26th Meeting: Busan, KR, Apr. 16-22, 2005 (VCEG-Z17).
Wedi et al., "Interpolation Filters for Motion Compensated Prediction with 1/4 and 1/8-pel Accuracy," Tenth VCEG Meeting, May 16-18, 2000, Osaka, JP.
Wedi, "Adaptive Interpolation Filter for Motion Compensated Hybrid Video Coding," Proc. Picture Coding Symposium (PCS 2001), Seoul, Korea, Apr. 2001.
Wittmann et al., "Separable adaptive interpolation filter," ITU—Telecommunications Standardization Sector, Study Group 16—Contribution 219, Jun. 2007 (COM16-C219-E).
Wittmann et al., "Separable adaptive interpolation filter," ITU—Telecommunications Standardization Sector, Study Group 16—Contribution 219, Matsushita, Jun. 2007 (COM16-C464-E).
Wu et al., "Description of video coding technology proposal by Microsoft," Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T SG16 WP3 and ISOI/IEC JTC1/SC29/WG11, 1st Meeting: Dresden, DE, Apr. 15-23, 2010 (JCTVC-A118).
YAN YE ; ALEXIS MICHAEL TOURAPIS: "Buffered adaptive interpolation filters", MULTIMEDIA AND EXPO (ICME), 2010 IEEE INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 19 July 2010 (2010-07-19), Piscataway, NJ, USA, pages 376 - 381, XP031761633, ISBN: 978-1-4244-7491-2
Ye et al., "Buffered Adaptive Interpolation Filters," IEEE International Conference on Multimedia and Expo (ICME), Piscataway, NJ, USA, Jul. 19, 2010, pp. 376-381 XP031761633.
Ye et al., "Enhanced Adaptive Interpolation Filter," ITU—Telecommunications Standardization Sector, Study Group 16—Contribution 464, Apr. 2008 (COM16-C464-E).
Ye et al., "Enhanced Adaptive Interpolation Filter," ITU—Telecommunications Standardization Sector, Study Group 16—Contribution 464, Qualcomm Inc., Apr. 2008 (COM16-C219-E).
Zhang et al., "A Single-Pass Based Adaptive Interpolation Filtering Algorithm for Video Coding," Proceedings of 2010 IEEE 17th International Conference on Image Processing (ICIP), Sep. 26-29, 2010, Hong Kong, pp. 3401-3404 (XP031814182).
Zhang et al., "Single-Pass Encoding Using Multiple Adaptive Interpolation Filters," 37th VCEG Meeting, Apr. 15-18, 2009, Yokohama, JP.

Also Published As

Publication number Publication date
WO2012044814A1 (en) 2012-04-05
US20120082217A1 (en) 2012-04-05
AU2011308759A1 (en) 2013-04-18

Similar Documents

Publication Publication Date Title
US9628821B2 (en) Motion compensation using decoder-defined vector quantized interpolation filters
US11438610B2 (en) Block-level super-resolution based video coding
US20120008686A1 (en) Motion compensation using vector quantized interpolation filters
US8976856B2 (en) Optimized deblocking filters
KR100703283B1 (en) Image encoding apparatus and method for estimating motion using rotation matching
CA2295689C (en) Apparatus and method for object based rate control in a coding system
US9215466B2 (en) Joint frame rate and resolution adaptation
US9602819B2 (en) Display quality in a variable resolution video coder/decoder system
US9414086B2 (en) Partial frame utilization in video codecs
US20120008687A1 (en) Video coding using vector quantized deblocking filters
US9584832B2 (en) High quality seamless playback for video decoder clients
US20120087411A1 (en) Internal bit depth increase in deblocking filters and ordered dither
US9565404B2 (en) Encoding techniques for banding reduction
WO2022022299A1 (en) Method, apparatus, and device for constructing motion information list in video coding and decoding
JP2006517369A (en) Apparatus for encoding a video data stream
Ratnottar et al. Comparative study of motion estimation & motion compensation for video compression
KR100814715B1 (en) Moving-picture coding apparatus and method
Andrews et al. Test model 12/Appendix II of H. 263 Version 3 Purpose: Information
Andrews et al. Test model 11 Purpose: Information

Legal Events

Date Code Title Description
AS Assignment

Owner name: APPLE INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HASKELL, BARIN G.;REEL/FRAME:025081/0906

Effective date: 20100929

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4