WO2008149327A2

WO2008149327A2 - Method and apparatus for motion-compensated video signal prediction

Info

Publication number: WO2008149327A2
Application number: PCT/IB2008/053473
Authority: WO
Inventors: Ronggang Wang; Zhen Ren
Original assignee: France Telecom
Priority date: 2007-06-04
Filing date: 2008-06-03
Publication date: 2008-12-11
Also published as: WO2008149327A3; WO2008148272A1

Abstract

The invention relates to a method for sub-pixel motion-compensated video encoding comprising, for an incoming video séquence frame, the steps of: dividing the frame into macroblocks and macroblock partitions, - determining a motion vector and an interpolation filter, from a set of interpolation filters, for a macroblock partition, for estimating fractional pixel displacements of said macroblock partition, determining a partition mode for a macroblock, performing motion compensation of a macroblock partition according to the determined partition mode, motion vector and interpolation filter, and wherein the step of determining an interpolation filter and a motion vector for a macroblock partition comprises determining an interpolation filter and a motion vector which minimize an coding motion cost indication for that partition of the frame macroblock.

Description

METHOD AND APPARATUS FOR MOTION-COMPENSATED VIDEO

SIGNAL PREDICTION

TECHNICAL FIELD OF THE INVENTION

The invention relates generally to motion-compensated video signal prediction, and more particularly to a method and apparatus for motion- compensated video signal prediction with sub-pixel motion estimation and compensation.

BACKGROUND OF THE INVENTION Video coding basically comprises the process of compressing (encoding) and decompressing (decoding) a digital video signal. In the following, and according to general use in the field of the invention, a device that compresses data is referred to as an encoder, a device that decompresses data is referred to as a decoder, and a device that acts as both encoder and decoder will be referred to as a codec. Further, it is also common in the field of video coding to use a syntax according to the hierarchical structure illustrated in Figure 1, in which a video sequence VS consists of a plurality of successive pictures P, which hereinafter will be indistinctively referred to as frames. Each frame P can be split into one or several slices SL, each slice defined as a sequence of macroblocks MB. A macroblock MB is defined as the basic unit for encoding and as being fixed size frame partitions that cover a rectangular area of 16x16 pixels. Each macroblock MB can be further segmented into one or more blocks B with variable block size. Further, hereinafter we will use the notion of macroblock partition to refer to a block of a macroblock for which motion-compensated prediction is applied. An allowable set of macroblock partition modes, i.e. a number of specific ways a macroblock can be partitioned in one or more macroblock partitions MPl to MP9, typically vary from one coding scheme to another and, for example, a 16x16 macroblock MB may have a mix of 8x8, 4x4, 4x8 and 8x4 macroblock partitions within a single macroblock.

Motion estimation and motion compensation are well known compression techniques exploiting temporal redundancies between blocks of subsequent frames in order to only transmit changes between subsequent frames (inter-frame prediction). Most state-of-the-art video codecs are based on motion-compensated prediction with motion vectors of fractional pixel resolution. This motion- compensated coding technique is commonly known in the art as sub-pixel motion compensation, and, in order to estimate and/or compensate such sub-pixel displacements, the image signal on that position is obtained by way of interpolation filtering. The interpolation filters applied to estimate and compensate fractional pixel displacements can be time invariant, that is, the same filters may be used for all the frames of the video sequence, but recently, adaptive interpolation filtering (AIF) techniques have been proposed in order to reduce the prediction error associated with the interpolation filter. In AIF, the filter coefficient values are adapted once at frame, macroblock or block level, and those coefficient values shall be encoded in the compressed bitstream and transmitted to the video decoder. A known proposed encoding process for a video sequence frame that uses AIF for motion-compensated prediction is disclosed in document JVT- D078, "Modified Adaptive Interpolation Filter", by Kei-ichi Chono and Yoshihiro Miyamoto, presented to the Joint Video Team of ISO/IEC MPEG & ITU-T VCEG Meeting in Klagenfurt, Austria 22-26 July 2002. The document proposes, for each frame macroblock, the determination of a motion vector and an interpolation filter that minimize an encoding motion cost indication for a frame macroblock. The encoding motion cost indication takes in consideration the distortion associated with an interpolation filter and the bits needed to code both a motion vector and an interpolation filter for a certain frame macroblock. The optimal interpolation filter is one of the filters of a predefined filter set, the filter set containing three interpolation filters, one having symmetric coefficient fixed values and two having non- symmetric coefficient fixed values, and all having a length of six coefficients. The filter number (1, 2 or 3) is transmitted per frame macroblock. Interpolation filter coefficient values are adapted only once per frame and for minimizing a prediction error for one part of a frame division. Transmission of the filter coefficients is done for each frame. In general, while sub-pixel motion compensation and AIF increase accuracy and coding efficiency, interpolation filtering still involves a significant computational and complexity coding cost. Therefore, there is a need for improving such computational and complexity coding cost associated with the sub-pixel motion compensation interpolation process both in encoder and decoder. Further, in spite of the compression efficiency gained by the prior art methods there is still capacity for improvement in this respect.

DISCLOSURE OF THE INVENTION In view of the drawbacks of the prior art, the present invention aims to provide a method and apparatus for the improvement of motion-compensated video signal prediction. This is achieved by the features of the independent claims. According to a first aspect of the invention a method for sub-pixel motion- compensated video encoding comprising, for an incoming video sequence frame, the steps of: dividing the frame into macroblocks and macroblock partitions, determining a motion vector and an interpolation filter, from a set of interpolation filters, for a macroblock partition, for estimating fractional pixel displacements of said macroblock partition, - determining a partition mode for a macroblock, performing motion compensation of a macroblock partition according to the determined partition mode, motion vector and interpolation filter, and wherein the step of determining an interpolation filter and a motion vector for a macroblock partition comprises determining an interpolation filter and a motion vector which minimize an coding motion cost indication for that partition of the frame macroblock.

According to an embodiment of the invention the step of determining a partition mode for a frame macroblock comprises minimizing an encoding motion cost indication for a frame macroblock. Since the motion vector, interpolation filter and partition mode can be related to each other they jointly contribute to minimize the motion cost of a frame macroblock.

According to a first aspect of the present invention, a certain video frame is divided into a plurality of small macroblocks and said macroblocks are further divided into a plurality of partitions for which sub-pixel motion compensation is applied, and for such macroblock partitions, a motion vector and an interpolation filter which minimizes a coding motion cost indication or criteria is determined. Since motion vector an interpolation filter are optimal for partitions of the frame macroblocks compression efficiency is improved.

In other words the method for sub-pixel motion-compensated video encoding of an incoming video sequence frame, can comprise the steps of: - dividing the frame into macroblocks; determining a set of partition modes for one of said macroblocks, dividing said macroblock into macroblock partitions, according to said set of partition modes; estimating fractional pixel displacements for at least one of said macroblock partitions, by:

- determining a motion vector for said macroblock; and

- determining an interpolation filter, from a set of interpolation filters, for said macroblock partition, said motion vector and said interpolation filter minimizing an encoding motion cost criteria for said partition of the macroblock, determining a partition mode from the set of partition modes for said macroblock, minimizing an encoding motion cost criteria for said macroblock, and performing motion compensation of said macroblock partition according to the determined partition mode, motion vector and interpolation filter. The set of interpolation filters is determined as comprising:

- a first set of interpolation filters comprising at least a first filter and a second filter, the second filter being the result of adapting at least one parameter of the first filter for minimizing a prediction error for a sub-pixel position, and/or

- a second set of interpolation filters comprising at least two different predetermined interpolation filters.

The first set of the interpolation filters and the second set of interpolation filters may be also determined and joined in just one common set of interpolation filters.

In particular, at least one of the sets of interpolation filters is determined and/or selected according to an indication at a frame slice level. Said indication or piece of information may be dynamically generated by other hierarchy video coding layers, such as frame or macroblock level layers, indicating that said determination and/or selection of the set of interpolation filters shall apply to macroblock partitions of some specific frames, slices or macroblocks, or such indication may be predetermined and apply, for example, to a certain number or all macroblock partitions of the video sequence frames.

According to an embodiment the second filter of the first set of interpolation filters is obtained by adapting filter coefficients of the first filter, in order to minimize a prediction error for a sub-pixel position, using least square estimation.

According to a specific embodiment, the filter coefficients of the second filter are of float value quantized to integer.

For example the second set of interpolation filters comprises interpolation filters with different tap sizes.

According to a specific embodiment, the second set of interpolation filters comprises at least one interpolation filter with short tap size, a short tap size being four or less coefficients, and at least one interpolation filter with long tap size, a long tap size being more than four coefficients. In another embodiment the step of determining an interpolation filter comprises a step of determining interpolation filter coefficient values and/or length of a customized interpolation filter, which is adapted to the video sequence frame, and - a step of inserting said interpolation filter coefficient values and/or length into an encoded bitstream corresponding to said video sequence frame.

The set of interpolation filters may comprise at least one of the following filters: 8-tap: (-2,6,-12,40, 40,-12,6,-2)/64

8-tap: (-1,3,-6,52, 20,-6,3,-l)/64 8-tap: (-1,3,-6,20, 52,-6,3,-l)/64 6-tap: (2, -8, 29,12,-4, 1)732 6-tap: (l,-4, 12, 29,-8, 2)/32 6-tap: (l, -5, 20, 20, -5, l)/32 4-tap: (-2, 10, 10, -2)/16 4-tap: (-2, 14, 5, -1)/16 4-tap: (-1, 5, 14, -2)/16

2-tap: (2,2)/4 2-tap: (3,l)/4 2-tap: (l,3)/4

The interpolation filter coefficient values, the interpolation filter tap size or both parameters (coefficient values and tap size), are locally adapted to the partitions of the macroblocks, the interpolation filter parameters are better adapted to local area video textures, which not only improves compression efficiency but reduces interpolation process complexity both in memory accessing and filtering calculation. According to still another embodiment of the invention, the bit rate of the compressed bitstream to be transmitted to a decoder can be also reduced.

According to still another aspect of the invention, the encoding method comprises the steps of checking if a motion vector is encoded for one of said macroblocks; - if yes, encoding at least one piece of information representative of the corresponding determined interpolation for said macroblock. For example, the method for sub-pixel motion-compensated video encoding according to an embodiment of the invention is not applied for macroblock partitions that are of SKIP or DIRECT mode. According to still another aspect of the invention, the encoding method comprises the steps of checking if the determined motion vector points to a sub-pixel position, and if yes, encoding the determined interpolation filter into a bitstream. When motion vector points to integer pixel position, the interpolation filter is not coded into the bitstream, which can further reduce bit rate for interpolation filter. According to another embodiment the determined interpolation filter is defined by a reference index or a set of reference coefficients values and a predictive residue.

The determined interpolation filter is defined by a reference index or a set of reference coefficients value, which may be predicted by a function of neighbor partitions filter indexes. The predictive residue may be coded by a variable length code.

The VLC can be, for example, a Huffman or Exponential Golomb code or a Context-Adaptive Binary Arithmetic Code. According to another specific feature of the invention the determined interpolation filter is represented by a filter index or a set of reference coefficients values and is combined with the motion vector, to be coded into a bitstream jointly with that motion vector.

According another specific feature of the invention the determined interpolation filter is represented by a filter index or a set of reference coefficients values and is combined with a reference frame index, to be coded into a bitstream jointly with that reference frame index.

According to another aspect of the invention, the computational complexity needed for sub-pixel motion estimation can be controlled and scaled so that it can be adapted to specific available codec resources. According to a specific embodiment of the invention, computation-coding complexity is reduced to make the method feasible for real time encoding scenarios.

According to a specific embodiment, for sub-pixel positions of partitions of frame macroblocks, the encoding method comprises the steps of: - calculating, for each of that sub-pixel positions, a coding motion cost indication associated with each interpolation filter of the set of interpolation filters, and from these calculated coding motion cost indications, determining the interpolation filter associated with the minimum coding motion cost indication for each of that sub-pixel positions, and from the previously obtained sub-pixel positions and associated coding motion cost indications, determining the motion vector corresponding to the sub-pixel position associated with the minimum coding motion cost indication for the partition of the frame macroblock.

Additionally, according to a specific embodiment, from the previously obtained partitions of the frame macroblock and associated coding motion cost, determining, a partition mode associated with the minimum coding motion cost indication for the frame macroblock.

According to a specific embodiment, for sub-pixel positions of partitions of frame macroblocks, the encoding method comprises the steps of: calculating, for each of that sub-pixel positions, the encoding motion cost indication associated with one selected interpolation filter of the set of interpolation filters, and from these calculated motion cost indications, determining the motion vector corresponding to the sub-pixel position associated with the minimum coding motion cost indication for the partition of the frame macroblock, and calculating, with the previously determined motion vector, the encoding motion cost indication associated with each interpolation filter of the set of interpolation filters, and from these calculated motion cost indications, determining the interpolation filter associated with the minimum coding motion cost indication for the partition of the frame macroblock.

Additionally, according to a specific embodiment, from the previously obtained partitions of the frame macroblock and associated coding motion cost, determining, a partition mode associated with the minimum motion cost indication for the frame macroblock.

According to a specific embodiment, the encoding method comprises the steps of: for sub-pixel positions of partitions of frame macroblocks,

- calculating, for each of that sub-pixel positions, the encoding motion cost indication associated with one selected interpolation filter of the set of interpolation filters, and from these calculated motion cost indications, determining the motion vector corresponding to the sub-pixel position associated with the minimum coding motion cost indication for the partition of the frame macroblock, and

- determining one partition mode for the frame macroblock, and for partitions of that macroblock partition mode, - calculating, with the previously determined motion vector for that partition of the frame macroblock, the coding motion cost indication associated with each interpolation filter of the set of interpolation filters, and from these calculated coding motion cost indications, determining the interpolation filter associated with the minimum coding motion cost indication for the partition of the frame macroblock.

Additionally, according to a specific embodiment, the step of determining one partition mode for the frame macroblock comprises determining, from the obtained partitions of the frame macroblock and associated coding motion cost, a partition mode associated with the minimum motion cost indication for the frame macroblock.

According to still another embodiment of the invention, the coding motion cost indication takes in consideration the coding bits needed to encode or transmit the determined interpolation filter. For example, the coding motion cost indication is determined according to the formula:

M= D + λm (MVr + Fr) wherein,

M represents the coding motion cost indication,

D represents a difference between a macroblock partition of the incoming frame and a predicted partition generated from a motion vector and an interpolation filter for a sub-pixel position, λm represents a tradeoff coefficient between D and encoding bit rate, MVr represents bits needed for coding a motion vector, and Fr represents bits needed for coding an interpolation filter, The invention also relates to a method for sub-pixel motion- compensated video decoding of a received encoded bitstream representative of a video sequence frame, said frame being organized into macroblocks. According to an embodiment of the invention the method comprises the steps of: obtaining from said bitstream:

- for at least one of said macroblocks: - a piece of information representative of a partition mode applied to said macroblock during coding, defining corresponding macroblock partitions of said macroblock; and

- for at least one of said macroblock partitions: - a motion vector, and

- an interpolation filter, and performing motion-compensated prediction for the partition according to the information representative of a partition mode, the motion vector and the interpolation filter. According to a specific embodiment, the step of decoding an interpolation filter associated with the macroblock partition is carried only if the motion vector of the macroblock partition is decoded from the bit stream. According to a specific embodiment, the step of obtaining from the bit stream the motion vector comprises obtaining a motion vector difference information from the bitstream, obtaining a motion vector prediction value and reconstructing the motion vector for the macroblock partition, andthe step of obtaining from the bit stream the interpolation filter comprises obtaining interpolation filter coefficients or a filter index.

According to an specific embodiment, the step of obtaining interpolation filter coefficients or a filter index from the bit stream is carried only if the motion vector of the partition is decoded from the bitstream and points to a sub-pixel position, otherwise the decoder will set a default interpolation filter for the macroblock partition.

According to another embodiment the step of obtaining from the bitstream the motion vector comprises obtaining a motion vector difference information, determining a motion vector prediction value and reconstructing the motion vector for the macroblock partition, and the step of obtaining the interpolation filter comprises determining, from the previously reconstructed motion vector, interpolation filter coefficients or a filter index for the partition of the macroblock.

According to an specific embodiment, the step of obtaining from the bit stream interpolation filter coefficients or a filter index from the motion vector is carried only if the motion vector of the partition is decoded from the bitstream, otherwise the decoder will set a default interpolation filter for the macroblock partition.

According to another specific embodiment, the step of performing motion compensation prediction of the macroblock partition comprises, - if the motion vector of current partition points to an integer-pixel position, obtaining a motion compensated prediction value of the macroblock partition by shifting a corresponding partition in a reference frame using the motion vector, or if the motion vector of the partition is not decoded from the bitstream, calculating a motion-compensated prediction value of the macroblock partition by interpolating a reference partition with the aid of a default interpolation filter, or in any other case, calculating a motion-compensated prediction value of the macroblock partition by interpolating a reference partition with the aid of an interpolation filter decoded from the bit stream.

Other embodiment of the invention relates to an encoding apparatus for video signals comprising means for sub-pixel motion-compensated prediction adapted to, for an incoming video sequence frame, - divide the frame into macroblocks and macroblock partitions, determine a motion vector and an interpolation filter, from a set of interpolation filters, for a macroblock partition, for estimating fractional pixel displacements of a macroblock partition, determine a partition mode for a macroblock, - perform motion compensation of a macroblock partition according to the determined partition mode, motion vector and interpolation filter, and wherein the means adapted to determine an interpolation filter and a motion vector for a macroblock partition determine an interpolation filter and a motion vector which minimize an encoding motion cost indication for that partition of the frame macroblock. Still other embodiment of the invention relates to a decoding apparatus for video signals comprising means for sub-pixel motion- compensated prediction adapted to receive an encoded bitstream, obtain from said bitstream:

- for at least one of said macroblocks:

- a piece of information representative of a partition mode applied to said macroblock during coding, defining corresponding macroblock partitions of said macroblock; and

- for at least one of said macroblock partitions:

- a motion vector, and - an interpolation filter, and perform motion-compensated prediction for the partition according to the information representative of a partition mode, the motion vector and the interpolation filter.

Another embodiment of the invention relates to a signal representative of an incoming video sequence comprising a series of frames, each frame being divided into macroblocks and macroblock partitions, characterized in that it comprises: a field defining a macroblock partition mode used for a frame macroblock; - a field defining a motion vector for a partition of said frame macroblock; a field defining an interpolation filter for said macroblock partition, from a set of interpolation filters, for estimating fractional pixel displacements of the macroblock partition. In particular the field defining an interpolation filter contains interpolation filter coefficients or filter indications which represent interpolation filters which are different in filter coefficients and/or tap size for partitions of the same frame macroblock. In particular the field defining an interpolation filter contains interpolation filter coefficients or reconstruction data comprising reference data to a predetermined filter and corresponding residue, permitting to reconstruct said interpolation filters. An embodiment of the invention also concerns a computer program product which can be downloaded from a communication network and/or previously stored on a computer readable medium and/or executable by a processor, comprising program instructions for implementing the method for sub- pixel motion-compensated video encoding as previously described. In another embodiment the invention concerns a computer program product which can be downloaded from a communication network and/or previously stored on a computer readable medium and/or executable by a processor, comprising program instructions for implementing the method for sub-pixel motion-compensated video decoding as previously described. Further embodiments and advantages of the present invention will become apparent from the depending claims and the following description of illustrative embodiments of the invention.

Although motion-compensated video signal prediction may be applied in the following examples, for video encoding and decoding, it shall be understood as just for the sake of simplification, and will be apparent that some of the steps of the processes described in the following illustrative embodiments are not limited to such application, since the same problems of the prior art may be found and solved according to the invention in applications such as video spatial up conversion, frame rate up conversion or de-interlacing. BRIEF DESCRIPTION OF THE DRAWINGS

Figure 1 represents a generally used video coding hierarchical syntax.

Figure 2 shows a flow chart of a video encoding process with sub-pixel motion compensation prediction according to an embodiment of the invention.

Figure 3 illustrates a first example of a sub-pixel motion estimation process according to an embodiment of the invention.

Figure 4 is a flow chart showing an illustrative process for adapting an interpolation filter for minimizing a prediction error according to a sub-pixel motion estimation embodiment of the invention. Figure 5 illustrates a second example of a sub-pixel motion estimation process according to an embodiment of the invention.

Figure 6 illustrates a third example of a sub-pixel motion estimation process according to an embodiment of the invention. Figure 7 illustrates a block diagram of a video codec architecture according to an embodiment of the invention.

Figure 8 shows a compressed bitstream syntax comprising optimal interpolation filter information determined according to an embodiment of the invention. Figure 9 shows an illustrative flow diagram for video decoding method according to an embodiment of the invention.

DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS

Figure 2 shows a simplified flow chart illustrating an encoding process for a video sequence frame, e.g. an inter-frame, which uses AIF for sub-pixel motion-compensated video coding according to an embodiment of the invention, comprising the steps of integer-pixel motion estimation 200, sub-pixel motion estimation 205, partition mode selection 210, motion compensation 215 and encoding 220.

In general, the frame is divided into macroblocks and said macroblocks are further divided into partitions (macroblock partitions) of varying size. Motion compensated prediction can be performed independently for each macroblock partition. A separate motion vector may be required for each partition and each motion vector may be encoded into a compressed bitstream together with the choice of partition (partition mode). Also according to an embodiment of the invention, when the motion vector of a macroblock partition is coded into the bitstream and points to a frame sub- pixel position, an optimal interpolation filter for that macroblock partition will also be determined.

The step of integer pixel motion estimation 200 is carried out to determine the best integer motion vector for a certain partition of a frame macroblock. Then in step 205, sub-pixel motion estimation is performed to determine the optimal sub-pixel motion vector and optimal interpolation filter for that macroblock partition, and in step 210, the optimal partition mode of the frame macroblock is selected.

For estimating the optimal motion vector and interpolation filter for a partition of a macroblock in step 205 and selecting the optimal macroblock partition mode in step 210 a rate-distortion function may be used. Rate- distortion optimization may use a Lagrangian formulation wherein signal distortion is weighed against the bit rate needed to transmit the motion information. The optimal motion vector for a macroblock partition or the optimal partition-coding mode for a macroblock is the one that minimizes the Lagrange cost. The coding motion cost can also be calculated, for example, according to the following formula

M= D + λm (MVr + Fr) wherein M represents the coding motion cost indication, D represents distortion or a difference between the macroblock partition to be coded and the motion-compensated predicted partition associated to a motion vector and an interpolation filter for a sub-pixel position, λm represents a tradeoff coefficient between D and the encoding bit rate or coding bits, MVr represents the bits needed for coding the motion vector, and Fr represents the bits needed for coding the interpolation filter. With such formula, interpolation filter coding is also taken in consideration for calculating the coding bits or bit rate needed to code or transmit the motion information. Nevertheless, other specific formulas may be applied for indicating the coding motion cost.

In step 215, a particular macroblock partition is motion compensated using the coding partition mode information, motion vector and interpolation filter, so that the motion- compensated prediction of that macroblock partition is constructed, usually by referring to partitions of other of reconstructed reference frame or frames.

Finally, in step 220, the residue of the macroblock partition may be calculated and encoded into the compressed bitstream by typical operations such as transform, quantization and entropy coding and transmitted, for example, to a decoder, together with the information about the optimal macroblock partition mode, optimal motion vector and optimal interpolation filter. Steps 200-220 may be repeated to encode subsequent frames of the video sequence.

According to an embodiment of the invention, both the bitrate of the encoded bitstream and the coding complexity can be reduced by, for example, not applying AIF motion compensation when the macroblock is of "skipped" type, i.e. a macroblock for which no data is coded other than an indication that the macroblock is to be decoded as "skipped", or when the macroblock partition is of "direct prediction" type, i.e. a partition for which no motion vector is decoded. In such cases, neither the optimal motion vector nor the optimal interpolation filter shall be determined and coded into the bitstream. As indicated above, and according to a further embodiment of the invention, an interpolation filter is calculated and transmitted only for macroblock partitions with motion vector pointing to sub-pixel positions and/or only when the motion vector for a certain macroblock partition is coded into the bitstream.

Although the flow diagram of Figure 2 illustrates process steps 200 to 220, applied to video encoding, it is understood that not all steps described are essential for other video coding or video treatment applications of the method for motion-compensated video signal prediction according to the invention.

Referring to Figure 3, a flow chart of a sub-pixel motion estimation method according to an embodiment of the invention is disclosed, comprising the steps of initializing an interpolation filter 305, calculating an encoding motion cost indication 310, comparing motion cost indications 340, registering motion vector and interpolation filter 330, checking iterative interpolation filter set condition 345, checking iterative sub-pixel position condition 350, and providing new interpolation filter 355.

An interpolation filter is initialized both with default coefficient values and tap size (i.e. the number of filter coefficients or interpolation filter length) in step 305. Then in step 310, for a certain sub-pixel position of the macroblock partition and a certain interpolation filter, the coding motion cost is calculated. The coding motion cost is then compared in step 340 to a threshold value, which can be, for example, the minimum cost registered in previous iterations of the process for that partition or to a certain fixed value for that partition, and, if the calculated cost is less than the minimum value, the motion vector and interpolation filter associated with that cost is registered in step 330. The process follows with step 345, in which a first iterative process condition shall be met, namely, if a new interpolation filter shall be provided for that sub-pixel position. In case the condition is met and a new interpolation filter shall be provided for that sub-pixel position, the process follows with step 355, which provides a new interpolation filter for which a new coding motion cost indication shall be calculated in step 310. In case the condition in step 345 is not met, namely, no new interpolation filter shall be provided for that sub-pixel position, then the process follows with step 350, in which a second iterative process condition shall be met, namely, if the process shall be repeated for more sub-pixel positions of the macroblock partition. It shall be noted that the process for sub-pixel motion estimation according to the embodiment of the invention illustrated in Figure 3 can be repeated for all sub-pixel positions of the frame macroblock partition or for a reduced number of such sub-pixel positions. The determined optimal motion vector and interpolation filter will be the motion vector and interpolation filter that minimize a coding motion cost indication for a partition of the frame macroblock. Said sub-pixel motion estimation process may be also carried for all partitions of a frame macroblock and all possible partition modes of a macroblock, or for a reduced number of such partitions and partition modes.

The step of initializing the interpolation filter 305 may comprise the selection of one interpolation filter from a certain set of interpolation filters and step 355 provides a new different interpolation filter for every iteration, which according to the invention is an interpolation filter from a certain set of interpolation filters. Said set of interpolation filters may be a first set of interpolation filters or a second set of interpolation filters. The first set of interpolation filters is determined as comprising at least a first interpolation filter, which may be the initial filter selected in step 305, and a second interpolation filter, which is the result of adapting at least one parameter of the first filter for minimizing a prediction error for a sub-pixel position. The second set of interpolation filters is determined as comprising at least two predetermined filters.

According to a specific embodiment, the first set of interpolation filters or the second set of interpolation filters contains at least one of the interpolation filters of the following list:

8-tap: (-2,6,-12,40, 40,-12,6,-2)/64 8-tap: (-1,3,-6,52, 20,-6,3,-l)/64 8-tap: (-1,3,-6,20, 52,-6,3,-l)/64 6-tap: (2, -8, 29,12,-4, l)/32 6-tap: (1,-4, 12, 29,-8, 2)/32

6-tap: (1, -5, 20, 20, -5, l)/32 4-tap: (-2, 10, 10, -2)/16 4-tap: (-2, 14, 5, -1)/16 4-tap: (-1, 5, 14, -2)/16 2-tap: (2,2)/4

2-tap: (3,l)/4 2-tap: (l,3)/4

For example, the first set of interpolation filters may contain at least

6-tap filter (1, -5, 20, 20, -5, l)/32, which may be selected as the initial filter in step 305, and at least a second filter which is the result of adapting the coefficient values of the initially selected 6-tap filter for minimizing a prediction error for a sub-pixel position. An exemplary flow diagram illustrating how said coefficient adaptation can be achieved according to an embodiment of the invention is shown in Figure 4. The second set of interpolation filters may contain at least two filters from the above list of interpolation filters.

According to another embodiment, the second set of interpolation filters comprises interpolation filters with different tap sizes, for example at least one interpolation filter having a tap size being four or less, and at least one interpolation filter having a tap size greater than four. This may also be referred as comprising interpolation filters with "short" and "long" tap sizes, a short tap size being an interpolation filter having four or less coefficients and a long tap size being an interpolation filter having more than four coefficients.

The determination and use of the first set or the second set of interpolation filters, for partitions of frame macroblocks, may be decided by a flag indication having at least two different values, one indicating the determination and use of the first set of interpolation filters and the other indicating the determination and use of the second set of interpolation filters. Said flag indication may be dynamically generated by higher hierarchy video coding layers, such as frame, slice or macroblock level layers, indicating that said determination and selection of the set of interpolation filters, the first set or second set of interpolation filters, shall apply to macroblock partitions of some specific frames, slices or macroblocks, or such flag indication may be predetermined and apply, for example, to a certain number or all macroblock partitions of the video sequence frames. Said flag indication may also have further values or meanings for indicating, for example, the number of interpolation filters inside the interpolation filter set, or even the filters to be selected. As an example, the flag indication may have a binary value "0" indicating that the first set of interpolation filters shall be determined and used and the said set shall contain 2 interpolation filters, and accordingly, step 305 will initialize the interpolation filter to 6-tap filter (1, -5, 20, 20, -5, l)/32 and step 355 will determine a second interpolation filter, by adapting, for example, the filter coefficient values of the previously initialized 6-tap filter, and provide said second filter to step 310. If flag indication has a binary value "1" this could indicate the determination and use of the second filter set and said set containing three interpolation filters, and accordingly, step 305 will initialize the interpolation filter to 6-tap filter (1, -5, 20, 20, -5, l)/32 and step 355 will provide one of 4-tap filter (-2, 14, 5, -1)/16 and 2-tap filter (2,2)/4 for each process iteration according to condition in step 345.

Condition in step 345 could be then checking if all the interpolation filters of the filter set, the first or the second filter set, have been provided in step 355 for a certain sub-pixel position. Alternatively said condition could be checking if a certain number of the interpolation filters of the filter set, the first or the second filter set, have been provided in step 355. According to the invention, for determining the optimal interpolation filter, the interpolation filter coefficient values, the interpolation filter tap size or both parameters (coefficient values and tap size), are locally adapted to the partitions of the frame macroblocks. This provides the advantage to further improve compression efficiency by better adapting the interpolation filter parameters to local area video textures, and reduces interpolation process complexity both in memory accessing and filtering calculation. As an example, long and short tap optimal interpolation filters may be determined and used for certain macroblock partitions according to the invention, for example, long-tap interpolation filters may be determined and used for stationary image areas, which advantageously provides a better frequency response, and short-tap filters may be selected and used for edge and smooth video textures.

For applications, as for example video encoding and decoding, in which the optimal interpolation filter shall be also encoded into a compressed bitstream and transmitted to a decoder apparatus, it is advantageous that the bitstream bit rate can be reduced. For this purpose, and according to the invention, the determined interpolation filter may be represented just by certain filter coefficient values or a filter index. In addition to the measures indicated for reducing the bitstream bit rate associated to the embodiment of the invention illustrated in Figure 2, the determined optimal interpolation filter can be coded in the following ways a) the filter coefficient values or filter index can be firstly predicted by a function of neighbor partitions filter coefficients or filter index. The function can be linear such as an average function, or non-linear such as a median function, or just as zero. Then the predictive residue is coded by variable length code, such as a Huffman or Exponential Golomb code or a Context-Adaptive Binary Arithmetic Code, b) the filter coefficient values or filter index can be padded after the motion vector, and be coded jointly with motion vector, or c) the filter coefficient values or filter index can be combined with a reference index, and be coded jointly with the reference index. Figure 4 is a flow diagram illustrating an example of how to adapt the filter coefficient values of a certain interpolation filter or filter set in order to reduce a prediction error, comprising the steps of parameter decision 400, training model decision 405, interpolation filter training 410, filter coefficients quantization 415 and filter coefficients coding 420. The filter will be optimized to derive an optimum filter for a certain partition so as to improve the motion compensation and thereby enhance the coding efficiency. The objective of the optimization is to minimize the prediction error between the partition to be coded and the predicted partition by, for example, a Least Square Estimation. The prediction error is represented by the sum of (e)² using the following formula

x y wherein, S represents the partition to be coded; S_pre represents the predicted partition; x and y represent x and y coordinates, respectively, of a pixel of the partition to be coded.

Before optimizing a filter or a filter set, it is necessary to determine parameter values of the filter or filter set in step 400 and the filtering pattern in step 405 according to the practical requirements.

The parameter values comprises such as sub-pixel resolution for determining the number of filters needed for the filter set, filter taps representing the size of each filter of the filter set. The filtering pattern includes filtering patterns in respect of each sub-pixel position as well as the relationship among the filters.

In step 410, coefficients of the filter set (i.e. coefficients of each filter with a specific sub-pixel resolution) are adaptively trained for minimizing the square error (e)². The prediction frame S_pre can be calculated using the following formula

^SP^reχ,y

wherein N^χM is the size of a filter, P represents the reference frame; (mvx, mvy) represents the motion vectors of a certain sub-pixel at the position (x, y); h represents the filter coefficients for that sub-pixel position. The filter size is decided by filter taps which were determined in step 400. As stated above, the square error (e)² can be obtained by using the following formula

wherein e represents the difference between the current raw partition and a prediction of the current raw partition; S represents the current raw picture; P represents the reference picture; x and y represent the x and y coordinates, respectively; (mvx, mvy) represents the motion vectors; h represents the float filter coefficients, i, j represent the coordinates of filter coefficients; and (M, N) represents filter tap size in horizontal direction and vertical direction. The training of a filter set in step 410 is to calculate optimum filter coefficients h for minimizing the square error (e)². The coefficients h are float coefficients. Therefore, the filter set obtained is a global optimum interpolation filter set.

Then, step 415 is carried out for mapping the float filter coefficients to quantization filter coefficients according to a required precision. It is understood that this mapping step is employed for facilitating the training of the interpolation filter.

Referring now back to Figure 3, the filter with quantized coefficients is the trained or adapted interpolation filter, which has been determined in step 355 of Figure 3 and will be provided to step 310 for the calculation of the coding motion cost for a certain sub-pixel position. The motion cost will be compared in step 340 to a certain value and, if said cost is less than a threshold value, said adapted filter will be registered as the optimal interpolation filter for that sub-pixel position. Figure 5 illustrates a second illustrative flow diagram of a sub-pixel motion estimation process according to an embodiment of the invention, comprising a first process for determining an optimal motion vector 390 followed by a second process for determining an optimal interpolation filter 395. The first process for determining an optimal motion vector 390 comprises the steps of initializing an interpolation filter 305, calculating an encoding motion cost indication 310, comparing motion cost indications 340, registering motion vector and interpolation filter 330, checking iterative sub-pixel position condition 350. The second process for determining an optimal interpolation filter 395 comprises the steps of providing new interpolation filter 355, calculating an encoding motion cost indication 310', comparing motion cost indications 340', registering motion vector and interpolation filter 330', and checking iterative interpolation filter set condition 345.

Steps 305, 310, 340, 330, 350, 355 and 345 provide the same functions as indicated for the sub-pixel motion estimation embodiment of Figure 3. Steps 310', 340' and 330' are also the same steps as steps 310, 340 and 330 of Figure 3. Basically, the difference between the embodiments for sub-pixel motion estimation of Figures 3 and 5 is just the arrangement of the different process steps. While in Figure 3, for each sub-pixel position, steps 345 and 355 were iteratively applied, in Figure 5 said steps are only applied to the optimal sub-pixel position of the macroblock partition, that is, the sub-pixel position providing the minimum coding motion cost according to a default interpolation filter. It will be noted that step 330 of Figure 5 will register the optimal motion vector providing the minimum motion cost according to an initial interpolation filter selected in step 305. Since said initial interpolation filter may not be the optimal interpolation filter for the macroblock partition, step 355 in combination with step 345 will provide new interpolation filters of a filter set for which the coding motion cost will be calculated, in step 310', and compared to the minimum motion cost registered for the macroblock partition, in step 340'. In case the motion cost of any of the new interpolation filters of the filter set is less than the minimum motion cost registered for the macroblock partition, it will be registered in step 330' as the optimal interpolation filter associated to that optimal motion vector.

This particular embodiment presents the advantage that it greatly reduces the computational complexity of the filtering process, compared with the computational complexity needed in Figure 3.

Figure 6 illustrates a third illustrative flow diagram of a sub-pixel motion estimation process according to an embodiment of the invention, comprising a process for determining an optimal motion vector 390 followed by a process for determining an optimal macroblock partition mode 600 and followed by a process for determining an optimal interpolation filter 395. Process step 390 and 395 provide the same functions as indicated for the sub-pixel motion estimation embodiment of Figure 5, and process step 600 provides the same function as indicated for the step of partition mode selection 210 of Figure 2. The difference between the embodiments for sub-pixel motion estimation of Figures 5 and 6 is just the introduction of the process for determining an optimal macroblock partition mode 600 between the process for determining an optimal motion vector 390 and the process for determining an optimal interpolation filter 395.

While in Figure 5 the process for determining an optimal interpolation filter 395 is applied to the optimal sub-pixel position of the macroblock partition, for all or a reduced number of macroblock partition modes, now the process for determining an optimal interpolation filter 395 in Figure 6 is applied only to the optimal sub-pixel position of those partitions associated to the optimal partition mode. The computational complexity is thus further reduced, which makes this particular embodiment feasible for use in real time encoding scenarios.

According to the invention, for applications such as, but not limited to, video encoding and decoding, any of the sub-pixel motion estimation embodiments shown in Figures 3, 5 and 6 can be implemented individually or in combination and selectively applied to determine the optimal motion vector and interpolation filter for a particular macroblock partition, thus making the computational complexity of the encoder scalable and adaptable to encoder available resources.

Figure 7 shows a simplified block diagram of a video codec architecture 170 comprising means for sub-pixel motion-compensated prediction according to an embodiment of the invention. The video codec comprises an encoder 171 and a decoder 172. The encoder 171 comprises a motion estimation module 100, an interpolation filter determination module 105, a partition mode selection module 110, a motion compensation module 115, an adder- subtracter 120, an encoding module 125, and a feedback- decoding module 130. The decoder 172 comprises a decoding module 150, a motion compensation module 155, and a reconstruction module 160. For a video sequence, a certain frame input Pc, namely, a raw image signal to be coded by the encoder 171 is divided into macroblocks and directed to the motion estimation module 100 where integer pixel motion and sub-pixel motion estimation is performed. Sub-pixel motion estimation is carried, for each candidate sub-pixel position of the macroblock partitions, with the aid of an interpolation filter Fo determined according to the invention by the interpolation filter determination module 105. The motion estimation module outputs motion information Mi, such as a reference picture index, motion vector, and interpolation filter Fo associated with that motion information.

The partition mode selection module 1 10 may take motion information Mi and interpolation filter Fo as input to make a selection on the encoding mode of the macroblock. The determination of the best partition mode for a macroblock can be based on a rate-distortion optimization function taking in consideration motion vector and interpolation filter coding information.

The partition mode selection module 110 provides encoding mode information Mo, such as partition mode, reference picture index and motion vector, and determined interpolation filter Fo to the motion compensation module 115, where a motion-compensated predictive partition Pre of current macroblock is constructed and fed to the adder-subtracter module 120. A macroblock of the frame input Pc may be predicted by a motion compensated prediction technique based on a reference frame Pc-I which was obtained by reconstructing a previous encoded frame in the feedback decoding module 130. The predicted macroblock partition Pre is subtracted from the corresponding macroblock partition of the input frame Pc, to get the residue partition information R at the output of adder- subtracter module 120.

The residue partition information R from adder- subtracter module 120, the encoding mode information Mo and the determined interpolation filter Fo from the motion compensation module 115, are given to the encoding module 125, in which the operations for transform, quantization and entropy coding are performed to produce a coded bitstream, which is transmitted to the decoding module 172. To get identical motion compensation prediction signal for encoder 171 and decoder 172, the encoded frame input Pc is reconstructed as reference frame Pc-I in the encoder by feedback module 130.

The decoding module 150 decodes the encoded residue partition information R, the encoding mode information Mo, such as partition mode, reference picture index and motion vector, and the associated determined interpolation filter Fo. Then, it transmits these decoded signals to the decoder motion compensation module 155.

The decoder motion compensation module 155 determines then the samples to be interpolated according to the decoded motion vector and for interpolating the reference frame so as to recover the motion compensated prediction frame based on the decoded difference and motion vectors by using the determined interpolation filter Fo.

The reconstruction module 160 receives the decoded difference R' from the decoding module 150 and the motion compensated prediction block Pre' from the decoder motion compensation module 155 so as to obtain a reconstructed frame Pd of the coded frame input Pc.

Figure 8 illustrates a compressed bitstream syntax comprising macroblock type MT, reference index RI, motion vector MV and interpolation filter IF information. The macroblock type, motion vector and interpolation filter may be calculated according to an embodiment for sub-pixel motion-compensated prediction of the invention. A new syntax comprising information about the determined "interpolation filter" is introduced into the bitstream.

Referring to Figure 9, we will explain an illustrative method for decoding a compressed bitstream of the type shown in figure 8, comprising information about a determined motion vector and interpolation filter according to an embodiment of the invention. The method comprises the steps of decoding a macroblock partition mode 800, decoding reference frame and motion vector for the macroblock partition 805, identifying interpolation filter need 810, decoding interpolation filter 820, setting default interpolation filter 815, performing motion compensation 825, decoding residue information 830, reconstructing macroblock partition 835, and checking iterative macroblock condition 840. A macroblock is the basic decoding unit for the decoder. Before initiating the decoding of the macroblock, the related sequence, frame and slice information are previously decoded from video encoded bitstream. Then the macroblock type information is decoded from the encoded video bitstream, from which information about the number and size of the macroblock partition is obtained in step 800.

Then, for each macroblock partition, the associated information such as reference frame and motion vector are decoded in step 805. In step 810, the decoded motion vector is checked to see if motion vector information is encoded in bitstream, and in that case, the corresponding interpolation filter is decoded from bitstream in step 820. Otherwise, a default interpolation filter is set in step 815.

In step 825, with the reference frame, motion vector and interpolation filter information, motion compensation is performed to obtain the motion compensation prediction information and, at the same time, the residue information is decoded from the bitstream in step 830. After obtaining the motion compensation prediction and residue information, the current macroblock can be reconstructed in step 835.

In step 840 it is checked if there is another partition contained in the decoded macroblock, and in that case repeating steps 805 to 835 are performed to reconstruct the next macroblock partition and until all partitions in current macroblock are decoded.

The above decoding process may be repeated until all the macroblocks of the frame are decoded, and until all the frames of the video sequence are decoded.

Claims

1. A method for sub-pixel motion-compensated video encoding comprising, for an incoming video sequence frame, the steps of:

- dividing the frame into macroblocks and macroblock partitions, - determining a motion vector and an interpolation filter, from a set of interpolation filters, for a macroblock partition, for estimating fractional pixel displacements of said macroblock partition, and characterized in that it comprises the steps of:

- determining a partition mode for a macroblock, - performing motion compensation of a macroblock partition according to the determined partition mode, motion vector and interpolation filter, and in that the step of determining an interpolation filter and a motion vector for a macroblock partition comprises determining an interpolation filter and a motion vector which minimize an coding motion cost indication for that partition of the frame macroblock.

2. The method for sub-pixel motion-compensated video encoding of claim

1, characterized in that the set of interpolation filters is determined as comprising: - a first set of interpolation filters comprising at least a first filter and a second filter, the second filter being the result of adapting at least one parameter of the first filter for minimizing a prediction error for a sub-pixel position, and/or

3. The method for sub-pixel motion-compensated video encoding of claim

2, characterized in that at least one of the sets of interpolation filters is determined and/or selected according to an indication at a frame slice level.

4. The method for sub-pixel motion-compensated video encoding according to claims 2 or 3, characterized in that a second filter of the first set of interpolation filters is obtained by adapting filter coefficients of the first filter, in order to minimize a prediction error for a sub-pixel position, using least square estimation.

5. The method for sub-pixel motion-compensated video encoding according to any one of claims 2 or 4, characterized in that the second set of interpolation filters comprises interpolation filters with different tap sizes.

6. The method for sub-pixel motion-compensated video encoding according to any one of claims 1 to 5, characterized in that said step of determining an interpolation filter comprises

- a step of determining interpolation filter coefficient values and/or length of a customized interpolation filter, which is adapted to the video sequence frame, and - a step of inserting said interpolation filter coefficient values and/or length into an encoded bitstream corresponding to said video sequence frame.

7. The method for sub-pixel motion-compensated video encoding according to any one of claims 1 to 6, characterized in that the set of interpolation filters comprises at least one of the following filters:

8-tap: (-2,6,-12,40, 40,-12,6,-2)/64

8-tap: (-1,3,-6,52, 20,-6,3,-l)/64

8-tap: (-1,3,-6,20, 52,-6,3,-l)/64

6-tap: (2, -8, 29,12,-4, l)/32 6-tap: (1,-4, 12, 29,-8, 2)/32

6-tap: (1, -5, 20, 20, -5, l)/32

4-tap: (-2, 10, 10, -2)/16

4-tap: (-2, 14, 5, -1)/16

4-tap: (-1, 5, 14, -2)/16 2-tap: (2,2)/4

2-tap: (3,l)/4

2-tap: (l,3)/4

8. The method for sub-pixel motion-compensated video encoding according to any one of claims 1 to 7, further comprising: - checking if a motion vector is encoded for one of said macroblocks;

- if yes, encoding at least one piece of information representative of the corresponding determined interpolation for said macroblock.

9. The method for sub-pixel motion-compensated video encoding according to any one of claims 1 to 7, further comprising:

- checking if the determined motion vector points to a sub-pixel position, and - if yes, encoding the determined interpolation filter into a bitstream.

10. The method for sub-pixel motion-compensated video encoding according to any one of claims 1 to 9, wherein the determined interpolation filter is defined by a reference index or a set of reference coefficients values and a predictive residue.

11. The method for sub-pixel motion-compensated video encoding according to any one of claims 1 to 10, wherein the determined interpolation filter is represented by a filter index or a set of reference coefficients values and is combined with the motion vector, to be coded into a bitstream jointly with that motion vector.

12. The method for sub-pixel motion-compensated video encoding according to any of the claims 1 to 10, wherein the determined interpolation filter is represented by a filter index or a set of reference coefficients values and is combined with a reference frame index, to be coded into a bitstream jointly with that reference frame index.

13. The method for sub-pixel motion-compensated video encoding according to any one of claims 1 to 12, characterized by, for sub-pixel positions of partitions of frame macroblocks, the steps of:

- calculating, for each of that sub-pixel positions, a coding motion cost indication associated with each interpolation filter of the set of interpolation filters, and

- from these calculated coding motion cost indications, determining the interpolation filter associated with the minimum coding motion cost indication for each of that sub-pixel positions, and

- from the previously obtained sub-pixel positions and associated coding motion cost indications, determining the motion vector corresponding to the sub-pixel position associated with the minimum coding motion cost indication for the partition of the frame macroblock.

14. The method for sub-pixel motion-compensated video encoding according to any one of claims 1 to 12, characterized by, for sub-pixel positions of partitions of frame macroblocks, the steps of:

- calculating, for each of that sub-pixel positions, the encoding motion cost indication associated with one selected interpolation filter of the set of interpolation filters, and from these calculated motion cost indications, determining the motion vector corresponding to the sub- pixel position associated with the minimum coding motion cost indication for the partition of the frame macroblock, and - calculating, with the previously determined motion vector, the encoding motion cost indication associated with each interpolation filter of the set of interpolation filters, and from these calculated motion cost indications, determining the interpolation filter associated with the minimum coding motion cost indication for the partition of the frame macroblock.

15. The method for sub-pixel motion-compensated video encoding according to any one of claims 1 to 12, characterized by the steps of:

- for sub-pixel positions of partitions of frame macroblocks,

- determining one partition mode for the frame macroblock, and

- for partitions of that macroblock partition mode,

- calculating, with the previously determined motion vector for that partition of the frame macroblock, the coding motion cost indication associated with each interpolation filter of the set of interpolation filters, and from these calculated coding motion cost indications, determining the interpolation filter associated with the minimum coding motion cost indication for the partition of the frame macroblock.

16. The method for sub-pixel motion-compensated video encoding according to any of the claims 1 to 15, characterized in that the coding motion cost indication is determined according to the formula:

M= D + λm (MVr + Fr) wherein,

M represents the coding motion cost indication,

D represents a difference between a macroblock partition of the incoming frame and a predicted partition generated from a motion vector and an interpolation filter for a sub-pixel position, λm represents a tradeoff coefficient between D and encoding bit rate, MVr represents bits needed for coding a motion vector, and Fr represents bits needed for coding an interpolation filter,

17. A method for sub-pixel motion-compensated video decoding of a received encoded bitstream representative of a video sequence frame, said frame being organized into macroblocks, characterized by the steps of:

- obtaining from said bitstream:

- for at least one of said macroblock partitions: - a motion vector, and - an interpolation filter, and

- performing motion-compensated prediction for the partition according to the information representative of a partition mode, the motion vector and the interpolation filter.

18. The method for sub-pixel motion-compensated video decoding of claim 17, characterized in that,

- the step of obtaining from the bit stream the motion vector comprises obtaining a motion vector difference information, obtaining a motion vector prediction value and reconstructing the motion vector for the macroblock partition, and

- the step of obtaining from the bit stream the interpolation filter comprises obtaining interpolation filter coefficients or a filter index.

19. The method for sub-pixel motion-compensated video decoding of claim 17, characterized in that

- the step of obtaining from the bitstream the motion vector comprises obtaining a motion vector difference information, determining a motion vector prediction value and reconstructing the motion vector for the macroblock partition, and

- the step of obtaining the interpolation filter comprises determining, from the previously reconstructed motion vector, interpolation filter coefficients or a filter index for the partition of the macroblock.

20. An encoding apparatus for video signals comprising means for sub-pixel motion-compensated prediction adapted to, for an incoming video sequence frame,

- divide the frame into macroblocks and macroblock partitions,

- determine a motion vector and an interpolation filter, from a set of interpolation filters, for a macroblock partition, for estimating fractional pixel displacements of a macroblock partition, and characterized by further comprising means adapted to

- determine a partition mode for a macroblock,

- perform motion compensation of a macroblock partition according to the determined partition mode, motion vector and interpolation filter, and wherein the means adapted to determine an interpolation filter and a motion vector for a macroblock partition determine an interpolation filter and a motion vector which minimize an encoding motion cost indication for that partition of the frame macroblock.

21. A decoding apparatus for video signals comprising means for sub-pixel motion- compensated prediction adapted to - receive an encoded bitstream, and characterized by further comprising means adapted to

- obtain from said bitstream:

- for at least one of said macroblock partitions: - a motion vector, and

- an interpolation filter, and

- perform motion-compensated prediction for the partition according to the information representative of a partition mode, the motion vector and the interpolation filter.

22. A signal representative of an incoming video sequence comprising a series of frames, each frame being divided into macroblocks and macroblock partitions, characterized in that it comprises:

- a field defining a macroblock partition mode used for a frame macroblock; - a field defining a motion vector for a partition of said frame macroblock;

- a field defining an interpolation filter for said macroblock partition, from a set of interpolation filters, for estimating fractional pixel displacements of the macroblock partition.

23. The signal of claim 22 characterized in that the field defining an interpolation filter contains interpolation filter coefficients or filter indications which represent interpolation filters which are different in filter coefficients and/or tap size for partitions of the same frame macroblock.

24. A computer program product which can be downloaded from a communication network and/or previously stored on a computer readable medium and/or executable by a processor, comprising program instructions for implementing the method for sub-pixel motion-compensated video encoding according to any one of claims 1 to 16.

25. A computer program product which can be downloaded from a communication network and/or previously stored on a computer readable medium and/or executable by a processor, comprising program instructions for implementing the method for sub-pixel motion-compensated video decoding according to any one of claims 17 to 19.