US20120307074A1 - Method and apparatus for reduced reference video quality measurement - Google Patents

Method and apparatus for reduced reference video quality measurement Download PDF

Info

Publication number
US20120307074A1
US20120307074A1 US13/151,761 US201113151761A US2012307074A1 US 20120307074 A1 US20120307074 A1 US 20120307074A1 US 201113151761 A US201113151761 A US 201113151761A US 2012307074 A1 US2012307074 A1 US 2012307074A1
Authority
US
United States
Prior art keywords
target video
video
target
measurement
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US13/151,761
Other versions
US8520075B2 (en
Inventor
Sitaram Bhagavathy
Jeffrey A. Bloom
Dekun Zou
Ran Ding
Beibei Wang
Tao Liu
Niranjan Narvekar
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dialogic Inc
Dialogic Corp USA
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to US13/151,761 priority Critical patent/US8520075B2/en
Application filed by Individual filed Critical Individual
Assigned to DIALOGIC INC. reassignment DIALOGIC INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BLOOM, JEFFREY A, BHAGAVATHY, SITARAM, DING, RAN, LIU, TAO, NARVEKAR, Niranjan, WANG, BEIBEI, ZOU, DEKUN
Assigned to DIALOGIC (US) INC. reassignment DIALOGIC (US) INC. CORRECTIVE ASSIGNMENT TO CORRECT THE NAME OF THE RECEIVING PARTY IDENTIFIED PREVIOUSLY AS "DIALOGIC INC." PREVIOUSLY RECORDED ON REEL 026393 FRAME 0959. ASSIGNOR(S) HEREBY CONFIRMS THE NAME OF THE RECEIVING PARTY SHOULD READ "DIALOGIC (US) INC.". Assignors: BLOOM, JEFFREY A., BHAGAVATHY, SITARAM, DING, RAN, LIU, TAO, NARVEKAR, Niranjan, WANG, BEIBEI, ZOU, DEKUN
Assigned to DIALOGIC CORPORATION reassignment DIALOGIC CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: DIALOGIC (US) INC.
Assigned to OBSIDIAN, LLC, AS COLLATERAL AGENT reassignment OBSIDIAN, LLC, AS COLLATERAL AGENT SUPPLEMENTAL INTELLECTUAL PROPERTY SECURITY AGREEMENT Assignors: DIALOGIC CORPORATION, DIALOGIC INC., DIALOGIC NETWORKS (ISRAEL) LTD.
Publication of US20120307074A1 publication Critical patent/US20120307074A1/en
Publication of US8520075B2 publication Critical patent/US8520075B2/en
Application granted granted Critical
Assigned to DIALOGIC INC., CANTATA TECHNOLOGY, INC., BROOKTROUT SECURITIES CORPORATION, DIALOGIC (US) INC., F/K/A DIALOGIC INC. AND F/K/A EICON NETWORKS INC., DIALOGIC RESEARCH INC., F/K/A EICON NETWORKS RESEARCH INC., DIALOGIC DISTRIBUTION LIMITED, F/K/A EICON NETWORKS DISTRIBUTION LIMITED, DIALOGIC MANUFACTURING LIMITED, F/K/A EICON NETWORKS MANUFACTURING LIMITED, EXCEL SWITCHING CORPORATION, BROOKTROUT TECHNOLOGY, INC., SNOWSHORE NETWORKS, INC., EAS GROUP, INC., SHIVA (US) NETWORK CORPORATION, BROOKTROUT NETWORKS GROUP, INC., CANTATA TECHNOLOGY INTERNATIONAL, INC., DIALOGIC JAPAN, INC., F/K/A CANTATA JAPAN, INC., DIALOGIC US HOLDINGS INC., EXCEL SECURITIES CORPORATION, DIALOGIC CORPORATION, F/K/A EICON NETWORKS CORPORATION reassignment DIALOGIC INC. RELEASE BY SECURED PARTY (SEE DOCUMENT FOR DETAILS). Assignors: OBSIDIAN, LLC
Assigned to SILICON VALLEY BANK reassignment SILICON VALLEY BANK SECURITY AGREEMENT Assignors: DIALOGIC (US) INC., DIALOGIC CORPORATION, DIALOGIC DISTRIBUTION LIMITED, DIALOGIC GROUP INC., DIALOGIC INC., DIALOGIC MANUFACTURING LIMITED, DIALOGIC US HOLDINGS INC.
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N17/00Diagnosis, testing or measuring for television systems or their details
    • H04N17/004Diagnosis, testing or measuring for television systems or their details for digital television systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/86Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving reduction of coding artifacts, e.g. of blockiness
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/40Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream

Definitions

  • the present application relates generally to systems and methods of objective video quality measurement, and more specifically to systems and methods of objective video quality measurement that employ a reduced-reference approach.
  • systems and methods are known that employ a full-reference approach, a no-reference approach, and a reduced-reference approach to video quality measurement.
  • systems that employ a full-reference approach to video quality measurement typically receive target video content (also referred to herein as a/the “target video”) whose perceptual quality is to be measured, and compare information from the target video to corresponding information from a reference version (also referred to herein as a/the “reference video”) of the target video to provide a measurement of the perceptual quality of the target video.
  • target video content also referred to herein as a/the “target video”
  • a reference version also referred to herein as a/the “reference video”
  • Systems that employ a reduced-reference approach to video quality measurement typically have access to a reduced amount of information from the reference video for comparison to the target video information.
  • information from the reference video can include a limited number of characteristics of the reference video, such as its spectral components, its variation of energy level, and/or its energy distribution in the frequency domain, each of which may be sensitive to degradation during processing and/or transmission of the target video.
  • systems and methods of objective video quality measurement are disclosed that employ a reduced-reference approach.
  • Such systems and methods of objective video quality measurement can extract information pertaining to one or more features (also referred to herein as “target features”) of a target video whose perceptual quality is to be measured, extract corresponding information pertaining to one or more features (also referred to herein as “reference features”) of a reference video, and employ one or more prediction functions involving the target features and the reference features to provide a measurement of the perceptual quality of the target video.
  • the target feature extractor is operative to extract one or more target features from the target video by performing one or more objective measurements with regard to the target video.
  • the reference feature extractor is operative to extract one or more reference features from the reference video by performing one or more objective measurements with regard to the reference video.
  • Such objective measurements performed on the target video and the reference video can include objective measurements of blocking artifacts in the respective target and reference videos (also referred to herein as “blockiness measurements”), objective measurements of blur in the respective target and reference videos (also referred to herein as “blurriness measurements”), objective measurements of an average quantization index for the respective target and reference videos, as examples, and/or any other suitable types of objective measurements.
  • objective measurements can result in target features and reference features that can be represented by compact data sets, which may be transmitted over a network without consuming an undesirably excessive amount of network bandwidth.
  • the term “quantization index” corresponds to any suitable parameter that can be adjusted to control the quantization step-size used by a video encoder.
  • a QI can correspond to a quantization parameter (also referred to herein as a/the “QP”) for video bitstreams compressed according to the H.264 coding format, a quantization scale for video bitstreams compressed according to the MPEG-2 coding format, or any other suitable parameter for video bitstreams compressed according to any other suitable coding format.
  • the quality assessor is operative to provide an assessment of the perceptual quality of the target video following its transmission over the network to an endpoint device, using one or more prediction functions involving the target features and the reference features.
  • one or more of the prediction functions can be linear prediction functions or non-linear prediction functions.
  • an endpoint device can be a mobile phone, a mobile or non-mobile computer, a tablet computer, or any other suitable type of mobile or non-mobile endpoint device capable of displaying video.
  • the perceptual quality of each of the target video and the reference video can be represented by a quality assessment score, such as a predicted mean opinion score (MOS).
  • MOS predicted mean opinion score
  • the quality assessor is operative to estimate the perceptual quality of the target video by obtaining a difference between an estimate of the perceptual quality of the reference video, and an estimate of the predicted differential MOS (also referred to herein as a/the “DMOS”) between at least a portion of the reference video and at least a portion of the target video.
  • the estimate of the perceptual quality of the target video also referred to herein as a/the “ ⁇ circumflex over (Q) ⁇ tar ”
  • Q mean opinion score
  • ⁇ Q ( Q ref ⁇ Q tar ).
  • the quality assessor is further operative to calculate or otherwise determine the ⁇ circumflex over (Q) ⁇ ref using a first prediction function for a predetermined segment from a corresponding time frame within the reference video and the target video.
  • the ⁇ circumflex over (Q) ⁇ ref can be expressed as
  • ⁇ 1 (QI ref ) corresponds to the first prediction function
  • QI ref corresponds to the QI for the reference video
  • the first prediction function, ⁇ 1 (QI ref ) can be a linear function of the QI for the reference video, and/or any other suitable reference feature(s).
  • the quality assessor is also operative to calculate or otherwise determine the ⁇ Q using a second prediction function for the predetermined segment.
  • the ⁇ Q can be expressed as
  • ⁇ 2 ( ⁇ blr, ⁇ blk) corresponds to the second prediction function
  • ⁇ blr corresponds to the average change in frame-wise blurriness measurements between the reference video and the target video for the predetermined segment
  • ⁇ blk corresponds to the average change in frame-wise blockiness measurements between the reference video and the target video for the predetermined segment
  • the second prediction function, ⁇ 2 ( ⁇ blr, ⁇ blk) can be a linear function of the average change in the frame-wise blurriness measurements for the respective target and reference videos, the average change in the frame-wise blockiness measurements for the respective target and reference videos, and/or any other suitable reference feature(s) and target feature(s).
  • the quality assessor is operative to estimate the perceptual quality of the target video, ⁇ circumflex over (Q) ⁇ tar , using a third prediction function based on the first prediction function, ⁇ 1 (QI ref ), and the second prediction function, ⁇ 2 ( ⁇ blr, ⁇ blk).
  • the ⁇ circumflex over (Q) ⁇ tar can be expressed as
  • ⁇ 3 (QI ref , ⁇ blr, ⁇ blk)” corresponds to the third prediction function
  • a 3 ,” “b 3 ,” “c 3 ,” and “d 3 ” each correspond to a parameter coefficient of the third prediction function.
  • the value of each of the parameter coefficients a 3 , b 3 , c 3 , and d 3 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique.
  • the target feature extractor, the reference feature extractor, and the quality assessor can be implemented in a distributed fashion within a video communications environment.
  • the target feature extractor can be located proximate to or co-located with the quality assessor, such as within the endpoint device, and the reference feature extractor can be disposed at a distal or geographically remote location from the target feature extractor and the quality assessor.
  • the disclosed system can transmit the reference features from the reference feature extractor at the distal or geographically remote location to the quality assessor, which, in turn, can access the target features from the target feature extractor proximate thereto or co-located therewith for estimating the perceptual quality of the target video.
  • the reference feature extractor can be located proximate to or co-located with the quality assessor, and the target feature extractor can be disposed at a distal or geographically remote location from the reference feature extractor and the quality assessor.
  • the disclosed system can transmit the target features from the target feature extractor at the distal or geographically remote location to the quality assessor, which, in turn, can access the reference features from the reference feature extractor proximate thereto or co-located therewith for estimating the perceptual quality of the target video.
  • the quality assessor may be disposed at a centralized location that is geographically remote from the target feature extractor and the reference feature extractor.
  • the disclosed system can transmit the target features from the target feature extractor to the quality assessor, and transmit the reference features from the reference feature extractor to the quality assessor, for estimating the perceptual quality of the target video within the quality assessor at the geographically remote, centralized location.
  • the disclosed systems and methods can operate to transmit the reference features and/or the target features over a network to a quality assessor for assessing the perceptual quality of the target video, without consuming an undesirably excessive amount of network bandwidth. Further, by providing for such a perceptual quality assessment of the target video using one or more prediction functions, the perceptual quality assessment of the target video can be performed within an endpoint device having limited processing power. Moreover, by using the average values of frame-wise objective measurements for a predetermined segment from a corresponding time frame within the reference video and the target video, fluctuations in the perceptual quality assessment of the target video can be reduced.
  • FIG. 1 is a block diagram of an exemplary video communications environment, in which an exemplary system for measuring the perceptual quality of a target video employing a reduced-reference approach to video quality measurement can be implemented, in accordance with an exemplary embodiment of the present application;
  • FIG. 2 a is a block diagram of an exemplary target feature extractor, an exemplary reference feature extractor, and an exemplary quality assessor included within the system of FIG. 1 , illustrating an exemplary method of providing target features and reference features from the target feature extractor and the reference feature extractor, respectively, to the quality assessor;
  • FIG. 2 b is a block diagram of the exemplary target feature extractor, the exemplary reference feature extractor, and the exemplary quality assessor included within the system of FIG. 1 , illustrating another exemplary method of providing the target features and the reference features from the target feature extractor and the reference feature extractor, respectively, to the quality assessor;
  • FIG. 2 c is a block diagram of the exemplary target feature extractor, the exemplary reference feature extractor, and the exemplary quality assessor included within the system of FIG. 1 , illustrating a further exemplary method of providing the target features and the reference features from the target feature extractor and the reference feature extractor, respectively, to the quality assessor;
  • FIG. 3 is a flow diagram of an exemplary method of operating the system of FIG. 1 .
  • Systems and methods of objective video quality measurement are disclosed that employ a reduced-reference approach.
  • Such systems and methods of objective video quality measurement can extract information pertaining to one or more features (also referred to herein as “target features”) of a target video whose perceptual quality is to be measured, extract corresponding information pertaining to one or more features (also referred to herein as “reference features”) of a reference video, and employ one or more prediction functions involving the target features and the reference features to provide a measurement of the perceptual quality of the target video.
  • FIG. 1 depicts an exemplary video communications environment 100 , in which an exemplary system 101 (also referred to herein as a/the “video quality measurement system”) for measuring the perceptual quality of a target video employing a reduced-reference approach to video quality measurement can be implemented, in accordance with the present application.
  • the exemplary video communications environment 100 includes a video encoder 102 , a video transcoder 104 , at least one communications channel 106 , and a video decoder 108 .
  • the video encoder 102 is operative to generate a reference version (also referred to herein as a/the “reference video”) of target video content (also referred to herein as a/the “target video”) from a source video sequence (also referred to herein as a/the “source video”), and to provide the reference video, compressed according to a first predetermined coding format, to the video transcoder 104 .
  • the source video can include a plurality of video frames such as YUV video frames, or any other suitable type of video frames.
  • the source video may include, by way of non-limiting example, one or more of television video, music video, performance video, webcam video, surveillance video, security video, unmanned aerial vehicle (UAV) video, teleconferencing video, or any other suitable type of video.
  • the video transcoder 104 is operative to transcode the reference video into a transcoded version of the reference video (also referred to herein as a/the “transcoded reference video”), which is compressed according to a second predetermined coding format that is supported by the communications channel 106 .
  • the first and second predetermined coding formats of the reference video and the transcoded reference video may be selected from the H.263 coding format, the H.264 coding format, the MPEG-2 coding format, the MPEG-4 coding format, and/or any other suitable coding format(s).
  • the video transcoder 104 is further operative to provide the transcoded reference video for transmission over the communications channel 106 , which, for example, can be wire-based, optical fiber-based, wireless, or any suitable combination thereof. Following its transmission over the communications channel 106 , the transcoded reference video is referred to herein as the target video.
  • the video decoder 108 is operative to receive the target video, and to decode the target video, thereby generating a decoded version of the target video (also referred to herein as a/the “decoded target video”).
  • one or more types of degradation may be introduced into the source video during its processing within the video encoder 102 to generate the reference video.
  • One or more types of degradation may also be introduced into the reference video during its processing within the video transcoder 104 , and/or its transmission over the communication channel 106 .
  • such degradation of the source video and/or the reference video may be due to image rotation, additive noise, low-pass filtering, compression losses, transmission losses, and/or any other possible type of degradation.
  • the perceptual quality of each of the source video, the reference video, and the target video can be represented by a predicted mean opinion score (MOS), or any other suitable type of quality assessment score.
  • MOS mean opinion score
  • the perceptual quality of the reference video can be represented by a predetermined constant value.
  • the source video is assumed to have the highest perceptual quality in comparison to the reference video and the target video.
  • FIG. 1 further depicts an illustrative embodiment of the exemplary video quality measurement system 101 within the video communications environment 100 .
  • the video quality measurement system 101 includes a plurality of functional components that can be implemented in a distributed fashion within the video communications environment 100 .
  • the plurality of functional components include a target feature extractor 112 , a reference feature extractor 114 , and a quality assessor 116 .
  • the target feature extractor 112 is operative to extract one or more target features from the target video by performing one or more objective measurements with regard to the target video.
  • the reference feature extractor 114 is operative to extract one or more reference features from the reference video by performing one or more objective measurements with regard to the reference video.
  • such objective measurements performed with regard to the target video and the reference video can involve one or more spatial quality factors and/or temporal quality factors, and can include objective measurements of blocking artifacts in the respective target and reference videos (also referred to herein as “blockiness measurements”), objective measurements of blur in the respective target and reference videos (also referred to herein as “blurriness measurements”), objective measurements of an average quantization index for the respective target and reference videos, and/or any other suitable types of objective measurements performed with regard to the respective target and reference videos.
  • blockiness measurements objective measurements of blocking artifacts in the respective target and reference videos
  • blur also referred to herein as “blurriness measurements”
  • objective measurements of an average quantization index for the respective target and reference videos and/or any other suitable types of objective measurements performed with regard to the respective target and reference videos.
  • quantization index (also referred to herein as a/the “QI”), as employed herein, corresponds to any suitable parameter that can be adjusted to control the quantization step-size used by a video encoder, such as the video encoder 102 , or a video encoder (not shown) within the video transcoder 104 .
  • such a QI can correspond to a quantization parameter (also referred to herein as a/the “QP”) for a video bitstream compressed according to the H.264 coding format, a quantization scale for a video bitstream compressed according to the MPEG-2 coding format, or any other suitable parameter for a video bitstream compressed according to any other suitable coding format.
  • a quantization parameter also referred to herein as a/the “QP”
  • QP quantization parameter
  • the quality assessor 116 is operative to provide an assessment of the perceptual quality of the target video, after its having been processed and transmitted within the video communications environment 100 , using one or more prediction functions involving the target features and the reference features.
  • the prediction functions can be linear prediction functions or non-linear prediction functions.
  • the quality assessor 116 is operative to estimate the perceptual quality of the target video by obtaining a difference between an estimate of the perceptual quality of the reference video, and an estimate of the predicted differential MOS (also referred to herein as a/the “DMOS”) between at least a portion of the reference video and at least a portion of the target video.
  • the estimate of the perceptual quality of the target video (also referred to herein as a/the “ ⁇ circumflex over (Q) ⁇ tar ”) can be expressed as
  • ⁇ circumflex over (Q) ⁇ ref corresponds to the estimate of the perceptual quality of the reference video
  • ⁇ Q corresponds to the estimate of the DMOS between the reference video and the target video
  • the quality assessor 116 is further operative to calculate or otherwise determine the ⁇ circumflex over (Q) ⁇ ref using a first prediction function for a predetermined segment from a corresponding time frame within the reference video and the target video.
  • a predetermined segment can have a duration of about 5 seconds, or any other suitable duration.
  • ⁇ 1 (QI ref ) corresponds to the first prediction function
  • QI ref corresponds to the QI for the reference video.
  • the first prediction function, ⁇ 1 (QI ref ) can be a linear function of the QI for the reference video, and/or any other suitable reference feature(s).
  • the quality assessor 116 is further operative to calculate or otherwise determine the ⁇ Q using a second prediction function for the predetermined segment.
  • the ⁇ Q can be expressed as
  • f 2 ( ⁇ blr, ⁇ blk) corresponds to the second prediction function
  • ⁇ blr corresponds to the average change in the blurriness measurements between the reference video and the target video for the predetermined segment
  • ⁇ blk corresponds to the average change in the blockiness measurements between the reference video and the target video for the predetermined segment
  • the second prediction function, ⁇ 2 ( ⁇ blr, ⁇ blk), can be a linear function of the average change in the blurriness measurements for the respective target and reference videos, the average change in the blockiness measurements for the respective target and reference videos, and/or any other suitable reference feature(s) and target feature(s).
  • Such blurriness measurements for the respective target and reference videos can be performed using any suitable technique, such as the techniques described in U.S. patent application Ser. No. 12/706,165, filed Feb. 16, 2010, entitled UNIVERSAL BLURRINESS MEASUREMENT APPROACH FOR DIGITAL IMAGERY, which is assigned to the same assignee of the present application, and which is hereby incorporated herein by reference in its entirety.
  • blockiness measurements for the respective target and reference videos can be performed using any suitable technique, such as the techniques described in U.S. patent application Ser. No. 12/757,389, filed Apr. 9, 2010, entitled BLIND BLOCKING ARTIFACT MEASUREMENT APPROACHES FOR DIGITAL IMAGERY, which is assigned to the same assignee of the present application, and which is hereby incorporated herein by reference in its entirety.
  • the quality assessor 116 is operative, for each predetermined segment from a corresponding time frame within the reference video and the target video, to perform blurriness measurements in a frame-wise fashion, and to take the average of the frame-wise blurriness measurements to obtain the average blurriness measurements, blr ref and blr tar , for the reference video and the target video, respectively.
  • the quality assessor 116 is further operative to take the difference between the average blurriness measurements, blr ref and blr tar , as follows,
  • the quality assessor 116 is operative, for each predetermined segment from a corresponding time frame within the reference video and the target video, to perform blockiness measurements in a frame-wise fashion, and to take the average of the frame-wise blockiness measurements to obtain the average blockiness measurements, blk ref and blk tar , for the reference video and the target video, respectively.
  • the quality assessor 116 is further operative to take the difference between the average blockiness measurements, blk ref and blk tar , as follows,
  • the quality assessor 116 is operative, at least for some types of objective measurements, to normalize the averaged objective measurements for the predetermined segment from the corresponding time frame within the reference video and the target video, before taking the difference between the averaged objective measurements.
  • Such normalization can be performed to reflect any differences in resolution between the reference video and the target video.
  • the blurriness measurements for the reference video and the target video can each provide a measure in pixels of the spread of an edge of a video frame, or a measure of the change in gradient between adjacent pixels.
  • the quality assessor 116 can normalize each of the average blurriness measurements, blr ref and blr tar , before taking the difference between the average measurements blr ref and blr tar , as follows,
  • w ref is the width of each video frame in the predetermined segment corresponding to the reference video
  • w tar is the width of each video frame in the predetermined segment corresponding to the target video
  • the quality assessor 116 is further operative, for each predetermined segment from a corresponding time frame within the reference video and the target video, to obtain the QI in a frame-wise fashion, and to take the average of the frame-wise QIs to obtain the QI ref for the reference video.
  • each frame-wise QI can be determined by taking the average of the QIs for all of the coded macroblocks in a corresponding video frame.
  • the quality assessor 116 is operative to estimate the perceptual quality of the target video, ⁇ circumflex over (Q) ⁇ tar , using a third prediction function based on the first prediction function, ⁇ 1 (QI ref ) (see equation (2) above), and the second prediction function, ⁇ 2 ( ⁇ blr, ⁇ blk) (see equation (3) above).
  • the ⁇ circumflex over (Q) ⁇ tar can be expressed as
  • ⁇ 3 (QI ref , ⁇ blr, ⁇ blk)” corresponds to the third prediction function
  • a 3 ,” “b 3 ,” “c 3 ,” and “d 3 ” each correspond to a parameter coefficient of the third prediction function. It is noted that the value of each of the parameter coefficients a 3 , b 3 , c 3 , and d 3 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique.
  • one such technique for determining the parameter coefficients, a 3 , b 3 , c 3 , and d 3 includes, for a large number (e.g., greater than about 500) of predetermined segments, collecting the corresponding ground truth quality values and objective feature values for QI ref , ⁇ blr, and ⁇ blk.
  • a matrix, X can then be formed, as follows,
  • the parameter vector, p can be determined using a least-squares linear regression approach, as follows,
  • exemplary values for the parameter coefficients, a 3 , b 3 , c 3 , and d 3 can be determined to be equal to about 0.025, 0.19, 1.85, and 4.28, respectively, or any other suitable values.
  • FIGS. 2 a - 2 c each depict an exemplary method of providing the target features and the reference features from the target feature extractor 112 and the reference feature extractor 114 , respectively, to the quality assessor 116 (see also FIG. 1 ).
  • the target feature extractor 112 , the reference feature extractor 114 , and the quality assessor 116 can be implemented in a distributed fashion within the video communications environment 100 such that the target feature extractor 112 is located proximate to or co-located with the quality assessor 116 , such as within an endpoint device, and the reference feature extractor 114 is disposed at a distal or geographically remote location from the target feature extractor 112 and the quality assessor 116 .
  • such an endpoint device can be a mobile phone, a mobile or non-mobile computer, a tablet computer, or any other suitable type of mobile or non-mobile endpoint device capable of displaying video.
  • the video quality measurement system 101 comprising the target feature extractor 112 , the reference feature extractor 114 , and the quality assessor 116 can operate to transmit the reference features from the reference feature extractor 114 at the distal or geographically remote location over at least one side communications channel 202 a to the quality assessor 116 , which, in turn, can access the target features from the target feature extractor 112 proximate thereto or co-located therewith for estimating the perceptual quality of the target video.
  • the target feature extractor 112 , the reference feature extractor 114 , and the quality assessor 116 can also be implemented in a distributed fashion within the video communications environment 100 such that the reference feature extractor 114 is located proximate to or co-located with the quality assessor 116 , and the target feature extractor 112 is disposed at a distal or geographically remote location from the reference feature extractor 114 and the quality assessor 116 .
  • the video quality measurement system 101 can operate to transmit the target features from the target feature extractor 112 at the distal or geographically remote location over at least one side communications channel 202 b to the quality assessor 116 , which, in turn, can access the reference features from the reference feature extractor 114 proximate thereto or co-located therewith for estimating the perceptual quality of the target video.
  • the target feature extractor 112 , the reference feature extractor 114 , and the quality assessor 116 can be implemented in a distributed fashion within the video communications environment 100 such that the quality assessor 116 is disposed at a centralized location that is geographically remote from the target feature extractor 112 and the reference feature extractor 114 .
  • the video quality measurement system 101 can operate to transmit the target features from the target feature extractor 112 over at least one side communications channel 202 c to the quality assessor 116 , and to transmit the reference features from the reference feature extractor 114 over the side communications channel 202 c to the quality assessor 116 , for estimating the perceptual quality of the target video within the quality assessor 116 at the geographically remote, centralized location. It is noted that the video quality measurement system 101 can operate to transmit the target features and the reference features to the quality assessor 116 over the same communications channel, or over different communications channels.
  • the target feature extractor 112 and the reference feature extractor 114 can extract target features from the target video, and reference features from the reference video, respectively, by performing objective measurements with regard to the respective target and reference videos involving one or more additional temporal quality factors including, but not limited to, temporal quality factors relating to frame rates, video motion properties including jerkiness motion, frame dropping impairments, packet loss impairments, freezing impairments, and/or ringing impairments.
  • the estimate of the predicted DMOS between the reference video and the target video, ⁇ Q can be expressed in terms of a modified version of the second prediction function (see equation (3) above), as follows,
  • ⁇ blr corresponds to the average change in the blurriness measurements between the reference video and the target video for a predetermined segment from a corresponding time frame within the reference video and the target video
  • ⁇ blk corresponds to the average change in the blockiness measurements between the reference video and the target video for the predetermined segment
  • ⁇ fps corresponds to the average change in the frame rates between the reference video and the target video, (fps ref ⁇ fps tar ), for the predetermined segment.
  • the ⁇ fps can have a minimum bound at 0 (i.e., zero), such that there is essentially no penalty or benefit for the frame rate of the target video being higher than the frame rate of the reference video.
  • the quality assessor 116 can estimate the perceptual quality of the target video, ⁇ circumflex over (Q) ⁇ tar using a fourth prediction function based on the first prediction function, ⁇ 1 (QI ref ) (see equation (2) above), and the modified second prediction function, ⁇ 2 ( ⁇ blr, ⁇ blk, ⁇ fps) (see equation (12) above).
  • the ⁇ circumflex over (Q) ⁇ tar can be expressed as
  • f 4 (QI ref , ⁇ blr, ⁇ blk, ⁇ fps)” corresponds to the fourth prediction function
  • QI ref corresponds to the QI for the reference video
  • a 4 ,” “b 4 ,” “c 4 ,” “d 4 ,” and “e 4 ” each correspond to a parameter coefficient of the fourth prediction function.
  • the value of each of the parameter coefficients a 4 , b 4 , c 4 , d 4 , and e 4 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique.
  • the parameter coefficients a 4 , b 4 , c 4 , d 4 , and e 4 may be determined or set to be equal to about 0.048, 0.19, 1.08, ⁇ 0.044, and 5.23, respectively, or any other suitable values.
  • the video encoder 102 may be omitted from the video communications environment 100 , allowing the source video to take on the role of the reference video.
  • the estimate of the perceptual quality of the reference video, ⁇ circumflex over (Q) ⁇ ref is assumed to be fixed and known. Further, because the source video is assumed to have the highest perceptual quality, the ⁇ circumflex over (Q) ⁇ ref estimate now corresponds to the highest perceptual quality in comparison to at least the estimate of the perceptual quality of the target video, ⁇ circumflex over (Q) ⁇ tar .
  • the quality assessor 116 can estimate the perceptual quality of the target video, ⁇ circumflex over (Q) ⁇ tar , using a fifth prediction function based on the modified second prediction function, ⁇ 2 ( ⁇ blr, ⁇ blk, ⁇ fps) (see equation (12) above).
  • the ⁇ circumflex over (Q) ⁇ tar can be expressed as
  • each of the parameter coefficients a 5 , b 5 , c 5 , and d 5 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique.
  • the fixed, known estimate of the perceptual quality of the reference video, ⁇ circumflex over (Q) ⁇ ref can be incorporated into the parameter coefficient, d 5 .
  • the parameter coefficients a 5 , b 5 , c 5 , and d 5 may be determined or set to be equal to about 0.17, 1.81, ⁇ 0.04, and 4.1, respectively, or any other suitable values.
  • the ⁇ Q can be expressed in terms of another modified version of the second prediction function (see equation (3) above), as follows,
  • ⁇ circumflex over (Q) ⁇ tar — temporal corresponds to an estimate of the perceptual quality of the target video for a predetermined segment from a corresponding time frame within the reference video and the target video, taking into account the video motion properties of the target video.
  • ⁇ circumflex over (Q) ⁇ tar — temporal can be expressed as
  • ⁇ and ⁇ are constants
  • AMD is a temporal quality factor that can be obtained by taking the sum of absolute mean differences of pixel values in a block-wise fashion between consecutive video frames in the predetermined segment of the target video.
  • the constants ⁇ and ⁇ may be set to be equal to about 9.412 and ⁇ 0.1347, respectively, or any other suitable values.
  • the constants ⁇ and ⁇ may be set to be equal to about 8.526 and ⁇ 0.0575, respectively, or any other suitable values.
  • the constants ⁇ and ⁇ may be set to be equal to about 6.283 and ⁇ 0.1105, respectively, or any other suitable values.
  • the quality assessor 116 can estimate the perceptual quality of the target video, ⁇ circumflex over (Q) ⁇ tar , using a sixth prediction function based on the modified second prediction function, ⁇ 2 ( ⁇ blr, ⁇ blk, ⁇ circumflex over (Q) ⁇ tar — temporal ) (see equation (15) above).
  • the ⁇ circumflex over (Q) ⁇ tar can be expressed as
  • each of the parameter coefficients a 6 , b 6 , c 6 , and d 6 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique.
  • the parameter coefficients a 6 , b 6 , c 6 , and d 6 may be determined or set to be equal to about 0.18, 1.74, 2.32, and 1.66, respectively, or any other suitable values.
  • the ⁇ Q can be expressed in terms of still another modified version of the second prediction function (see equation (3) above), as follows,
  • NIFVQ(fps tar ) is a temporal quality factor representative of the negative impact of such frame dropping impairments on the perceptual quality of the target video for a predetermined segment from a corresponding time frame within the reference video and the target video, and “fps tar ” corresponds to the frame rate of the target video, for a current video frame in the predetermined segment.
  • NIFVQ(fps tar ) can be expressed as
  • NIFVQ (fps tar ) [ log(30) ⁇ log(fps tar )] (20)
  • NIFVQ (fps tar ) AMD *[ log(30) ⁇ log(fps tar )], (21)
  • ASD is the temporal quality factor that can be obtained by taking the sum of block-wise absolute mean differences of pixel values in a block-wise fashion between consecutive video frames in the predetermined segment of the target video. It is noted that in the exemplary equations (20) and (21) above, it has been assumed that the maximum frame rate of the reference video is 30 frames per second.
  • the quality assessor 116 can estimate the perceptual quality of the target video, ⁇ circumflex over (Q) ⁇ tar , using a seventh prediction function based on the modified second prediction function, ⁇ 2 ( ⁇ blr, ⁇ blk,NIFVQ(fps tar )) (see equation (19) above).
  • the ⁇ circumflex over (Q) ⁇ tar can be expressed as
  • ⁇ 7 ( ⁇ blr, ⁇ blk,NIFVQ(fps tar )) corresponds to the seventh prediction function
  • a 7 ,” “b 7 ,” “c 7 ,” and “d 7 ” each correspond to a parameter coefficient of the seventh prediction function.
  • the value of each of the parameter coefficients a 7 , b 7 , c 7 , and d 7 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique.
  • the parameter coefficients a 7 , b 7 , c 7 , and d 7 may be determined or set to be equal to about 0.17, 1.82, ⁇ 0.68, and 4.07, respectively, or any other suitable values.
  • the parameter coefficients a 7 , b 7 , c 7 , and d 7 may be determined or set to be equal to about 0.18, 1.84, ⁇ 0.02, and 3.87, respectively, or any other suitable values.
  • the ⁇ Q can be expressed in terms of yet another modified version of the second prediction function (see equation (3) above), as follows,
  • JM is a temporal quality factor representative of a jerkiness measurement performed with regard to the target video for a predetermined segment from a corresponding time frame within the reference video and the target video.
  • JM can be expressed as
  • fps tar is the frame rate of the target video
  • M and N are the dimensions of each video frame in the target video
  • represents the direct frame difference between consecutive video frames at times “i” and “i ⁇ 1” in the target video.
  • the temporal quality factor, AMD may be similar to the direct frame difference,
  • the temporal quality factor, JM can therefore be alternatively expressed in terms of the AMD as follows,
  • the quality assessor 116 can estimate the perceptual quality of the target video, ⁇ circumflex over (Q) ⁇ tar , using an eighth prediction function based on the modified second prediction function, ⁇ 2 ( ⁇ blr, ⁇ blk,JM) (see equation (23) above).
  • the ⁇ circumflex over (Q) ⁇ tar can be expressed as
  • each of the parameter coefficients a 8 , b 8 , c 8 , and d 8 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique.
  • the parameter coefficients a 8 , b 8 , c 8 , and d 8 may be set to be equal to about 0.19, 1.92, ⁇ 0.006, and 3.87, respectively, or any other suitable values.
  • a target video whose perceptual quality is to be measured is received over at least one communications channel.
  • information pertaining to one or more target features is extracted from the target video by the target feature extractor 112 (see FIG. 1 ).
  • information pertaining to one or more reference features is extracted from a reference version of the target video by the reference feature extractor 114 (see FIG. 1 ).
  • a measurement of the perceptual quality of the reference video is provided, by the quality assessor 116 (see FIG. 1 ), based on a first prediction function for a predetermined segment from a corresponding time frame within the target video and the reference video, wherein the first prediction function involves at least one of the reference features.
  • a measurement of a predicted differential mean opinion score (DMOS) between at least a portion of the target video and at least a portion of the reference video is provided, by the quality assessor 116 , based on a second prediction function for the predetermined segment, wherein the second prediction function involves at least one of the target features and at least one of the reference features.
  • DMOS differential mean opinion score
  • the perceptual quality of the target video is measured using a third prediction function for the predetermined segment, wherein the third prediction function is based on the first prediction function and the second prediction function.
  • any of the operations depicted and/or described herein that form part of the illustrative embodiments are useful machine operations.
  • the illustrative embodiments also relate to a device or an apparatus for performing such operations.
  • the apparatus can be specially constructed for the required purpose, or can be a general-purpose computer selectively activated or configured by a computer program stored in the computer.
  • various general-purpose machines employing one or more processors coupled to one or more computer readable media can be used with computer programs written in accordance with the teachings disclosed herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations.
  • the presently disclosed systems and methods can also be embodied as computer readable code on a computer readable medium.
  • the computer readable medium is any data storage device that can store data, which can thereafter be read by a computer system. Examples of such computer readable media include hard drives, read-only memory (ROM), random-access memory (RAM), CD-ROMs, CD-Rs, CD-RWs, magnetic tapes, and/or any other suitable optical or non-optical data storage devices.
  • the computer readable media can also be distributed over a network-coupled computer system, so that the computer readable code can be stored and/or executed in a distributed fashion.

Abstract

Systems and methods of objective video quality measurement that employ a reduced-reference approach to video quality measurement. Such systems and methods of objective video quality measurement can extract information pertaining to one or more features of a target video whose perceptual quality is to be measured, extract information pertaining to one or more features of a reference video, and employ one or more prediction functions involving the target features and the reference features to provide a measurement of the perceptual quality of the target video.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • —Not applicable—
  • STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
  • —Not applicable—
  • FIELD OF THE INVENTION
  • The present application relates generally to systems and methods of objective video quality measurement, and more specifically to systems and methods of objective video quality measurement that employ a reduced-reference approach.
  • BACKGROUND OF THE INVENTION
  • Systems and methods are known that employ a full-reference approach, a no-reference approach, and a reduced-reference approach to video quality measurement. For example, systems that employ a full-reference approach to video quality measurement typically receive target video content (also referred to herein as a/the “target video”) whose perceptual quality is to be measured, and compare information from the target video to corresponding information from a reference version (also referred to herein as a/the “reference video”) of the target video to provide a measurement of the perceptual quality of the target video. In such systems that employ a full-reference approach to video quality measurement, it is generally assumed that the systems have full access to all of the information from the reference video for comparison to the target video information. However, transmitting all of the information from the reference video over a network for comparison to the target video information at an endpoint device, such as a mobile phone, can consume an undesirably excessive amount of network bandwidth. Such a full-reference approach to video quality measurement is therefore generally considered to be impractical for use in measuring the perceptual quality of a target video at such an endpoint device.
  • In systems that employ a no-reference approach to video quality measurement, it is generally assumed that no information from any reference video is available to the systems for comparison to the target video information. Such systems that employ a no-reference approach to video quality measurement therefore typically provide measurements of the perceptual quality of the target video using only information from the target video. However, such systems that employ a no-reference approach to video quality measurement may be inaccurate, since certain assumptions made for the purpose of measuring the perceptual quality of the target video may be inaccurate.
  • Systems that employ a reduced-reference approach to video quality measurement typically have access to a reduced amount of information from the reference video for comparison to the target video information. For example, such information from the reference video can include a limited number of characteristics of the reference video, such as its spectral components, its variation of energy level, and/or its energy distribution in the frequency domain, each of which may be sensitive to degradation during processing and/or transmission of the target video. However, such known systems that employ a reduced-reference approach to video quality measurement can also be impractical for use in measuring the perceptual quality of a target video following its transmission over a network to an endpoint device, such as a mobile phone, due at least in part to constraints in the network bandwidth, and/or because of the limited processing power that is typically available in the endpoint device to perform the video quality measurement.
  • It would therefore be desirable to have improved systems and methods of objective video quality measurement that avoid at least some of the drawbacks of the various known video quality measurement systems and methods described above.
  • BRIEF SUMMARY OF THE INVENTION
  • In accordance with the present application, systems and methods of objective video quality measurement are disclosed that employ a reduced-reference approach. Such systems and methods of objective video quality measurement can extract information pertaining to one or more features (also referred to herein as “target features”) of a target video whose perceptual quality is to be measured, extract corresponding information pertaining to one or more features (also referred to herein as “reference features”) of a reference video, and employ one or more prediction functions involving the target features and the reference features to provide a measurement of the perceptual quality of the target video.
  • In accordance with a first aspect, a system for measuring the perceptual quality of a target video that employs a reduced-reference approach to video quality measurement comprises a plurality of functional components, including a target feature extractor, a reference feature extractor, and a quality assessor. The target feature extractor is operative to extract one or more target features from the target video by performing one or more objective measurements with regard to the target video. Similarly, the reference feature extractor is operative to extract one or more reference features from the reference video by performing one or more objective measurements with regard to the reference video. Such objective measurements performed on the target video and the reference video can include objective measurements of blocking artifacts in the respective target and reference videos (also referred to herein as “blockiness measurements”), objective measurements of blur in the respective target and reference videos (also referred to herein as “blurriness measurements”), objective measurements of an average quantization index for the respective target and reference videos, as examples, and/or any other suitable types of objective measurements. Such objective measurements can result in target features and reference features that can be represented by compact data sets, which may be transmitted over a network without consuming an undesirably excessive amount of network bandwidth. As employed herein, the term “quantization index” (also referred to herein as a/the “QI”) corresponds to any suitable parameter that can be adjusted to control the quantization step-size used by a video encoder. For example, such a QI can correspond to a quantization parameter (also referred to herein as a/the “QP”) for video bitstreams compressed according to the H.264 coding format, a quantization scale for video bitstreams compressed according to the MPEG-2 coding format, or any other suitable parameter for video bitstreams compressed according to any other suitable coding format. The quality assessor is operative to provide an assessment of the perceptual quality of the target video following its transmission over the network to an endpoint device, using one or more prediction functions involving the target features and the reference features. In accordance with an exemplary aspect, one or more of the prediction functions can be linear prediction functions or non-linear prediction functions. For example, such an endpoint device can be a mobile phone, a mobile or non-mobile computer, a tablet computer, or any other suitable type of mobile or non-mobile endpoint device capable of displaying video.
  • In accordance with another exemplary aspect, the perceptual quality of each of the target video and the reference video can be represented by a quality assessment score, such as a predicted mean opinion score (MOS). In accordance with such an exemplary aspect, the quality assessor is operative to estimate the perceptual quality of the target video by obtaining a difference between an estimate of the perceptual quality of the reference video, and an estimate of the predicted differential MOS (also referred to herein as a/the “DMOS”) between at least a portion of the reference video and at least a portion of the target video. For example, the estimate of the perceptual quality of the target video (also referred to herein as a/the “{circumflex over (Q)}tar”) can be expressed as

  • {circumflex over (Q)} tar ={circumflex over (Q)} ref Δ Q,
  • in which “{circumflex over (Q)}ref” corresponds to the estimate of the perceptual quality of the reference video, and “ Δ Q” corresponds to the estimate of the DMOS between the reference video and the target video. It is noted that the DMOS between the reference video and the target video can be expressed as follows,

  • ΔQ=(Q ref −Q tar).
  • The quality assessor is further operative to calculate or otherwise determine the {circumflex over (Q)}ref using a first prediction function for a predetermined segment from a corresponding time frame within the reference video and the target video. For example, the {circumflex over (Q)}ref can be expressed as

  • {circumflex over (Q)} ref1(QIref),
  • in which “ƒ1(QIref)” corresponds to the first prediction function, and “QIref” corresponds to the QI for the reference video. For example, the first prediction function, ƒ1(QIref), can be a linear function of the QI for the reference video, and/or any other suitable reference feature(s). The quality assessor is also operative to calculate or otherwise determine the Δ Q using a second prediction function for the predetermined segment. For example, the Δ Q can be expressed as

  • Δ Q 2(Δblr,Δblk),
  • in which “ƒ2(Δblr,Δblk)” corresponds to the second prediction function, “Δblr” corresponds to the average change in frame-wise blurriness measurements between the reference video and the target video for the predetermined segment, and “Δblk” corresponds to the average change in frame-wise blockiness measurements between the reference video and the target video for the predetermined segment. For example, the second prediction function, ƒ2(Δblr,Δblk), can be a linear function of the average change in the frame-wise blurriness measurements for the respective target and reference videos, the average change in the frame-wise blockiness measurements for the respective target and reference videos, and/or any other suitable reference feature(s) and target feature(s).
  • In accordance with a further exemplary aspect, the quality assessor is operative to estimate the perceptual quality of the target video, {circumflex over (Q)}tar, using a third prediction function based on the first prediction function, ƒ1(QIref), and the second prediction function, ƒ2(Δblr,Δblk). For example, the {circumflex over (Q)}tar can be expressed as
  • Q ^ tar = f 3 ( QI ref , Δ blr , Δ blk ) = a 3 · QI ref + b 3 · Δ blr + c 3 · Δ blk + d 3 ,
  • in which “ƒ3(QIref,Δblr,Δblk)” corresponds to the third prediction function, and “a3,” “b3,” “c3,” and “d3” each correspond to a parameter coefficient of the third prediction function. For example, the value of each of the parameter coefficients a3, b3, c3, and d3 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique.
  • In accordance with another aspect of the disclosed systems and methods, the target feature extractor, the reference feature extractor, and the quality assessor can be implemented in a distributed fashion within a video communications environment. In accordance with an exemplary aspect, the target feature extractor can be located proximate to or co-located with the quality assessor, such as within the endpoint device, and the reference feature extractor can be disposed at a distal or geographically remote location from the target feature extractor and the quality assessor. In accordance with such an exemplary aspect, the disclosed system can transmit the reference features from the reference feature extractor at the distal or geographically remote location to the quality assessor, which, in turn, can access the target features from the target feature extractor proximate thereto or co-located therewith for estimating the perceptual quality of the target video. In accordance with another exemplary aspect, the reference feature extractor can be located proximate to or co-located with the quality assessor, and the target feature extractor can be disposed at a distal or geographically remote location from the reference feature extractor and the quality assessor. In accordance with such an exemplary aspect, the disclosed system can transmit the target features from the target feature extractor at the distal or geographically remote location to the quality assessor, which, in turn, can access the reference features from the reference feature extractor proximate thereto or co-located therewith for estimating the perceptual quality of the target video. In accordance with a further exemplary aspect, the quality assessor may be disposed at a centralized location that is geographically remote from the target feature extractor and the reference feature extractor. In accordance with such an exemplary aspect, the disclosed system can transmit the target features from the target feature extractor to the quality assessor, and transmit the reference features from the reference feature extractor to the quality assessor, for estimating the perceptual quality of the target video within the quality assessor at the geographically remote, centralized location.
  • By extracting reference features and target features from a reference video and a target video, respectively, and representing the respective reference and target features as compact data sets, the disclosed systems and methods can operate to transmit the reference features and/or the target features over a network to a quality assessor for assessing the perceptual quality of the target video, without consuming an undesirably excessive amount of network bandwidth. Further, by providing for such a perceptual quality assessment of the target video using one or more prediction functions, the perceptual quality assessment of the target video can be performed within an endpoint device having limited processing power. Moreover, by using the average values of frame-wise objective measurements for a predetermined segment from a corresponding time frame within the reference video and the target video, fluctuations in the perceptual quality assessment of the target video can be reduced.
  • Other features, functions, and aspects of the invention will be evident from the Drawings and/or the Detailed Description of the Invention that follow.
  • BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS
  • The invention will be more fully understood with reference to the following Detailed Description of the Invention in conjunction with the drawings of which:
  • FIG. 1 is a block diagram of an exemplary video communications environment, in which an exemplary system for measuring the perceptual quality of a target video employing a reduced-reference approach to video quality measurement can be implemented, in accordance with an exemplary embodiment of the present application;
  • FIG. 2 a is a block diagram of an exemplary target feature extractor, an exemplary reference feature extractor, and an exemplary quality assessor included within the system of FIG. 1, illustrating an exemplary method of providing target features and reference features from the target feature extractor and the reference feature extractor, respectively, to the quality assessor;
  • FIG. 2 b is a block diagram of the exemplary target feature extractor, the exemplary reference feature extractor, and the exemplary quality assessor included within the system of FIG. 1, illustrating another exemplary method of providing the target features and the reference features from the target feature extractor and the reference feature extractor, respectively, to the quality assessor;
  • FIG. 2 c is a block diagram of the exemplary target feature extractor, the exemplary reference feature extractor, and the exemplary quality assessor included within the system of FIG. 1, illustrating a further exemplary method of providing the target features and the reference features from the target feature extractor and the reference feature extractor, respectively, to the quality assessor; and
  • FIG. 3 is a flow diagram of an exemplary method of operating the system of FIG. 1.
  • DETAILED DESCRIPTION OF THE INVENTION
  • Systems and methods of objective video quality measurement are disclosed that employ a reduced-reference approach. Such systems and methods of objective video quality measurement can extract information pertaining to one or more features (also referred to herein as “target features”) of a target video whose perceptual quality is to be measured, extract corresponding information pertaining to one or more features (also referred to herein as “reference features”) of a reference video, and employ one or more prediction functions involving the target features and the reference features to provide a measurement of the perceptual quality of the target video.
  • FIG. 1 depicts an exemplary video communications environment 100, in which an exemplary system 101 (also referred to herein as a/the “video quality measurement system”) for measuring the perceptual quality of a target video employing a reduced-reference approach to video quality measurement can be implemented, in accordance with the present application. As shown in FIG. 1, the exemplary video communications environment 100 includes a video encoder 102, a video transcoder 104, at least one communications channel 106, and a video decoder 108. The video encoder 102 is operative to generate a reference version (also referred to herein as a/the “reference video”) of target video content (also referred to herein as a/the “target video”) from a source video sequence (also referred to herein as a/the “source video”), and to provide the reference video, compressed according to a first predetermined coding format, to the video transcoder 104. For example, the source video can include a plurality of video frames such as YUV video frames, or any other suitable type of video frames. Further, the source video may include, by way of non-limiting example, one or more of television video, music video, performance video, webcam video, surveillance video, security video, unmanned aerial vehicle (UAV) video, teleconferencing video, or any other suitable type of video. The video transcoder 104 is operative to transcode the reference video into a transcoded version of the reference video (also referred to herein as a/the “transcoded reference video”), which is compressed according to a second predetermined coding format that is supported by the communications channel 106. By way of non-limiting example, the first and second predetermined coding formats of the reference video and the transcoded reference video, respectively, may be selected from the H.263 coding format, the H.264 coding format, the MPEG-2 coding format, the MPEG-4 coding format, and/or any other suitable coding format(s). The video transcoder 104 is further operative to provide the transcoded reference video for transmission over the communications channel 106, which, for example, can be wire-based, optical fiber-based, wireless, or any suitable combination thereof. Following its transmission over the communications channel 106, the transcoded reference video is referred to herein as the target video. The video decoder 108 is operative to receive the target video, and to decode the target video, thereby generating a decoded version of the target video (also referred to herein as a/the “decoded target video”).
  • It is noted that one or more types of degradation may be introduced into the source video during its processing within the video encoder 102 to generate the reference video. One or more types of degradation may also be introduced into the reference video during its processing within the video transcoder 104, and/or its transmission over the communication channel 106. By way of non-limiting example, such degradation of the source video and/or the reference video may be due to image rotation, additive noise, low-pass filtering, compression losses, transmission losses, and/or any other possible type of degradation. For example, the perceptual quality of each of the source video, the reference video, and the target video can be represented by a predicted mean opinion score (MOS), or any other suitable type of quality assessment score. It is noted that the perceptual quality of the reference video can be represented by a predetermined constant value. It is further noted that the source video is assumed to have the highest perceptual quality in comparison to the reference video and the target video.
  • FIG. 1 further depicts an illustrative embodiment of the exemplary video quality measurement system 101 within the video communications environment 100. As shown in FIG. 1, the video quality measurement system 101 includes a plurality of functional components that can be implemented in a distributed fashion within the video communications environment 100. The plurality of functional components include a target feature extractor 112, a reference feature extractor 114, and a quality assessor 116. The target feature extractor 112 is operative to extract one or more target features from the target video by performing one or more objective measurements with regard to the target video. Similarly, the reference feature extractor 114 is operative to extract one or more reference features from the reference video by performing one or more objective measurements with regard to the reference video. For example, such objective measurements performed with regard to the target video and the reference video can involve one or more spatial quality factors and/or temporal quality factors, and can include objective measurements of blocking artifacts in the respective target and reference videos (also referred to herein as “blockiness measurements”), objective measurements of blur in the respective target and reference videos (also referred to herein as “blurriness measurements”), objective measurements of an average quantization index for the respective target and reference videos, and/or any other suitable types of objective measurements performed with regard to the respective target and reference videos.
  • It is noted that such objective measurements performed with regard to the respective target and reference videos can result in target features and reference features that can be represented by compact data sets, which may be transmitted over a network without consuming an undesirably excessive amount of network bandwidth. It is further noted that the term “quantization index” (also referred to herein as a/the “QI”), as employed herein, corresponds to any suitable parameter that can be adjusted to control the quantization step-size used by a video encoder, such as the video encoder 102, or a video encoder (not shown) within the video transcoder 104. For example, such a QI can correspond to a quantization parameter (also referred to herein as a/the “QP”) for a video bitstream compressed according to the H.264 coding format, a quantization scale for a video bitstream compressed according to the MPEG-2 coding format, or any other suitable parameter for a video bitstream compressed according to any other suitable coding format.
  • The quality assessor 116 is operative to provide an assessment of the perceptual quality of the target video, after its having been processed and transmitted within the video communications environment 100, using one or more prediction functions involving the target features and the reference features. For example, one or more of the prediction functions can be linear prediction functions or non-linear prediction functions. In accordance with the illustrative embodiment of FIG. 1, the quality assessor 116 is operative to estimate the perceptual quality of the target video by obtaining a difference between an estimate of the perceptual quality of the reference video, and an estimate of the predicted differential MOS (also referred to herein as a/the “DMOS”) between at least a portion of the reference video and at least a portion of the target video. For example, the estimate of the perceptual quality of the target video (also referred to herein as a/the “{circumflex over (Q)}tar”) can be expressed as

  • {circumflex over (Q)} tar ={circumflex over (Q)} ref Δ Q,   (1a)
  • in which “{circumflex over (Q)}ref” corresponds to the estimate of the perceptual quality of the reference video, and “ Δ Q” corresponds to the estimate of the DMOS between the reference video and the target video. It is noted that the DMOS between the reference video and the target video can be expressed as follows,

  • ΔQ=(Q ref −Q tar).   (1b)
  • The quality assessor 116 is further operative to calculate or otherwise determine the {circumflex over (Q)}ref using a first prediction function for a predetermined segment from a corresponding time frame within the reference video and the target video. For example, such a predetermined segment can have a duration of about 5 seconds, or any other suitable duration.
  • Moreover, the {circumflex over (Q)}ref can be expressed as

  • {circumflex over (Q)} ref1(QIref),   (2)
  • in which “ƒ1(QIref)” corresponds to the first prediction function, and “QIref” corresponds to the QI for the reference video. For example, the first prediction function, ƒ1(QIref), can be a linear function of the QI for the reference video, and/or any other suitable reference feature(s). The quality assessor 116 is further operative to calculate or otherwise determine the Δ Q using a second prediction function for the predetermined segment. For example, the Δ Q can be expressed as

  • Δ Q 2(Δblr,Δblk),   (3)
  • in which “f2(Δblr,Δblk)” corresponds to the second prediction function, “Δblr” corresponds to the average change in the blurriness measurements between the reference video and the target video for the predetermined segment, and “Δblk” corresponds to the average change in the blockiness measurements between the reference video and the target video for the predetermined segment.
  • For example, the second prediction function, ƒ2(Δblr,Δblk), can be a linear function of the average change in the blurriness measurements for the respective target and reference videos, the average change in the blockiness measurements for the respective target and reference videos, and/or any other suitable reference feature(s) and target feature(s). Such blurriness measurements for the respective target and reference videos can be performed using any suitable technique, such as the techniques described in U.S. patent application Ser. No. 12/706,165, filed Feb. 16, 2010, entitled UNIVERSAL BLURRINESS MEASUREMENT APPROACH FOR DIGITAL IMAGERY, which is assigned to the same assignee of the present application, and which is hereby incorporated herein by reference in its entirety. Further, such blockiness measurements for the respective target and reference videos can be performed using any suitable technique, such as the techniques described in U.S. patent application Ser. No. 12/757,389, filed Apr. 9, 2010, entitled BLIND BLOCKING ARTIFACT MEASUREMENT APPROACHES FOR DIGITAL IMAGERY, which is assigned to the same assignee of the present application, and which is hereby incorporated herein by reference in its entirety.
  • In accordance with the illustrative embodiment of FIG. 1, the quality assessor 116 is operative, for each predetermined segment from a corresponding time frame within the reference video and the target video, to perform blurriness measurements in a frame-wise fashion, and to take the average of the frame-wise blurriness measurements to obtain the average blurriness measurements, blrref and blrtar, for the reference video and the target video, respectively. To obtain the average change, Δblr, in the blurriness measurements for the reference video and the target video, the quality assessor 116 is further operative to take the difference between the average blurriness measurements, blrref and blrtar, as follows,

  • Δblr=blrref−blrtar.   (4)
  • Similarly, to obtain the average change, Δblk, in the blockiness measurements for the reference video and the target video, the quality assessor 116 is operative, for each predetermined segment from a corresponding time frame within the reference video and the target video, to perform blockiness measurements in a frame-wise fashion, and to take the average of the frame-wise blockiness measurements to obtain the average blockiness measurements, blkref and blktar, for the reference video and the target video, respectively. The quality assessor 116 is further operative to take the difference between the average blockiness measurements, blkref and blktar, as follows,

  • Δblk=blkref−blktar.   (5)
  • In further accordance with the illustrative embodiment of FIG. 1, the quality assessor 116 is operative, at least for some types of objective measurements, to normalize the averaged objective measurements for the predetermined segment from the corresponding time frame within the reference video and the target video, before taking the difference between the averaged objective measurements. Such normalization can be performed to reflect any differences in resolution between the reference video and the target video. For example, the blurriness measurements for the reference video and the target video can each provide a measure in pixels of the spread of an edge of a video frame, or a measure of the change in gradient between adjacent pixels. Because such measures are typically resolution-dependent, the quality assessor 116 can normalize each of the average blurriness measurements, blrref and blrtar, before taking the difference between the average measurements blrref and blrtar, as follows,
  • Δ blr = blr ref w ref - blr tar w tar , ( 6 )
  • in which “wref” is the width of each video frame in the predetermined segment corresponding to the reference video, and “wtar” is the width of each video frame in the predetermined segment corresponding to the target video.
  • Moreover, to obtain the QIref, the quality assessor 116 is further operative, for each predetermined segment from a corresponding time frame within the reference video and the target video, to obtain the QI in a frame-wise fashion, and to take the average of the frame-wise QIs to obtain the QIref for the reference video. For example, each frame-wise QI can be determined by taking the average of the QIs for all of the coded macroblocks in a corresponding video frame.
  • In further accordance with the illustrative embodiment of FIG. 1, the quality assessor 116 is operative to estimate the perceptual quality of the target video, {circumflex over (Q)}tar, using a third prediction function based on the first prediction function, ƒ1(QIref) (see equation (2) above), and the second prediction function, ƒ2(Δblr,Δblk) (see equation (3) above). For example, the {circumflex over (Q)}tar can be expressed as
  • Q ^ tar = f 3 ( QI ref , Δ blr , Δ blk ) = a 3 · QI ref + b 3 · Δ blr + c 3 · Δ blk + d 3 , ( 7 )
  • in which “ƒ3(QIref,Δblr,Δblk)” corresponds to the third prediction function, and “a3,” “b3,” “c3,” and “d3” each correspond to a parameter coefficient of the third prediction function. It is noted that the value of each of the parameter coefficients a3, b3, c3, and d3 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique.
  • By way of example, one such technique for determining the parameter coefficients, a3, b3, c3, and d3, includes, for a large number (e.g., greater than about 500) of predetermined segments, collecting the corresponding ground truth quality values and objective feature values for QIref, Δblr, and Δblk. A matrix, X, can then be formed, as follows,

  • X=[QIref|Δblr|Δblk|1],   (8)
  • in which “QIref” is a vector of all of the corresponding QIref values, “Δblr” is a vector of all of the corresponding Δblr values, “Δblk” is a vector of all of the corresponding Δblk values, “1” is a vector of 1s, and “|” indicates column-wise concatenation of the vectors, QIref, Δblr, Δblk, and 1. Next, a vector, y, of all of the ground truth quality values can be formed, and can be related to the matrix, X, as follows,

  • y=Xp,   (9)
  • in which “p” is a parameter vector, which can be expressed as

  • p=(a 3 , b 3 , c 3 , d 3)T.   (10)
  • For example, the parameter vector, p, can be determined using a least-squares linear regression approach, as follows,

  • p=(X T X)−1 X T y.   (11)
  • In this way, exemplary values for the parameter coefficients, a3, b3, c3, and d3, can be determined to be equal to about 0.025, 0.19, 1.85, and 4.28, respectively, or any other suitable values.
  • FIGS. 2 a-2 c each depict an exemplary method of providing the target features and the reference features from the target feature extractor 112 and the reference feature extractor 114, respectively, to the quality assessor 116 (see also FIG. 1). As shown in FIG. 2 a, the target feature extractor 112, the reference feature extractor 114, and the quality assessor 116 can be implemented in a distributed fashion within the video communications environment 100 such that the target feature extractor 112 is located proximate to or co-located with the quality assessor 116, such as within an endpoint device, and the reference feature extractor 114 is disposed at a distal or geographically remote location from the target feature extractor 112 and the quality assessor 116. For example, such an endpoint device can be a mobile phone, a mobile or non-mobile computer, a tablet computer, or any other suitable type of mobile or non-mobile endpoint device capable of displaying video. Further, the video quality measurement system 101 comprising the target feature extractor 112, the reference feature extractor 114, and the quality assessor 116 can operate to transmit the reference features from the reference feature extractor 114 at the distal or geographically remote location over at least one side communications channel 202 a to the quality assessor 116, which, in turn, can access the target features from the target feature extractor 112 proximate thereto or co-located therewith for estimating the perceptual quality of the target video.
  • As shown in FIG. 2 b, the target feature extractor 112, the reference feature extractor 114, and the quality assessor 116 can also be implemented in a distributed fashion within the video communications environment 100 such that the reference feature extractor 114 is located proximate to or co-located with the quality assessor 116, and the target feature extractor 112 is disposed at a distal or geographically remote location from the reference feature extractor 114 and the quality assessor 116. Further, the video quality measurement system 101 can operate to transmit the target features from the target feature extractor 112 at the distal or geographically remote location over at least one side communications channel 202 b to the quality assessor 116, which, in turn, can access the reference features from the reference feature extractor 114 proximate thereto or co-located therewith for estimating the perceptual quality of the target video.
  • In addition, and as shown in FIG. 2 c, the target feature extractor 112, the reference feature extractor 114, and the quality assessor 116 can be implemented in a distributed fashion within the video communications environment 100 such that the quality assessor 116 is disposed at a centralized location that is geographically remote from the target feature extractor 112 and the reference feature extractor 114. Further, the video quality measurement system 101 can operate to transmit the target features from the target feature extractor 112 over at least one side communications channel 202 c to the quality assessor 116, and to transmit the reference features from the reference feature extractor 114 over the side communications channel 202 c to the quality assessor 116, for estimating the perceptual quality of the target video within the quality assessor 116 at the geographically remote, centralized location. It is noted that the video quality measurement system 101 can operate to transmit the target features and the reference features to the quality assessor 116 over the same communications channel, or over different communications channels.
  • Having described the above illustrative embodiments of the video quality measurement system 101, other alternative embodiments or variations may be made. In accordance with one or more such alternative embodiments, the target feature extractor 112 and the reference feature extractor 114 can extract target features from the target video, and reference features from the reference video, respectively, by performing objective measurements with regard to the respective target and reference videos involving one or more additional temporal quality factors including, but not limited to, temporal quality factors relating to frame rates, video motion properties including jerkiness motion, frame dropping impairments, packet loss impairments, freezing impairments, and/or ringing impairments.
  • For example, taking into account the frame rate, in frames per second (fps), of the target video (also referred to herein as “fpstar”), and the frame rate, in frames per second (fps), of the reference video (also referred to herein as “fpsref,”), the estimate of the predicted DMOS between the reference video and the target video, Δ Q, can be expressed in terms of a modified version of the second prediction function (see equation (3) above), as follows,

  • Δ Q 2(Δblr,Δblk,Δfps),   (12)
  • in which “Δblr” corresponds to the average change in the blurriness measurements between the reference video and the target video for a predetermined segment from a corresponding time frame within the reference video and the target video, “Δblk ” corresponds to the average change in the blockiness measurements between the reference video and the target video for the predetermined segment, and “Δfps” corresponds to the average change in the frame rates between the reference video and the target video, (fpsref−fpstar), for the predetermined segment. For example, the Δfps can have a minimum bound at 0 (i.e., zero), such that there is essentially no penalty or benefit for the frame rate of the target video being higher than the frame rate of the reference video.
  • Accordingly, the quality assessor 116 can estimate the perceptual quality of the target video, {circumflex over (Q)}tar using a fourth prediction function based on the first prediction function, ƒ1(QIref) (see equation (2) above), and the modified second prediction function, ƒ2(Δblr,Δblk,Δfps) (see equation (12) above). For example, the {circumflex over (Q)}tar can be expressed as
  • Q ^ tar = f 4 ( QI ref , Δ blr , Δ blk , Δ fps ) = a 4 · QI ref + b 4 · Δ blr + c 4 · Δ blk + d 4 · Δ fps + e 4 ( 13 )
  • in which “f4(QIref,Δblr,Δblk,Δfps)” corresponds to the fourth prediction function, “QIref” corresponds to the QI for the reference video, and “a4,” “b4,” “c4,” “d4,” and “e4” each correspond to a parameter coefficient of the fourth prediction function. For example, the value of each of the parameter coefficients a4, b4, c4, d4, and e4 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique. For example, the parameter coefficients a4, b4, c4, d4, and e4 may be determined or set to be equal to about 0.048, 0.19, 1.08, −0.044, and 5.23, respectively, or any other suitable values.
  • In accordance with one or more further alternative embodiments, the video encoder 102 may be omitted from the video communications environment 100, allowing the source video to take on the role of the reference video. In such a case, the estimate of the perceptual quality of the reference video, {circumflex over (Q)}ref, is assumed to be fixed and known. Further, because the source video is assumed to have the highest perceptual quality, the {circumflex over (Q)}ref estimate now corresponds to the highest perceptual quality in comparison to at least the estimate of the perceptual quality of the target video, {circumflex over (Q)}tar.
  • Accordingly, the quality assessor 116 can estimate the perceptual quality of the target video, {circumflex over (Q)}tar, using a fifth prediction function based on the modified second prediction function, ƒ2(Δblr,Δblk,Δfps) (see equation (12) above). For example, the {circumflex over (Q)}tar can be expressed as
  • Q ^ tar = f 5 ( Δ blr , Δ blk , Δ fps ) = a 5 · Δ blr + b 5 · Δ blk + c 5 · Δ fps + d 5 , ( 14 )
  • in which “ƒ5(Δblr,Δblk,Δfps)” corresponds to the fifth prediction function, and “a5,” “b5,” “c5,” and “d5” each correspond to a parameter coefficient of the fifth prediction function. For example, the value of each of the parameter coefficients a5, b5, c5, and d5 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique. It is noted that, in the fifth prediction function (see equation (14) above), the fixed, known estimate of the perceptual quality of the reference video, {circumflex over (Q)}ref, can be incorporated into the parameter coefficient, d5. For example, the parameter coefficients a5, b5, c5, and d5 may be determined or set to be equal to about 0.17, 1.81, −0.04, and 4.1, respectively, or any other suitable values.
  • In accordance with one or more additional alternative embodiments, taking into account the video motion properties of the target video, the Δ Q can be expressed in terms of another modified version of the second prediction function (see equation (3) above), as follows,

  • Δ Q 2(Δblr,Δblk,{circumflex over (Q)} tar temporal),   (15)
  • in which “{circumflex over (Q)}tar temporal” corresponds to an estimate of the perceptual quality of the target video for a predetermined segment from a corresponding time frame within the reference video and the target video, taking into account the video motion properties of the target video. For example, {circumflex over (Q)}tar temporal can be expressed as
  • Q ^ tar _ temporal = 1 - - d f tar f max 1 - - d , ( 16 )
  • in which “ftar” corresponds to the frame rate of the target video, and “fmax” corresponds to a maximum frame rate. Further, in equation (16) above, “d” can be expressed as

  • d=α·e β·AMD,   (17)
  • in which “α” and “β” are constants, and “AMD” is a temporal quality factor that can be obtained by taking the sum of absolute mean differences of pixel values in a block-wise fashion between consecutive video frames in the predetermined segment of the target video. For example, for video frames in the common intermediate format (CIF), the constants α and β may be set to be equal to about 9.412 and −0.1347, respectively, or any other suitable values. Further, for video frames in the video graphics array (VGA) format, the constants α and β may be set to be equal to about 8.526 and −0.0575, respectively, or any other suitable values. Moreover, for video frames in the high definition (HD) format, the constants α and β may be set to be equal to about 6.283 and −0.1105, respectively, or any other suitable values.
  • Accordingly, the quality assessor 116 can estimate the perceptual quality of the target video, {circumflex over (Q)}tar, using a sixth prediction function based on the modified second prediction function, ƒ2(Δblr,Δblk,{circumflex over (Q)}tar temporal) (see equation (15) above). For example, the {circumflex over (Q)}tar can be expressed as
  • Q ^ tar = f 6 ( Δ blr , Δ blk , Q ^ tar _ temporal ) = a 6 · Δ blr + b 6 Δ blk + c 6 Q ^ tar _ temporal + d 6 , ( 18 )
  • in which “ƒ6(Δblr,Δblk,{circumflex over (Q)}tar temporal)” corresponds to the sixth prediction function, and “a6,” “b6,” “c6,” and “d6” each correspond to a parameter coefficient of the sixth prediction function. For example, the value of each of the parameter coefficients a6, b6, c6, and d6 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique. For example, the parameter coefficients a6, b6, c6, and d6 may be determined or set to be equal to about 0.18, 1.74, 2.32, and 1.66, respectively, or any other suitable values.
  • In accordance with one or more further alternative embodiments, taking into account the frame dropping impairments of the target video, the Δ Q can be expressed in terms of still another modified version of the second prediction function (see equation (3) above), as follows,

  • Δ Q 2(Δblr,Δblk,NIFVQ(fpstar)),   (19)
  • in which “NIFVQ(fpstar)” is a temporal quality factor representative of the negative impact of such frame dropping impairments on the perceptual quality of the target video for a predetermined segment from a corresponding time frame within the reference video and the target video, and “fpstar” corresponds to the frame rate of the target video, for a current video frame in the predetermined segment. For example, NIFVQ(fpstar) can be expressed as

  • NIFVQ(fpstar)=[ log(30)−log(fpstar)]  (20)

  • or

  • NIFVQ(fpstar)=AMD*[ log(30)−log(fpstar)],   (21)
  • in which “AMD ” is the temporal quality factor that can be obtained by taking the sum of block-wise absolute mean differences of pixel values in a block-wise fashion between consecutive video frames in the predetermined segment of the target video. It is noted that in the exemplary equations (20) and (21) above, it has been assumed that the maximum frame rate of the reference video is 30 frames per second.
  • Accordingly, the quality assessor 116 can estimate the perceptual quality of the target video, {circumflex over (Q)}tar, using a seventh prediction function based on the modified second prediction function, ƒ2(Δblr,Δblk,NIFVQ(fpstar)) (see equation (19) above). For example, the {circumflex over (Q)}tar can be expressed as
  • Q ^ tar = f 7 ( Δ blr , Δ blk , NIFVQ ( fps tar ) ) = a 7 · Δ blr + b 7 · Δ blk + c 7 · NIFVQ ( fps tar ) + d 7 , ( 22 )
  • in which “ƒ7(Δblr,Δblk,NIFVQ(fpstar))” corresponds to the seventh prediction function, and “a7,” “b7,” “c7,” and “d7” each correspond to a parameter coefficient of the seventh prediction function. For example, the value of each of the parameter coefficients a7, b7, c7, and d7 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique. For example, in the event the temporal quality factor, NIFVQ(fpstar), is determined using equation (20) above, the parameter coefficients a7, b7, c7, and d7 may be determined or set to be equal to about 0.17, 1.82, −0.68, and 4.07, respectively, or any other suitable values. Further, in the event the temporal quality factor, NIFVQ(fpstar), is determined using equation (21) above, the parameter coefficients a7, b7, c7, and d7 may be determined or set to be equal to about 0.18, 1.84, −0.02, and 3.87, respectively, or any other suitable values.
  • In accordance with one or more additional alternative embodiments, taking into account the jerkiness motion in the target video, the Δ Q can be expressed in terms of yet another modified version of the second prediction function (see equation (3) above), as follows,

  • Δ Q 2(Δblr,Δblk,JM),   (23)
  • in which “JM” is a temporal quality factor representative of a jerkiness measurement performed with regard to the target video for a predetermined segment from a corresponding time frame within the reference video and the target video. For example, JM can be expressed as
  • JM = 30 fps tar * 1 MN x = 1 M y = 1 N f i ( x , y ) - f i - 1 ( x , y ) ( 24 )
  • in which “fpstar” is the frame rate of the target video, “M” and “N” are the dimensions of each video frame in the target video, and “|ƒi(x,y)−ƒi−1(x,y)|” represents the direct frame difference between consecutive video frames at times “i” and “i−1” in the target video. It is noted that the temporal quality factor, AMD, may be similar to the direct frame difference, |ƒi(x,y)−ƒi−1(x,y)|, employed in equation (24) above. The temporal quality factor, JM, can therefore be alternatively expressed in terms of the AMD as follows,
  • JM = 30 fps tar * AMD ( 25 )
  • Accordingly, the quality assessor 116 can estimate the perceptual quality of the target video, {circumflex over (Q)}tar, using an eighth prediction function based on the modified second prediction function, ƒ2(Δblr,Δblk,JM) (see equation (23) above). For example, the {circumflex over (Q)}tar can be expressed as
  • Q ^ tar = f 7 ( Δ blr , Δ blk , JM ) = a 8 Δ blr + b 8 Δ blk + c 8 JM + d 8 , ( 26 )
  • in which “ƒ7(Δblr,Δblk,JM)” corresponds to the eighth prediction function, and “a8,” “b8,” “c8,” and “d8” each correspond to a parameter coefficient of the eighth prediction function. For example, the value of each of the parameter coefficients a8, b8, c8, and d8 can be determined using a multi-variate linear regression approach, based on a plurality of predetermined target video bitstreams and their corresponding reference video bitstreams, and ground truth quality values, or any other suitable technique. For example, the parameter coefficients a8, b8, c8, and d8 may be set to be equal to about 0.19, 1.92, −0.006, and 3.87, respectively, or any other suitable values.
  • An illustrative method of operating the video quality measurement system 101 of FIG. 1 is described below with reference to FIG. 3, as well as FIG. 1. As depicted in step 302, a target video whose perceptual quality is to be measured is received over at least one communications channel. As depicted in step 304, information pertaining to one or more target features is extracted from the target video by the target feature extractor 112 (see FIG. 1). As depicted in step 306, information pertaining to one or more reference features is extracted from a reference version of the target video by the reference feature extractor 114 (see FIG. 1). As depicted in step 308, a measurement of the perceptual quality of the reference video is provided, by the quality assessor 116 (see FIG. 1), based on a first prediction function for a predetermined segment from a corresponding time frame within the target video and the reference video, wherein the first prediction function involves at least one of the reference features. As depicted in step 310, a measurement of a predicted differential mean opinion score (DMOS) between at least a portion of the target video and at least a portion of the reference video is provided, by the quality assessor 116, based on a second prediction function for the predetermined segment, wherein the second prediction function involves at least one of the target features and at least one of the reference features. As depicted in step 312, the perceptual quality of the target video is measured using a third prediction function for the predetermined segment, wherein the third prediction function is based on the first prediction function and the second prediction function.
  • It is noted that the operations depicted and/or described herein are purely exemplary, and imply no particular order. Further, the operations can be used in any sequence, when appropriate, and/or can be partially used. With the above illustrative embodiments in mind, it should be understood that such illustrative embodiments can employ various computer-implemented operations involving data transferred or stored in computer systems. Such operations are those requiring physical manipulation of physical quantities. Typically, though not necessarily, such quantities take the form of electrical, magnetic, and/or optical signals capable of being stored, transferred, combined, compared, and/or otherwise manipulated.
  • Further, any of the operations depicted and/or described herein that form part of the illustrative embodiments are useful machine operations. The illustrative embodiments also relate to a device or an apparatus for performing such operations. The apparatus can be specially constructed for the required purpose, or can be a general-purpose computer selectively activated or configured by a computer program stored in the computer. In particular, various general-purpose machines employing one or more processors coupled to one or more computer readable media can be used with computer programs written in accordance with the teachings disclosed herein, or it may be more convenient to construct a more specialized apparatus to perform the required operations.
  • The presently disclosed systems and methods can also be embodied as computer readable code on a computer readable medium. The computer readable medium is any data storage device that can store data, which can thereafter be read by a computer system. Examples of such computer readable media include hard drives, read-only memory (ROM), random-access memory (RAM), CD-ROMs, CD-Rs, CD-RWs, magnetic tapes, and/or any other suitable optical or non-optical data storage devices. The computer readable media can also be distributed over a network-coupled computer system, so that the computer readable code can be stored and/or executed in a distributed fashion.
  • The foregoing description has been directed to particular illustrative embodiments of this disclosure. It will be apparent, however, that other variations and modifications may be made to the described embodiments, with the attainment of some or all of their associated advantages. Moreover, the procedures, processes, and/or modules described herein may be implemented in hardware, software, embodied as a computer-readable medium having program instructions, firmware, or a combination thereof. For example, the functions described herein may be performed by a processor executing program instructions out of a memory or other storage device.
  • It will be appreciated by those skilled in the art that modifications to and variations of the above-described systems and methods may be made without departing from the inventive concepts disclosed herein. Accordingly, the disclosure should not be viewed as limited except as by the scope and spirit of the appended claims.

Claims (33)

1. A method of measuring perceptual quality of video, the video being provided over at least one communications channel, the method comprising the steps of:
receiving a target video over the at least one communications channel;
extracting, from the target video, information pertaining to one or more target features of the target video;
receiving information pertaining to one or more reference features of a reference version of the target video;
providing a measurement of perceptual quality of the reference version of the target video based at least on the information pertaining to the one or more reference features;
providing a measurement of a predicted differential mean opinion score (DMOS) between at least a portion of the target video and at least a portion of the reference version of the target video based at least on the information pertaining to the one or more reference features and the information pertaining to the one or more target features; and
providing a measurement of perceptual quality of the target video, the providing of the measurement of perceptual quality of the target video including obtaining a difference between the measurement of perceptual quality of the reference version of the target video, and the measurement of the predicted DMOS between at least the portion of the target video and at least the portion of the reference version of the target video.
2. The method of claim 1 wherein the providing of the measurement of perceptual quality of the target video comprises:
providing the measurement of perceptual quality of the target video based at least on one or more prediction functions, each of the one or more prediction functions being a function of one or more of the information pertaining to the one or more reference features and the information pertaining to the one or more target features.
3. The method of claim 2 wherein the providing of the measurement of perceptual quality of the target video comprises:
providing the measurement of perceptual quality of the target video based at least on the one or more prediction functions, each of the one or more prediction functions pertaining to a predetermined segment from a corresponding time frame within the target video and the reference version of the target video.
4. The method of claim 2 wherein at least one of the one or more prediction functions is a function of at least one or more objective measurements performable with regard to the target video and the reference version of the target video, and wherein the providing of the measurement of perceptual quality of the target video comprises:
normalizing the one or more objective measurements to reflect differences in resolution between the target video and the reference version of the target video.
5. The method of claim 1 wherein the providing of the measurement of perceptual quality of the reference version of the target video comprises:
setting the measurement of perceptual quality of the reference version of the target video to a predetermined constant value.
6. The method of claim 1 wherein the providing of the measurement of perceptual quality of the reference version of the target video further comprises:
providing the measurement of perceptual quality of the reference version of the target video based at least on a first prediction function for a predetermined segment from a corresponding time frame within the target video and the reference version of the target video.
7. The method of claim 6 wherein the first prediction function is a function of at least a quantization index for the reference version of the target video.
8. The method of claim 7 wherein the quantization index corresponds to one of a quantization parameter and a quantization scale for the reference version of the target video.
9. The method of claim 8 wherein the quantization parameter corresponds to the H.264 video coding format.
10. The method of claim 8 wherein the quantization scale corresponds to the MPEG-2 video coding format.
11. The method of claim 6 wherein the providing of the measurement of the predicted DMOS between at least the portion of the target video and at least the portion of the reference version of the target video comprises:
providing the measurement of the predicted DMOS between at least the portion of the target video and at least the portion of the reference version of the target video based at least on a second prediction function for the predetermined segment from the corresponding time frame within the target video and the reference version of the target video.
12. The method of claim 11 wherein the second prediction function is a function of one or more of (a) an objective measurement of blur in the target video and an objective measurement of blur in the reference version of the target video, (b) an objective measurement of blocking artifacts in the target video and an objective measurement of blocking artifacts in the reference version of the target video, (c) a frame rate for the target video and a frame rate for the reference version of the target video, (d) video motion properties of the target video, (e) frame dropping impairments of the target video, (f) packet loss impairments of the target video, (g) frame freezing impairments of the target video, and (h) ringing impairments of the target video.
13. The method of claim 12 wherein the second prediction function is a linear prediction function.
14. The method of claim 12 wherein the video motion properties include jerkiness motion properties of the target video.
15. The method of claim 11 wherein the providing of the measurement of perceptual quality of the target video comprises:
measuring the perceptual quality of the target video using a third prediction function for the predetermined segment from the corresponding time frame within the target video and the reference version of the target video, the third prediction function being based at least on the first prediction function and the second prediction function.
16. The method of 1 wherein the extracting of the information pertaining to the one or more target features of the target video comprises:
performing one or more objective measurements with regard to the target video, the one or more objective measurements including one or more of (a) objective measurements of blur in the target video, (b) objective measurements of blocking artifacts in the target video, and (c) objective measurements of an average quantization index for the target video.
17. The method of claim 1 further comprising:
extracting, from the reference version of the target video, the information pertaining to the one or more reference features of the reference version of the target video.
18. The method of 17 wherein the extracting of the information pertaining to the one or more reference features of the reference version of the target video comprises:
performing one or more objective measurements with regard to the reference version of the target video, the one or more objective measurements including one or more of (a) objective measurements of blur in the reference version of the target video, (b) objective measurements of blocking artifacts in the reference version of the target video, and (c) objective measurements of an average quantization index for the reference version of the target video.
19. The method of claim 17 wherein the extracting of the information pertaining to the one or more target features of the target video comprises:
performing an objective measurement of blur in the target video, and
wherein the extracting of the information pertaining to the one or more reference features of the reference version of the target video comprises:
performing an objective measurement of blur in the reference version of the target video.
20. The method of claim 19 wherein the providing of the measurement of perceptual quality of the target video comprises:
normalizing the objective measurement of blur in the target video; and
normalizing the objective measurement of blur in the reference version of the target video, thereby reflecting differences in resolution between the target video and the reference version of the target video.
21. The method of claim 2 wherein at least one of the one or more prediction functions includes a plurality of parameter coefficients, and wherein the method further comprises:
determining the plurality of parameter coefficients using at least a multi-variate linear regression technique.
22. The method of claim 2 wherein at least one of the one or more prediction functions is a function of one or more of (a) an average change between a measurement of blur in the target video and a measurement of blur in the reference version of the target video, (b) an average change between a measurement of blocking artifacts in the target video and a measurement of blocking artifacts in the reference version of the target video, (c) an average change between a frame rate for the target video and a frame rate for the reference version of the target video, (d) video motion properties of the target video, (e) frame dropping impairments of the target video, (f) packet loss impairments of the target video, (g) frame freezing impairments of the target video, and (h) ringing impairments of the target video.
23. The method of claim 22 wherein at least one of the one or more prediction functions is a linear prediction function.
24. A video quality measurement system, comprising:
a target feature extractor operative to extract, from a target video, information pertaining to one or more target features of the target video; and
a quality assessor operative:
to receive information pertaining to one or more reference features of a reference version of the target video,
to provide a measurement of perceptual quality of the reference version of the target video based at least on the information pertaining to the one or more reference features;
to provide a measurement of a predicted differential mean opinion score (DMOS) between at least a portion of the target video and at least a portion of the reference version of the target video based at least on the information pertaining to the one or more reference features and the information pertaining to the one or more target features; and
to obtain a difference between the measurement of perceptual quality of the reference version of the target video, and the measurement of the predicted DMOS between at least the portion of the target video and at least the portion of the reference version of the target video, thereby providing a measurement of perceptual quality of the target video.
25. The system of claim 24 wherein the quality assessor is further operative to provide the measurement of perceptual quality of the target video based at least on one or more prediction functions, each of the one or more prediction functions being a function of one or more of the information pertaining to the one or more target features and the information pertaining to the one or more reference features.
26. The system of claim 25 wherein the quality assessor is further operative to provide the measurement of perceptual quality of the target video based at least on the one or more prediction functions, each of the one or more prediction functions pertaining to a predetermined segment from a corresponding time frame within the target video and the reference version of the target video.
27. The system of claim 25 wherein at least one of the one or more prediction functions is a function of at least one or more objective measurements performable with regard to the target video and the reference version of the target video, and wherein the quality assessor is further operative to normalize the one or more objective measurements to reflect differences in resolution between the target video and the reference version of the target video.
28. The system of claim 25 wherein at least one of the one or more prediction functions is a function of one or more of (a) an average change between a measurement of blur in the target video and a measurement of blur in the reference version of the target video, (b) an average change between a measurement of blocking artifacts in the target video and a measurement of blocking artifacts in the reference version of the target video, (c) an average change between a frame rate for the target video and a frame rate for the reference version of the target video, (d) video motion properties of the target video, (e) frame dropping impairments of the target video, (f) packet loss impairments of the target video, (g) frame freezing impairments of the target video, and (h) ringing impairments of the target video.
29. The system of claim 28 wherein at least one of the one or more prediction functions is a linear prediction function.
30. A distributed system for measuring perceptual quality of video, the video having one or more features, the one or more features of the video being provided over at least one communications channel, the system comprising:
a target feature extractor operative to extract, from a target video whose perceptual quality is to be measured, information pertaining to one or more target features of the target video;
a reference feature extractor operative to extract, from a reference version of the target video, information pertaining to one or more reference features of the target video; and
a quality assessor that is geographically remote from the target feature extractor and the reference feature extractor, the quality assessor being operative:
to receive the one or more target features from the target feature extractor over the at least one communications channel;
to receive the one or more reference features from the reference feature extractor over the at least one communications channel; and
to provide a measurement of perceptual quality of the target video based at least on the one or more target features and the one or more reference features.
31. The system of claim 30 wherein the quality assessor is further operative:
to provide the measurement of perceptual quality of the target video based at least on one or more prediction functions involving the one or more target features and the one or more reference features.
32. A distributed system for measuring perceptual quality of video, the video having one or more features, the one or more features of the video being provided over at least one communications channel, the system comprising:
a target feature extractor operative to extract, from a target video whose perceptual quality is to be measured, information pertaining to one or more target features of the target video;
a reference feature extractor operative to extract, from a reference version of the target video, information pertaining to one or more reference features of the target video; and
a quality assessor that is co-located with the reference feature extractor and geographically remote from the target feature extractor, the quality assessor being operative:
to receive the one or more reference features from the reference feature extractor;
to receive the one or more target features from the target feature extractor over the at least one communications channel; and
to provide a measurement of perceptual quality of the target video based at least on the one or more target features and the one or more reference features.
33. The system of claim 32 wherein the quality assessor is further operative to provide the measurement of perceptual quality of the target video based at least on one or more prediction functions involving the one or more target features and the one or more reference features.
US13/151,761 2011-06-02 2011-06-02 Method and apparatus for reduced reference video quality measurement Active 2031-10-10 US8520075B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/151,761 US8520075B2 (en) 2011-06-02 2011-06-02 Method and apparatus for reduced reference video quality measurement

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/151,761 US8520075B2 (en) 2011-06-02 2011-06-02 Method and apparatus for reduced reference video quality measurement

Publications (2)

Publication Number Publication Date
US20120307074A1 true US20120307074A1 (en) 2012-12-06
US8520075B2 US8520075B2 (en) 2013-08-27

Family

ID=47261397

Family Applications (1)

Application Number Title Priority Date Filing Date
US13/151,761 Active 2031-10-10 US8520075B2 (en) 2011-06-02 2011-06-02 Method and apparatus for reduced reference video quality measurement

Country Status (1)

Country Link
US (1) US8520075B2 (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130148731A1 (en) * 2011-12-09 2013-06-13 General Instrument Corporation Encoding and decoding using perceptual representations
US20140098899A1 (en) * 2012-10-05 2014-04-10 Cheetah Technologies, L.P. Systems and processes for estimating and determining causes of video artifacts and video source delivery issues in a packet-based video broadcast system
US20150181207A1 (en) * 2013-12-20 2015-06-25 Vmware, Inc. Measuring Remote Video Display with Embedded Pixels
US9208608B2 (en) 2012-05-23 2015-12-08 Glasses.Com, Inc. Systems and methods for feature tracking
US9236024B2 (en) 2011-12-06 2016-01-12 Glasses.Com Inc. Systems and methods for obtaining a pupillary distance measurement using a mobile computing device
US9286715B2 (en) 2012-05-23 2016-03-15 Glasses.Com Inc. Systems and methods for adjusting a virtual try-on
US9483853B2 (en) 2012-05-23 2016-11-01 Glasses.Com Inc. Systems and methods to display rendered images
US9614892B2 (en) 2011-07-14 2017-04-04 Vmware, Inc. Method and system for measuring display performance of a remote application
US9674265B2 (en) 2013-11-04 2017-06-06 Vmware, Inc. Filtering unnecessary display updates for a networked client
US9699247B2 (en) 2014-06-17 2017-07-04 Vmware, Inc. User experience monitoring for application remoting
CN108139757A (en) * 2015-09-11 2018-06-08 深圳市大疆创新科技有限公司 For the system and method for detect and track loose impediment
EP3573338A1 (en) * 2018-05-25 2019-11-27 Carrier Corporation Video device and network quality evaluation/diagnostic tool
EP3855752A1 (en) * 2020-01-23 2021-07-28 Modaviti Emarketing Pvt Ltd Artificial intelligence based perceptual video quality assessment system
EP4068779A1 (en) * 2021-03-31 2022-10-05 Hulu, LLC Cross-validation of video encoding

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8867013B2 (en) * 2012-01-26 2014-10-21 Avaya Inc. System and method for measuring video quality degradation using face detection
US9369742B2 (en) * 2012-12-06 2016-06-14 Avaya Inc. System and method to estimate end-to-end video frame delays
US9414081B1 (en) * 2014-11-04 2016-08-09 Sprint Communications Company L.P. Adaptation of digital image transcoding based on estimated mean opinion scores of digital images
US9917952B2 (en) 2016-03-31 2018-03-13 Dolby Laboratories Licensing Corporation Evaluation of perceptual delay impact on conversation in teleconferencing system
US10735742B2 (en) 2018-11-28 2020-08-04 At&T Intellectual Property I, L.P. Adaptive bitrate video testing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5446492A (en) * 1993-01-19 1995-08-29 Wolf; Stephen Perception-based video quality measurement system
US6496221B1 (en) * 1998-11-02 2002-12-17 The United States Of America As Represented By The Secretary Of Commerce In-service video quality measurement system utilizing an arbitrary bandwidth ancillary data channel
US7668397B2 (en) * 2005-03-25 2010-02-23 Algolith Inc. Apparatus and method for objective assessment of DCT-coded video quality with or without an original video sequence
US7705881B2 (en) * 2003-08-22 2010-04-27 Nippon Telegraph And Telepone Corporation Video quality assessing apparatus, video quality assessing method, and video quality assessing program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5446492A (en) * 1993-01-19 1995-08-29 Wolf; Stephen Perception-based video quality measurement system
US6496221B1 (en) * 1998-11-02 2002-12-17 The United States Of America As Represented By The Secretary Of Commerce In-service video quality measurement system utilizing an arbitrary bandwidth ancillary data channel
US7705881B2 (en) * 2003-08-22 2010-04-27 Nippon Telegraph And Telepone Corporation Video quality assessing apparatus, video quality assessing method, and video quality assessing program
US7668397B2 (en) * 2005-03-25 2010-02-23 Algolith Inc. Apparatus and method for objective assessment of DCT-coded video quality with or without an original video sequence

Cited By (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9674263B2 (en) 2011-07-14 2017-06-06 Vmware, Inc. Measurement of remote display responsiveness to application display changes
US9614892B2 (en) 2011-07-14 2017-04-04 Vmware, Inc. Method and system for measuring display performance of a remote application
US9236024B2 (en) 2011-12-06 2016-01-12 Glasses.Com Inc. Systems and methods for obtaining a pupillary distance measurement using a mobile computing device
US20130148731A1 (en) * 2011-12-09 2013-06-13 General Instrument Corporation Encoding and decoding using perceptual representations
US9503756B2 (en) * 2011-12-09 2016-11-22 Arris Enterprises, Inc. Encoding and decoding using perceptual representations
US9286715B2 (en) 2012-05-23 2016-03-15 Glasses.Com Inc. Systems and methods for adjusting a virtual try-on
US9235929B2 (en) 2012-05-23 2016-01-12 Glasses.Com Inc. Systems and methods for efficiently processing virtual 3-D data
US9311746B2 (en) 2012-05-23 2016-04-12 Glasses.Com Inc. Systems and methods for generating a 3-D model of a virtual try-on product
US9378584B2 (en) 2012-05-23 2016-06-28 Glasses.Com Inc. Systems and methods for rendering virtual try-on products
US9483853B2 (en) 2012-05-23 2016-11-01 Glasses.Com Inc. Systems and methods to display rendered images
US9208608B2 (en) 2012-05-23 2015-12-08 Glasses.Com, Inc. Systems and methods for feature tracking
US10147233B2 (en) 2012-05-23 2018-12-04 Glasses.Com Inc. Systems and methods for generating a 3-D model of a user for a virtual try-on product
US20140098899A1 (en) * 2012-10-05 2014-04-10 Cheetah Technologies, L.P. Systems and processes for estimating and determining causes of video artifacts and video source delivery issues in a packet-based video broadcast system
US9674265B2 (en) 2013-11-04 2017-06-06 Vmware, Inc. Filtering unnecessary display updates for a networked client
US9674518B2 (en) * 2013-12-20 2017-06-06 Vmware, Inc. Measuring remote video display with embedded pixels
US20150181207A1 (en) * 2013-12-20 2015-06-25 Vmware, Inc. Measuring Remote Video Display with Embedded Pixels
US9699247B2 (en) 2014-06-17 2017-07-04 Vmware, Inc. User experience monitoring for application remoting
CN108139757A (en) * 2015-09-11 2018-06-08 深圳市大疆创新科技有限公司 For the system and method for detect and track loose impediment
EP3573338A1 (en) * 2018-05-25 2019-11-27 Carrier Corporation Video device and network quality evaluation/diagnostic tool
US10944993B2 (en) 2018-05-25 2021-03-09 Carrier Corporation Video device and network quality evaluation/diagnostic tool
EP3855752A1 (en) * 2020-01-23 2021-07-28 Modaviti Emarketing Pvt Ltd Artificial intelligence based perceptual video quality assessment system
EP4068779A1 (en) * 2021-03-31 2022-10-05 Hulu, LLC Cross-validation of video encoding
JP2022158941A (en) * 2021-03-31 2022-10-17 フル・エルエルシー Cross-validation of video encoding
US11622116B2 (en) 2021-03-31 2023-04-04 Hulu, LLC Cross-validation of video encoding
JP7342166B2 (en) 2021-03-31 2023-09-11 フル・エルエルシー Cross-validation of video encoding

Also Published As

Publication number Publication date
US8520075B2 (en) 2013-08-27

Similar Documents

Publication Publication Date Title
US8520075B2 (en) Method and apparatus for reduced reference video quality measurement
US8804815B2 (en) Support vector regression based video quality prediction
US8737486B2 (en) Objective image quality assessment device of video quality and automatic monitoring device
JP4817246B2 (en) Objective video quality evaluation system
Engelke et al. Modelling saliency awareness for objective video quality assessment
US10009611B2 (en) Visual quality measure for real-time video processing
EP2553935B1 (en) Video quality measurement
KR101416265B1 (en) Apparatus for measuring quality of video data based on hybrid type and method thereof
US20140321552A1 (en) Optimization of Deblocking Filter Parameters
US20110255589A1 (en) Methods of compressing data and methods of assessing the same
US20010053186A1 (en) Computer-readable medium having image decoding program stored thereon
US8902973B2 (en) Perceptual processing techniques for video transcoding
US20130195206A1 (en) Video coding using eye tracking maps
US20170257635A1 (en) Rate control for content transcoding
Joskowicz et al. A parametric model for perceptual video quality estimation
US20170339410A1 (en) Quality Metric for Compressed Video
US20130121422A1 (en) Method And Apparatus For Encoding/Decoding Data For Motion Detection In A Communication System
US8755613B2 (en) Method for measuring flicker
KR20100071820A (en) Method and apparatus for measuring quality of video
JP5013487B2 (en) Video objective image quality evaluation system
EP2875640B1 (en) Video quality assessment at a bitstream level
EP2888877B1 (en) Method and apparatus for estimating content complexity for video quality assessment
US8472529B2 (en) Estimating complexity of video frames for encoding
Oelbaum et al. Building a reduced reference video quality metric with very low overhead using multivariate data analysis
Lu Image analysis for video artifact estimation and measurement

Legal Events

Date Code Title Description
AS Assignment

Owner name: DIALOGIC INC., NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BHAGAVATHY, SITARAM;BLOOM, JEFFREY A;ZOU, DEKUN;AND OTHERS;SIGNING DATES FROM 20110526 TO 20110528;REEL/FRAME:026393/0959

AS Assignment

Owner name: DIALOGIC (US) INC., NEW JERSEY

Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE NAME OF THE RECEIVING PARTY IDENTIFIED PREVIOUSLY AS "DIALOGIC INC." PREVIOUSLY RECORDED ON REEL 026393 FRAME 0959. ASSIGNOR(S) HEREBY CONFIRMS THE NAME OF THE RECEIVING PARTY SHOULD READ "DIALOGIC (US) INC.";ASSIGNORS:BHAGAVATHY, SITARAM;BLOOM, JEFFREY A.;ZOU, DEKUN;AND OTHERS;SIGNING DATES FROM 20110526 TO 20110528;REEL/FRAME:027634/0920

AS Assignment

Owner name: DIALOGIC CORPORATION, CANADA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:DIALOGIC (US) INC.;REEL/FRAME:027655/0039

Effective date: 20120203

AS Assignment

Owner name: OBSIDIAN, LLC, AS COLLATERAL AGENT, CALIFORNIA

Free format text: SUPPLEMENTAL INTELLECTUAL PROPERTY SECURITY AGREEMENT;ASSIGNORS:DIALOGIC INC.;DIALOGIC CORPORATION;DIALOGIC NETWORKS (ISRAEL) LTD.;REEL/FRAME:027931/0001

Effective date: 20120322

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: EXCEL SECURITIES CORPORATION, NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: DIALOGIC US HOLDINGS INC., NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: DIALOGIC RESEARCH INC., F/K/A EICON NETWORKS RESEA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: BROOKTROUT TECHNOLOGY, INC., NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: DIALOGIC CORPORATION, F/K/A EICON NETWORKS CORPORA

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: DIALOGIC INC., NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: EAS GROUP, INC., NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: SNOWSHORE NETWORKS, INC., NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: DIALOGIC JAPAN, INC., F/K/A CANTATA JAPAN, INC., N

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: SHIVA (US) NETWORK CORPORATION, NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: DIALOGIC MANUFACTURING LIMITED, F/K/A EICON NETWOR

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: BROOKTROUT SECURITIES CORPORATION, NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: BROOKTROUT NETWORKS GROUP, INC., NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: CANTATA TECHNOLOGY INTERNATIONAL, INC., NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: DIALOGIC DISTRIBUTION LIMITED, F/K/A EICON NETWORK

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: CANTATA TECHNOLOGY, INC., NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: EXCEL SWITCHING CORPORATION, NEW JERSEY

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

Owner name: DIALOGIC (US) INC., F/K/A DIALOGIC INC. AND F/K/A

Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:OBSIDIAN, LLC;REEL/FRAME:034468/0654

Effective date: 20141124

AS Assignment

Owner name: SILICON VALLEY BANK, MASSACHUSETTS

Free format text: SECURITY AGREEMENT;ASSIGNORS:DIALOGIC (US) INC.;DIALOGIC INC.;DIALOGIC US HOLDINGS INC.;AND OTHERS;REEL/FRAME:036037/0165

Effective date: 20150629

FEPP Fee payment procedure

Free format text: PAT HOLDER CLAIMS SMALL ENTITY STATUS, ENTITY STATUS SET TO SMALL (ORIGINAL EVENT CODE: LTOS); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8