US9271016B2 - Reformatting media streams to include auxiliary data - Google Patents
Reformatting media streams to include auxiliary data Download PDFInfo
- Publication number
- US9271016B2 US9271016B2 US14/213,919 US201414213919A US9271016B2 US 9271016 B2 US9271016 B2 US 9271016B2 US 201414213919 A US201414213919 A US 201414213919A US 9271016 B2 US9271016 B2 US 9271016B2
- Authority
- US
- United States
- Prior art keywords
- data
- media
- media stream
- amount
- auxiliary data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 claims abstract description 56
- 238000003860 storage Methods 0.000 claims description 19
- 230000000007 visual effect Effects 0.000 claims description 12
- 238000003780 insertion Methods 0.000 claims description 8
- 230000037431 insertion Effects 0.000 claims description 8
- 238000013139 quantization Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 description 32
- 238000013459 approach Methods 0.000 description 9
- 238000012937 correction Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 230000004048 modification Effects 0.000 description 7
- 238000012986 modification Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 238000007726 management method Methods 0.000 description 6
- 238000012545 processing Methods 0.000 description 5
- 230000009467 reduction Effects 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000007906 compression Methods 0.000 description 4
- 230000006835 compression Effects 0.000 description 4
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000013144 data compression Methods 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010924 continuous production Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000016776 visual perception Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
- H04N21/2347—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving video stream encryption
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/154—Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/40—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/46—Embedding additional information in the video signal during the compression process
- H04N19/463—Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
- H04N21/23424—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
- H04N21/234381—Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by altering the temporal resolution, e.g. decreasing the frame rate by frame skipping
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/266—Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel
- H04N21/26606—Channel or content management, e.g. generation and management of keys and entitlement messages in a conditional access system, merging a VOD unicast channel into a multicast channel for generating or managing entitlement messages, e.g. Entitlement Control Message [ECM] or Entitlement Management Message [EMM]
Definitions
- the present invention relates to media streams, and more specifically, to reformatting media streams to include auxiliary data.
- the encoding of content into media streams needs to satisfy multiple conditions. For example, the content needs to be compressed to a small size for efficient transmission. The content also needs to conform to specifications that allow for multiple different client devices to decode and present the content in a consistent manner. Further, it needs to provide high level information that allows for parsing, trickplay and navigation of the content. Frequently, it also needs to allow for time synchronization so that the client device is playing at the same rate that the server is delivering the media content.
- DRM digital rights management
- CA conditional access
- ECM entitlement control messages
- EMM entitlement management messages
- media data in the media stream can be identified and removed in an efficient and fast manner, with minimal and imperceptible quality loss, to allow for insertion of auxiliary information without disruption of the media stream consistency.
- proper PCR values are maintained for the media stream. Benefits of these embodiments include, for example, providing a lightweight process that can be applied in real time in software that has a very limited view of the stream and can therefore be applied with small delay to the video processing workflow. Further, eliminating the frequencies that are expensive to the data compression can result in a reduction in data with minimal amount of visual degradation.
- a method to reformat a media stream to include auxiliary data includes: receiving the auxiliary data to be inserted into the media stream; determining the amount of data in the auxiliary data; identifying media data in the media stream to be reduced in size; reformatting the media stream to reduce the amount of data in the media data such that the amount of data removed from the media data is at least equal to the amount of data in the auxiliary data while providing minimal impact to the quality of the media data; and adding the auxiliary data to the media stream.
- a system for reformatting a media stream to include auxiliary data includes: a head end server configured to receive the auxiliary data to be inserted into the media stream through a first network, the head end server determining the amount of data in the auxiliary data, identifying media data to be reduced in size in the media stream, and reformatting the media stream to reduce the amount of data in the media data such that the amount of data removed from the media data is at least equal the amount of data in the auxiliary data while providing minimal impact to the quality of the media data; and a network interface module configured to add the auxiliary data to the reformatted media stream and distribute the media stream to a plurality of client devices through a second network while the media stream maintains a consistent size.
- a non-transitory storage medium storing a computer program to reformat a media stream to include auxiliary data.
- the computer program comprising executable instructions that cause the computer to: receive the auxiliary data to be inserted into the media stream; determine the amount of data in the auxiliary data; identify media data to be reduced in size; reformat the media stream to reduce the amount of data in the media data such that the amount of data removed from the media data is at least equal to the amount of data in the auxiliary data while providing minimal impact to the quality of the media data; and add the auxiliary data to the media stream.
- FIG. 1 is an overview of a content access or conditional access (CA) implementation which illustrates the problem of adding information to media streams;
- CA conditional access
- FIG. 2 illustrates a PCR correction showing the location of blocks with newly added data (New Data 1 and New Data 2 ) and the accumulated PCR adjustment in two locations;
- FIG. 3 is a functional block diagram of a system for reformatting selected locations of media data to make room in the data section that allows for insertion of auxiliary data without increase in the file size and bit rate;
- FIG. 4 is a flow diagram illustrating a process of adding auxiliary data to a media stream in accordance with one embodiment of the present invention.
- FIG. 5 is a flow diagram illustrating a process that shows the details of the identification process and the reformatting process in accordance with one embodiment of the present invention.
- media streams are files that contain digital media such as audio or video or combinations thereof.
- a media stream may include several streams of audio, video, and text in different tracks. The individual tracks may be included in a program stream like MPEG program stream (MPEG-PS). Different tracks may include different video programs or alternative audio tracks for different languages or text for subtitles.
- Content is typically compressed for efficient transmission and storage using codes like MPEG2, AVC or HEVC.
- Media streams can be provided as files for download by a client device or streamed such that the client device will not have access to the entire content before the playback begins but receives the content just before playback. Streaming may be applied to content prepared in advance or live, in a continuous process while recording.
- Auxiliary data is information that enhances media streams by adding information, for example, used to aid in content decryption such as digital rights management (DRM) information.
- DRM digital rights management
- Other examples of auxiliary data include sub titles, information for a video decoder or video renderer, thumbnails that can be displayed with the content, Internet links to websites with related information and additional audio tracks.
- Media data encodes perceptual information, and is included in the media stream. Examples include video frames, video slices, prediction information, macroblocks, and motion vectors. Media data is not limited to video, and can be, for example, audio or other content data.
- CA Conditional access
- the data stream is scrambled with a secret key referred to as the control word. Knowing the value of a single control word at a given moment is of relatively little value, because content providers will change the control word several times per minute. The control word is generated automatically in such a way that successive values are not usually predictable.
- entitlement messages are used by a CA unit to authorize and communicate the decryption keys to the receiver of the stream.
- FIG. 1 is a functional block diagram of a content access or conditional access (CA) implementation 100 which illustrates the process of adding information to media streams with a specific example of a Moving Picture Experts Group 2 (MPEG-2) transport stream in digital video broadcast (DVB) format.
- CA Module 110 inserts information into the media stream via content encryption/distribution unit 120 .
- the CA client module present, for example, in the subscriber's Set-Top Box (STB) 130 uses this information to determine if the end-user has sufficient digital rights to decrypt and view that media stream.
- CA specific information is encapsulated in the media stream as either an Entitlement Control Message (ECM) or an Entitlement Management Message (EMM). While ECMs are used to transmit the digital keys necessary to decrypt the media stream, EMMs are used to authorize an STB or a group of STBs to decrypt the media stream.
- ECM Entitlement Control Message
- EMMs Entitlement Management Message
- a Program Clock Reference (PCR) is transmitted in the adaptation field of an MPEG-2 transport stream packet on an arbitrary basis, but no less than every 0.1 seconds.
- the PCR packet includes a PCR time base consisting of 48 bits (six bytes), which define a time stamp.
- the time stamp indicates the relative time that the PCR packet was sent by the program source.
- the value of the PCR when properly used, is employed to generate a system_timing_clock in the decoder.
- the PCR packets have headers and a flag to enable their recovery at the receiver, for example, a set top box (STB), where they are used to synchronize the receiver clock to the source clock.
- STB set top box
- STC System Time Clock
- the System Time Clock (STC) decoder when properly implemented, provides a highly accurate time base that is used to synchronize audio and video elementary streams. Timing in MPEG-2 references this clock.
- the presentation time stamp (PTS) is intended to be relative to the PCR.
- the first 33 bits are based on a 90 kHz clock.
- the last 9 are based on a 27 MHz clock.
- a standard maximum jitter permitted for the PCR is ⁇ 500 ns.
- the size of a media stream for a given duration is termed bit rate and it is expressed as the amount of data between two consecutive recordings of the PCR in the stream.
- bit rate When inserting or removing information from a media stream, it is therefore necessary to correct the PCR value of all successive PCR recordings to reflect the change of the amount of data. This action of adjusting all successive PCR values is called a PCR correction.
- FIG. 2 illustrates the PCR correction showing the location of blocks with newly added data (New Data 1 and New Data 2 ) and the accumulated PCR adjustment in two locations.
- This method of adjusting the PCR values can achieve relatively high levels of throughput without the use of specialized hardware but it is prone to errors and its accuracy can be affected by the bit rate characteristics of the original stream. Inserting CA information in the media stream (in the form of ECM and EMM messages) is a relatively simple process. However, when the data is inserted, difficulty arises from the fact that the additional data causes the above mentioned inconsistencies.
- PCR correction involves modules (e.g., multiplexers that are inserting CA or other information in the stream) re-creating the PCR and Program Presentation Timestamps (PTS) to account for the amount and location of the added data during a re-multiplexing process.
- modules e.g., multiplexers that are inserting CA or other information in the stream
- PTS Program Presentation Timestamps
- Re-creating the PCR/PTS uses a real time clock as well as dedicated hardware capable of re-encoding multiple MPEG-2 streams in real time.
- Such hardware is not always available, and adds significantly to the cost of the overall CA implementation.
- PCR correction involves a software-only solution which is employed to correct only the PCR value by small increments based on the amount of data inserted in the stream.
- software-only PCR correction in non-real time environment, is prone to the same problem it tries to address, still lacking accuracy in the calculation of the new PCR values. This is increased if PCR values are corrected only and not PTS values, desynchronizing the two.
- NULL packets can be used instead of adjusting the PCR values.
- NULL packets are data chunks in an MPEG-2 TS that are inserted to correct the bit rate of the stream. The receiver is expected to ignore its contents.
- ECMs or EMMs the replacement of already existing NULL packets in the case of MPEG-2 TS with ECMs or EMMs is a plausible alternative, such replacement needs access to the content encoder or re-encoding the stream to guarantee sufficient bandwidth of NULL packets in the stream. This is a time consuming and complex process that delays the media workflow that is critical, in particular, for live content.
- Certain embodiments as disclosed herein provide for avoiding the above-mentioned drawbacks of re-creation, adjustment or using NULL packets for the insertion of auxiliary data.
- selected locations of media data are re-formatted to make room in the data section that allows for insertion of auxiliary data without increase in the file size and bit rate.
- the media data can be identified and reduced in size in an efficient and fast manner, with minimal and imperceptible quality loss, to allow for insertion of auxiliary information without disruption of the stream consistency.
- proper PCR values are maintained for the media stream. Benefits of these embodiments include, for example, providing a lightweight process that can be applied in real time in software that has a very limited view of the stream and can therefore be applied with small delay to the video processing workflow. Further, eliminating the frequencies that are expensive to the data compression can result in a reduction in data with minimal amount of visual degradation.
- FIG. 3 is a functional block diagram of a system 300 for reformatting selected locations of the media stream to make room in the data section that allows for insertion of auxiliary data without increase in the file size and bit rate.
- content is provided from a content server 320 (e.g., by a content owner such as a movie studio) to an operator or distributor (e.g., a head end 310 ).
- the head end 310 processes the content, prepares it for distribution, and distributes the content to end users' client devices 340 , 342 , 344 .
- a client device is one of desktop computer, a mobile device, and other computing devices such as a tablet device.
- a client device is an electronic device that contains a media player. It typically retrieves content from a server via a network but may also play back content from its local storage or physical media such as DVD, Blu-Ray, other optical discs, or USB memory sticks or other storage devices. Examples of client devices include Set Top Boxes, desktop and laptop computers, cell phones, mp3 players, and portable media players.
- the content server 320 communicates with the head end 310 by way of a first network 350 .
- the head end 310 communicates with the client devices 340 , 342 , 344 by way of a second network 360 .
- the networks may be of various types, for example, telco, cable, satellite, and wireless including local area network (LAN) and wide area network (WAN).
- LAN local area network
- WAN wide area network
- the processing at the head end 310 may include encoding of the content if the format provided by the content provider does not meet the operator's requirements. In one embodiment, the processing also includes the addition of auxiliary data before distribution.
- the head end 310 includes a network interface module 312 that provides communication.
- the network interface module 312 includes various elements, for example, according to the type of networks used.
- the head end 310 also includes a processor module 314 and a storage module 316 .
- the processor module 314 processes communications being received and transmitted by the head end 310 .
- the storage module 316 stores data for use by the processor module 314 .
- the storage module 316 is also used to store computer readable instructions for execution by the processor module 314 .
- the system 300 for reformatting selected locations of the media stream includes the head end server 310 and a network interface module 312 .
- the head end server 310 is configured to receive the auxiliary data to be inserted into the media stream through a first network 350 .
- the head end server 310 determines the amount of data in the auxiliary data, identifies media data to be reduced in the media stream, and reformats the media stream to reduce the amount of data in the media data such that the amount of data removed from the media data is at least equal the amount of data in the auxiliary data while providing minimal impact to the quality of the media data.
- the network interface module 312 is configured to add the auxiliary data to the reformatted media stream and distribute the media stream to a plurality of client devices 340 , 342 , 344 through a second network 360 .
- the computer readable instructions can be used by the head end 310 for accomplishing its various functions.
- the storage module 316 or parts of the storage module 316 is a non-transitory machine readable medium.
- the head end 310 or embodiments of it are described as having certain functionality. It will be appreciated that in some embodiments, this functionality is accomplished by the processor module 314 in conjunction with the storage module 316 , and the network interface module 312 .
- the processor module 314 may include specific purpose hardware to accomplish some functions.
- FIG. 4 is a flow diagram illustrating a process 400 of adding auxiliary data to a media stream in accordance with one embodiment of the present invention.
- the process 400 is implemented by the head end server 310 .
- auxiliary data is determined, read, or received from another component such as a content access system that provides entitlement messages to be inserted. This data also includes the relevant target location in the media stream as a byte location in the stream or time code.
- the process 400 determines, at step 420 , the amount of data that needs to be inserted and the resulting space requirement. The determination is made with the consideration of formatting of packaging the data.
- the media data to be reduced is then identified, at step 430 , to determine whether that media data can be reduced in size according to several specified criteria.
- the removal should be done with minimal impact to the quality of the stream, be close enough to the target location to be useful, and make enough room for the auxiliary data (“enough room” can be defined as same size or larger than the size of the auxiliary data).
- auxiliary room can be defined as same size or larger than the size of the auxiliary data.
- the specific approaches and selection of media data to reduce in size are discussed further below.
- the media data is reduced in size from the media stream, at step 440 , by reformatting the media stream, which reduces the media data in size that best suits the requirements.
- the details of the identification process (step 430 ) and the reformatting process (step 440 ) are illustrated in FIG. 5 .
- Auxiliary data is added and adjustment to maintain a consistent media stream is applied, at step 450 .
- a consistent media stream provides content playback without artifacts such as jitter, skips, or frozen playback.
- This adjustment if any, is typically limited to the data between the removed information and the added auxiliary data to correct the values in the stream to accommodate for the removed data until the auxiliary data is included after which the stream has the same timing information as before the process.
- A e.g., a bi-directionally predicted frame—‘b’ frame
- the data A can be analyzed to remove the least relevant to the final decoded picture quality pieces to produce a new piece of data A′.
- the size of A minus the size of A′ is equal or larger to the size of auxiliary data, which can be inserted right after or before the A′.
- the process 400 continues for all positions of the auxiliary data in the media stream.
- the amount of data in the existing media stream is decreased to create space in the encoded domain.
- the video stream typically includes system information, audio data, and video data. Since the system information contains very little redundancy and audio data is commonly very small, usually the video data is reduced to accommodate auxiliary data.
- media data encoded as video using compression formats like MPEG-2 or advanced video coding (AVC) includes different encoding elements that are assembled into the final video during decode.
- the general approach is either removing those elements or reducing them in size. The decision is made depending on the size that is required to be made available and the impact on the resulting visual quality.
- the approach used is to remove an entire frame.
- bidirectional (‘b’) frames use information from neighboring frames in both directions but they are not referenced by other frames and are therefore suitable for removal since they do not create artifacts in neighboring frames.
- the frame may be removed in its entirety, or replaced with instructions that copy the frame from other elements, allowing for storage of this frame with much less data. Depending on the availability of the frames and their content this may create visual artifacts.
- Other encoding elements that may be removed or replaced with information that is much smaller include slices and macro-blocks, defined in several coding standards. The replacement can occur from neighboring encoding elements.
- elements of frames, slices and macro-blocks are re-encoded using a higher quantization value.
- the elements may be re-encoded individually or may be grouped and reduced in size together, distributing the quality loss over a larger area. Grouping may be more desirable for the resulting loss in visual perception, but it may result in accessing a larger area of the encoded bit stream which creates a more complex process with longer expected delay in the encoding pipeline.
- I, P and B Video information of I (intra) and P (predicted) frames may be referenced by other frames that copy part of their visual data. Consequently, any modification by removal of the media data that has an impact on the decoded video information may propagate to other frames. B frames that are not referenced by other frames are the best target for modifications, since those modifications do not propagate to other frames.
- a frame encoded as a B frame consists of several key elements including Header data (typically less than 1% of the overall data), motion vector information (typically 1-5%), different flags and adaptive quantization data (typically 1-5%), and variable length code (VLC) discrete cosine transform (DCT) coefficients ( ⁇ 90%).
- Header data typically less than 1% of the overall data
- motion vector information typically 1-5%
- different flags and adaptive quantization data typically 1-5%
- VLC variable length code
- DCT discrete cosine transform
- a picture region is divided into 16 ⁇ 16 areas called macro-blocks. Each macro-block is subdivided into several 8 ⁇ 8 areas called blocks. There are between 6 and 12 blocks in a macro-block (depending on the chroma subsampling format).
- a 2-dimensional DCT is applied to the blocks and then quantization is performed. Quantization is reducing DCT values by an amount depending on their location representing the frequencies. Higher frequencies are often quantized stronger, since they contribute less to the overall perceptual quality of the encoded media data.
- a useful feature of the encoding process is that usually, at this stage, a majority of coefficients, in particular, in high frequencies are zeros. Higher frequencies are found in the lower right of the 2-D matrix.
- the resulting 2-D 8 ⁇ 8 matrix is converted into 1-D array by using a zigzag scan and higher frequencies are at the end of this array.
- the 1-D array is then run length encoded using a Huffman based algorithm that efficiently encodes runs of 0 s.
- the Huffman coding can be undone to arrive at DCT coefficients that can be re-quantized. That is, the quantization table can be applied to further reduce the frequencies for higher compression.
- auxiliary data to be stored is often small (e.g. an ECM message of 188 bytes)
- the modifications to the DCT coefficients can be better targeted by optimizing the selection of targeted areas.
- the targeted areas are chosen from all areas that can be modified that allows the best tradeoff in reducing the visible impact of data reduction, which allows for more precise targeting of the right amount of data that needs to be removed.
- FIG. 5 is a flow diagram illustrating a process 500 that shows the details of the identification process (step 430 ) and the reformatting process (step 440 ) in accordance with one embodiment of the present invention.
- a location of the media data to be modified or removed is identified, at step 510 .
- a data section within the media data that needs to have additional room e.g., data or time range in the compressed stream
- the media data to be modified is identified, at step 520 .
- all blocks in frames that are in the location to be modified are identified.
- blocks contained in B frames or parts thereof are identified.
- B frames include DCT coefficients that can be targeted for data reduction.
- the DCT coefficients are generated and sorted by impact on the visual quality, at step 530 .
- the coefficients that are clustered in the high frequencies with high values that use the largest space to encode and contribute relatively little to the visual quality of the encoded picture are sorted.
- the sorting criterion can be the weighted sum of the DCT values, with more weight given to the DCT values corresponding to the higher frequencies.
- the coefficients with the largest values of the sorted list are removed and the coefficients corresponding to the highest frequencies are set to zero, which allows for compression to smaller size data during the Huffman encoding.
- the process 500 continues at step 550 by applying Huffman encoding which re-encodes the modified block of blocks.
- the size of the unmodified block is compared, at step 560 , with the modified block to determine the amount of media data that has been removed. If the difference (i.e., the size of the media data removed) is determined, at step 570 , to be not yet large enough (i.e., the difference is smaller than the amount of auxiliary data), the process 500 continues back to step 540 . For example, given how many bytes should be made available at a certain frame interval, several passes can be applied until at least that amount of data is reduced from the encoded frames in the specified interval. Otherwise, the process 500 of removing the media data is complete.
- the media data is removed from macro-blocks.
- the removed media data is prediction information of the macro-blocks.
- the auxiliary data can include digital right management (DRM) information, which can include MPEG-2 entitlement management messages and entitlement control messages.
- DRM digital right management
- the media stream is in a streaming media format.
- the streaming media format is an MPEG-2 transport stream.
- the foregoing approach can target combinations that are encoded using an escape signal Huffman code followed by raw representation of that pair, which is used to preserve high frequencies but is generally expensive in terms of data usage.
- the escape code sequences commonly encode noise present in the original or high frequency patterns.
- Such processes are generally not used by MPEG-2 encoders or systems that recompress the content and may not optimize DCT coefficients after quantization to fit with the Huffman table.
- the combinations often take significant number of bits to encode but contribute a comparably small amount to the resulting video quality.
- RDO rate distortion optimization
- Areas of application where additional auxiliary information is inserted in the stream after it has been formatted include DRM application where the content will to be encrypted and additional data for authentication and decryption is useful. This may be from a single or first DRM system or from additional DRM systems that aim to include support for additional devices that only support the secondary DRM system.
- Another example includes copy control information that supplies information about how the content can be used, what the allowed outputs are, and how often a user may make a copy.
- Other areas include information that enhances the content and consumption such as subtitles or additional audio tracks that are later added to the stream, and information for a video decoder or video renderer that enables more efficient decoding but are proprietary to a single decoder or thumbnails that are added to be displayed with the content.
- Another example of data that may later be added to the content is Internet links to websites with related information.
- processors such as a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein.
- DSP digital signal processor
- ASIC application specific integrated circuit
- FPGA field programmable gate array
- a general-purpose processor can be a microprocessor, but in the alternative, the processor can be any processor, controller, microcontroller, or state machine.
- a processor can also be implemented as a combination of computing devices, for example, a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration.
- a software module can reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium.
- An exemplary storage medium can be coupled to the processor such that the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium can be integral to the processor.
- the processor and the storage medium can reside in an ASIC.
- device, blocks, or modules that are described as coupled may be coupled via intermediary device, blocks, or modules.
- a first device may be described a transmitting data to (or receiving from) a second device when there are intermediary devices that couple the first and second device and also when the first device is unaware of the ultimate destination of the data.
Abstract
Description
Claims (18)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/213,919 US9271016B2 (en) | 2013-03-15 | 2014-03-14 | Reformatting media streams to include auxiliary data |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361798134P | 2013-03-15 | 2013-03-15 | |
US14/213,919 US9271016B2 (en) | 2013-03-15 | 2014-03-14 | Reformatting media streams to include auxiliary data |
Publications (2)
Publication Number | Publication Date |
---|---|
US20140270705A1 US20140270705A1 (en) | 2014-09-18 |
US9271016B2 true US9271016B2 (en) | 2016-02-23 |
Family
ID=51527445
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US14/213,919 Active US9271016B2 (en) | 2013-03-15 | 2014-03-14 | Reformatting media streams to include auxiliary data |
Country Status (1)
Country | Link |
---|---|
US (1) | US9271016B2 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150040184A1 (en) * | 2014-10-17 | 2015-02-05 | Donald C.D. Chang | Digital Enveloping for Digital Right Management and Re-broadcasting |
US10200692B2 (en) | 2017-03-16 | 2019-02-05 | Cisco Technology, Inc. | Compressed domain data channel for watermarking, scrambling and steganography |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102013017031A1 (en) * | 2013-10-10 | 2015-04-16 | Bernd Korz | Method for playing and separately storing audio and video tracks on the Internet |
US20160048697A1 (en) * | 2014-08-18 | 2016-02-18 | Spatial Digital Systems, Inc. | Enveloping and de-enveloping for Digital Photos via Wavefront Muxing |
US20160048371A1 (en) * | 2014-08-18 | 2016-02-18 | Spatial Digital Systems, Inc. | Enveloping via Digital Audio |
US10264052B2 (en) * | 2014-08-18 | 2019-04-16 | Spatial Digital Systems, Inc. | Enveloping for device independence |
US20160048701A1 (en) * | 2014-08-18 | 2016-02-18 | Spatial Digital Systems, Inc. | Enveloping for remote Digital Camera |
US10674182B2 (en) * | 2015-06-05 | 2020-06-02 | Telefonaktiebolaget Lm Ericsson (Publ) | Pixel pre-processing and encoding |
US20170206933A1 (en) * | 2016-01-19 | 2017-07-20 | Arris Enterprises, Inc. | Systems and methods for indexing media streams for navigation and trick play control |
US10389786B1 (en) * | 2016-09-30 | 2019-08-20 | Amazon Technologies, Inc. | Output tracking for protected content-stream portions |
US20220321627A1 (en) * | 2021-03-31 | 2022-10-06 | Tencent America LLC | Methods and apparatus for just-in-time content preparation in 5g networks |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5691986A (en) * | 1995-06-07 | 1997-11-25 | Hitachi America, Ltd. | Methods and apparatus for the editing and insertion of data into an encoded bitstream |
US5708509A (en) * | 1993-11-09 | 1998-01-13 | Asahi Kogaku Kogyo Kabushiki Kaisha | Digital data processing device |
US5734589A (en) * | 1995-01-31 | 1998-03-31 | Bell Atlantic Network Services, Inc. | Digital entertainment terminal with channel mapping |
US20050157714A1 (en) * | 2002-02-22 | 2005-07-21 | Nds Limited | Scrambled packet stream processing |
US7292602B1 (en) * | 2001-12-27 | 2007-11-06 | Cisco Techonology, Inc. | Efficient available bandwidth usage in transmission of compressed video data |
US7912219B1 (en) * | 2005-08-12 | 2011-03-22 | The Directv Group, Inc. | Just in time delivery of entitlement control message (ECMs) and other essential data elements for television programming |
-
2014
- 2014-03-14 US US14/213,919 patent/US9271016B2/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5708509A (en) * | 1993-11-09 | 1998-01-13 | Asahi Kogaku Kogyo Kabushiki Kaisha | Digital data processing device |
US5734589A (en) * | 1995-01-31 | 1998-03-31 | Bell Atlantic Network Services, Inc. | Digital entertainment terminal with channel mapping |
US5691986A (en) * | 1995-06-07 | 1997-11-25 | Hitachi America, Ltd. | Methods and apparatus for the editing and insertion of data into an encoded bitstream |
US7292602B1 (en) * | 2001-12-27 | 2007-11-06 | Cisco Techonology, Inc. | Efficient available bandwidth usage in transmission of compressed video data |
US20050157714A1 (en) * | 2002-02-22 | 2005-07-21 | Nds Limited | Scrambled packet stream processing |
US7912219B1 (en) * | 2005-08-12 | 2011-03-22 | The Directv Group, Inc. | Just in time delivery of entitlement control message (ECMs) and other essential data elements for television programming |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150040184A1 (en) * | 2014-10-17 | 2015-02-05 | Donald C.D. Chang | Digital Enveloping for Digital Right Management and Re-broadcasting |
US10289856B2 (en) * | 2014-10-17 | 2019-05-14 | Spatial Digital Systems, Inc. | Digital enveloping for digital right management and re-broadcasting |
US10200692B2 (en) | 2017-03-16 | 2019-02-05 | Cisco Technology, Inc. | Compressed domain data channel for watermarking, scrambling and steganography |
Also Published As
Publication number | Publication date |
---|---|
US20140270705A1 (en) | 2014-09-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9271016B2 (en) | Reformatting media streams to include auxiliary data | |
US11245938B2 (en) | Systems and methods for protecting elementary bitstreams incorporating independently encoded tiles | |
US9219940B2 (en) | Fast channel change for hybrid device | |
US8351498B2 (en) | Transcoding video data | |
US20050207569A1 (en) | Methods and apparatus for preparing data for encrypted transmission | |
US20090313652A1 (en) | Ad splicing using re-quantization variants | |
US20180338168A1 (en) | Splicing in adaptive bit rate (abr) video streams | |
US20210168472A1 (en) | Audio visual time base correction in adaptive bit rate applications | |
Shah et al. | A cloud-based transcoding with partial content protection scheme | |
Soto | Aggressive joint compression for DTV simulcast | |
Coelho | Low Cost Transcoder |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: VERIMATRIX, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SOLONSKY, ALEXANDER;ELEFTHEIROU, ANDREAS;THORWIRTH, NIELS;SIGNING DATES FROM 20140219 TO 20140428;REEL/FRAME:032852/0927 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: SILICON VALLEY BANK, CALIFORNIA Free format text: SECURITY INTEREST;ASSIGNOR:VERIMATRIX, INC.;REEL/FRAME:039801/0018 Effective date: 20150908 |
|
AS | Assignment |
Owner name: VERIMATRIX, INC., CALIFORNIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:SILICON VALLEY BANK;REEL/FRAME:048448/0374 Effective date: 20190214 |
|
AS | Assignment |
Owner name: GLAS SAS, AS SECURITY AGENT, FRANCE Free format text: SECURITY INTEREST;ASSIGNOR:VERIMATRIX, INC.;REEL/FRAME:049041/0084 Effective date: 20190429 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2551); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 4 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
FEPP | Fee payment procedure |
Free format text: 7.5 YR SURCHARGE - LATE PMT W/IN 6 MO, SMALL ENTITY (ORIGINAL EVENT CODE: M2555); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YR, SMALL ENTITY (ORIGINAL EVENT CODE: M2552); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY Year of fee payment: 8 |