|Publication number||US7386046 B2|
|Application number||US 10/076,013|
|Publication date||10 Jun 2008|
|Filing date||13 Feb 2002|
|Priority date||13 Feb 2001|
|Also published as||US8054879, US8073047, US8553759, US8867610, US8929442, US8934535, US20020191692, US20080232457, US20090154545, US20100316114, US20110235697, US20140023135, US20140105270, US20140105271|
|Publication number||076013, 10076013, US 7386046 B2, US 7386046B2, US-B2-7386046, US7386046 B2, US7386046B2|
|Inventors||James J. Fallon, Stephen J. McErlain|
|Original Assignee||Realtime Data Llc|
|Export Citation||BiBTeX, EndNote, RefMan|
|Patent Citations (115), Non-Patent Citations (18), Referenced by (65), Classifications (8), Legal Events (2)|
|External Links: USPTO, USPTO Assignment, Espacenet|
This application is based on U.S. provisional application Ser. No. 60/268,394, filed on Feb. 13, 2001, which is fully incorporated herein by reference.
1. Technical Field
The present invention relates generally to data compression and decompression and, in particular, to a system and method for compressing and decompressing data based on an actual or expected throughput (bandwidth) of a system that employs data compression. Additionally the present invention relates to the subsequent storage, retrieval, and management of information in data storage devices utilizing either compression and/or accelerated data storage and retrieval bandwidth.
2. Description of Related Art
There are a variety of data compression algorithms that are currently available, both well defined and novel. Many compression algorithms define one or more parameters that can be varied, either dynamically or a-priori, to change the performance characteristics of the algorithm. For example, with a typical dictionary based compression algorithm such as Lempel-Ziv, the size of the dictionary can affect the performance of the algorithm. Indeed, a large dictionary may be employed to yield very good compression ratios but the algorithm may take a long time to execute. If speed were more important than compression ratio, then the algorithm can be limited by selecting a smaller dictionary, thereby obtaining a much faster compression time, but at the possible cost of a lower compression ratio. The desired performance of a compression algorithm and the system in which the data compression is employed, will vary depending on the application.
Thus, one challenge in employing data compression for a given application or system is selecting one or more optimal compression algorithms from the variety of available algorithms. Indeed, the desired balance between speed and efficiency is typically a significant factor that is considered in determining which algorithm to employ for a given set of data. Algorithms that compress particularly well usually take longer to execute whereas algorithms that execute quickly usually do not compress particularly well.
Accordingly, a system and method that would provide dynamic modification of compression system parameters so as to provide an optimal balance between execution speed of the algorithm (compression rate) and the resulting compression ratio, is highly desirable.
Yet another problem within the current art is data storage and retrieval bandwidth limitations. Modern computers utilize a hierarchy of memory devices. In order to achieve maximum performance levels, modern processors utilize onboard memory and on board cache to obtain high bandwidth access to both program and data. Limitations in process technologies currently prohibit placing a sufficient quantity of onboard memory for most applications. Thus, in order to offer sufficient memory for the operating system(s), application programs, and user data, computers often use various forms of popular off-processor high speed memory including static random access memory (SRAM), synchronous dynamic random access memory (SDRAM), synchronous burst static ram (SBSRAM). Due to the prohibitive cost of the high-speed random access memory, coupled with their power volatility, a third lower level of the hierarchy exists for non-volatile mass storage devices. While mass storage devices offer increased capacity and fairly economical data storage, their data storage and retrieval bandwidth is often much less in relation to the other elements of a computing system.
Computers systems represent information in a variety of manners. Discrete information such as text and numbers are easily represented in digital data. This type of data representation is known as symbolic digital data. Symbolic digital data is thus an absolute representation of data such as a letter, figure, character, mark, machine code, or drawing.
Continuous information such as speech, music, audio, images and video, frequently exists in the natural world as analog information. As is well known to those skilled in the art, recent advances in very large scale integration (VLSI) digital computer technology have enabled both discrete and analog information to be represented with digital data. Continuous information represented as digital data is often referred to as diffuse data. Diffuse digital data is thus a representation of data that is of low information density and is typically not easily recognizable to humans in its native form.
Modern computers utilize digital data representation because of its inherent advantages. For example, digital data is more readily processed, stored, and transmitted due to its inherently high noise immunity. In addition, the inclusion of redundancy in digital data representation enables error detection and/or correction. Error detection and/or correction capabilities are dependent upon the amount and type of data redundancy, available error detection and correction processing, and extent of data corruption.
One outcome of digital data representation is the continuing need for increased capacity in data processing, storage, and transmittal. This is especially true for diffuse data where increases in fidelity and resolution create exponentially greater quantities of data. Data compression is widely used to reduce the amount of data required to process, transmit, or store a given quantity of information. In general, there are two types of data compression techniques that may be utilized either separately or jointly to encode/decode data: lossless and lossy data compression.
Over the last decade, computer processor performance has improved by at least a factor of 50. During this same period, magnetic disk storage has only improved by a factor of 5. Thus one additional problem with the existing art is that memory storage devices severely limit the performance of consumer, entertainment, office, workstation, servers, and mainframe computers for all disk and memory intensive operations.
For example, magnetic disk mass storage devices currently employed in a variety of home, business, and scientific computing applications suffer from significant seek-time access delays along with profound read/write data rate limitations. Currently the fastest available (15,000) rpm disk drives support only a 40.0 Megabyte per second data rate (MB/sec). This is in stark contrast to the modern Personal Computer's Peripheral Component Interconnect (PCI) Bus's input/output capability of 512 MB/sec and internal local bus capability of 1600 MB/sec.
Another problem within the current art is that emergent high performance disk interface standards such as the Small Computer Systems Interface (SCSI-3), iSCSI, Fibre Channel, AT Attachment UltraDMA/100+, Serial Storage Architecture, and Universal Serial Bus offer only higher data transfer rates through intermediate data buffering in random access memory. These interconnect strategies do not address the fundamental problem that all modern magnetic disk storage devices for the personal computer marketplace are still limited by the same typical physical media restriction. In practice, faster disk access data rates are only achieved by the high cost solution of simultaneously accessing multiple disk drives with a technique known within the art as data striping and redundant array of independent disks (RAID).
RAID systems often afford the user the benefit of increased data bandwidth for data storage and retrieval. By simultaneously accessing two or more disk drives, data bandwidth may be increased at a maximum rate that is linear and directly proportional to the number of disks employed. Thus another problem with modern data storage systems utilizing RAID systems is that a linear increase in data bandwidth requires a proportional number of added disk storage devices.
Another problem with most modern mass storage devices is their inherent unreliability. Many modern mass storage devices utilize rotating assemblies and other types of electromechanical components that possess failure rates one or more orders of magnitude higher than equivalent solid state devices. RAID systems employ data redundancy distributed across multiple disks to enhance data storage and retrieval reliability. In the simplest case, data may be explicitly repeated on multiple places on a single disk drive, on multiple places on two or more independent disk drives. More complex techniques are also employed that support various trade-offs between data bandwidth and data reliability.
Standard types of RAID systems currently available include RAID Levels 0, 1, and 5. The configuration selected depends on the goals to be achieved. Specifically data reliability, data validation, data storage/retrieval bandwidth, and cost all play a role in defining the appropriate RAID data storage solution. RAID level 0 entails pure data striping across multiple disk drives. This increases data bandwidth at best linearly with the number of disk drives utilized. Data reliability and validation capability are decreased. A failure of a single drive results in a complete loss of all data. Thus another problem with RAID systems is that low cost improved bandwidth requires a significant decrease in reliability.
RAID Level 1 utilizes disk mirroring where data is duplicated on an independent disk subsystem. Validation of data amongst the two independent drives is possible if the data is simultaneously accessed on both disks and subsequently compared. This tends to decrease data bandwidth from even that of a single comparable disk drive. In systems that offer hot swap capability, the failed drive is removed and a replacement drive is inserted. The data on the failed drive is then copied in the background while the entire system continues to operate in a performance degraded but fully operational mode. Once the data rebuild is complete, normal operation resumes. Hence, another problem with RAID systems is the high cost of increased reliability and associated decrease in performance.
RAID Level 5 employs disk data striping and parity error detection to increase both data bandwidth and reliability simultaneously. A minimum of three disk drives is required for this technique. In the event of a single disk drive failure, that drive may be rebuilt from parity and other data encoded on disk remaining disk drives. In systems that offer hot swap capability, the failed drive is removed and a replacement drive is inserted. The data on the failed drive is then rebuilt in the background while the entire system continues to operate in a performance degraded but fully operational mode. Once the data rebuild is complete, normal operation resumes.
Thus another problem with redundant modern mass storage devices is the degradation of data bandwidth when a storage device fails. Additional problems with bandwidth limitations and reliability similarly occur within the art by all other forms of sequential, pseudo-random, and random access mass storage devices. Typically mass storage devices include magnetic and optical tape, magnetic and optical disks, and various solid-state mass storage devices. It should be noted that the present invention applies to all forms and manners of memory devices including storage devices utilizing magnetic, optical, neural and chemical techniques or any combination thereof.
Yet another problem within the current art is the application and use of various data compression techniques. It is well known within the current art that data compression provides several unique benefits. First, data compression can reduce the time to transmit data by more efficiently utilizing low bandwidth data links. Second, data compression economizes on data storage and allows more information to be stored for a fixed memory size by representing information more efficiently.
For purposes of discussion, data compression is canonically divided into lossy and lossless techniques. Lossy data compression techniques provide for an inexact representation of the original uncompressed data such that the decoded (or reconstructed) data differs from the original unencoded/uncompressed data. Lossy data compression is also known as irreversible or noisy compression. Negentropy is defined as the quantity of information in a given set of data. Thus, one obvious advantage of lossy data compression is that the compression ratios can be larger than that dictated by the negentropy limit, all at the expense of information content. Many lossy data compression techniques seek to exploit various traits within the human senses to eliminate otherwise imperceptible data. For example, lossy data compression of visual imagery might seek to delete information content in excess of the display resolution or contrast ratio of the target display device.
On the other hand, lossless data compression techniques provide an exact representation of the original uncompressed data. Simply stated, the decoded (or reconstructed) data is identical to the original unencoded/uncompressed data. Lossless data compression is also known as reversible or noiseless compression. Thus, lossless data compression has, as its current limit, a minimum representation defined by the entropy of a given data set.
A rich and highly diverse set of lossless data compression and decompression algorithms exist within the current art. These range from the simplest “adhoc” approaches to highly sophisticated formalized techniques that span the sciences of information theory, statistics, and artificial intelligence. One fundamental problem with almost all modern approaches is the compression ratio to encoding and decoding speed achieved. As previously stated, the current theoretical limit for data compression is the entropy limit of the data set to be encoded. However, in practice, many factors actually limit the compression ratio achieved. Most modern compression algorithms are highly content dependent. Content dependency exceeds the actual statistics of individual elements and often includes a variety of other factors including their spatial location within the data set.
Of popular compression techniques, arithmetic coding possesses the highest degree of algorithmic effectiveness, and as expected, is the slowest to execute. This is followed in turn by dictionary compression, Huffman coding, and run-length coding with respectively decreasing execute times. What is not apparent from these algorithms, that is also one major deficiency within the current art, is knowledge of their algorithmic efficiency. More specifically, given a compression ratio that is within the effectiveness of multiple algorithms, the question arises as their corresponding efficiency.
Within the current art there also presently exists a strong inverse relationship between achieving the maximum (current) theoretical compression ratio, which we define as algorithmic effectiveness, and requisite processing time. For a given single algorithm the effectiveness over a broad class of data sets including text, graphics, databases, and executable object code is highly dependent upon the processing effort applied. Given a baseline data set, processor operating speed and target architecture, along with its associated supporting memory and peripheral set, we define algorithmic efficiency as the time required to achieve a given compression ratio. Algorithmic efficiency assumes that a given algorithm is implemented in an optimum object code representation executing from the optimum places in memory. This is almost never achieved in practice due to limitations within modern optimizing software compilers. It should be further noted that an optimum algorithmic implementation for a given input data set may not be optimum for a different data set. Much work remains in developing a comprehensive set of metrics for measuring data compression algorithmic performance, however for present purposes the previously defined terms of algorithmic effectiveness and efficiency should suffice.
Various solutions to this problem of optimizing algorithmic implementation are found in U.S. Pat. Nos. 6,195,024 and 6,309,424, issued on Feb. 27, 2001 and Oct. 30, 2001, respectively, to James Fallon, both of which are entitled “Content Independent Data Compression Method and System,” and are incorporated herein by reference. These patents describe data compression methods that provide content-independent data compression, wherein an optimal compression ratio for an encoded stream can be achieved regardless of the data content of the input data stream. As more fully described in the above incorporated patents, a data compression protocol comprises applying an input data stream to each of a plurality of different encoders to, in effect, generate a plurality of encoded data streams. The plurality of encoders are preferably selected based on their ability to effectively encode different types of input data. The final compressed data stream is generated by selectively combining blocks of the compressed streams output from the plurality of encoders based on one or more factors such as the optimal compression ratios obtained by the plurality of decoders. The resulting compressed output stream can achieve the greatest possible compression, preferably in real-time, regardless of the data content.
Yet another problem within the current art relates to data management and the use of existing file management systems. Present computer operating systems utilize file management systems to store and retrieve information in a uniform, easily identifiable, format. Files are collections of executable programs and/or various data objects. Files occur in a wide variety of lengths and must be stored within a data storage device. Most storage devices, and in particular, mass storage devices, work most efficiently with specific quantities of data. For example, modern magnetic disks are often divided into cylinders, heads and sectors. This breakout arises from legacy electromechanical considerations with the format of an individual sector often some binary multiple of bytes (512, 1024, . . . ). A fixed or variable quantity of sectors housed on an individual track. The number of sectors permitted on a single track is limited by the number of reliable flux reversals that can be encoded on the storage media per linear inch, often referred to as linear bit density. In disk drives with multiple heads and disk media, a single cylinder is comprised of multiple tracks.
A file allocation table is often used to organize both used and unused space on a mass storage device. Since a file often comprises more than one sector of data, and individual sectors or contiguous strings of sectors may be widely dispersed over multiple tracks and cylinders, a file allocation table provides a methodology of retrieving a file or portion thereof. File allocation tables are usually comprised of strings of pointers or indices that identify where various portions of a file are stored.
In-order to provide greater flexibility in the management of disk storage at the media side of the interface, logical block addresses have been substituted for legacy cylinder, head, sector addressing. This permits the individual disk to optimize its mapping from the logical address space to the physical sectors on the disk drive. Advantages with this technique include faster disk accesses by allowing the disk manufacturer greater flexibility in managing data interleaves and other high-speed access techniques. In addition, the replacement of bad media sectors can take place at the physical level and need not be the concern of the file allocation table or host computer. Furthermore, these bad sector replacement maps are definable on a disk by disk basis.
Practical limitations in the size of the data required to both represent and process an individual data block address, along with the size of individual data blocks, governs the type of file allocation tables currently in use. For example, a 4096 byte logical block size (8 sectors) employed with 32 bit logical block addresses. This yields an addressable data space of 17.59 Terabytes. Smaller logical blocks permit more efficient use of disk space. Larger logical blocks support a larger addressable data space. Thus one limitation within the current art is that disk file allocation tables and associated file management systems are a compromise between efficient data storage, access speed, and addressable data space.
Data in a computer has various levels of information content. Even within a single file, many data types and formats are utilized. Each data representation has specific meaning and each may hold differing quantities of information. Within the current art, computers process data in a native, uncompressed, format. Thus compressed data must often be decompressed prior to performing various data processing functions or operations. Modern file systems have been designed to work with data in its native format. Thus another significant problem within the current art is that file systems are not able to randomly access compressed data in an efficient manner.
Further aggravating this problem is the fact that when data is decompressed, processed and recompressed it may not fit back into its original disk space, causing disk fragmentation or complex disk space reallocation requirements. Several solutions exist within the current art including file by file and block structured compressed data management.
In file by file compression, each file is compressed when stored on disk and decompressed when retrieved. For very small files this technique is often adequate, however for larger files the compression and decompression times are too slow, resulting in inadequate system level performance. In addition, the ability to access randomly access data within a specific file is lost. The one advantage to file by file compression techniques is that they are easy to develop and are compatible with existing file systems. Thus file by file compressed data management is not an adequate solution.
Block structured disk compression operates by compressing and decompressing fixed block sizes of data. Block sizes are often fixed, but may be variable in size. A single file usually is comprised of multiple blocks, however a file may be so small as to fit within a single block. Blocks are grouped together and stored in one or more disk sectors as a group of Blocks (GOBs). A group of blocks is compressed and decompressed as a unit, thus there exists practical limitations on the size of GOBs. Most compression algorithms achieve a higher level of algorithmic effectiveness when operating on larger quantities of data. Restated, the larger the quantity of data processed with a uniform information density, the higher the compressions ratio achieved. If GOBs are small compression ratios are low and processing time short. Conversely, when GOBS are large compression ratios are higher and processing time is longer. Large GOBs tend to perform in a manner analogous to file by file compression. The two obvious benefits to block structured disk compression are psuedo-random data access and reduced data compression/decompression processing time.
Several problems exist within the current art for the management of compressed blocks. One method for storage of compressed files on disk is by contiguously storing all GOBs corresponding to a single file. However as files are processed within the computers, files may grow or shrink in size. Inefficient disk storage results when a substantial file size reduction occurs. Conversely when a file grows substantially, the additional space required to store the data may not be available contiguously. The result of this process is substantial disk fragmentation and slower access times.
An alternate method is to map compressed GOBs into the next logical free space on the disk. One problem with this method is that average file access times are substantially increased by this technique due to the random data storage. Peak access delays may be reduced since the statistics behave with a more uniform white spectral density, however this is not guaranteed.
A further layer of complexity is encountered when compressed information is to be managed on more than one data storage device. Competing requirements of data access bandwidth, data reliability/redundancy, and efficiency of storage space are encountered.
These and other limitations within the current art are solved with the present invention.
The present invention is directed to a system and method for compressing and decompressing based on the actual or expected throughput (bandwidth) of a system employing data compression and a technique of optimizing based upon planned, expected, predicted, or actual usage.
In one aspect of the present invention, a system for providing bandwidth sensitive data compression comprises:
a data compression system for compressing and decompressing data input to the system;
a plurality of compression routines selectively utilized by the data compression system; and
a controller for tracking the throughput of the system and generating a control signal to select a compression routine based on the system throughput. In a preferred embodiment, when the controller determines that the system throughput falls below a predetermined throughput threshold, the controller commands the data compression engine to use a compression routine providing a faster rate of compression so as to increase the throughput.
In another aspect, a system for providing bandwidth sensitive data compression comprises a plurality of access profiles, operatively accessible by the controller that enables the controller to determine a compression routine that is associated with a data type of the data to be compressed. The access profiles comprise information that enables the controller to select a suitable compression algorithm that provides a desired balance between execution speed (rate of compression) and efficiency (compression ratio).
In yet another aspect, a system comprises a data storage controller for controlling the compression and storage of compressed data to a storage device and the retrieval and decompression of compressed data from the storage device. The system throughput tracked by the controller preferably comprises a number of pending access requests to a storage device.
In another aspect, the system comprises a data transmission controller for controlling the compression and transmission of compressed data, as well as the decompression of compressed data received over a communication channel. The system throughput tracked by the controller comprises a number of pending transmission requests over the communication channel.
In yet another aspect of the present invention, a method for providing bandwidth sensitive data compression in a data processing system, comprises the steps of:
compressing data using an first compression routine providing a first compression rate;
tracking the throughput of the data processing system to determine if the first compression rate provides a throughput that meets a predetermined throughput threshold; and
compressing data using a second compression routine providing a second compression rate that is greater than the first compression rate, if the tracked throughput does not meet the predetermined throughput threshold.
Preferably, the first compression routine comprises a default asymmetric routine and wherein the second compression routine comprises a symmetric routine.
In another aspect, the method comprises processing a user command to load a user-selected compression routine for compressing data.
In another aspect, the method further comprises processing a user command to compress user-provided data and automatically selecting a compression routine associated with a data type of the user-provided data.
These and other aspects, features and advantages of the present invention will become apparent from the following detailed description of preferred embodiments, which is to be read in connection with the accompanying drawings.
The present invention is directed to a system and method for compressing and decompressing based on the actual or expected throughput (bandwidth) of a system employing data compression. Although one of ordinary skill in the art could readily envision various implementations for the present invention, a preferred system in which this invention is employed comprises a data storage controller that preferably utilizes a real-time data compression system to provide “accelerated” data storage and retrieval bandwidths. The concept of “accelerated” data storage and retrieval was introduced in co-pending U.S. patent application Ser. No. 09/266,394, filed Mar. 11, 1999, entitled “System and Methods For Accelerated Data Storage and Retrieval” and co-pending U.S. patent application Ser. No. 09/481,243, filed Jan. 11, 2000, entitled “System and Methods For Accelerated Data Storage and Retrieval,” both of which are commonly assigned and incorporated herein by reference.
In general, as described in the above-incorporated applications, “accelerated” data storage comprises receiving a digital data stream at a data transmission rate which is greater than the data storage rate of a target storage device, compressing the input stream at a compression rate that increases the effective data storage rate of the target storage device and storing the compressed data in the target storage device. For instance, assume that a mass storage device (such as a hard disk) has a data storage rate of 20 megabytes per second. If a storage controller for the mass storage device is capable of compressing (in real time) an input data stream with an average compression rate of 3:1, then data can be stored in the mass storage device at a rate of 60 megabytes per second, thereby effectively increasing the storage bandwidth (“storewidth”) of the mass storage device by a factor of three. Similarly, accelerated data retrieval comprises retrieving a compressed digital data stream from a target storage device at the rate equal to, e.g., the data access rate of the target storage device and then decompressing the compressed data at a rate that increases the effective data access rate of the target storage device. Advantageously, providing accelerated data storage and retrieval at (or close to) real-time can reduce or eliminate traditional bottlenecks associated with, e.g., local and network disk accesses.
In a preferred embodiment, the present invention is implemented for providing accelerated data storage and retrieval. In one embodiment, a controller tracks and monitors the throughput (data storage and retrieval) of a data compression system and generates control signals to enable/disable different compression algorithms when, e.g., a bottleneck occurs so as to increase the throughput and eliminate the bottleneck.
In the following description of preferred embodiments, two categories of compression algorithms are defined—an “asymmetrical” data compression algorithm and a “symmetrical data compression algorithms. An asymmetrical data compression algorithm is referred to herein as one in which the execution time for the compression and decompression routines differ significantly. In particular, with an asymmetrical algorithm, either the compression routine is slow and the decompression routine is fast or the compression routine is fast and the decompression routine is slow. Examples of asymmetrical compression algorithms include dictionary-based compression schemes such as Lempel-Ziv.
On the other hand, a “symmetrical” data compression algorithm is referred to herein as one in which the execution time for the compression and the decompression routines are substantially similar. Examples of symmetrical algorithms include table-based compression schemes such as Huffman. For asymmetrical algorithms, the total execution time to perform one compress and one decompress of a data set is typically greater than the total execution time of symmetrical algorithms. But an asymmetrical algorithm typically achieves higher compression ratios than a symmetrical algorithm.
It is to be appreciated that in accordance with the present invention, symmetry may be defined in terms of overall effective bandwidth, compression ratio, or time or any combination thereof. In particular, in instances of frequent data read/writes, bandwidth is the optimal parameter for symmetry. In asymmetric applications such as operating systems and programs, the governing factor is net decompression bandwidth, which is a function of both compression speed, which governs data retrieval time, and decompression speed, wherein the total governs the net effective data read bandwidth. These factors work in an analogous manner for data storage where the governing factors are both compression ratio (storage time) and compression speed. The present invention applies to any combination or subset thereof, which is utilized to optimize overall bandwidth, storage space, or any operating point in between.
Referring now to
The compression system 12 is operatively connected to the storage medium 14 using suitable protocols to write and read compressed data to and from the storage medium 14. It is to be understood that the storage medium 14 may comprise any form of memory device including all forms of sequential, pseudo-random, and random access storage devices. The memory storage device 14 may be volatile or non-volatile in nature, or any combination thereof. Storage devices as known within the current art include all forms of random access memory, magnetic and optical tape, magnetic and optical disks, along with various other forms of solid-state mass storage devices. Thus it should be noted that the current invention applies to all forms and manners of memory devices including, but not limited to, storage devices utilizing magnetic, optical, and chemical techniques, or any combination thereof. The data compression system 12 preferably operates in real-time (or substantially real-time) to compress data to be stored on the storage device 14 and to decompress data that is retrieved from the storage device 14. In addition, the compression system 12 may receive data (compressed or not compressed) via an I/O (input/output) port 16 that is transmitted over a transmission line or communication channel from a remote location, and then process such data (e.g., decompress or compress the data). The compression system 12 may further transmit data (compressed or decompressed) via the I/O port 16 to another network device for remote processing or storage.
The controller 11 utilizes information comprising a plurality of data profiles 15 to determine which compression algorithms 13 should be used by the compression system 12. In a preferred embodiment, the compression algorithms 13 comprise one or more asymmetric algorithms. As noted above, with asymmetric algorithms, the compression ratio is typically greater than the compression ratios obtained using symmetrical algorithms. Preferably, a plurality of asymmetric algorithms are selected to provide one or more asymmetric algorithms comprising a slow compress and fast decompress routine, as well as one or more asymmetric algorithms comprising a fast compress and slow decompress routine.
The compression algorithms 14 further comprise one or more symmetric algorithms, each having a compression rate and corresponding decompression rate that is substantially equal. Preferably, a plurality of symmetric algorithms are selected to provide a desired range of compression and decompression rates for data to be processed by a symmetric algorithm.
In a preferred embodiment, the overall throughput (bandwidth) of the system 10 is one factor considered by the controller 11 in deciding whether to use an asymmetrical or symmetrical compression algorithm for processing data stored to, and retrieved from, the storage device 14. Another factor that is used to determine the compression algorithm is the type of data to be processed. In a preferred embodiment, the data profiles 15 comprise information regarding predetermined access profiles of different data sets, which enables the controller 11 to select a suitable compression algorithm based on the data type. For instance, the data profiles may comprise a map that associates different data types (based on, e.g., a file extension) with preferred one(s) of the compression algorithms 13. For example, preferred access profiles considered by the controller 11 are set forth in the following table.
Access Profile 1:
Access Profile 3
Data is written to a
Data is written
The amount of times data
storage medium once
to the storage
is read from and written
(or very few times)
to the storage medium is
but is read from the
but read few
substantially the same.
storage medium many
With Access Profile 1, the decompression routine would be executed significantly more times than the corresponding compression routine. This is typical with operating systems, applications and websites, for example. Indeed, an asymmetrical application can be used to (offline) compress an (OS) operating system, application or Website using a slow compression routine to achieve a high compression ratio. After the compressed OS, application or website is stored, the asymmetric algorithm is then used during runtime to decompress, at a significant rate, the OS, application or website launched or accessed by a user.
Therefore, with data sets falling within Access Profile 1, it is preferable to utilize an asymmetrical algorithm that provides a slow compression routine and a fast decompression routine so as to provide an increase in the overall system performance as compared the performance that would be obtained using a symmetrical algorithm. Further, the compression ratio obtained using the asymmetrical algorithm would likely be higher than that obtained using a symmetrical algorithm (thus effectively increasing the storage capacity of the storage device).
With Access Profile 2, the compression routine would be executed significantly more times than the decompression routine. This is typical with a system for automatically updating an inventory database, for example, wherein an asymmetric algorithm that provides a fast compression routine and a slow decompression routine would provide an overall faster (higher throughput) and efficient (higher compression ratio) system performance than would be obtained using a symmetrical algorithm.
With Access Profile 3, where data is accessed with a similar number of reads and writes, the compression routine would be executed approximately the same number of times as the decompression routine. This is typical of most user-generated data such as documents and spreadsheets. Therefore, it is preferable to utilize a symmetrical algorithm that provides a relatively fast compression and decompression routine. This would result in an overall system performance that would be faster as compared to using an asymmetrical algorithm (although the compression ratio achieved may be lower).
The following table summarizes the three data access profiles and the type of compression algorithm that would produce optimum throughput.
In accordance with the present invention, the access profile of a given data set is known a priori or determined prior to compression so that the optimum category of compression algorithm can be selected. As explained below, the selection process may be performed either manually or automatically by the controller 11 of the data compression system 12. Further, the decision regarding which routines will be used at compression time (write) and at decompression time (read) is preferably made before or at the time of compression. This is because once data is compressed using a certain algorithm, only the matching decompression routine can be used to decompress the data, regardless of how much processing time is available at the time of decompression.
Referring now to
In a preferred embodiment, the default algorithm comprises an asymmetrical algorithm since an operating system and application programs will be read from hard disk memory and decompressed during the initial use of the system 10. Indeed, as discussed above, an asymmetric algorithm that provides slow compression and fast decompression is preferable for compressing operating systems and applications so as to obtain a high compression ratio (to effectively increase the storage capacity of the hard disk) and fast data access (to effectively increase the retrieval rate from the hard disk). The initial asymmetric routine that is applied (by, e.g., a vendor) to compress the operating system and applications is preferably set as the default. The operating system will be retrieved and then decompressed using the default asymmetric routine (step 21).
During initial runtime, the controller will maintain use the default algorithm until certain conditions are met. For instance, if a read command is received (affirmative result in step 22), the controller will determine whether the data to be read from disk can be compressed using the current routine (step 23). For this determination, the controller could, e.g., read a flag value that indicates the algorithm that was used to compress the file. If the data can be decompressed using the current algorithm (affirmative determination in step 23), then the file will be retrieved and decompressed (step 25). On the other hand, if the data cannot be decompressed using the current algorithm (negative determination in step 23), the controller will issue the appropriate control signal to the compression system to load the algorithm associated with the file (step 24) and, subsequently, decompress the file (step 25).
If a write command is received (affirmative result in step 26), the data to be stored will be compressed using the current algorithm (step 27). During the process of compression and storing the compressed data, the controller will track the throughput to determine whether the throughput is meeting a predetermined threshold (step 28). For example, the controller may track the number of pending disk accesses (access requests) to determine whether a bottleneck is occurring. If the throughput of the system is not meeting the desired threshold (e.g., the compression system cannot maintain the required or requested data rates)(negative determination in step 28), then the controller will command the data compression system to utilize a compression routine providing faster compression (e.g., a fast symmetric compression algorithm) (step 29) so as to mitigate or eliminate the bottleneck.
If, on the other hand, the system throughput is meeting or exceeding the threshold (affirmative determination in step 28) and the current algorithm being used is a symmetrical routine (affirmative determination in step 30), in an effort to achieve optimal compression ratios, the controller will command the data compression system to use an asymmetric compression algorithm (step 31) that may provide a slower rate of compression, but provide efficient compression.
This process is repeated such that whenever the controller determines that the compression system can maintain the required/requested data throughput using a slow (highly efficient) asymmetrical compression algorithm, the controller will allow the compression system to operate in the asymmetrical mode. This will allow the system to obtain maximum storage capacity on the disk. Further, the controller will command the compression system to use a symmetric routine comprising a fast compression routine when the desired throughput is not met. This will allow the system to, e.g., service the backlogged disk accesses. Then, when the controller determines that the required/requested data rates are subsequently lower and the compression system can maintain the data rate, the controller can command the compression system to use a slower (but more efficient) asymmetric compression algorithm.
With the above-described method depicted in
It is to be appreciated that the present invention may be implemented in any data processing system, device, or apparatus using data compression. For instance, the present invention may be employed in a data transmission controller in a network environment to provide accelerated data transmission over a communication channel (i.e., effectively increase the transmission bandwidth by compressing the data at the source and decompressing data at the receiver, in real-time).
Further, the present invention can be implemented with a data storage controller utilizing data compression and decompression to provided accelerated data storage and retrieval from a mass storage device. Exemplary embodiments of preferred data storage controllers in which the present invention may be implemented are described, for example, in U.S. patent application Ser. No. 09/775,905, filed on Feb. 2, 2001, entitled “Data Storewidth Accelerator”, which is commonly assigned and fully incorporated herein by reference.
The data storage controller 120 further comprises a plurality of memory devices including a RAM (random access memory) device 123 and a ROM (read only memory) device 124 (or FLASH memory or other types of non-volatile memory). The RAM device 123 is utilized as on-board cache and is preferably implemented as SDRAM. The ROM device 124 is utilized for non-volatile storage of logic code associated with the DSP 121 and configuration data used by the DSP 121 to program the programmable logic device 122.
The DSP 121 is operatively connected to the memory devices 123, 124 and the programmable logic device 122 via a local bus 125. The DSP 121 is also operatively connected to the programmable logic device 122 via an independent control bus 126. The programmable logic device 122 provides data flow control between the DSP 121 and the host computer system attached to the bus 116, as well as data flow control between the DSP 121 and the storage device. A plurality of external I/O ports 127 are included for data transmission and/or loading of one or more programmable logic devices. Preferably, the disk interface 114 driven by the programmable logic device 122 supports a plurality of hard drives.
The storage controller 120 further comprises computer reset and power up circuitry 128 (or “boot configuration circuit”) for controlling initialization (either cold or warm boots) of the host computer system and storage controller 120. A preferred boot configuration circuit and preferred computer initialization systems and protocols are described in U.S. patent application Ser. No. 09/775,897, filed on Feb. 2, 2001, entitled “System and Methods For Computer Initialization,” which is commonly assigned and incorporated herein by reference. Preferably, the boot configuration circuit 128 is employed for controlling the initializing and programming the programmable logic device 122 during configuration of the host computer system (i.e., while the CPU of the host is held in reset). The boot configuration circuit 128 ensures that the programmable logic device 122 (and possibly other volatile or partially volatile logic devices) is initialized and programmed before the bus 116 (such as a PCI bus) is fully reset. In particular, when power is first applied to the boot configuration circuit 128, the boot configuration circuit 28 generates a control signal to reset the local system (e.g., storage controller 120) devices such as a DSP, memory, and I/O interfaces. Once the local system is powered-up and reset, the controlling device (such as the DSP 121) will then proceed to automatically determine the system environment and configure the local system to work within that environment. By way of example, the DSP 121 of the disk storage controller 120 would sense that the data storage controller 120 is on a PCI computer bus (expansion bus) and has attached to it a hard disk on an IDE interface. The DSP 121 would then load the appropriate PCI and IDE interfaces into the programmable logic device 122 prior to completion of the host system reset. Once the programmable logic device 122 is configured for its environment, the boot device controller is reset and ready to accept commands over the computer/expansion bus 116.
It is to be understood that the data storage controller 120 may be utilized as a controller for transmitting data (compressed or uncompressed) to and from remote locations over the DSP I/O ports 127 or system bus 116, for example. Indeed, the I/O ports 127 of the DSP 121 may be used for transmitting data (compressed or uncompressed) that is either retrieved from the disk or received from the host system via the bus 116, to remote locations for processing and/or storage. Indeed, the I/O ports may be operatively connected to other data storage controllers or to a network communication channels. Likewise, the data storage controller 120 may receive data (compressed or uncompressed) over the I/O ports 127 of the DSP 121 from remote systems that are connected to the I/O ports 127 of the DSP, for local processing by the data storage controller 120. For instance, a remote system may remotely access the data storage controller 120 (via the I/O ports of the DSP or system bus 116) to utilize the data compression, in which case the data storage controller 120 would transmit the compressed data back to the system that requested compression.
In accordance with the present invention, the system (e.g., data storage controller 120) preferably boots-up in a mode using asymmetrical data compression. It is to be understood that the boot process would not be affected whether the system boots up defaulting to an asymmetrical mode or to a symmetrical mode. This is because during the boot process of the computer, it is reading the operating system from the disk, not writing. However, once data is written to the disk using a compression algorithm, it must retrieve and read the data using the corresponding decompression algorithm.
As the user creates, deletes and edits files, the disk controller 120 will preferably utilize an asymmetrical compression routine that provides slow compression and fast decompression. Since using the asymmetrical compression algorithm will provide slower compression than a symmetrical algorithm, the file system of the computer will track whether the disk controller 120 has disk accesses pending. If the disk controller 120 does have disk accesses pending and the system is starting to slow down, the file management system will command the disk controller 120 to use a faster symmetrical compression algorithm. If there are no disk access requests pending, the file management system will leave the disk controller in the mode of using the asymmetrical compression algorithm.
If the disk controller 120 was switched to using a symmetrical algorithm, the file management system will preferably signal the controller to switch back to a default asymmetrical algorithm when, e.g., the rate of the disk access requests slow to the point where there are no pending disk accesses.
At some point a user may decide to install software or load files onto the hard disk. Before installing the software, for example, as described above, the user could indicate to the disk controller 120 (via a software utility) to enter and remain in an asymmetric mode using an asymmetric compression algorithm with a slow compression routine and a very fast decompression routine. The disk controller would continue to use the asymmetrical algorithm until commanded otherwise, regardless of the number of pending disk accesses. Then, after completing the software installation, the user would then release the disk controller from this “asymmetrical only” mode of operation (via the software utility).
Again, when the user is not commanding the disk controller 120 to remain in a certain mode, the file management system will determine whether the disk controller should use the asymmetrical compression algorithms or the symmetrical compression algorithms based on the amount of backlogged disk activity. If the backlogged disk activity exceeds a threshold, then the file management system will preferably command the disk controller to use a faster compression algorithm, even though compression performance may suffer. Otherwise, the file management system will command the disk controller to use the asymmetrical algorithm that will yield greater compression performance.
It is to be appreciated that the data compression methods described herein by be integrated or otherwise implemented with the content independent data compression methods described in the above-incorporated U.S. Pat. Nos. 6,195,024 and 6,309,424.
In yet another embodiment of the present invention a virtual file management system is utilized to store, retrieve, or transmit compressed and/or accelerated data. In one embodiment of the present invention, a physical or virtual disk is utilized employing a representative file system format as illustrated in
The “virtual block table address” denotes the physical address of the virtual block table. The “virtual block table size” denotes the size of the virtual block table.
The “allocation size” refers to the minimum number of contiguous sectors on the disk to reserve for each new data entry. For example, assuming that 4 sectors are allowed for each allocation and that a compressed data entry requires only 1 sector, then the remaining 3 sectors would be left unused. Then, if that piece of data were to be appended, there would be room to increase the data while remaining contiguous on the disk. Indeed, by maintaining the data contiguously, the speed at which the disk can read and write the data will increase. Although the controller preferably attempts to keep these unused sectors available for expansion of the data, if the disk were to fill up, the controller could use such sectors to store new data entries. In this way, a system can be configured to achieve greater speed, while not sacrificing disk space. Setting the allocation size to 1 sector would effectively disable this feature.
The “number Of free sectors” denotes the number of physical free sectors remaining on the disk. The ID (“Magic) number” identifies this data as a Superblock. The “checksum” comprises a number that changes based on the data in the Superblock and is used for error checking. Preferably, this number is chosen so that all of the words in the Superblock (including the checksum) added up are equal to zero.
The “virtual block table” (VBT) comprises a number of “sector map” entries, one for each grouping of compressed data (or chunks). The VBT may reside anywhere on the disk. The size of the VBT will depend on how much data is on the disk. Each sector map entry comprises 8 bytes. Although there is preferably only one VBT on the disk, each chunk of compressed data will have a copy of its sector map entry in its header. If the VBT were to become corrupted, scanning the disk for all sector maps could create a new one.
The term “type” refers to the sector map type. For example, a value of “00” corresponds to this sector map definition. Other values are preferably reserved for future redefinitions of the sector map.
A “C Type” denotes a compression type. A value of “000” will correspond to no compression. Other values are defined as required depending on the application. This function supports the use of multiple compression algorithms along with the use of various forms of asymmetric data compression.
The “C Info” comprises the compression information needed for the given compression type. These values are defined depending on the application. In addition, the data may be tagged based on it's use—for example operating system “00”, Program “01”, or data “10”. Frequency of use or access codes may also be included. The size of this field may be greatly expanded to encode statistics supporting these items including, for example, cumulative number of times accessed, number of times accessed within a given time period or CPU clock cycles, and other related data.
The “sector count” comprises the number of physical sectors on the disk that are used for this chunk of compressed data. The “LBA” refers to the logical block address, or physical disk address, for this chunk of compressed data.
Referring back to
The “sector map” comprises a copy of the sector map entry in the VBT for this data chunk. The “VBI” is the Virtual Block Index, which is the index into the VBT that corresponds to this data chunk. The “ID (“Magic) Number” identifies this data as a data block header. The “checksum” number will change based on the data in the header and is used for error checking. This number is preferably chosen such that the addition of all the words in the header (including the checksum) will equal zero.
It should be noted that the present invention is not limited to checksums but may employ any manner of error detection and correction techniques, utilizing greatly expanded fields error detection and/or correction.
It should be further noted that additional fields may be employed to support encryption, specifically an identifier for encrypted or unencrypted data along with any parameters necessary for routing or processing the data to an appropriate decryption module or user.
The virtual size of the disk will depend on the physical size of the disk, the compress size selected, and the expected compression ratio. For example, assume there is a 75 GB disk with a selected compress size expecting a 3:1 compression ratio, the virtual disk size would be 225 GB. This will be the maximum amount of uncompressed data that the file system will be able to store on the disk.
If the number chosen is too small, then the entire disk will not be utilized. Consider the above example where a system comprises a 75 GB disk and a 225 GB virtual size. Assume that in actuality during operation the average compression ratio obtained is 5:1. Whereas this could theoretically allow 375 GB to be stored on the 75 GB disk, in practice, only 225 GB would be able to be stored on the disk before a “disk full” message is received. Indeed, with a 5:1 compression ratio, the 225 GB of data would only take up 45 GB on the disk leaving 30 GB unused. Since the operating system would think the disk is full, it would not attempt to write any more information to the disk.
On the other hand, if the number chosen is too large, then the disk will fill up when the operating system would still indicate that there was space available on the disk. Again consider the above example where a system comprises a 75 GB disk and a 225 GB virtual size. Assume further that during operation, the average compression ratio actually obtained is only 2:1. In this case, the physical disk would be full after writing 150 GB to it, but the operating system would still think there is 75 GB remaining. If the operating system tried to write more information to the disk, an error would occur.
Thus, in another embodiment of the present invention, the virtual size of the disk is dynamically altered based upon the achieved compression ratio. In one embodiment, a running average may be utilized to reallocate the virtual disk size. Alternatively, certain portions of the ratios may already be known—such as a preinstalled operating system and programs. Thus, this ratio is utilized for that portion of the disk, and predictive techniques are utilized for the balance of the disk or disks.
Yet in another embodiment, users are prompted for setup information and the computer selects the appropriate virtual disk(s) size or selects the best method of estimation based on, e.g., a high level menu of what is the purpose of this computer: home, home office, business, server. Another submenu may ask for the expected data mix, word, excel, video, music, etc. Then, based upon expected usage and associated compression ratios (or the use of already compressed data in the event of certain forms of music and video) the results are utilized to set the virtual disk size.
It should be noted that the present invention is independent of the number or types of physical or virtual disks, and indeed may be utilized with any type of storage.
It is to be understood that the systems and methods described herein may be implemented in various forms of hardware, software, firmware, special purpose processors, or a combination thereof. In particular, the present invention may be implemented as an application comprising program instructions that are tangibly embodied on a program storage device (e.g., magnetic floppy disk, RAM, ROM, CD ROM, etc.) and executable by any device or machine comprising suitable architecture. It is to be further understood that, because some of the constituent system components and process steps depicted in the accompanying Figures are preferably implemented in software, the actual connections between such components and steps may differ depending upon the manner in which the present invention is programmed. Given the teachings herein, one of ordinary skill in the related art will be able to contemplate these and similar implementations or configurations of the present invention.
Although illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the present system and method is not limited to those precise embodiments, and that various other changes and modifications may be affected therein by one skilled in the art without departing from the scope or spirit of the invention. All such changes and modifications are intended to be included within the scope of the invention as defined by the appended claims.
|Cited Patent||Filing date||Publication date||Applicant||Title|
|US4127518||16 Jun 1977||28 Nov 1978||Coy David Howard||Novel derivatives of gamma-endorphins, intermediates therefor, and compositions and methods employing said derivatives|
|US4302775||15 Dec 1978||24 Nov 1981||Compression Labs, Inc.||Digital video compression system and methods utilizing scene adaptive coding with rate buffer feedback|
|US4394774||26 Jun 1981||19 Jul 1983||Compression Labs, Inc.||Digital video compression system and methods utilizing scene adaptive coding with rate buffer feedback|
|US4574351||3 Mar 1983||4 Mar 1986||International Business Machines Corporation||Apparatus for compressing and buffering data|
|US4593324||26 Sep 1985||3 Jun 1986||Fuji Xerox Co., Ltd.||Image data storing device|
|US4682150||9 Dec 1985||21 Jul 1987||Ncr Corporation||Data compression method and apparatus|
|US4730348||19 Sep 1986||8 Mar 1988||Adaptive Computer Technologies||Adaptive data compression system|
|US4754351||25 Mar 1985||28 Jun 1988||Maxtor Corporation||Method and apparatus for controlling radial disk displacement in Winchester disk drives|
|US4804959||10 Nov 1987||14 Feb 1989||International Business Machines Corporation||Method and apparatus using multiple codes to increase storage capacity|
|US4870415||19 Oct 1987||26 Sep 1989||Hewlett-Packard Company||Data compression system with expansion protection|
|US4872009||7 Dec 1987||3 Oct 1989||Hitachi, Ltd.||Method and apparatus for data compression and restoration|
|US4876541||15 Oct 1987||24 Oct 1989||Data Compression Corporation||Stem for dynamically compressing and decompressing electronic data|
|US4888812 *||18 Dec 1987||19 Dec 1989||International Business Machines Corporation||Document image processing system|
|US4906995||12 Dec 1986||6 Mar 1990||Sangamo Weston, Inc.||Data compression apparatus and method for data recorder|
|US4929946 *||9 Feb 1989||29 May 1990||Storage Technology Corporation||Adaptive data compression apparatus including run length encoding for a tape drive system|
|US4953324||30 Mar 1989||4 Sep 1990||Nova-Tech Engineering, Inc.||Personnel door for a RF shielded room|
|US4965675||12 May 1988||23 Oct 1990||Canon Kabushiki Kaisha||Method and apparatus for after-recording sound on a medium having pre-recorded video thereon|
|US4988998||5 Sep 1989||29 Jan 1991||Storage Technology Corporation||Data compression system for successively applying at least two data compression methods to an input data stream|
|US5028922||30 Oct 1989||2 Jul 1991||Industrial Technology Research Institute||Multiplexed encoder and decoder with address mark generation/check and precompensation circuits|
|US5045848||29 Sep 1987||3 Sep 1991||Fnn||Method of encoding market data and transmitting by radio to a plurality of receivers|
|US5045852||30 Mar 1990||3 Sep 1991||International Business Machines Corporation||Dynamic model selection during data compression|
|US5046027||8 Nov 1988||3 Sep 1991||Massachusetts General Hospital||Apparatus and method for processing and displaying images in a digital procesor based system|
|US5049881||18 Jun 1990||17 Sep 1991||Intersecting Concepts, Inc.||Apparatus and method for very high data rate-compression incorporating lossless data compression and expansion utilizing a hashing technique|
|US5091782||9 Apr 1990||25 Feb 1992||General Instrument Corporation||Apparatus and method for adaptively compressing successive blocks of digital video|
|US5097261||19 Nov 1990||17 Mar 1992||International Business Machines Corporation||Data compression for recording on a record medium|
|US5113522||17 May 1989||12 May 1992||International Business Machines Corporation||Data processing system with system resource management for itself and for an associated alien processor|
|US5121342||28 Aug 1989||9 Jun 1992||Network Communications Corporation||Apparatus for analyzing communication networks|
|US5150430||15 Mar 1991||22 Sep 1992||The Board Of Trustees Of The Leland Stanford Junior University||Lossless data compression circuit and method|
|US5159336 *||13 Aug 1991||27 Oct 1992||Iomega Corporation||Tape controller with data compression and error correction sharing a common buffer|
|US5175543||25 Sep 1991||29 Dec 1992||Hewlett-Packard Company||Dictionary reset performance enhancement for data compression applications|
|US5179651||5 Feb 1991||12 Jan 1993||Massachusetts General Hospital||Apparatus for retrieval and processing of selected archived images for display at workstation terminals|
|US5187793||9 Jan 1989||16 Feb 1993||Intel Corporation||Processor with hierarchal memory and using meta-instructions for software control of loading, unloading and execution of machine instructions stored in the cache|
|US5191431 *||13 Aug 1990||2 Mar 1993||Canon Kabushiki Kaisha||Recording apparatus having plural operating modes involving diverse signal compression rates and different apportioning of pilot signal recording area|
|US5204756||30 Jul 1990||20 Apr 1993||International Business Machines Corporation||Method for high-quality compression of binary text images|
|US5209220||28 Aug 1990||11 May 1993||Olympus Optical Co., Ltd.||Endoscope image data compressing apparatus|
|US5212742||24 May 1991||18 May 1993||Apple Computer, Inc.||Method and apparatus for encoding/decoding image data|
|US5226176||20 Aug 1990||6 Jul 1993||Microsystems, Inc.||System for selectively aborting operation or waiting to load required data based upon user response to non-availability of network load device|
|US5227893||31 Oct 1990||13 Jul 1993||International Business Machines Corporation||Pseudo-bar code control of image transmission|
|US5231492||16 Mar 1990||27 Jul 1993||Fujitsu Limited||Video and audio multiplex transmission system|
|US5237460||4 Sep 1992||17 Aug 1993||Ceram, Inc.||Storage of compressed data on random access storage devices|
|US5237675||19 Mar 1992||17 Aug 1993||Maxtor Corporation||Apparatus and method for efficient organization of compressed data on a hard disk utilizing an estimated compression factor|
|US5243341||1 Jun 1992||7 Sep 1993||Hewlett Packard Company||Lempel-Ziv compression scheme with enhanced adapation|
|US5243348||27 Apr 1992||7 Sep 1993||Motorola, Inc.||Partitioned digital encoder and method for encoding bit groups in parallel|
|US5247638||18 Jun 1990||21 Sep 1993||Storage Technology Corporation||Apparatus for compressing data in a dynamically mapped virtual data storage subsystem|
|US5247646||22 Jul 1991||21 Sep 1993||Aquidneck Systems International, Inc.||Compressed data optical disk storage system|
|US5263168||3 Jun 1991||16 Nov 1993||Motorola, Inc.||Circuitry for automatically entering and terminating an initialization mode in a data processing system in response to a control signal|
|US5270832||14 Apr 1992||14 Dec 1993||C-Cube Microsystems||System for compression and decompression of video data using discrete cosine transform and coding techniques|
|US5287420||8 Apr 1992||15 Feb 1994||Supermac Technology||Method for image compression on a personal computer|
|US5293379||27 May 1993||8 Mar 1994||Gandalf Technologies, Inc.||Packet-based data compression method|
|US5307497||25 Jun 1990||26 Apr 1994||International Business Machines Corp.||Disk operating system loadable from read only memory using installable file system interface|
|US5309555||14 May 1992||3 May 1994||International Business Machines Corporation||Realtime communication of hand drawn images in a multiprogramming window environment|
|US5355498||30 Aug 1993||11 Oct 1994||Sun Microsystems, Inc.||Method and apparatus for booting a computer system without loading a device driver into memory|
|US5357614||17 Sep 1992||18 Oct 1994||Rexon/Tecmar, Inc.||Data compression controller|
|US5379036||1 Apr 1992||3 Jan 1995||Storer; James A.||Method and apparatus for data compression|
|US5379757||30 Nov 1992||10 Jan 1995||Olympus Optical Co. Ltd.||Method of compressing endoscope image data based on image characteristics|
|US5381145||10 Feb 1993||10 Jan 1995||Ricoh Corporation||Method and apparatus for parallel decoding and encoding of data|
|US5394534||11 Sep 1992||28 Feb 1995||International Business Machines Corporation||Data compression/decompression and storage of compressed and uncompressed data on a same removable data storage medium|
|US5396228||16 Jan 1992||7 Mar 1995||Mobile Telecommunications Technologies||Methods and apparatus for compressing and decompressing paging data|
|US5400401||30 Oct 1992||21 Mar 1995||Scientific Atlanta, Inc.||System and method for transmitting a plurality of digital services|
|US5403639||2 Sep 1992||4 Apr 1995||Storage Technology Corporation||File server having snapshot application data groups|
|US5406278||28 Feb 1992||11 Apr 1995||Intersecting Concepts, Inc.||Method and apparatus for data compression having an improved matching algorithm which utilizes a parallel hashing technique|
|US5406279||2 Sep 1992||11 Apr 1995||Cirrus Logic, Inc.||General purpose, hash-based technique for single-pass lossless data compression|
|US5412384||10 Aug 1994||2 May 1995||International Business Machines Corporation||Method and system for adaptively building a static Ziv-Lempel dictionary for database compression|
|US5414850||23 Aug 1991||9 May 1995||Stac Electronics, Inc.||System for transparently compressing data files in a computer system|
|US5420639||1 Apr 1993||30 May 1995||Scientific-Atlanta, Inc.||Rate adaptive huffman coding|
|US5434983||28 Aug 1992||18 Jul 1995||Fujitsu Limited||Data processing apparatus having first bus with bus arbitration independent of CPU, second bus for CPU, and gate between first and second buses|
|US5452287||20 Sep 1993||19 Sep 1995||Motorola, Inc.||Method of negotiation of protocols, classes, and options in computer and communication networks providing mixed packet, frame, cell, and circuit services|
|US5461679||14 May 1993||24 Oct 1995||Apple Computer, Inc.||Method and apparatus for encoding/decoding image data|
|US5467087||18 Dec 1992||14 Nov 1995||Apple Computer, Inc.||High speed lossless data compression system|
|US5471206||5 Dec 1994||28 Nov 1995||Ricoh Corporation||Method and apparatus for parallel decoding and encoding of data|
|US5479587||3 Sep 1992||26 Dec 1995||Hewlett-Packard Company||Page printer having adaptive data compression for memory minimization|
|US5483470||29 Jun 1992||9 Jan 1996||At&T Corp.||Timing verification by successive approximation|
|US5486826||19 May 1994||23 Jan 1996||Ps Venture 1 Llc||Method and apparatus for iterative compression of digital data|
|US5495244||12 Oct 1994||27 Feb 1996||Samsung Electronics Co., Ltd.||Device for encoding and decoding transmission signals through adaptive selection of transforming methods|
|US5506844||20 May 1994||9 Apr 1996||Compression Labs, Inc.||Method for configuring a statistical multiplexer to dynamically allocate communication channel bandwidth|
|US5506872||26 Apr 1994||9 Apr 1996||At&T Corp.||Dynamic compression-rate selection arrangement|
|US5530845||17 Jan 1995||25 Jun 1996||Southwestern Bell Technology Resources, Inc.||Storage control subsystem implemented with an application program on a computer|
|US5533051||12 Mar 1993||2 Jul 1996||The James Group||Method for data compression|
|US5535356||9 Sep 1992||9 Jul 1996||Samsung Electronics Co., Ltd.||Digital data recording/playback system utilizing EEPROM and ROM memories as a storage medium|
|US5537658||7 Jun 1995||16 Jul 1996||International Business Machines Corporation||Distributed directory method and structure for direct access storage device (DASD) data compression|
|US5557551 *||18 Jul 1995||17 Sep 1996||International Business Machines Corporation||Method and apparatus for a thermal protection unit|
|US5557668||25 May 1995||17 Sep 1996||Teledata Solutions, Inc.||Call distribution system with distributed control of calls and data distribution|
|US5557749||15 Oct 1992||17 Sep 1996||Intel Corporation||System for automatically compressing and decompressing data for sender and receiver processes upon determination of a common compression/decompression method understood by both sender and receiver processes|
|US5561824||4 Oct 1994||1 Oct 1996||International Business Machines Corporation||Storage management of data for ensuring communication of minimal length data|
|US5563961 *||3 Mar 1994||8 Oct 1996||Radius Inc.||Video data compression method and system which measures compressed data storage time to optimize compression rate|
|US5574952||11 May 1994||12 Nov 1996||International Business Machines Corporation||Data storage system and method for operating a disk controller including allocating disk space for compressed data|
|US5574953||19 Aug 1994||12 Nov 1996||Hewlett-Packard Company||Storing compressed data in non-contiguous memory|
|US5576953||7 Sep 1994||19 Nov 1996||Hugentobler; Max||Electronic translating device|
|US5583500||23 Dec 1993||10 Dec 1996||Ricoh Corporation||Method and apparatus for parallel encoding and decoding of data|
|US5590306||20 Jul 1993||31 Dec 1996||Fuji Photo Film Co., Ltd.||Memory card management system for writing data with usage and recording codes made significant|
|US5596674||26 May 1993||21 Jan 1997||Sony Corporation||State machine apparatus and methods for encoding data in serial form and decoding using multiple tables|
|US5604824||22 Sep 1994||18 Feb 1997||Houston Advanced Research Center||Method and apparatus for compression and decompression of documents and the like using splines and spline-wavelets|
|US5606706||22 Aug 1994||25 Feb 1997||Hitachi, Ltd.||Data storing system and data transfer method|
|US5611024||28 Aug 1992||11 Mar 1997||Compaq Computer Corporation||Data compression of bit map images|
|US5612788||18 May 1994||18 Mar 1997||Sony Corporation||Video data compression apparatus for recording and reproducing compressed video data at their various compressed data rates|
|US5613069||16 Dec 1994||18 Mar 1997||Tony Walker||Non-blocking packet switching network with dynamic routing codes having incoming packets diverted and temporarily stored in processor inputs when network ouput is not available|
|US5615017||18 Nov 1992||25 Mar 1997||Samsung Electronics Co., Ltd.||Method of and control circuit for compression recording and reproducing of multiple images|
|US5621820||25 Aug 1995||15 Apr 1997||Radius Inc.||Video data compression method and system which measures compressed data storage time to optimize compression rate|
|US5623623||11 May 1995||22 Apr 1997||Samsung Electronics Co., Ltd.||Digital storage system adopting semiconductor memory device|
|US5623701||6 Jun 1995||22 Apr 1997||International Business Machines Corporation||Data compression method and structure for a direct access storage device|
|US5627534||23 Mar 1995||6 May 1997||International Business Machines Corporation||Dual stage compression of bit mapped image data using refined run length and LZ compression|
|US5627995||1 Jun 1994||6 May 1997||Alfred P. Gnadinger||Data compression and decompression using memory spaces of more than one size|
|US5629732||29 Mar 1994||13 May 1997||The Trustees Of Columbia University In The City Of New York||Viewer controllable on-demand multimedia service|
|US5630092||2 Jan 1996||13 May 1997||International Business Machines||System and method for transferring compressed and uncompressed data between storage systems|
|US5635632||26 Apr 1994||3 Jun 1997||Cytec Technology Corp.||Settling process analysis device and method|
|US5638498 *||7 Jun 1995||10 Jun 1997||Adobe Systems Incorporated||Method and apparatus for reducing storage requirements for display data|
|US5675333 *||29 Aug 1995||7 Oct 1997||U.S. Philips Corporation||Digital compressed sound recorder|
|US5974235 *||31 Oct 1996||26 Oct 1999||Sensormatic Electronics Corporation||Apparatus having flexible capabilities for analysis of video information|
|US5982723 *||30 Sep 1996||9 Nov 1999||Laser Dynamics, Inc.||Data recording and reproducing method for multi-layered optical disk system|
|US6097520 *||30 Jun 1997||1 Aug 2000||Microsoft Corporation||Remote control receiver and method of operation|
|US6104389 *||12 Jun 1998||15 Aug 2000||Victor Company Of Japan, Ltd.||Broadcast receiving method and broadcast receiving apparatus therefor|
|US6487640 *||19 Jan 1999||26 Nov 2002||International Business Machines Corporation||Memory access request reordering to reduce memory access latency|
|US6601104 *||11 Mar 1999||29 Jul 2003||Realtime Data Llc||System and methods for accelerated data storage and retrieval|
|US6748457 *||2 Feb 2001||8 Jun 2004||Realtime Data, Llc||Data storewidth accelerator|
|EP0587437A2 *||10 Sep 1993||16 Mar 1994||International Business Machines Corporation||Data compression/decompression and storage of compressed and uncompressed data on a single data storage volume|
|1||"Fast Dos Soft Boot", IBM Technical Disclosure Bulletin, Feb 1994, vol. 37, Issue No. 2B, pp. 185-186.|
|2||"IBM boosts your memory", Geek.com [online], Jun. 26, 2000 [retrieved on Jul. 6, 2007], <URL:http://www.geek.com/ibm-boosts-your-memory/>.|
|3||"IBM Research Breakthrough Doubles Computer Memory Capacity", IBM Press Release [online], Jun. 26, 2000 [retrieved on Jul. 6, 2007], <URL:http://www-03.ibm.com/press/us/en/pressrelease/1653.wss>.|
|4||"Operating System Platform Abstraction Method", IBM Technical Disclosure Bulletin, Feb. 1995, vol. 38, Issue No. 2, pp. 343-344.|
|5||"ServerWorks To Deliver IBM's Memory eXpansion Technology in Next-Generation Core Logic Servers", ServerWorks Press Release [online], Jun. 27, 2000 [retrieved on Jul. 14, 2000], <URL:http://www.serverworks.com/news/press/ 000627.html>.|
|6||Abali, B., et al., "Memory Expansion Technology (MXT) Software support and performance", IBM Journal of Research and Development, vol. 45, Issue No. 2, Mar. 2001, pp. 287-301.|
|7||Anderson, J., et al. "Codec squeezes color teleconferencing through digital telephone lines", Electronics 1984, pp. 13-15.|
|8||Coene, W., et al. "A Fast Route for Application of a Rate-distortion Optimal Quantization in an MPEG Video Encoder" Proceedings of the International Conference on Image Processing, US., New York, Sep. 16, 1996, pp. 825-8286.|
|9||Franaszek, P. A., et al., "Algorithms and data structures for compressed-memory machines", IBM Journal of Research and Development, vol. 45, Issue No. 2, Mar. 2001, pp. 245-258.|
|10||Franaszek, P. A., et al., "On internal organization in compressed random-access memories", IBM Journal of Research and Development, vol. 45, Issue No. 2, Mar. 2001, pp. 259-270.|
|11||Millman, Howard, "Image and video compression", Computerworld, vol. 33, Issue No. 3, Jan. 18, 1999, pp. 78.|
|12||Murashita, K., et al., "High-Speed Statistical Compression using Self-organized Rules and Predetermined Code Tables", IEEE, 1996 Data Compression Conference.|
|13||Rice, Robert F., "Some Practical Universal Noiseless Coding Techniques", Jet Propulsion Laboratory, Pasadena, California, JPL Publication 79-22, Mar. 15, 1979.|
|14||Rice, Robert, "Lossless Coding Standards for Space Data Systems", IEEE 1058-6393/97, pp. 577-585.|
|15||Smith, T.B., et al., "Memory Expansion Technology (MXT) Competitive impact", IBM Journal of Research and Development, vol. 45, Issue No. 2, Mar. 2001, pp. 303-309.|
|16||Tremaine, R. B., et al., "IBM Memory Expansion Technology (MXT)", IBM Journal of Research and Development, vol. 45, Issue No. 2, Mar. 2001, pp. 271-285.|
|17||Venbrux, Jack, "A VLSI Chip Set for High-Speed Lossless Data Compression", IEEE Trans. On Circuits and Systems for Video Technology, vol. 2, No. 44, Dec. 1992, pp. 381-391.|
|18||Yen, Pen-Shu, "The CCSDS Lossless Data Compression Recommendation for Space Applications", Chapter 16, Lossless Compression Handbook, Elsevier Science (USA), 2003, pp. 311-326.3.|
|Citing Patent||Filing date||Publication date||Applicant||Title|
|US7675646 *||31 May 2005||9 Mar 2010||Xerox Corporation||Flexible print data compression|
|US7714747||8 Jan 2007||11 May 2010||Realtime Data Llc||Data compression systems and methods|
|US7777651||2 Jun 2008||17 Aug 2010||Realtime Data Llc||System and method for data feed acceleration and encryption|
|US7831491||5 Nov 2004||9 Nov 2010||Chicago Mercantile Exchange Inc.||Market data message format|
|US7882122 *||17 Mar 2006||1 Feb 2011||Capital Source Far East Limited||Remote access of heterogeneous data|
|US8037528 *||17 Sep 2007||11 Oct 2011||Cisco Technology, Inc.||Enhanced server to client session inspection|
|US8054879||8 Jan 2010||8 Nov 2011||Realtime Data Llc||Bandwidth sensitive data compression and decompression|
|US8073047||19 May 2008||6 Dec 2011||Realtime Data, Llc||Bandwidth sensitive data compression and decompression|
|US8090936||19 Oct 2006||3 Jan 2012||Realtime Data, Llc||Systems and methods for accelerated loading of operating systems and application programs|
|US8112619||19 Oct 2006||7 Feb 2012||Realtime Data Llc||Systems and methods for accelerated loading of operating systems and application programs|
|US8203733||3 May 2007||19 Jun 2012||Fuji Xerox Co., Ltd.||Image processing apparatus, storage medium in which image processing program is stored, and image processing method|
|US8229820||22 Sep 2010||24 Jul 2012||Chicago Mercantile Exchange Inc.||Market data message format|
|US8275897||8 Apr 2006||25 Sep 2012||Realtime Data, Llc||System and methods for accelerated data storage and retrieval|
|US8321325||3 Mar 2010||27 Nov 2012||Verticlear, Inc.||Systems and methods for compression of trade-related records|
|US8341056||9 May 2012||25 Dec 2012||Chicago Mercantile Exchange Inc.||Distribution of market data|
|US8379841||22 Mar 2007||19 Feb 2013||Exegy Incorporated||Method and system for high throughput blockwise independent encryption/decryption|
|US8407122||31 Mar 2011||26 Mar 2013||Exegy Incorporated||High speed processing of financial information using FPGA devices|
|US8458081||31 Mar 2011||4 Jun 2013||Exegy Incorporated||High speed processing of financial information using FPGA devices|
|US8478680||31 Mar 2011||2 Jul 2013||Exegy Incorporated||High speed processing of financial information using FPGA devices|
|US8502707||9 Feb 2010||6 Aug 2013||Realtime Data, Llc||Data compression systems and methods|
|US8504710||26 Oct 2006||6 Aug 2013||Realtime Data Llc||System and methods for accelerated data storage and retrieval|
|US8533101 *||3 Mar 2010||10 Sep 2013||Verticlear, Inc.||Systems and methods for compression of trade-related records|
|US8553759||6 Jun 2011||8 Oct 2013||Realtime Data, Llc||Bandwidth sensitive data compression and decompression|
|US8560443||19 Nov 2012||15 Oct 2013||Chicago Mercantile Exchange, Inc.||Distribution of market data|
|US8595104||31 Mar 2011||26 Nov 2013||Ip Reservoir, Llc||High speed processing of financial information using FPGA devices|
|US8600856||31 Mar 2011||3 Dec 2013||Ip Reservoir, Llc||High speed processing of financial information using FPGA devices|
|US8620881||21 Jun 2011||31 Dec 2013||Ip Reservoir, Llc||Intelligent data storage and processing using FPGA devices|
|US8626624||31 Mar 2011||7 Jan 2014||Ip Reservoir, Llc||High speed processing of financial information using FPGA devices|
|US8626625||24 Jun 2011||7 Jan 2014||Chicago Mercantile Exchange Inc.||Trade engine processing of mass quote messages and resulting production of market data|
|US8643513||6 Jun 2011||4 Feb 2014||Realtime Data Llc||Data compression systems and methods|
|US8655764||31 Mar 2011||18 Feb 2014||Ip Reservoir, Llc||High speed processing of financial information using FPGA devices|
|US8692695||16 Aug 2010||8 Apr 2014||Realtime Data, Llc||Methods for encoding and decoding data|
|US8717203||24 Sep 2013||6 May 2014||Realtime Data, Llc||Data compression systems and methods|
|US8717204||24 Sep 2013||6 May 2014||Realtime Data Llc||Methods for encoding and decoding data|
|US8719438||5 May 2011||6 May 2014||Realtime Data Llc||System and methods for accelerated data storage and retrieval|
|US8723701||24 Sep 2013||13 May 2014||Realtime Data Llc||Methods for encoding and decoding data|
|US8737606||5 Feb 2013||27 May 2014||Ip Reservoir, Llc||Method and system for high throughput blockwise independent encryption/decryption|
|US8738416 *||16 Feb 2011||27 May 2014||Fuji Xerox Co., Ltd.||Information processing apparatus and computer readable medium|
|US8742958||24 Sep 2013||3 Jun 2014||Realtime Data Llc||Methods for encoding and decoding data|
|US8751452||6 Jan 2012||10 Jun 2014||Ip Reservoir, Llc||Intelligent data storage and processing using FPGA devices|
|US8756332||26 Oct 2006||17 Jun 2014||Realtime Data Llc||System and methods for accelerated data storage and retrieval|
|US8762249||7 Jun 2011||24 Jun 2014||Ip Reservoir, Llc||Method and apparatus for high-speed processing of financial market depth data|
|US8768805||7 Jun 2011||1 Jul 2014||Ip Reservoir, Llc||Method and apparatus for high-speed processing of financial market depth data|
|US8768888||6 Jan 2012||1 Jul 2014||Ip Reservoir, Llc||Intelligent data storage and processing using FPGA devices|
|US8843408||26 Oct 2010||23 Sep 2014||Ip Reservoir, Llc||Method and system for high speed options pricing|
|US8867610||19 Dec 2013||21 Oct 2014||Realtime Data Llc||System and methods for video and audio data distribution|
|US8879727||29 Aug 2008||4 Nov 2014||Ip Reservoir, Llc||Method and apparatus for hardware-accelerated encryption/decryption|
|US8880862||27 May 2011||4 Nov 2014||Realtime Data, Llc||Systems and methods for accelerated loading of operating systems and application programs|
|US8929442||19 Dec 2013||6 Jan 2015||Realtime Data, Llc||System and methods for video and audio data distribution|
|US8933825||11 Apr 2014||13 Jan 2015||Realtime Data Llc||Data compression systems and methods|
|US8934535||20 Sep 2013||13 Jan 2015||Realtime Data Llc||Systems and methods for video and audio data storage and distribution|
|US8983063||16 May 2014||17 Mar 2015||Ip Reservoir, Llc||Method and system for high throughput blockwise independent encryption/decryption|
|US9025261 *||18 Nov 2013||5 May 2015||International Business Machines Corporation||Writing and reading data in tape media|
|US9047243||5 Apr 2012||2 Jun 2015||Ip Reservoir, Llc||Method and apparatus for low latency data distribution|
|US9054728||24 Sep 2014||9 Jun 2015||Realtime Data, Llc||Data compression systems and methods|
|US9116908||12 Jun 2014||25 Aug 2015||Realtime Data Llc||System and methods for accelerated data storage and retrieval|
|US9141992||23 Feb 2012||22 Sep 2015||Realtime Data Llc||Data feed acceleration|
|US9143546||3 Oct 2001||22 Sep 2015||Realtime Data Llc||System and method for data feed acceleration and encryption|
|US20050096999 *||30 Jul 2004||5 May 2005||Chicago Mercantile Exchange||Trade engine processing of mass quote messages and resulting production of market data|
|US20060195464 *||28 Feb 2005||31 Aug 2006||Microsoft Corporation||Dynamic data delivery|
|US20060268322 *||31 May 2005||30 Nov 2006||Xerox Corporation||Print data compression|
|US20070083746 *||19 Oct 2006||12 Apr 2007||Realtime Data Llc||Systems and methods for accelerated loading of operating systems and application programs|
|US20110218900 *||3 Mar 2010||8 Sep 2011||Liam Cheung||Systems and methods for compression of trade-related records|
|US20120041799 *||16 Feb 2011||16 Feb 2012||Fuji Xerox Co., Ltd.||Information processing apparatus and computer readable medium|
|US20150110131 *||23 Oct 2013||23 Apr 2015||Google Inc.||Secure communications using adaptive data compression|
|International Classification||H04B1/66, H03M7/30|
|Cooperative Classification||H03M7/3084, H03M7/30, H03M7/6094|
|European Classification||H03M7/30, H03M7/30Z|
|8 Sep 2006||AS||Assignment|
Owner name: REALTIME DATA, LLC, NEW YORK
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:FALLON, JAMES J.;MCERLAIN, STEPHEN J.;REEL/FRAME:018240/0813
Effective date: 20020807
|15 Sep 2011||FPAY||Fee payment|
Year of fee payment: 4