WO1992002989A1 - Compounds adaptive data compression system - Google Patents

Compounds adaptive data compression system Download PDF

Info

Publication number
WO1992002989A1
WO1992002989A1 PCT/US1991/005659 US9105659W WO9202989A1 WO 1992002989 A1 WO1992002989 A1 WO 1992002989A1 US 9105659 W US9105659 W US 9105659W WO 9202989 A1 WO9202989 A1 WO 9202989A1
Authority
WO
WIPO (PCT)
Prior art keywords
character
encoding
font
string
characters
Prior art date
Application number
PCT/US1991/005659
Other languages
French (fr)
Inventor
Francis L. Bacon
Ernest R. Price
Original Assignee
Telcor Systems Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telcor Systems Corporation filed Critical Telcor Systems Corporation
Publication of WO1992002989A1 publication Critical patent/WO1992002989A1/en

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/46Conversion to or from run-length codes, i.e. by representing the number of consecutive digits, or groups of digits, of the same kind by a code word and a digit indicative of that kind
    • H03M7/48Conversion to or from run-length codes, i.e. by representing the number of consecutive digits, or groups of digits, of the same kind by a code word and a digit indicative of that kind alternating with other codes during the code conversion process, e.g. run-length coding being performed only as long as sufficientlylong runs of digits of the same kind are present
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3084Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
    • H03M7/3086Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing a sliding window, e.g. LZ77
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
    • H03M7/42Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code using table look-up for the coding or decoding process, e.g. using read-only memory

Definitions

  • the invention relates to the field of data compression systems and particularly to apparatus and methods for compressing data signals and reconstituting the data signals.
  • Data compression systems are known in the prior art that encode a stream of digital data signals into compressed digital signals and decode the compressed digital data signals back into the original data signals.
  • Data compression refers to any process that converts data in one format into another format having fewer bits than the original.
  • the objective of data compression systems is to reduce the amount of storage required to hold a given body of digital information or to increase the speed of data transmission by permitting an effective data transmission rate that is greater than the rated capacity of a given data communication link.
  • Compression effectiveness is characterized by the compression ratio of the system.
  • Compression ratio is herein defined as the ratio of the number of bits in the input data to the number of bits in the encoded output data. The larger the compression ratio, the greater will be the reduction in storage space or transmission time.
  • redundancy occurs both in the nonuniform usage of individual symbols, e.g. characters, bytes, or digits, and in frequent recurrence of symbol sequences, such as common words, blank record fields, and the like.
  • An effective data compression system should respond to both types of redundancy:
  • a typical data stream contains both types of redundancy in varying portions resulting in varying statistics.
  • An example of a data stream of varying statistics is a data stream wherein
  • a digital data compression system must possess the property of reversibility, i.e. it must be possible to reexpand or decompress the compressed data back into its original form without any alteration or loss of information.
  • the decompressed and the original information must be identical and indistinguishable with respect to each other.
  • it should satisfy several performance criteria.
  • the compression effectiveness should be high, and therefore the compression ratio should be large.
  • the system should provide high data rate performance with respect to the data rates provided by and accepted by the equipment with which the data compression and decompression systems are interfacing. For real time, switched network, data communications applications, preferably the rate at which data should be compressed should match the output data rate from the compression system.
  • the system should be adaptable, that is, capable of achieving high compression effectiveness and high performance on data having a variety of statistical characteristics.
  • Many prior art data compression procedures require prior knowledge of the statistics of the data being compressed. Some prior art procedures adapt to the statistics of the data as it is received. Adaptability in the prior art processes has either been limited to a narrow range of variation e.g. character-by-character encoding or has required a high degree of complexity with resultant severe penalty in data rate performance.
  • the requirement for data compression systems suitable for use in modems in high speed data communication links is to accommodate a wide range of data characteristics without prior knowledge of data statistics and achieve both high compression ratios and high data rate performance.
  • Data compression and decompression systems and modems currently available are either not adaptable over a wide range of data characteristics or are severely limited in compression efficiency or data-rate performance and so are not suitable for general purpose usage.
  • the system should be responsively adaptable, that is, capable of reestablishing a high compression ratio quickly after the beginning of a new data file from a stream of data files, wherein each file has different statistical properties from the data in the immediately proceeding file.
  • U.S. Patent 4,612,532 to Bacon et al. which is hereby incorporated herein by reference, discloses a system for adaptive compression and decompression of a data stream designed to compress redundancy resulting from non-uniform usage of individual symbology.
  • the Bacon invention uses an adaptive character-by-character compression technique wherein dynamically updated "followset" tables having Huffman codes are used to encode characters, using, on average, far fewer bits per character than is required by ASCII or EBCDIC encoding.
  • Each incoming character is encoded using information from the three preceding characters (character type, character type, character identity), i.e. (two bits, two bits, seven bits).
  • the Bacon invention has a high compression efficiency on a character-by-character basis and achieves high performance by using fewer processing steps, on average, to encode each character than other character-by- character encoding techniques.
  • U.S. Patent 4,558,302 to Welsh discloses a string search system designed to compress redundancy resulting from frequent recurrence of symbol sequences.
  • the Welsh invention includes a compressor which compresses a stream of data character signals into a compressed stream of code signals.
  • the compressor stores strings of data character signals parsed from the input data stream and searches the stream of data character signals by comparing the stream to the stored strings to determine the longest match. Having found the longest match, the compressor stores an extended string comprising the longest match plus the next data character signal following the longest match and assigns a code signal thereto.
  • a compressed stream of code signals is provided from the code signals corresponding to the stored longest matches.
  • U.S. Patent 4,464,650 to Eastman et al. discloses an adaptive string search system designed to compress redundancy resulting from frequent recurrence of symbol sequences.
  • the Eastman invention uses the Lempel-Ziv algorithm to encode strings of characters without constraint on the length of the input or output word.
  • ⁇ the Eastman invention suffers the disadvantage of requiring numerous RAM cycles per input character and utilizing time consuming and complex mathematical procedures such as multiplication and division to effect compression and decompression. These disadvantages tend to render the
  • Eastman invention unsuitable for on-line data communications applications.
  • U.S. Patent 4,730,348 to McCrisken discloses a system for adaptive compression and decompression of a data stream using a combination of techniques to compress redundancy from non-uniform usage of individual symbols and frequent recurrence of symbol sequences.
  • the McCrisken implementation uses an adaptive character-by-character compression technique described as "bigram encoding” based on "pruned tree” Huffman and "running bigrams” to compress redundancy resulting from non-uniform usage of individual symbology.
  • McCrisken uses a plurality of encoding tables, on-line analysis of compression efficiency, an on-line table builder, a table changer and a table change code to permit rapid adaptation to changes when compressing data streams having varying statistics. McCrisken also uses a history buffer and a string substitution technique which identifies and further compresses matching strings of up to eighteen characters to compress redundancy resulting from frequent recurrence of symbol sequences. Both techniques are adaptive and therefore do not need prior knowledge of data statistics. In a preferred embodiment, some of the data stream is encoded on a character-by-character basis and some of the data stream is encoded with a string substitution code.
  • McCrisken also uses protocol emulation and packet size control to improve performance.
  • the McCrisken character-by-character compression technique has a low compression efficiency and a poor data rate performance compared to the Bacon method. This is partly because the encoding tables of McCrisken's character-by-character compression technique are updated on the basis of on-line explicit analysis of compression efficiency and this technique is very inefficient compared with the transposition heuristic used by Bacon to update his followset tables. McCrisken's use of a string substitution technique compensates to a large extent for the low compression efficiency and poor performance of the adaptive updating of the McCrisken encoding tables.
  • McCrisken does not achieve as good a compression ratio as the Eastman implementation of the Lempel-Ziv algorithm which uses no character-by-character encoding of any kind. Furthermore, because of its complexity the McCrisken implementation is inherently slow. James A.
  • the Storer discusses in detail three on-line textual substitution methods (p.54), all of which use dynamically updated local dictionaries.
  • the three methods are the sliding dictionary method, the improved sliding dictionary method and the dynamic dictionary method.
  • the local dictionary of strings is stored in a "trie" structure (p.15) which is a tree where the edges are labeled by elements of the alphabet in such a way that children of a given parent are connected via edges that have distinct labels, all leaf nodes are labelled as "marked”, and all internal nodes are labeled as either "marked” or "unmarked”.
  • the set of strings represented by a trie are those that correspond to all root to marked node paths.
  • the sliding dictionary method contains within its local dictionary all strings contained within a portion of the source string defined by a "sliding window” technique well known (but used for other purposes) in data communications systems. This method is similar to the method using a history buffer described by McCrisken except for the method of storing pointers to strings. It is a practical realization of the first of two universal data compression algorithms proposed by Lempel and Ziv designated by Storer (p.67) as LZ1.
  • LZ1 algorithm works as follows. At each stage, the longest prefix of the (unread portion of the) input stream that matches a substring of the input already seen is identified as the current match.
  • a triple (d, 1, c) is transmitted where d is the displacement back to a previous occurrence of this match, 1 is the length of the match, and c is the next input character following the current match (the transmission of c is pointer guaranteed progress) .
  • the input is then advanced past the current match and the character following the current match.
  • the sliding dictionary method can be viewed as a practical implementation of LZl that uses fixed size pointers; instead of remembering the entire input stream the system remembers only a fixed number of characters back, and instead of pointer guaranteed progress, the system uses dictionary guaranteed progress by reserving codes for the characters of the alphabet.
  • the improved sliding dictionary method contains a heuristic that eliminates duplicate strings. It too requires that the alphabet be added initially in the local dictionary.
  • the dynamic dictionary method uses update and deletion heuristics that maintain a collection of strings that do not, in general, form a contiguous portion of the input stream.
  • update and delete heuristics i.e. mechanisms which provide learning capability
  • Both the improved sliding dictionary method and the dynamic dictionary method create and maintain a dictionary that is different from the history buffer of McCrisken.
  • Most of the heuristics described by Storer are directed to the maintenance of pointer sets for the special dictionaries.
  • Textual substitution methods achieve higher compression ratios with large files and dictionaries.
  • Storer in his patent, describes a string search data compression system that uses a sliding dictionary that is stored as a tree ("trie") structure. This approach provides fast access to dictionary entries but updating the tree structure loads the processor heavily so Storer uses sophisticated update heuristics.
  • McCrisken describes a string search data compression system that uses a history buffer. The McCrisken approach provides fast updating of the history buffer but, in this case, string matching loads the processor heavily. McCrisken resolves this with arbitrary cut-off of his search process.
  • the invention provides a system for the dynamic encoding of a character stream.
  • a preferred embodiment of the system comprises a single character encoder which includes a plurality of fonts, a string encoder which includes a history buffer, and an output selector which compares encodings from the single character encoder and the string encoder and selects the least cost encoding for output.
  • the single character encoder generates and stores hash codes which it uses for font access.
  • the string encoder retrieves these same hash codes and uses them for history buffer access.
  • the hash codes are generated by applying a CRC algorithm to a character pair and are given the name "CRC hash".
  • the single character encoder maintains a position in a font for all characters not otherwise listed in the font, such characters herein called "new character", and four tables are maintained for the encoding of such characters.
  • the single character encoder also maintains a position in a font for a symbol representing a string, which position directly follows the position of new character in the font. Three or more consecutive like characters are represented in the history buffer by three characters only.
  • a pair encoder is provided that encodes character pairs using the font number. The pair encoder may be active at the same time as the string encoder. Two string encoding modes are provided.
  • a switch controls activation and deactivation of string search processes based on a comparison of the average bit cost of new character encoding with a predetermined value.
  • a hash-link/hash-test table is provided in the string search encoder having entries corresponding to every second character position in the history buffer. This table uses properties of the CRC hash to access matching strings in the history buffer. String match testing starts “n” characters beyond the current character where "n" is the length of the longest match found so far. Accordingly, the string search encoder, in addition to searching forward, also searches back. The string search encoder discards a string match that has less than a predetermined number of characters. Linked lists of pointers to candidate strings are maintained and the end of the linked list is determined using a property of the CRC hash.
  • Fig. 1A is a block diagram and overview of the main buffers, tables and processes of the preferred embodiment of - li ⁇
  • Fig. IB shows the two phases of the encoding process of Fig. 1A.
  • Fig. 2A illustrates the loading of the data stream into the CC buffer.
  • Fig. 2B illustrates the relationships among the encoding buffers, tables and processes of the preferred embodiment of the present invention.
  • Fig. 3 illustrates the fonts used in the font encoder.
  • Fig. 4A shows the global (Huffman) font encoding tables.
  • Fig. 4B shows the Huffman Tables used for encoding New Character and for encoding String Length in Mode A.
  • Fig. 4C shows the Huffman Tables used for encoding String Length in Mode B.
  • Fig. 4D shows the Huffman Tables used for encoding Zone Code in Mode A and Mode B.
  • Fig. 5 illustrates the font access tables used in the font encoding process.
  • Fig. 6 illustrates the generation and use of the CRC hash.
  • Fig. 7A illustrates the new character encoding process.
  • Fig. 7B illustrates the Pair Encoding, Mode A process.
  • Fig. 7C illustrates the String Encoding, Mode A process.
  • Fig. 8A illustrates the use of the history buffer access tables for mode A string encoding.
  • Fig. 8B illustrates the use of the history buffer access tables for mode B string encoding.
  • Fig. 9 shows the start points for string searches.
  • Figs. 10A and 10B show the decoding logic.
  • Fig. 11 shows the dual processor configuration.
  • Fig. 12 shows the prior art processor configuration.
  • Fig. 13 shows a conventional two-processor configuration.
  • the present invention in a preferred embodiment combines a novel adaptive font encoding single-character compression technique with a repeat character compression technique and several novel string encoding compression techniques. It includes an adaptive font encoding process that is an improved version of the efficient, high performance font encoding process disclosed by Bacon et al. in U.S. Patent 4,612,532. It includes several novel string encoding processes. It further includes a novel data compressibility trending function which is used to select the most effective encoding process according to the compressibility of the data.
  • the font encoding process and the string encoding process of a preferred embodiment share memory and processes associated with the generation of a novel "CRC hash" using a CRC algorithm, a portion of the CRC hash being used as a hash code for font and dictionary addressing and another portion being used for identification.
  • the present invention achieves superior compression ratios and superior performance over the prior art described above.
  • a copy of the source.code of the preferred embodiment of the present invention, expressed in the assembly language of the Rockwell C-19 processor, is attached hereto as Appendix 1.
  • Appendix 2 A guide to the source code listing is given in Appendix 2.
  • a general overview of a preferred embodiment of the system is shown in Fig. 1A.
  • the system provides full duplex operation and it is generally divided into an encoder 1 and a decoder 2 such that each contains its own set of buffers (encoder: PC In Buffer 4, Process Buffer 5, History Buffer 6, and Modem Out Buffer 7; decoder: Modem In Buffer 8,
  • Fig. 1A shows the main tasks performed by the encoder software (Load Process Buffers, Do Font Encoding, Update Fonts, Do String Encoding, Select Least-Code Encoding, Update History Buffers, and Format and Output) 15 and the decoder software (Receive Bit Stream, Interpret Escape Codes, Decode Single Characters and Strings, Load PC Out Buffer, Update History Buffer, and Update Fonts) 16.
  • Fig. IB shows the two phases of encoding.
  • Phase 1 processes steps 1-10) 17, including Loading Process Buffer, Doing Font Encoding and Repeat Character Encoding and Updating Font, are performed once for each character of input.
  • Phase 2 processes steps 11-20) 18 including String Encoding, Selecting Least-Cost Encoding, Formatting For Output, and Updating Buffers are performed, typically, when the process buffer is full.
  • Test 19 following Phase 1 is "Process Buffer Full or Flush”.
  • Test 20 following Phase 2 is "Flush and Process Buffer not Empty”.
  • String encoding includes string encoding mode A, or string encoding mode B which combines string encoding with pair encoding. The decoder performs corresponding decoding processes.
  • the character stream 20 enters the CC buffer as shown in Fig. 2A.
  • the CC buffer consists of ECChar 21 which contains 256 bytes representing the most recent characters from the data stream and ECCharCopy 22 which contains an identical copy of the content of ECChar.
  • ECCharCopy is provided to " remove the necessity for boundary checking in the string matching process.
  • Fig. 2A shows string continuation for searching 23 extending into ECCharCopy.
  • ECCharCopy is contiguous with ECChar in memory.
  • Fig. 2A also shows the next store location in ECChar 24 and in ECCharCopy 25, and old data 26.
  • the ECChar and ECCharCopy buffers are two of nine process buffers, shown in Fig. 2B, which operate in parallel and share input and output pointers. These buffers are used by the font encoding and string encoding processes.
  • Fonts 31 are shown in Fig. 3.
  • Fig. 3 shows a table of fonts 31 having 1024 font numbers 32, an FTLink field 33, an FTMatch field 34, an FTNC field (NewCharPosition) 35, an FTSize field 36, and Font Character fields (6 per Font max) 37.
  • Huffman encoding tables are shown in Figs. 4A-4D.
  • Fig. 4A shows global (Huffman) font encoding tables including an Access Table 41 having an index 42, a Font Code table 43 and a Font Bits table 44.
  • Fig. 4B shows the Huffman Global Code (Frequency) Tables, used for encoding New Character and for encoding String Length in Mode A.
  • the tables have 256 Table Entries, a Code Length of 4-13 bits and are referenced as "Global Code High; Global Code Low” in the source code.
  • Fig. 4C shows the Huffman Tables used for encoding String Length in Mode B. These tables have 10 table entries, a code length of 1-6 bits, and are referenced as “LengthBCode” in the source code.
  • Fig. 4D shows the Huffman tables used for encoding Zone Code in Mode A and Mode B. These tables have 32 table entries, a Code Length of 2-7 bits and are referenced as "ZoneCode” in the source code. Font access tables 51 along with a font table 31 and an input data stream 52 are shown in Fig. 5.
  • the font access table include a CRC Hash Table 53 having CRC Hash 54, a MatchVal data 55 and RoughAdr data 56.
  • the font access tables also include an FTRough Table 59 having an index 57 and FTRough data 58.
  • the history buffer and history buffer access tables used for string search are shown in Figs. 8A and 8B.
  • the entire compound encoding process includes: 1. Repeat character encoding;
  • font encoding uses a set of fonts having character symbols stored in approximate order of the frequency of occurrence of such character after the occurrence of a pair of characters with which the font is associated. For example, if the input data stream contained the words "this" and “those", then a font would exist associated with the pair of characters "th” and the font would contain the letter "i” and the letter "o".
  • a single font consists of pointers, links, characters, etc. whose selection (font number) is based on the prior two characters in the input stream and which contains a list of historically occurring candidate characters to be matched with Encoder Current Character.
  • Fig. 3 shows an array of fonts. A single font is illustrated by a single row.
  • New Character i.e., any character that is not otherwise listed in the table, is also assigned a position in the table in approximate order of such characters local frequency of occurrence after the occurrence of a pair of characters with which the table is associated.
  • NewChar is hereinbelow referred to as “NewChar” and sometimes abbreviated as "NC”.
  • N the occurrence of NewChar in the data stream is a font encoding event, so the occurrence of NewChar is a font encoding event.
  • NewChar is a font encoding event wherein either the Encoder Current Character is not found in the selected font or the selected font does not exist.
  • NewChar Position is a dynamic value in the range of 0 through n (where n is the maximum number of characters per font) meaning "Character Not in Table". NewChar does not occupy a character position in the font: it is assigned a "virtual position”.
  • Fig. 3 shows how the position of NewChar is stored in field FTNC in the font.
  • mode A each font includes a virtual position for a string directly following the NewChar position.
  • mode B each font includes a virtual position for a "pair encoding" directly following the NewChar and includes another virtual position for string encoding following the pair encoding. Font Encoding, CRC Hash and Font Access
  • Font access tables are shown in Fig. 5.
  • Fig. 6 shows how the hash pointer (RoughAdr) and the match value (MatchVal) are derived from the CRC hash.
  • Font encoding includes the following steps:
  • the CRC hash for the two prior characters (“S" and "P") is created as follows.
  • a CRC function (CCITT polynomial xl6 + xl2 + x5 +1) is performed on the character S and then on P yielding 65 a sixteen-bit CRC result (64 see Fig. 6) (herein below referred to as "the CRC hash” indicative of its function in the present invention) .
  • the CRC hash is used as follows: a) The ten least significant bits of the CRC hash are extracted and stored as RoughAdr (62 see Fig. 6) for use as a hash pointer.
  • the CRC hash has two very important properties: i) Its ten least significant bits provide a hash code having excellent statistical properties for use in hashing. ii) The sixteen-bit result produced by every possible two-byte combination is unique. No two-byte combination shares a sixteen-bit result with another two-byte combination so the sixteen-bit result may be used to provide one-to-one mapping with the original two bytes.
  • the ten least significant bits may be used as a hash code to access a table and the remaining six bits may be used to test if this is the specific font assigned to that exact character pair.
  • the CRC hash is used in font encoding and for history buffer access in string encoding mode A and string encoding mode B. It provides significant benefit in reducing the average amount of processor time consumed in accessing the fonts and history buffer, thereby enabling a given processor to handle higher encoding throughput rates.
  • the use of the CRC hash, as described herein below, by virtue of the throughput rate benefits, also provides a practical realization of trigram font encoding. The combination of the CRC hash and MatchVal will always identify uniquely the font associated with the prior two characters.
  • Fig. 4A shows a set of global Huffman tables and the associated access table.
  • the Access Table 41 is indexed by Font Size 42 and contains pointers to the several Huffman tables for Font Code 43 and Font Bits 44 (the bit cost of the encoding) .
  • the Access Table is "Encoding Table" in the source code. Index 0, and the corresponding Font Code (1,0) and Font Bits (1,1) are not used. ECFontlndex is computed and stored during font encoding. Later, during string encoding, FontBits is retrieved and, during the output process, FontCode is retrieved.
  • Figs. 4B, 4C and 4D each show a single Huffman table. Fig.
  • FIG. 4B shows the table used for new character encoding, for string length encoding and for repeats encoding.
  • Fig. 4C shows the tables used for string length, mode B encoding.
  • Fig. 4D shows the table used for the zone portion of string address encoding, mode A and mode B.
  • FIG. 5 shows the static state of the Font Encoding and Access Tables directly after processing the string. Beginning at an initial state having empty fonts, the process of encoding the first character proceeds as follows. 1. Initialization and Assignment of the first Font. As described above, each new character to be encoded is associated with a CRC hash. The ten least significant bits of the CRC hash 56 are used as a pointer to the ECFTRoughTable 59 (Encoder Font Rough Table) . Since all fonts are empty at the outset, the ECFTRoughTable is initially null indicating the need for new font creation. A font number is assigned and stored in the ECFTRoughTable in the position pointed to by the hash ("000" in the example given in Fig. 5) . This font number is either the next available not-in-use font or an old font selected as described later.
  • Font Character Encoder Current Character
  • Font Characters N/A Following table reset, the first character to be processed is the " ⁇ ". The prior two characters and the CRC are assumed to be 0. Thus a MatchVal and ten-bit RoughAdr of 0 are used. This points to FTRoughTable entry number 0 (which was initially null) and font number 1 was assigned. Font number l was initialized as specified above and has not changed since, as indicated by Fig. 5.
  • the current character is found, its position, the size of the font and other pertinent data are stored in the process buffers for later use by the encoding selection process.
  • the character matching Encoder Current Character is promoted towards the top of the table (higher frequency) by exchange with the next higher frequency entity (character or NC) .
  • NC next higher frequency entity
  • the current character is not found, it is added to the table in the next available position (overwriting the last character when the table is full) and the table size is incremented (if not full) .
  • the NC value is promoted one position towards the top of the table unless already at the top (highest frequency) .
  • Font Encoding Example 2. Finding the Current Font Using the Link Table If FTMatch does not equal MatchVal, FTLink is examined. If FTLink is null, then the Ftlink field is assigned the next font number and the flow joins step 1 above for the creation of a new font. If FTLink is not null, control proceeds to FTMatch comparison in step 2 with the FTLink field as the new font number. Linking and match comparison repeat until either the desired font is found or a new one is created.
  • the last line of the input data stream in Fig. 5 details the "o" character from the sequence "A A do”.
  • the calculated CRC hash for "*d” is 7DD6 which yields a MatchValue of 7C and a RoughAdr of 1D6.
  • the sequence "in”, seven characters earlier, produced a CRC hash of 05D6, MatchValue of 04 and RoughAdr of 1D6.
  • Access to entry 1D6 in the ECFTHashRough Table yields a pointer to font number OOOC but comparison of the FTMatch field in font OOOC does not equal the desired value of 7C.
  • the FTLink field of font number OOOC was set to NULL. Consequently, font number 0013 was assigned, set to initial state and the character "o" was added to it.
  • a future occurrence of the sequence " d" can link to font number 0013 via font number OOOC and search for or add characters as required.
  • Fig. 7A shows the encoding of character "w” which follows, in the character stream 701, "No".
  • Mode A font size (SZ) + 2.
  • Position "2" 715 in these tables yields 709 the bit string "000" in the Global Font Encoding Tables 710 for either Mode A 711 or Mode B 712.
  • String "000” will be transmitted by the encoder and will be recognized by the decoder as the "new character escape”. This will indicate to the decoder that the next bits to be received are the encoding of a new character.
  • Bits 5 and 6 706 from the prior character are used to select one of these four NC to frequency tables (in this example NC to FreqTable 11 707) .
  • the binary value of "w" (77 in hexadecimal, 708 in Fig. 7A) is used to enter the selected NC to Frequency table, yielding a position (or frequency) of 15, which defines an entry into the Global Code High/Low Table 716.
  • This table in turn, yields the Huffman code 01111, the font encoding of new character "w” following "No".
  • the output bit stream sequence 717 is therefore 000 (font) followed by 01111 (frequency) .
  • the use of four tables instead of the one table described in U.S. Patent 4,612,532 to Bacon et al, is found to improve compression efficiency. Of course, more or less than four tables could be used.
  • the process buffers shown in Fig. 2B, consist of nine "First In/First Out" buffers 201-209, each having 256 locations, which operate in parallel and share input and output pointers. These buffers are used by the font encoding and string encoding processes.
  • Fig. 2B shows the flow of font encoding data among the process buffers and various tables. The contents and significance of the several buffers are as follows:
  • the ECChar buffer 203 contains the most recent 256 characters from the input stream to be encoded. Characters are received singly from the input stream, placed in rotation in ECChar, font encoded, and later string encoded. Least-cost selection and output formatting follow.
  • the value range ' of ECChar is 0 - 255.
  • the ECCharCopy 202 buffer contains an exact copy of the ECChar buffer. It is contiguous with ECChar to facilitate string searching.
  • the value range of ECCharCopy is 0 - 255.
  • ECType 209 is a steering value which is set by the font encoding and/or the string encoding process.
  • ECType is used by the output format process to control the output bit stream.
  • ECType may have any one of the following values: 0 - String or pair continuation (the second or subsequent character of a mode A string or a mode B string or the second character of a pair encoding) . 2 - Font encoding. The encoding is the relative offset of the character in the selected Font.
  • ECFontlndex 207 is the zero relative index into the FontCode or FontBits tables for this character. By using the value of ECFontlndex as an index, either the encoding size in bits or the actual encoding bit pattern can be accessed quickly.
  • the value range of ECFontlndex is 2 - 43 as shown in Fig. 4A.
  • ECFrequency 208 is the frequency value of the character. It is obtained by using the character as an index into the NC to FreqTables (Fig. 7A) . The value range of ECFrequency is 0 - 255.
  • ECHashRawO 2040 contains the eight least significant bits of the CRC hash computed from the prior two characters in the input stream.
  • the value range of ECHashRawO is 0 - 255. Data is shown in hexadecimal in Fig. 2B.
  • ECHashRawl 2041 contains the eight most significant bits of the CRC hash computed from the prior two characters in the input stream.
  • the value range of ECHashRawl is 0 - 255.
  • ECHashX20 2050 contains the eight least significant bits of zero relative font number multiplied by two. This value is maintained for quick access to the ECFTHashNext table.
  • the value range of ECHashX20 is 0 - 254, even numbers. Data is shown in hexadecimal in Fig. 2B.
  • ECHashX21 2051 contains the eight most significant bits of zero relative font number multiplied by two. This value is maintained for quick access to the ECFTHashNext table.
  • ECHashX21 is 0 - ((MaxFontTable-l)*2)/256) .
  • ECNewIndex 206 is the zero relative index into FontCode or FontBits representing the New Character position in this Font.
  • the value of ECNewIndex is derived from Font Size and font-relative new character position. (During font encoding, ECNewIndex is computed and stored. Later, during string encoding, FontBits is retrieved and, during the output process, FontCode is retrieved. See Fig. 4A.) Similarly for pair encoding and/or string escapes, the value of ECNewIndex is incremented by 1 or 2 and the bit cost or pattern quickly determined. The value range of ECNewIndex is 2 - 41.
  • ECRepeats 201 is the count of repeats of this character beyond two. That is, the two prior characters are the same as this one. The buffer pointer will not advance as long as subsequent input characters remain the same and ECRepeats is less than or equal to 255. The value range of ECRepeats is 0 - 255. Font Encoding Process Flow
  • Font encoding process flow is shown in Fig. IB, first phase, steps 1 through 10. Font encoding and font update processing are performed in steps l through 10. This series of steps occurs once for each character of input. In this process, known as "refill", a character is added to the process buffer and the current input pointer is advanced by one.
  • the steps (shown in Fig. 2B as SI, 82, S3, etc. corresponding to step 1, step 2 step 3, etc.) are as follows: 1. A character from the input stream is fetched and stored in the current input ECChar field 210. 2. The same character is stored in the current
  • ECCharCopy field 211 (The relationship of ECChar and ECCharCopy is shown in Fig. 2A) .
  • the current character is compared with the two prior characters in the input stream. If equal, the ECRepeats field is incremented (e.g. 212 in Fig. 2B) and, if the ECRepeats field is less than or equal to 255, flow proceeds to step 1 above. This loop insures that no more than three consecutive like characters are stored in the history buffer (except when the number of consecutive like characters exceeds 258) . 4.
  • the CRC hash is computed on the two prior characters in the input stream (as described under
  • the appropriate font 214 is accessed or created (as described hereinabove) . If the font exists, the font number from FT Rough Table 215 is stored in the ECHashX2 table high and low bytes (217 and
  • the index value fetched in step 6 is added to the NC (NewChar position) 219 from the font accessed in step 5.
  • the result 220 is stored in the ECNewIndex for later use as a NewChar or String Escape. If the current character (from step 1) was not found in the accessed font, the ECType field is set to 4 denoting a NewChar encoding.
  • the raw position 221 of that character in the font is added to the index value 222 fetched in step 6 and the result 223 stored in the current ECFontlndex field. If the character position is greater than or equal to the NC (NewChar) value from the font, the ECFontlndex field is incremented by two if in Mode A and 3 if in Mode B allowing for the virtual positions of the NewChar, Pair Encoding and/or String Escapes.
  • the ECType field is set to 2 224 denoting that the character was found in the font, a "Font Encoding".
  • step 11 If the number of characters in the process buffer array is now 256 or, the Font Trending Switch changed from Mode A to B (or vice versa) , or a timer-initiated flush occurs, flow proceeds to step 11 below for string processing and output. Otherwise flow proceeds to step l above.
  • Steps 11 through 20, including string search, least cost encoding selection and output are described hereinbelow under "Second Phase Processing”. Font Reallocation: As input context changes, old fonts go out of use and new ones are created. Since there is a limit to the number of practical (actual) fonts in a preferred embodiment (e.g. 1024) , a method for reassigning fonts is required. In the preferred embodiment this is a circular (low to high then back to low) replacement heuristic. An alternative embodiment may also include a "less recently used" heuristic. The next three paragraphs describe the combination. (The source listing of Appendix 1 details the circular heuristic only) .
  • an unused bit of the FTHashNext field is set to 1 indicating activity.
  • the reallocation process traverse the fonts, it will reset the activity bit if it is set and link to the next candidate font. If the activity bit is reset, the font will be reallocated as a new font.
  • the string encoder of the present invention uses a circular history buffer to store a sliding dictionary.
  • the history buffer is a dictionary of all the strings it contains.
  • String encoding may operate in one of two modes, mode A (using the tables in Fig. 8A) for use on relatively compressible text or mode B (using the tables in Fig. 8B) for use on less compressible text. In both modes, string encoding is designed to achieve near-optimum compression efficiency under the time constraints of on-line operation.
  • the history buffer is tagged at regular intervals and, in a preferred embodiment, is tagged every second character position.
  • the string encoder of the present invention also uses a novel dictionary access structure having a set of tables for accessing the history buffer.
  • Updating the history buffer involves very little processing because it involves no more than accepting the next character and incrementing a pointer.
  • updating the dictionary access structure is as challenging a problem as updating the sliding dictionary in string encoding systems which store the sliding dictionary as a tree structure.
  • the present invention addresses this problem by the use of a novel history buffer access method.
  • the method is based on the structure of the history buffer access tables as shown in Figs. 8A and 8B and it retrieves and uses the same CRC hash codes created and used in the font update process during font encoding. Accordingly, by use of this method, updating of the dictionary access structure is faster and requires less processing than updating a tree structure would require.
  • a tagged history buffer provides additional benefit for accessing and matching strings.
  • String encoding mode A using a tagged history buffer, locates longer strings in a shorter time than earlier methods. While the process searches the same number of candidates, the process encounters shorter linked lists in the access buffers than would otherwise occur. Processing time spent building and searching access tables is beneficially reduced.
  • the history buffer/dictionary access structure includes a history buffer and access tables.
  • the history buffer and the access tables shown in Figs. 8A and 8B are used by the string encoding process of mode A and the string encoding process of mode B respectively.
  • Both Figs. 8A and 8B show an ECRR (History) Buffer (1 byte wide) 81 with a Next Available Buffer
  • Position 82 and an ECRR Suffix (256 positions) 83 show an ECRR Hash Head Buffer (2 bytes wide) 84. Both Figs, show an ECRR Hash Buffer containing an ECRR Hash Link portion (2 bytes wide) 85 and an ECRR Hash Test Portion (2 bytes wide) 86. Both Figs, show the derivation 87 and 88 of the CRC hash used as entry to the ECRR Hash Head Buffer 84. Both string encoding mode A and string encoding mode B use the CRC hash created earlier during font encoding and stored in the ECHashRaw table (see Fig. 2B) .
  • String encoding mode B uses the CRC hash (a hash based on two consecutive characters) directly.
  • String encoding mode A uses a novel algorithm (which includes the CRC hash) to create a hash based on two consecutive pairs of characters (four consecutive characters) as illustrated by the following example for the four characters "THEY”: "TH" [CRC hash] yields XXXX (16 bits)
  • FIG. 9 shows a Data Stream 91, a History Buffer ECRR 92 with a Next Available Buffer Position 93, A CC Buffer 94 with a Current Character 95, and a Pointer "p" 96.
  • the pointer 96 is shown for Mode A to have a First Start Point for String Search 901 displaced 3 characters from the position of the current character and a Second Start Point for String Search 902 dispaced 2 characters from the position of the current character.)
  • the linked list to be searched is, on average, only one-half the size it would otherwise be (the list is drawn from a population of candidates only one-half the size it would otherwise be) .
  • Less memory is required for the ECRR Hash- Link/Hash-Test Table because, in the preferred embodiment, it is only one-half the size it would otherwise be.
  • the end of the linked list is determined dynamically by comparing the current hash code with the content of the ECRR Hash Test field. Thus the need to maintain end of list pointers or link length pointers or the like is eliminated. Because the end of the linked list is determined dynamically, no maintenance is required for the overwritten string.
  • FIG. 9 shows a Data Stream 91, a History Buffer ECRR 92 with a Next Available Buffer Position 93, a CC Buffer 94 with a Current Character 95, and a Pointer "p" 96.
  • the pointer 96 is shown for Mode
  • step 4 Select from the outputs of step 4 and step 5 the string which: a) is the longest; b) if the strings are equal, the one that is most recently stored.
  • the history buffer location of the first string character is subtracted from the Next Buffer Store location. Buffer wraparound, if any, is corrected such that the result is the displacement from the found string to the Next Buffer Store location and is in the range 0 through BufferSize-1. Note that strings closest in recent history (newer) have lesser displacements than do older strings. Example 1. (using 8192 character buffer)
  • the calculated displacement can be expressed in thirteen bits.
  • the displacement is further broken into two components.
  • A) A zone portion from the most significant five bits.
  • the five bit zone is Huffman coded using the ZoneCode table and the eight bit offset is inserted directly into the output stream.
  • the Zone may be encoded using from 2 to 7 bits depending upon zone value with the strings closest in recent history getting favorably shorter encodings.
  • Fig. 7C shows a History Buffer 753 having a character string beginning with "w" at location 933 (hexadecimal) 754.
  • Fig. 7C shows that a string of nine characters 752 in the character stream match the nine characters in the history buffer beginning at location 933 754.
  • the string location 933 (hexadecimal) 754 is subtracted from the location of the Next History Buffer Location 1201 (hexadecimal) 757 to yield 8CE (hexadecimal) whose 13 least significant bits 758 comprise the displacement which is broken into two components: i) a zone portion from the most significant five bits 759 and ii) an offset portion from the least significant eight bits 760.
  • the five bit zone portion is Huffman coded using the Zone Code table 761 and the eight bit offset is inserted directly in the output stream.
  • the Output Bit Stream Sequence 752 includes 1st: Font (001), 2nd: Length (1011), 3rd: Zone (01001) and 4th: Offset (11001110).
  • the string search process discards matches having less than a predetermined number of characters, the predetermined number being greater than the hash length.
  • the minimum string length can be greater than the hash length and it is advantageous to make it so.
  • mode A the hash length is 4 and the predetermined number is 6.
  • mode B the hash length is 2 and the predetermined number is 3. Setting a lower limit on the length of the string reduces the bit-cost of encoding longer strings because the top (shortest code) entry into the Huffman table is used to represent a string of the minimum length.
  • the decoder maintains a table of the actual two characters which are associated with each font. Thus it can do a direct lookup when directed by the encoded bit stream.
  • FIG. 7B shows an input character stream 731 with a character pair « « w " 732, Font (No) 722 (font address hex 195) and a Global Font Encoding Table 723 yielding, at entry point 3 725, a Font String Code 001 724.
  • the character pair "w " lower case w and caret
  • font number 215 hexadecimal
  • a font escape is a bit encoded sequence which serves as a signal from the encoder that the subsequent item is to be treated differently from that normally expected.
  • a font encoded sequence that signifies that a NewChar follows in the data stream is an escape. It is used as an Escape to signal GlobalNC encoding.
  • Another escape is String Escape. This is a bit sequence specifically to condition the decoder for reception of a string.
  • String Escape has a value equal to NewChar Escape + 1 when String "A" mode is active.
  • string mode B the font has 3 escapes: 1) New character. Value - Font NC. 2) Pair encoding. Value *-*• * Font NC + 1. 3) String mode B.
  • Second Phase Processing Second Phase Processing includes string search, least cost encoding selection, formatting and output. Throughout these steps, the pointer into the process buffer is the current output pointer which is from 1 to 256 characters behind (older than) the current input pointer.
  • step 12 If a less than a minimum length string (3 if mode B and 6 if Mode A) is found in step 11, proceed to step 15. Otherwise, the bit cost of the string is computed by summing the costs of String Escape,
  • step 13 Compute the bit cost of equivalent font encoding for each position corresponding to a character in the string using step a or b below. Subtract this bit cost from the total from step 12. If underflow (the result goes negative) at any point, exit step 13 since the string encoding wins over the font encoding. If all corresponding positions are examined without underflow occurring, font encoding has a lesser or equal bit cost and will be used so proceed to step 15. a. If the ECType field is 4, use the ECNewindex field to access the FontBits table for the bit cost of NewChar Escape. Use the ECFrequency field to access the GlobalBits table for the bit cost of the NewChar. b.
  • the ECType field is 2
  • use the ECFontlndex field to access the FontBits table for the bit cost of a font encoding.
  • string encoding wins as indicated in step 13 change the ECType field corresponding to the first character of the string to an 8 (denoting String Encoding) and then change the ECType field corresponding to all remaining characters of the string to a 0 (denoting string continuation) .
  • step a Fetch the ECNewIndex value corresponding to the current output position and add 1. The result is used to index the FontBits table. This is the bit cost for the Pair Encoding Escape. b. Add 10 to the result of step a. This is the total Pair Encoding cost. 16. Compute the bit cost of equivalent font encoding for each of the two characters in the pair (current output position and current output position +1) using step a or step b below. Subtract this bit cost from the total in step 15.
  • step 16 If underflow (the result goes negative) at any point, exit step 16 since the pair encoding wins over the font encoding. If the two positions are examined without underflow occurring, font encoding has a lesser or equal bit cost and will be used so proceed to step 18. a. If the ECType field is 4, use the ECNewindex field to access the FontBits table for the bit cost of NewChar Escape. Use the ECFrequency field to access the GlobalBits table for the bit cost of the NewChar. b. If the ECType field is 2, use the ECFontlndex field to access the FontBits table for the bit cost of a font encoding. 17.
  • step 16 If pair encoding wins as indicated in step 16, change the ECType field corresponding to the first character of the string to a 6 (denoting Pair Encoding) and then change the ECType field corresponding to the next character of the pair to a 0 (denoting string/pair continuation) .
  • NCBitsNew Every forty-eight NewChars (which may be more than forty-eight input characters) , the current sum in NCBitsNew is added to the previous forty-eight character sum from NCBitsPrior and the result compared to the constant 96 * 7.5 (representing 96 characters at 7.5 bits per character). If there are less than 96 * 7.5 bits in the result, the Compressibility Trending Switch is turned OFF (or remains OFF). If the result is 96 * 7.5 or greater, the Compressibility Trending Switch is turned ON (or remains ON) . After the calculation, the current NCBitsNew is stored in NCBitsPrior in preparation for the next cycle forty-eight NewChars later.
  • Font encoding is active.
  • String mode A is active.
  • Pair encoding is active.
  • Tables 1A through ID The several encodings produced by the present invention, in addition to font encoding (NewChar, Pair and String) are shown in Tables 1A through ID below.
  • Table 2 provides the key to the data in these tables.
  • EEE A Huffman pattern from the FontCode table from one to four bits in length encoding a value not equal to the Font NewChar or Font NewChar plus one and representing the font relative position of the encoded character in the Font.
  • FFF A Huffman pattern from the FontCode table from one to five bits in length encoding a value not equal to the Font NewChar, Font NewChar plus one, or Font NewChar plus two and representing the font relative position of the encoded character in the Font.
  • GGG A Huffman pattern from the GlobalHigh/
  • GlobalLow table from four to thirteen bits in length, encoding a value from 0 to 255 and representing a character frequency.
  • PEB A Huffman pattern from the FontCode table from two to four bits in length encoding a value equal to the Font NewChar plus one and representing a Pair Encoding Escape, Mode B. PPP The remainder of a Huffman pattern from the FontCode table
  • GlobalHigh/GlobalLow table excepting the first two bits which are emitted separately, from two to eleven bits in length, encoding a value (in consideration of the prior two bits) from 0 to 255 and representing a character frequency.
  • SEA A Huffman pattern from the FontCode table from two to four bits in length encoding a value equal to the Font NewChar plus one and representing a String Escape, Mode A.
  • SEB A Huffman pattern from the FontCode table from three to five bits in length encoding a value equal to the Font NewChar plus two and representing a String Escape, Mode B.
  • SSS A Huffman pattern from the GlobalHigh/
  • GlobalLow table excepting the first four entries (those beginning with 11) , from four to thirteen bits in length, encoding a value from 4 to 255 and representing a string length of 6 - 253 characters.
  • ZZZ A Huffman pattern from the ZoneCode table, from four to thirteen bits in length, encoding a value from 0 to 31 and representing the five most significant bits of the string displacement. Used with the oooo pp, described above to identify a string position in the history buffer.
  • the least-cost encoding is built by selecting the encoding that has the fewest bits. If the bit cost of the two encodings are the same, font encoding is chosen.
  • Table 3 indicates the action taken for each character output.
  • Switch On (Transparent Mode) Switch Off (Mode B Encoding) SUM > 0 No Change j SUM ⁇ 0 No Change
  • Figs. 10A and 10B provide a flowchart of the decoding process.
  • Table 4 provides the key to the flowchart of Figs. 10A and 10B.
  • D G Decode Global. Decode a Huffman pattern which was selected and encoded from the
  • D F Decode Font Decode a Huffman pattern which was selected and encoded from the appropriate Font Encoding tables (Fig. 4A) .
  • D S Decode Short Decode a Huffman pattern which was selected and encoded from the GlobalCode encoding table.. Same as the DG (Decode Global) except that two bits have already been fetched (F2) and are in DCCode. Used for length of 'A' type strings.
  • D L Decode Length Decode a Huffman pattern which was selected and encoded from the LengthBCode encoding table. Used for length of *B' type strings.
  • D Z Decode Zone Decode a Huffman pattern which was selected and encoded from the lowest 64 entries in the GlobalCode encoding table.
  • the system uses two processors connected in series between the computer (the DTE interface, 111) and the telephone line (the DCE interface, 112) , each processor having its own memory.
  • One processor the DCE Interface Processor 113, a Zilog Z80180, performs DCE interface processes (modem control and data flow management) .
  • the other processor the DCE Interface Processor 113
  • Compression/Decompression and DTE Interface Processor 114 performs data compression, data decompression and DTE interface processes (data interchange with the PC) .
  • This configuration is shown for duplex operation in Fig. 11.
  • Fig. 11 shows a data rate of 11,500 characters/second at the DTE Interface and a data rate of 1,500 characters/second at the DCE Interface.
  • the conventional (prior art) approach using a single processor 121 to perform all functions (DTE interface, DCE interface and Compression/Decompression) is shown in Fig. 12.
  • the single processor approach involves using a more powerful, albeit more expensive, processor.
  • the general problem of sharing tasks among multiple processors is known to be a difficult problem in computer science.
  • FIG. 13 shows a conventional two-processor configuration having a DTE/DCE Interface Processor 131 and a Compression/Decompression Processor 132.
  • the present invention achieves the sharing of tasks by a simple but, nonetheless, unexpectedly effective configuration.
  • the preferred embodiment shown in Fig. 11, achieves efficient control over all processes occurring in the system.
  • This configuration utilizes the insight that compression and decompression and interface with the terminal all occur at a high error-free data rate, whereas modem control and the data line interface processes operate at a lower data rate and involves error detection and repeat transmission to cope with transmission errors.
  • a first relatively high speed processor is used for both control of the terminal interface and for data compression and decompression; and a second processor is used for the processes involved in control of the data line interface including error detection and retransmission.
  • Font One record of an array of records each record consisting of pointers, links, characters, etc., each record having an address based on the prior two characters in the input stream, each record containing a list of historically occurring candidate characters to be matched with characters from the input stream.
  • Huffman Codes As used in this document, this term refers to any variable length bit representation having fewer bits corresponding to higher frequency of occurrence, including but not limited to codes created by a tree algorithm.
  • NewChar The occurrence of "NewChar” is a font encoding event wherein either the encoder current character, "ECChar”, is not found in the selected font or there is no font in existence (and it thus contains 0 characters based on the font selection scheme) .
  • NewChar Symbol A dynamic value in the range of 0 through n (where n is the maximum number of characters per font) which represents the current virtual position in the font which represents "character not in table". It is used as an escape to signal GlobalNC encoding. NewChar Escape Specifically an encoding representing the NewChar Symbol. String Escape An escape sequence specifically to condition the decoder for reception of a string. When used in the context of a font encoding or a font decoding, a value equal to NewChar Escape + 1.
  • .sfcond include ITEcl9 include TCdfmOOl
  • NC8BitCycle EQU 64 NC8BitCycle EQU 64 ; J (64) controls A-String/B-String
  • ZoneTestA EQU 1 ; J (1) 0 - HIGH; 1 - HIGH &
  • FontTables EQU 1024 may be 512-1024 provided that
  • BufferSuffix EQU 1 ; 0 - nulls; 1 - maintained Failsafe EQU 0 ; 0 - no failsafe; 1 - output failsafe
  • Failsafe FailSafeSets EQU 4 ; output every (n * 256) encodings
  • MinimumAString EQU 6 minimum length A string MinimumAUpdate EQU 3 ; bytes advanced if no A string found MinimumBString EQU 2 ; minimum length B string MinimumBUpdate EQU 1 ; bytes advanced if no B string found NCFreqSets EQU 4 ; uses 256*2*2*Sets bytes
  • NCFreqSetsHigh EQU 0 ; if used, 0 gives best result ????? NCFreqSetsReset EQU 1 ; 0 - no; 1 - reset on B to A change Repeats EQU 1 ; 0 - no repeat logic; 1 - repeat logic
  • ProdCycle EQU 67 prod every ProdCycle characters
  • Load8250 EQU serial loader/debugger ; parallel loader/debugger
  • This block from 4000h to Obfffh inclusive, is the 32 kbyte page
  • FontlstUse IF "&YY” EQ "FU” ⁇ LDA ECFontBase+1 ADD #HIGH(ECFontTables) STA ECFontBase+1 LDA #000h STA ECCharacters STA ECNCIndex
  • DCKCharEQNCIndex IJMP DCCharSwap DCKCharEQNCIndex: INC .
  • TXA output is NC frequency value STA ECFrequency,Y ; save for consistent
  • NCBitsPrior set to

Abstract

A system for the dynamic encoding of a character stream has a single character encoder that includes a plurality of fonts, a string encoder that includes a history buffer, and an output selector that compares encodings from the single character encoder and the string encoder and selects the least cost encoding for output. The single character encoder generates and stores hash codes used for font access and the string encoder retrieves these same hash codes and uses them for history buffer access.

Description

COMPOUND ADAPTIVE DATA COMPRESSION SYSTEM
DESCRIPTION
Technical Field The invention relates to the field of data compression systems and particularly to apparatus and methods for compressing data signals and reconstituting the data signals.
Background Art Data Compression System Requirements
Data compression systems are known in the prior art that encode a stream of digital data signals into compressed digital signals and decode the compressed digital data signals back into the original data signals. Data compression refers to any process that converts data in one format into another format having fewer bits than the original. The objective of data compression systems is to reduce the amount of storage required to hold a given body of digital information or to increase the speed of data transmission by permitting an effective data transmission rate that is greater than the rated capacity of a given data communication link. Compression effectiveness is characterized by the compression ratio of the system. Compression ratio is herein defined as the ratio of the number of bits in the input data to the number of bits in the encoded output data. The larger the compression ratio, the greater will be the reduction in storage space or transmission time.
In order for data to be compressible, the data must contain redundancy. Compression effectiveness is determined by how effectively the compression procedure matches the forms of redundancy in the input data. In typical computer stored data, e.g. English text, computer programs, arrays of integers and the like, redundancy occurs both in the nonuniform usage of individual symbols, e.g. characters, bytes, or digits, and in frequent recurrence of symbol sequences, such as common words, blank record fields, and the like. An effective data compression system should respond to both types of redundancy: A typical data stream contains both types of redundancy in varying portions resulting in varying statistics. An example of a data stream of varying statistics is a data stream wherein
"normal" English text is immediately followed by a computer program, for example source code in the "C" programming language.
To be of practical and general utility, a digital data compression system must possess the property of reversibility, i.e. it must be possible to reexpand or decompress the compressed data back into its original form without any alteration or loss of information. The decompressed and the original information must be identical and indistinguishable with respect to each other. In addition, it should satisfy several performance criteria. First, the compression effectiveness should be high, and therefore the compression ratio should be large. Second, the system should provide high data rate performance with respect to the data rates provided by and accepted by the equipment with which the data compression and decompression systems are interfacing. For real time, switched network, data communications applications, preferably the rate at which data should be compressed should match the output data rate from the compression system. Because it should match the output (compressed) rate, it should be higher in proportion to the compression effectiveness, typically 6:1. The higher the compression effectiveness, the faster the input data must be processed to provide sufficient output data to fully utilize the capacity of the output channel. Thus high data rate performance of data compression processing is necessary to match the line speed of today's communication systems and the compression effectiveness of modern data compression methods. The data rate performance of data compression and decompression systems is typically limited by the time required to perform the processing steps associated with encoding each incoming character, which in turn is limited by serial processing and the speed of the compression processor. High performance for a given compression processor is achieved by a compression method that uses fewer processing steps, on average, to encode each incoming character. The fewer processing steps, the higher the performance. However, complex methods are needed to achieve high compression effectiveness for data streams of varying statistics. Such methods tend to increase the number of processing steps and therefore tend to reduce data compression processing performance.
Third, the system should be adaptable, that is, capable of achieving high compression effectiveness and high performance on data having a variety of statistical characteristics. Many prior art data compression procedures require prior knowledge of the statistics of the data being compressed. Some prior art procedures adapt to the statistics of the data as it is received. Adaptability in the prior art processes has either been limited to a narrow range of variation e.g. character-by-character encoding or has required a high degree of complexity with resultant severe penalty in data rate performance. The requirement for data compression systems suitable for use in modems in high speed data communication links is to accommodate a wide range of data characteristics without prior knowledge of data statistics and achieve both high compression ratios and high data rate performance. Data compression and decompression systems and modems currently available are either not adaptable over a wide range of data characteristics or are severely limited in compression efficiency or data-rate performance and so are not suitable for general purpose usage.
Finally, the system should be responsively adaptable, that is, capable of reestablishing a high compression ratio quickly after the beginning of a new data file from a stream of data files, wherein each file has different statistical properties from the data in the immediately proceeding file. Prior Art Systems
U.S. Patent 4,612,532 to Bacon et al., which is hereby incorporated herein by reference, discloses a system for adaptive compression and decompression of a data stream designed to compress redundancy resulting from non-uniform usage of individual symbology. The Bacon invention uses an adaptive character-by-character compression technique wherein dynamically updated "followset" tables having Huffman codes are used to encode characters, using, on average, far fewer bits per character than is required by ASCII or EBCDIC encoding. Each incoming character is encoded using information from the three preceding characters (character type, character type, character identity), i.e. (two bits, two bits, seven bits). Thus, for each incoming character, information from the three preceding characters is used to select the appropriate followset table. The Bacon invention has a high compression efficiency on a character-by-character basis and achieves high performance by using fewer processing steps, on average, to encode each character than other character-by- character encoding techniques.
U.S. Patent 4,558,302 to Welsh discloses a string search system designed to compress redundancy resulting from frequent recurrence of symbol sequences. The Welsh invention includes a compressor which compresses a stream of data character signals into a compressed stream of code signals. The compressor stores strings of data character signals parsed from the input data stream and searches the stream of data character signals by comparing the stream to the stored strings to determine the longest match. Having found the longest match, the compressor stores an extended string comprising the longest match plus the next data character signal following the longest match and assigns a code signal thereto. A compressed stream of code signals is provided from the code signals corresponding to the stored longest matches.
U.S. Patent 4,464,650 to Eastman et al. discloses an adaptive string search system designed to compress redundancy resulting from frequent recurrence of symbol sequences. The Eastman invention uses the Lempel-Ziv algorithm to encode strings of characters without constraint on the length of the input or output word. However, ^the Eastman invention suffers the disadvantage of requiring numerous RAM cycles per input character and utilizing time consuming and complex mathematical procedures such as multiplication and division to effect compression and decompression. These disadvantages tend to render the
Eastman invention unsuitable for on-line data communications applications.
U.S. Patent 4,730,348 to McCrisken discloses a system for adaptive compression and decompression of a data stream using a combination of techniques to compress redundancy from non-uniform usage of individual symbols and frequent recurrence of symbol sequences. The McCrisken implementation uses an adaptive character-by-character compression technique described as "bigram encoding" based on "pruned tree" Huffman and "running bigrams" to compress redundancy resulting from non-uniform usage of individual symbology. As part of his adaptive character-by-character compression technique, McCrisken uses a plurality of encoding tables, on-line analysis of compression efficiency, an on-line table builder, a table changer and a table change code to permit rapid adaptation to changes when compressing data streams having varying statistics. McCrisken also uses a history buffer and a string substitution technique which identifies and further compresses matching strings of up to eighteen characters to compress redundancy resulting from frequent recurrence of symbol sequences. Both techniques are adaptive and therefore do not need prior knowledge of data statistics. In a preferred embodiment, some of the data stream is encoded on a character-by-character basis and some of the data stream is encoded with a string substitution code. McCrisken also uses protocol emulation and packet size control to improve performance. The McCrisken character-by-character compression technique has a low compression efficiency and a poor data rate performance compared to the Bacon method. This is partly because the encoding tables of McCrisken's character-by-character compression technique are updated on the basis of on-line explicit analysis of compression efficiency and this technique is very inefficient compared with the transposition heuristic used by Bacon to update his followset tables. McCrisken's use of a string substitution technique compensates to a large extent for the low compression efficiency and poor performance of the adaptive updating of the McCrisken encoding tables. However, the processing required to perform the search for the longest list in the McCrisken is time-consuming and the search is limited, in McCrisken's preferred embodiment, to the first twenty items in the list. Also, because of the McCrisken string substitution code, the longest matching string that can be encoded as such is eighteen characters long (column 14, lines 13-18) . Because of these disadvantages, McCrisken does not achieve as good a compression ratio as the Eastman implementation of the Lempel-Ziv algorithm which uses no character-by-character encoding of any kind. Furthermore, because of its complexity the McCrisken implementation is inherently slow. James A. Storer, in his book Data Compression: Methods and Theory, Computer Systems Press, 1988, which is hereby incorporated herein by reference, discusses methods and theories pertaining to lossless data compression over a noiseless channel with serial I/O. Storer describes a family of character-by-character techniques (p.20) and notes (p.21) that (i) the performance of Huffman codes has been well studied and can serve as a useful benchmark on which to judge the effectiveness of more complex methods and (ii) for several applications it will be useful to combine more sophisticated techniques with Huffman codes. A dynamic Huffman codes method is discussed (p.40) in which "tries" (special tree structures - see p.15) are built dynamically and maintained based on characters appearing in the data stream. Storer describes the "unseen leaf" (the equivalent of "new character" in the Bacon patent) but does not describe the floating position characteristic of Bacon's "new character." Higher order Huffman codes are described (p.44) along with the "transposition" heuristic (p.45), correctly attributed to Bacon (p.52) .
Storer discusses in detail three on-line textual substitution methods (p.54), all of which use dynamically updated local dictionaries. The three methods are the sliding dictionary method, the improved sliding dictionary method and the dynamic dictionary method. The local dictionary of strings is stored in a "trie" structure (p.15) which is a tree where the edges are labeled by elements of the alphabet in such a way that children of a given parent are connected via edges that have distinct labels, all leaf nodes are labelled as "marked", and all internal nodes are labeled as either "marked" or "unmarked". The set of strings represented by a trie are those that correspond to all root to marked node paths. The sliding dictionary method (p.64) contains within its local dictionary all strings contained within a portion of the source string defined by a "sliding window" technique well known (but used for other purposes) in data communications systems. This method is similar to the method using a history buffer described by McCrisken except for the method of storing pointers to strings. It is a practical realization of the first of two universal data compression algorithms proposed by Lempel and Ziv designated by Storer (p.67) as LZ1. The LZ1 algorithm works as follows. At each stage, the longest prefix of the (unread portion of the) input stream that matches a substring of the input already seen is identified as the current match. Then a triple (d, 1, c) is transmitted where d is the displacement back to a previous occurrence of this match, 1 is the length of the match, and c is the next input character following the current match (the transmission of c is pointer guaranteed progress) . The input is then advanced past the current match and the character following the current match. The sliding dictionary method can be viewed as a practical implementation of LZl that uses fixed size pointers; instead of remembering the entire input stream the system remembers only a fixed number of characters back, and instead of pointer guaranteed progress, the system uses dictionary guaranteed progress by reserving codes for the characters of the alphabet. The improved sliding dictionary method (p.67) contains a heuristic that eliminates duplicate strings. It too requires that the alphabet be added initially in the local dictionary. Storer also suggests using Huffman coding of output pointers. The dynamic dictionary method (p.69) uses update and deletion heuristics that maintain a collection of strings that do not, in general, form a contiguous portion of the input stream. Various update and delete heuristics (i.e. mechanisms which provide learning capability) are described which are used to implement the methods. Both the improved sliding dictionary method and the dynamic dictionary method create and maintain a dictionary that is different from the history buffer of McCrisken. Apart from the heuristic for locating the longest match (Storer's "greedy match heuristic") most of the heuristics described by Storer are directed to the maintenance of pointer sets for the special dictionaries. Difficulties encountered by the use of heuristics such as "pruning" to remove "dead strings" relate also to the special nature of these dictionaries. Storer's experimental data shows that sliding dictionary methods provide significantly better compression ratios than Huffman coding methods especially on spread-sheet data; the improved sliding dictionary method provides a higher compression ratio by 1% to 2% over the sliding dictionary method; and the best performance of the dynamic dictionary methods is better than the best performance of the sliding dictionary and the improved sliding dictionary methods. Storer textual substitution methods provide compression ratios of typically between 3-to-l and 2-to-l on English text and between 5-to-l and 2.5-to-l on programming language text. U.S. Patent No. 4,876,541 to Storer discloses and claims the AP (all-prefixes) heuristic, modifications of the LRU (least recently used) strategy, limited look ahead, and the use of the MaxChildren parameter.
Textual substitution methods achieve higher compression ratios with large files and dictionaries. However, as the files and dictionaries grow, so too does the time taken to access and update them. Storer, in his patent, describes a string search data compression system that uses a sliding dictionary that is stored as a tree ("trie") structure. This approach provides fast access to dictionary entries but updating the tree structure loads the processor heavily so Storer uses sophisticated update heuristics. McCrisken describes a string search data compression system that uses a history buffer. The McCrisken approach provides fast updating of the history buffer but, in this case, string matching loads the processor heavily. McCrisken resolves this with arbitrary cut-off of his search process. Practical on-line, prior art, textual substitution techniques are thus limited by the trade-off between the size of the files and dictionaries on the one hand and the speed of the access algorithms and update heuristics on the other. To the extent that access and update processing can be done more efficiently, i.e. faster, then larger files and dictionaries can be maintained with a corresponding improvement in compression ratios for a given data rate. Disclosure of Invention
The invention provides a system for the dynamic encoding of a character stream. A preferred embodiment of the system comprises a single character encoder which includes a plurality of fonts, a string encoder which includes a history buffer, and an output selector which compares encodings from the single character encoder and the string encoder and selects the least cost encoding for output. The single character encoder generates and stores hash codes which it uses for font access. The string encoder retrieves these same hash codes and uses them for history buffer access. The hash codes are generated by applying a CRC algorithm to a character pair and are given the name "CRC hash". The single character encoder maintains a position in a font for all characters not otherwise listed in the font, such characters herein called "new character", and four tables are maintained for the encoding of such characters. The single character encoder also maintains a position in a font for a symbol representing a string, which position directly follows the position of new character in the font. Three or more consecutive like characters are represented in the history buffer by three characters only. A pair encoder is provided that encodes character pairs using the font number. The pair encoder may be active at the same time as the string encoder. Two string encoding modes are provided. A switch controls activation and deactivation of string search processes based on a comparison of the average bit cost of new character encoding with a predetermined value. A hash-link/hash-test table is provided in the string search encoder having entries corresponding to every second character position in the history buffer. This table uses properties of the CRC hash to access matching strings in the history buffer. String match testing starts "n" characters beyond the current character where "n" is the length of the longest match found so far. Accordingly, the string search encoder, in addition to searching forward, also searches back. The string search encoder discards a string match that has less than a predetermined number of characters. Linked lists of pointers to candidate strings are maintained and the end of the linked list is determined using a property of the CRC hash. Brief Description of the Drawings
Fig. 1A is a block diagram and overview of the main buffers, tables and processes of the preferred embodiment of - li ¬
the present invention.
Fig. IB shows the two phases of the encoding process of Fig. 1A.
Fig. 2A illustrates the loading of the data stream into the CC buffer.
Fig. 2B illustrates the relationships among the encoding buffers, tables and processes of the preferred embodiment of the present invention.
Fig. 3 illustrates the fonts used in the font encoder. Fig. 4A shows the global (Huffman) font encoding tables.
Fig. 4B shows the Huffman Tables used for encoding New Character and for encoding String Length in Mode A.
Fig. 4C shows the Huffman Tables used for encoding String Length in Mode B.
Fig. 4D shows the Huffman Tables used for encoding Zone Code in Mode A and Mode B.
Fig. 5 illustrates the font access tables used in the font encoding process. Fig. 6 illustrates the generation and use of the CRC hash.
Fig. 7A illustrates the new character encoding process. Fig. 7B illustrates the Pair Encoding, Mode A process. Fig. 7C illustrates the String Encoding, Mode A process.
Fig. 8A illustrates the use of the history buffer access tables for mode A string encoding.
Fig. 8B illustrates the use of the history buffer access tables for mode B string encoding. Fig. 9 shows the start points for string searches. Figs. 10A and 10B show the decoding logic. Fig. 11 shows the dual processor configuration. Fig. 12 shows the prior art processor configuration. Fig. 13 shows a conventional two-processor configuration.
Detailed Description of Specific Embodiments The present invention in a preferred embodiment combines a novel adaptive font encoding single-character compression technique with a repeat character compression technique and several novel string encoding compression techniques. It includes an adaptive font encoding process that is an improved version of the efficient, high performance font encoding process disclosed by Bacon et al. in U.S. Patent 4,612,532. It includes several novel string encoding processes. It further includes a novel data compressibility trending function which is used to select the most effective encoding process according to the compressibility of the data. The font encoding process and the string encoding process of a preferred embodiment share memory and processes associated with the generation of a novel "CRC hash" using a CRC algorithm, a portion of the CRC hash being used as a hash code for font and dictionary addressing and another portion being used for identification. The present invention achieves superior compression ratios and superior performance over the prior art described above. A copy of the source.code of the preferred embodiment of the present invention, expressed in the assembly language of the Rockwell C-19 processor, is attached hereto as Appendix 1. A guide to the source code listing is given in Appendix 2. A general overview of a preferred embodiment of the system is shown in Fig. 1A. The system provides full duplex operation and it is generally divided into an encoder 1 and a decoder 2 such that each contains its own set of buffers (encoder: PC In Buffer 4, Process Buffer 5, History Buffer 6, and Modem Out Buffer 7; decoder: Modem In Buffer 8,
History Buffer 9, and PC Out Buffer 10), character fonts 11 and 12 and access tables 13 and 14. Both the encoder and the decoder are operated by control software 3 that runs on a single, shared Rockwell C-19 processor. Fig. 1A shows the main tasks performed by the encoder software (Load Process Buffers, Do Font Encoding, Update Fonts, Do String Encoding, Select Least-Code Encoding, Update History Buffers, and Format and Output) 15 and the decoder software (Receive Bit Stream, Interpret Escape Codes, Decode Single Characters and Strings, Load PC Out Buffer, Update History Buffer, and Update Fonts) 16. Fig. IB shows the two phases of encoding. Phase 1 processes (steps 1-10) 17, including Loading Process Buffer, Doing Font Encoding and Repeat Character Encoding and Updating Font, are performed once for each character of input. Phase 2 processes (steps 11-20) 18 including String Encoding, Selecting Least-Cost Encoding, Formatting For Output, and Updating Buffers are performed, typically, when the process buffer is full. Test 19 following Phase 1 is "Process Buffer Full or Flush". Test 20 following Phase 2 is "Flush and Process Buffer not Empty". String encoding includes string encoding mode A, or string encoding mode B which combines string encoding with pair encoding. The decoder performs corresponding decoding processes.
The character stream 20 enters the CC buffer as shown in Fig. 2A. The CC buffer consists of ECChar 21 which contains 256 bytes representing the most recent characters from the data stream and ECCharCopy 22 which contains an identical copy of the content of ECChar. ECCharCopy is provided to" remove the necessity for boundary checking in the string matching process. Fig. 2A shows string continuation for searching 23 extending into ECCharCopy. ECCharCopy is contiguous with ECChar in memory. Fig. 2A also shows the next store location in ECChar 24 and in ECCharCopy 25, and old data 26.
The ECChar and ECCharCopy buffers are two of nine process buffers, shown in Fig. 2B, which operate in parallel and share input and output pointers. These buffers are used by the font encoding and string encoding processes.
Fonts 31 are shown in Fig. 3. Fig. 3 shows a table of fonts 31 having 1024 font numbers 32, an FTLink field 33, an FTMatch field 34, an FTNC field (NewCharPosition) 35, an FTSize field 36, and Font Character fields (6 per Font max) 37. Huffman encoding tables are shown in Figs. 4A-4D. Fig. 4A shows global (Huffman) font encoding tables including an Access Table 41 having an index 42, a Font Code table 43 and a Font Bits table 44. Fig. 4B shows the Huffman Global Code (Frequency) Tables, used for encoding New Character and for encoding String Length in Mode A. The tables have 256 Table Entries, a Code Length of 4-13 bits and are referenced as "Global Code High; Global Code Low" in the source code. Fig. 4C shows the Huffman Tables used for encoding String Length in Mode B. These tables have 10 table entries, a code length of 1-6 bits, and are referenced as "LengthBCode" in the source code. Fig. 4D shows the Huffman tables used for encoding Zone Code in Mode A and Mode B. These tables have 32 table entries, a Code Length of 2-7 bits and are referenced as "ZoneCode" in the source code. Font access tables 51 along with a font table 31 and an input data stream 52 are shown in Fig. 5. The font access table include a CRC Hash Table 53 having CRC Hash 54, a MatchVal data 55 and RoughAdr data 56. The font access tables also include an FTRough Table 59 having an index 57 and FTRough data 58. CRC(Λv)=2963, CRC(in)=05D6 and CRC(Λd)=7DD6 provide entry points 501, 502 and 503 respectively to the CRC Hash Table from the input data stream. The history buffer and history buffer access tables used for string search are shown in Figs. 8A and 8B.
The entire compound encoding process includes: 1. Repeat character encoding;
2. Font (single-character) encoding;
3. Monitoring compressibility of data stream;
4. Selecting encoding processes dynamically (mode A or mode B) ; 5. String encoding (longest match. Mode A) ;
6. String encoding (longest match. Mode B) ;
7. Pair encoding;
8. Anti-expansion process (mode B only) ;
9. Selecting and concatenating encodings having fewest bits.
These processes will now be described in detail starting with font encoding. Font Encoding
In a preferred embodiment of the present invention, font encoding uses a set of fonts having character symbols stored in approximate order of the frequency of occurrence of such character after the occurrence of a pair of characters with which the font is associated. For example, if the input data stream contained the words "this" and "those", then a font would exist associated with the pair of characters "th" and the font would contain the letter "i" and the letter "o". A single font consists of pointers, links, characters, etc. whose selection (font number) is based on the prior two characters in the input stream and which contains a list of historically occurring candidate characters to be matched with Encoder Current Character. Fig. 3 shows an array of fonts. A single font is illustrated by a single row. "New Character", i.e., any character that is not otherwise listed in the table, is also assigned a position in the table in approximate order of such characters local frequency of occurrence after the occurrence of a pair of characters with which the table is associated. "New Character", is hereinbelow referred to as "NewChar" and sometimes abbreviated as "NC". Just as the occurrence of a particular character in the data stream is a font encoding event, so the occurrence of NewChar is a font encoding event. NewChar is a font encoding event wherein either the Encoder Current Character is not found in the selected font or the selected font does not exist. The value of NewChar Position is a dynamic value in the range of 0 through n (where n is the maximum number of characters per font) meaning "Character Not in Table". NewChar does not occupy a character position in the font: it is assigned a "virtual position". Fig. 3 shows how the position of NewChar is stored in field FTNC in the font. In mode A, each font includes a virtual position for a string directly following the NewChar position. In mode B, each font includes a virtual position for a "pair encoding" directly following the NewChar and includes another virtual position for string encoding following the pair encoding. Font Encoding, CRC Hash and Font Access
Font access tables are shown in Fig. 5. Fig. 6 shows how the hash pointer (RoughAdr) and the match value (MatchVal) are derived from the CRC hash.
Font encoding includes the following steps:
1. Computing a CRC hash using a CRC algorithm applied to the prior two characters;
2. Using a portion of the CRC hash (RoughAdr) as a rough selector for a linked list of fine entries and using the remaining portion of the CRC hash (MatchVal) to identify a font;
3. Determining and storing the position of the current character in the selected font; 4. Selecting a global Huffman table according to the current size of the font. FTSize from Fig. 5 is used to enter the Access Table of Fig. 4A. The Font Encoding process occurs once for each character of input data. Fig. 6 shows the data stream 61 including the current character to be encoded "N" and its two predecessors "P" and "S". Encoder Current Character "N" is the most recent character from the input stream which is being processed by the font encoder. At the end of each encoder cycle "Encoder Current Character" becomes CharlPrior and the fetch and encoding process continues with the next character from the input stream as the new Encoder Current Character. In the example of Fig. 6, in the input data stream, 61, Encoder Current Character is "N", CharlPrior (character immediately prior to Encoder Current Character) is "P" and Char2Prior is "S".
After the initial value of the CRC hash is seeded to zero, the CRC hash for the two prior characters ("S" and "P") is created as follows. A CRC function (CCITT polynomial xl6 + xl2 + x5 +1) is performed on the character S and then on P yielding 65 a sixteen-bit CRC result (64 see Fig. 6) (herein below referred to as "the CRC hash" indicative of its function in the present invention) . The CRC hash is used as follows: a) The ten least significant bits of the CRC hash are extracted and stored as RoughAdr (62 see Fig. 6) for use as a hash pointer. b) The six most significant bits of the CRC hash are extracted and stored as MatchVal (63 see Fig. 6) to be used as a match value with the selected font, c) The CRC hash is also stored for later use in constructing hashes for string encoding.
The CRC hash has two very important properties: i) Its ten least significant bits provide a hash code having excellent statistical properties for use in hashing. ii) The sixteen-bit result produced by every possible two-byte combination is unique. No two-byte combination shares a sixteen-bit result with another two-byte combination so the sixteen-bit result may be used to provide one-to-one mapping with the original two bytes.
Accordingly, the ten least significant bits may be used as a hash code to access a table and the remaining six bits may be used to test if this is the specific font assigned to that exact character pair. The CRC hash is used in font encoding and for history buffer access in string encoding mode A and string encoding mode B. It provides significant benefit in reducing the average amount of processor time consumed in accessing the fonts and history buffer, thereby enabling a given processor to handle higher encoding throughput rates. The use of the CRC hash, as described herein below, by virtue of the throughput rate benefits, also provides a practical realization of trigram font encoding. The combination of the CRC hash and MatchVal will always identify uniquely the font associated with the prior two characters.
We found experimentally that the use of all sixteen bits of the prior two characters to identify a font gives an 8%-12% improvement in font encoding compression efficiency on "normal" English text when compared with the Type/Type/Prior Character method described in U.S. Patent 4,612,532 to Bacon et al. We also found experimentally that use of ten bits from the CRC hash, in the manner described hereinabove, produces less synonyms and therefore reduces execution time. This benefit is achieved because less time is spent linking through the fonts via the FTLink fields (see Fig. 5) . Huffman Encoding Tables
Fig. 4A shows a set of global Huffman tables and the associated access table. The Access Table 41 is indexed by Font Size 42 and contains pointers to the several Huffman tables for Font Code 43 and Font Bits 44 (the bit cost of the encoding) . The Access Table is "Encoding Table" in the source code. Index 0, and the corresponding Font Code (1,0) and Font Bits (1,1) are not used. ECFontlndex is computed and stored during font encoding. Later, during string encoding, FontBits is retrieved and, during the output process, FontCode is retrieved. Figs. 4B, 4C and 4D each show a single Huffman table. Fig. 4B shows the table used for new character encoding, for string length encoding and for repeats encoding. Fig. 4C shows the tables used for string length, mode B encoding. Fig. 4D shows the table used for the zone portion of string address encoding, mode A and mode B.
Font Encoding, Example 1. Finding the Current Character in the Current Font
Referring now to Fig. 5, let us consider the encoding of the following string:
"ΛVeni, Vidi,ΛVinci.ΛAΛdo"
In this string the caret character "A" has been substituted for the space character " " to reduce ambiguity. Fig. 5 shows the static state of the Font Encoding and Access Tables directly after processing the string. Beginning at an initial state having empty fonts, the process of encoding the first character proceeds as follows. 1. Initialization and Assignment of the first Font. As described above, each new character to be encoded is associated with a CRC hash. The ten least significant bits of the CRC hash 56 are used as a pointer to the ECFTRoughTable 59 (Encoder Font Rough Table) . Since all fonts are empty at the outset, the ECFTRoughTable is initially null indicating the need for new font creation. A font number is assigned and stored in the ECFTRoughTable in the position pointed to by the hash ("000" in the example given in Fig. 5) . This font number is either the next available not-in-use font or an old font selected as described later.
The newly created font is initialized as follows: FTLink = NULL FTMatch = MatchVal from CRC calculation
FTNC = 0 (Most frequent)
FTSize = 1
First Font Character = Encoder Current Character Other Font Characters = N/A Following table reset, the first character to be processed is the "Λ". The prior two characters and the CRC are assumed to be 0. Thus a MatchVal and ten-bit RoughAdr of 0 are used. This points to FTRoughTable entry number 0 (which was initially null) and font number 1 was assigned. Font number l was initialized as specified above and has not changed since, as indicated by Fig. 5.
2. Finding the Current Font and the Current Character in the Font a. The current font is accessed as follows. When "e" becomes the current character, a CRC hash is performed on
" " and "V". The result is hexadecimal 2963 (third row of the hash table in Fig. 5) giving a MatchVal of 28 and a RoughAdr of 163. b. The RoughAdr of 163 is used to enter the FTRough Table and yield the font number 0003, the font to be tested to determine if it is the font »*v". c. To test if the selected font is the font "ΛV", MatchVal is compared with the FTMatch from the selected font. If these are equal, the font is searched for the occurrence of Encoder Current Character. 3. Storing the Current Character a. If the current character is found, its position, the size of the font and other pertinent data are stored in the process buffers for later use by the encoding selection process. The character matching Encoder Current Character is promoted towards the top of the table (higher frequency) by exchange with the next higher frequency entity (character or NC) . b. If the current character is not found, it is added to the table in the next available position (overwriting the last character when the table is full) and the table size is incremented (if not full) . The NC value is promoted one position towards the top of the table unless already at the top (highest frequency) .
Font Encoding, Example 2. Finding the Current Font Using the Link Table If FTMatch does not equal MatchVal, FTLink is examined. If FTLink is null, then the Ftlink field is assigned the next font number and the flow joins step 1 above for the creation of a new font. If FTLink is not null, control proceeds to FTMatch comparison in step 2 with the FTLink field as the new font number. Linking and match comparison repeat until either the desired font is found or a new one is created.
The last line of the input data stream in Fig. 5 details the "o" character from the sequence "AAdo". The calculated CRC hash for "*d" is 7DD6 which yields a MatchValue of 7C and a RoughAdr of 1D6. Note that the sequence "in", seven characters earlier, produced a CRC hash of 05D6, MatchValue of 04 and RoughAdr of 1D6. Access to entry 1D6 in the ECFTHashRough Table yields a pointer to font number OOOC but comparison of the FTMatch field in font OOOC does not equal the desired value of 7C. At that point in time, the FTLink field of font number OOOC was set to NULL. Consequently, font number 0013 was assigned, set to initial state and the character "o" was added to it. A future occurrence of the sequence " d" can link to font number 0013 via font number OOOC and search for or add characters as required.
Font Encoding, Example 3. New Character
When a character is encountered in the data stream that does not appear in the font defined by its prior two characters, it is encoded using one of four frequency encoding tables.
Fig. 7A shows the encoding of character "w" which follows, in the character stream 701, "No". As shown in Fig. 7A, looking at Font (No) 702, "w" does not appear, and NC 703 = 2, indicating that "New Character" has a virtual position between the position of "v" and the position of "n" in the font. Also Font (No) contains four characters so SZ 704 = 4. The two Global Font encoding Tables shown in Fig. 7A 710 are two of the tables from Fig. 4A, corresponding to font size SZ = 4 (from Font (No)) + 2 (for NC and ST in mode A) 713 or SZ = 4 + 3 (for NC, PE and ST in mode B) 714. Mode A font size = (SZ) + 2. Mode B font size - (SZ) + 3. Position "2" 715 in these tables yields 709 the bit string "000" in the Global Font Encoding Tables 710 for either Mode A 711 or Mode B 712. String "000" will be transmitted by the encoder and will be recognized by the decoder as the "new character escape". This will indicate to the decoder that the next bits to be received are the encoding of a new character. In a preferred embodiment, there are four NC to Frequency Encoding tables 705, identified as 00, 01, 10 and 11. Bits 5 and 6 706 from the prior character (in this example "o", and "o" = 6F in hexadecimal) are used to select one of these four NC to frequency tables (in this example NC to FreqTable 11 707) . The binary value of "w" (77 in hexadecimal, 708 in Fig. 7A) is used to enter the selected NC to Frequency table, yielding a position (or frequency) of 15, which defines an entry into the Global Code High/Low Table 716. This table in turn, yields the Huffman code 01111, the font encoding of new character "w" following "No". The output bit stream sequence 717 is therefore 000 (font) followed by 01111 (frequency) . The use of four tables, instead of the one table described in U.S. Patent 4,612,532 to Bacon et al, is found to improve compression efficiency. Of course, more or less than four tables could be used. Process Buffers
The process buffers, shown in Fig. 2B, consist of nine "First In/First Out" buffers 201-209, each having 256 locations, which operate in parallel and share input and output pointers. These buffers are used by the font encoding and string encoding processes. Fig. 2B shows the flow of font encoding data among the process buffers and various tables. The contents and significance of the several buffers are as follows:
The ECChar buffer 203 contains the most recent 256 characters from the input stream to be encoded. Characters are received singly from the input stream, placed in rotation in ECChar, font encoded, and later string encoded. Least-cost selection and output formatting follow. The value range' of ECChar is 0 - 255.
The ECCharCopy 202 buffer contains an exact copy of the ECChar buffer. It is contiguous with ECChar to facilitate string searching. The value range of ECCharCopy is 0 - 255. ECType 209 is a steering value which is set by the font encoding and/or the string encoding process. ECType is used by the output format process to control the output bit stream. ECType may have any one of the following values: 0 - String or pair continuation (the second or subsequent character of a mode A string or a mode B string or the second character of a pair encoding) . 2 - Font encoding. The encoding is the relative offset of the character in the selected Font.
4 - New character. 6 - First character of a pair encoding. 8 - First character of a string encoding. ECFontlndex 207 is the zero relative index into the FontCode or FontBits tables for this character. By using the value of ECFontlndex as an index, either the encoding size in bits or the actual encoding bit pattern can be accessed quickly. The value range of ECFontlndex is 2 - 43 as shown in Fig. 4A.
ECFrequency 208 is the frequency value of the character. It is obtained by using the character as an index into the NC to FreqTables (Fig. 7A) . The value range of ECFrequency is 0 - 255.
ECHashRawO 2040 contains the eight least significant bits of the CRC hash computed from the prior two characters in the input stream. The value range of ECHashRawO is 0 - 255. Data is shown in hexadecimal in Fig. 2B.
ECHashRawl 2041 contains the eight most significant bits of the CRC hash computed from the prior two characters in the input stream. The value range of ECHashRawl is 0 - 255. ECHashX20 2050 contains the eight least significant bits of zero relative font number multiplied by two. This value is maintained for quick access to the ECFTHashNext table. The value range of ECHashX20 is 0 - 254, even numbers. Data is shown in hexadecimal in Fig. 2B. ECHashX21 2051 contains the eight most significant bits of zero relative font number multiplied by two. This value is maintained for quick access to the ECFTHashNext table. The value range of ECHashX21 is 0 - ((MaxFontTable-l)*2)/256) . ECNewIndex 206 is the zero relative index into FontCode or FontBits representing the New Character position in this Font. The value of ECNewIndex is derived from Font Size and font-relative new character position. (During font encoding, ECNewIndex is computed and stored. Later, during string encoding, FontBits is retrieved and, during the output process, FontCode is retrieved. See Fig. 4A.) Similarly for pair encoding and/or string escapes, the value of ECNewIndex is incremented by 1 or 2 and the bit cost or pattern quickly determined. The value range of ECNewIndex is 2 - 41.
ECRepeats 201 is the count of repeats of this character beyond two. That is, the two prior characters are the same as this one. The buffer pointer will not advance as long as subsequent input characters remain the same and ECRepeats is less than or equal to 255. The value range of ECRepeats is 0 - 255. Font Encoding Process Flow
Font encoding process flow is shown in Fig. IB, first phase, steps 1 through 10. Font encoding and font update processing are performed in steps l through 10. This series of steps occurs once for each character of input. In this process, known as "refill", a character is added to the process buffer and the current input pointer is advanced by one. The steps (shown in Fig. 2B as SI, 82, S3, etc. corresponding to step 1, step 2 step 3, etc.) are as follows: 1. A character from the input stream is fetched and stored in the current input ECChar field 210. 2. The same character is stored in the current
ECCharCopy field 211. (The relationship of ECChar and ECCharCopy is shown in Fig. 2A) . 3. The current character is compared with the two prior characters in the input stream. If equal, the ECRepeats field is incremented (e.g. 212 in Fig. 2B) and, if the ECRepeats field is less than or equal to 255, flow proceeds to step 1 above. This loop insures that no more than three consecutive like characters are stored in the history buffer (except when the number of consecutive like characters exceeds 258) . 4. The CRC hash is computed on the two prior characters in the input stream (as described under
"Font Encoding, CRC Hash and Font Access" hereinabove) and the result is stored in the low and high bytes of EChashRaw 213 for later use.
5. The appropriate font 214 is accessed or created (as described hereinabove) . If the font exists, the font number from FT Rough Table 215 is stored in the ECHashX2 table high and low bytes (217 and
216) and the character fetched in step 1 above is looked up in the font 31. If the font is created (new font) , ECHashX2 is set to NULL.
6. Using SZ (the number of characters in the font) from the accessed font, the Access Table of Fig.
4A 41 is accessed for a pointer 218 to be used as an index value. Neither the FontCode or the FontBits tables are used at this time.
7. The index value fetched in step 6 is added to the NC (NewChar position) 219 from the font accessed in step 5. The result 220 is stored in the ECNewIndex for later use as a NewChar or String Escape. If the current character (from step 1) was not found in the accessed font, the ECType field is set to 4 denoting a NewChar encoding.
8. If the current character (from step 1) was found in the accessed font, the raw position 221 of that character in the font is added to the index value 222 fetched in step 6 and the result 223 stored in the current ECFontlndex field. If the character position is greater than or equal to the NC (NewChar) value from the font, the ECFontlndex field is incremented by two if in Mode A and 3 if in Mode B allowing for the virtual positions of the NewChar, Pair Encoding and/or String Escapes.
The ECType field is set to 2 224 denoting that the character was found in the font, a "Font Encoding".
9. If in Mode B or if the current character (from step 1) was not found in the font, the appropriate one of four ECNCFrequency tables 225 (selected from bits 5 and 6 of the immediately prior character) is selected (Fig. 7A) . The frequency value 226 corresponding to the current character is fetched from the selected table and stored in the current input position of the ECFrequency field 227. This is for later use as a new character encoding or for 8-bit output in antiexpansion mode. 10. The current input pointer 228 into the process buffer is incremented by one. If the number of characters in the process buffer array is now 256 or, the Font Trending Switch changed from Mode A to B (or vice versa) , or a timer-initiated flush occurs, flow proceeds to step 11 below for string processing and output. Otherwise flow proceeds to step l above.
Steps 11 through 20, including string search, least cost encoding selection and output are described hereinbelow under "Second Phase Processing". Font Reallocation: As input context changes, old fonts go out of use and new ones are created. Since there is a limit to the number of practical (actual) fonts in a preferred embodiment (e.g. 1024) , a method for reassigning fonts is required. In the preferred embodiment this is a circular (low to high then back to low) replacement heuristic. An alternative embodiment may also include a "less recently used" heuristic. The next three paragraphs describe the combination. (The source listing of Appendix 1 details the circular heuristic only) . Since the fonts are linked in chains starting at FTRoughTable and forward-only linked via FTLink, the circular reallocation process points into the FTRoughTable advancing from 0 through 1023 and back to 0. The selected font, and subsequently linked fonts (if any) as indicated by FTLink are examined for potential reuse.
Each time a font is accessed by the previously described Font Encoding Process, an unused bit of the FTHashNext field is set to 1 indicating activity. As the reallocation process traverse the fonts, it will reset the activity bit if it is set and link to the next candidate font. If the activity bit is reset, the font will be reallocated as a new font. By the use of the single activity bit, any given font has the opportunity to survive permanently provided that it is used at least once per pass of the reallocation search process.
For example, referring to Fig. 5, assume that the main reallocation pointer is pointing to the FTRoughTable at hexadecimal 1D6. The FTLink field of font OOOC will be examined for the activity bit. Assuming it to be reset, font OOOC will be the next assigned for a new font. This is done by copying the contents of the FTLink field (in this case 0013) into the FTRoughTable at 1D6 thus freeing font OOOC. The reallocation pointer is moved to 0013 for use in the next allocation cycle. String Encoding
The string encoder of the present invention uses a circular history buffer to store a sliding dictionary. The history buffer is a dictionary of all the strings it contains. String encoding may operate in one of two modes, mode A (using the tables in Fig. 8A) for use on relatively compressible text or mode B (using the tables in Fig. 8B) for use on less compressible text. In both modes, string encoding is designed to achieve near-optimum compression efficiency under the time constraints of on-line operation. The history buffer is tagged at regular intervals and, in a preferred embodiment, is tagged every second character position. The string encoder of the present invention also uses a novel dictionary access structure having a set of tables for accessing the history buffer. Updating the history buffer involves very little processing because it involves no more than accepting the next character and incrementing a pointer. However, updating the dictionary access structure is as challenging a problem as updating the sliding dictionary in string encoding systems which store the sliding dictionary as a tree structure. The present invention addresses this problem by the use of a novel history buffer access method. The method is based on the structure of the history buffer access tables as shown in Figs. 8A and 8B and it retrieves and uses the same CRC hash codes created and used in the font update process during font encoding. Accordingly, by use of this method, updating of the dictionary access structure is faster and requires less processing than updating a tree structure would require.
The use of a tagged history buffer provides additional benefit for accessing and matching strings. String encoding mode A, using a tagged history buffer, locates longer strings in a shorter time than earlier methods. While the process searches the same number of candidates, the process encounters shorter linked lists in the access buffers than would otherwise occur. Processing time spent building and searching access tables is beneficially reduced.
The history buffer/dictionary access structure, in a preferred embodiment, includes a history buffer and access tables. The history buffer and the access tables shown in Figs. 8A and 8B are used by the string encoding process of mode A and the string encoding process of mode B respectively. Both Figs. 8A and 8B show an ECRR (History) Buffer (1 byte wide) 81 with a Next Available Buffer
Position 82 and an ECRR Suffix (256 positions) 83. Both Figs, show an ECRR Hash Head Buffer (2 bytes wide) 84. Both Figs, show an ECRR Hash Buffer containing an ECRR Hash Link portion (2 bytes wide) 85 and an ECRR Hash Test Portion (2 bytes wide) 86. Both Figs, show the derivation 87 and 88 of the CRC hash used as entry to the ECRR Hash Head Buffer 84. Both string encoding mode A and string encoding mode B use the CRC hash created earlier during font encoding and stored in the ECHashRaw table (see Fig. 2B) . However, each of these processes uses the CRC hash in a slightly different way. String encoding mode B uses the CRC hash (a hash based on two consecutive characters) directly. String encoding mode A uses a novel algorithm (which includes the CRC hash) to create a hash based on two consecutive pairs of characters (four consecutive characters) as illustrated by the following example for the four characters "THEY": "TH" [CRC hash] yields XXXX (16 bits)
"EY" [CRC hash] yields YYYY (16 bits)
XXXX θ (0-YYYY) yields ZZZZ (16 bits) where ® is exclusive OR, 0-YYYY is zero minus YYYY and ZZZZ is the resultant hash. String Encoding. Mode A
Every second character position in the history buffer is tagged and the tags are used to index the string search process. Each tagged position has corresponding Hash Link and Hash Test field. String encoding for mode A includes the following steps:
1. Set LookAhead = 3 (Fig. 9 shows a Data Stream 91, a History Buffer ECRR 92 with a Next Available Buffer Position 93, A CC Buffer 94 with a Current Character 95, and a Pointer "p" 96. The pointer 96 is shown for Mode A to have a First Start Point for String Search 901 displaced 3 characters from the position of the current character and a Second Start Point for String Search 902 dispaced 2 characters from the position of the current character.) Set pointer p to CCBuffer pointer
(ECNextChar pointer in Fig. 2) plus a number of characters equal to LookAhead.
2. Create the hash for the string of four characters starting at the "p"th character as described hereinabove.
3. Use the least significant eleven bits of the hash (ZZZZ in the example above) as a pointer (e.g. 1811 in Fig. 8A) to enter ECRR Hash Head Table of Fig. 8A. Set pointer "h" to the first potential match by using the contents of ECRR Hash Head field (e.g. 7300 in Fig. 8A) to point to the most recent four-character string in the history buffer, starting at a tagged location, that hashes to that same hash. 4. Find the longest match: a) Set n = 3 b) Set x = 0 c) Compare (one character at a time) the character at (p + n - x) in the CC buffer with the character at (h + n - x) in the history buffer, incrementing x by 1 until x = n or no match. The "fast reject step" is when x = 0. d) Increment n by 1 and compare the character at (p + n) in the CC buffer with the character at (h + n) in the history buffer until no match.
Continue to search for the longest match as follows. Use pointer "h" to enter the ECRRHashLink table (at 7300 in Fig. 8A) . Reset pointer "h" from the content of the ECRRHashLink table so that pointer "h" points to the next most recent four-character string in the history buffer (7284 in Fig. 8A) . In each search, using steps b through d above, begin comparing characters for match starting at character n, where n is the length of the current longest match. Continue until the end of the linked list, as indicated by a non-match of the hash with the corresponding entry in the ECRR Hash Test field or, to prevent looping, until MaximumASearches (eight in the preferred embodiment) have been performed.
Store length and location of longest match if n (length of longest match) > 3. 5. Backmatch, as follows, to maximize the length of the string: a) First time through (LookAhead = 3) , check until no match: character preceding 1st character, the character preceding that and then the character preceding that (the current character) . b) Within repeat steps (from step 6, LookAhead = 2) check until no match: character preceding first character and then the character preceding that (the current character) .
6. Repeat steps 1-5 with LookAhead = 2.
7. Select from the outputs of steps 5a and 5b the string which: a) is the longest; b) if the strings are equal, the one that is most recently stored. The following advantages follow from the structure and method of string encoding mode A: a) History buffer update processing time is reduced when the history buffer is accessed at fewer entry points than every character. In the preferred embodiment, the history buffer update processing time is reduced by a factor of two because the history buffer access table update takes place every second character instead of every character. b) String search processing time is reduced when the history buffer is accessed at fewer entry points than every character. In the preferred embodiment, the linked list to be searched is, on average, only one-half the size it would otherwise be (the list is drawn from a population of candidates only one-half the size it would otherwise be) . c) Less memory is required for the ECRR Hash- Link/Hash-Test Table because, in the preferred embodiment, it is only one-half the size it would otherwise be. d) The end of the linked list is determined dynamically by comparing the current hash code with the content of the ECRR Hash Test field. Thus the need to maintain end of list pointers or link length pointers or the like is eliminated. Because the end of the linked list is determined dynamically, no maintenance is required for the overwritten string. e) Non-matches are eliminated faster and with fewer processing steps because each search starts at p + n. This "fast reject" technique ensures that the candidate string is rejected immediately if it cannot be at least one character longer than the previous longest match. String Encoding. Mode B ~~
Every second character position in the history buf er is tagged and the tags are used to index the string search process. Each tagged position has corresponding Hash Link and Hash Test fields. String encoding for mode B includes the following steps:
1. Set pointer p to CCBuffer pointer + 1 (Fig. 9 shows a Data Stream 91, a History Buffer ECRR 92 with a Next Available Buffer Position 93, a CC Buffer 94 with a Current Character 95, and a Pointer "p" 96. The pointer 96 is shown for Mode
B to have a First Start Point for String Search 903 displaced 1 character from the position of the current character and a Second Start Point for String Search 904 coincident with the position of the current character) .
2. Retrieve the CRC hash from ECHashRaw (Fig. 2B) for the string of two characters starting at the "p"th character.
3. Use the least significant eleven bits of the (16 bit) CRC hash as a pointer (2048 positions) to enter ECRR Hash Head Table (0012 in Fig. 8B) . Set pointer "h" to the start of the first potential match by using the contents of ECRR Hash Head field (0006 in Fig. 8B) to point to the most recent two-character string in the history buffer, starting at a tagged location that hashes to a CRC hash that has the same least significant eleven bits (ZQ in Fig. 8B) .
4. Find the longest match having three or more characters: a) Compare the character at (p -1) in the CC buffer with the character at (h - 1) in the history buffer and terminate if no match. This is the "fast reject step". b) Set n = 0 c) Compare (one character at a time) the character at (p + n) in the CC buffer with the character at (h + n) in the history buffer, incrementing n by 1 until no match. Continue to search for the longest match as follows. Use pointer "h" to enter the ECRRHashLink table (at 0006 in Fig. 8B) . Reset pointer "h" from the content of the ECRRHashLink table so that pointer "h" points to the next most recent two-character string in the history buffer (3750 in Fig. 8B) . In each search, use steps 4a through 4c above (or steps 5a through step 5c below) . Continue until the end of the linked list, as indicated by a non-match of the hash with the corresponding entry in the ECRR Hash Test field or, to prevent looping, until MaximumBSearches (sixteen in the preferred embodiment) have been performed. Store length and location of longest match if n (length of longest match) > 2.
5. Set p to CCBuffer pointer (the current character) and repeat steps 2 through 4, using the following steps a, b and c instead of steps 4a, 4b and 4c to find the longest match: a) Compare the character at (p + 2) in the CC buffer with the character at (h + 2) in the history buffer and terminate if no match.
This is the "fast reject step". b) Set n = 0 c) Compare (one character at a time) the character at (p + n) in the CC buffer with the character at (h + n) in the history buffer, incrementing n by 1 until no match.
6. Select from the outputs of step 4 and step 5 the string which: a) is the longest; b) if the strings are equal, the one that is most recently stored. String Length Encoding
String lengths are encoded differently for mode A string encoding and mode B string encoding.
In mode A, string lengths are encoded using the GlobalHigh/Low table. Further, the encoding is slightly different depending upon the method of string escape, i) If the escape follows creation of a font, MinimumAString (which is 6) is subtracted from the actual length of the string and the result is used to index the GlobalHigh/Low table, ii) If the escape follows an old (existing) font, MinimumAString (which is 6) is subtracted from the actual length of the string, four is added, and the result is used to index the GlobalHigh/Low table. This latter operation is because the bit pattern 11, which begins the first four entries in the GlobalHigh/Low table is reserved to signify a Pair Encoding. The selected Huffman pattern from the GlobalHigh/Low table is placed into the output stream. String length encoding, mode A, old font, is illustrated as the second operation in Fig. 7C.
In mode B, string lengths are encoded by subtracting MinimumBString (which is 3) from the actual length of the string. If the result is less than 9, the LengthBCode table is used to encode the string length. If the result is greater than or equal to nine, the further escape 0010 is output, an additional nine is subtracted from the result above, and the new result is used to index the
GlobalHigh/Low table. The selected Huffman pattern from the GlobalHigh/Low or LengthBCode table is placed into the output stream. String Pointer Encoding String pointer encoding for both mode A and mode B proceeds as follows:
The history buffer location of the first string character is subtracted from the Next Buffer Store location. Buffer wraparound, if any, is corrected such that the result is the displacement from the found string to the Next Buffer Store location and is in the range 0 through BufferSize-1. Note that strings closest in recent history (newer) have lesser displacements than do older strings. Example 1. (using 8192 character buffer)
Decimal Hexadecimal Next Store location 3152 0C50 Found string location 1511- 05E7-
1641 0669 Example 2. (using 8192 character buffer) Next Store location 0052 0034 Found string location 8157- 1FDD-
8105- 1FA9-
Correction 8192+ 2000+
String Displacement 87 57
With the BufferSize in the preferred embodiment selected as 8192, the calculated displacement can be expressed in thirteen bits.
The displacement is further broken into two components. A) A zone portion from the most significant five bits. B) An offset portion from the least significant eight bits. In the proper string encoding context (i.e. after appropriate string encoding escapes) the five bit zone is Huffman coded using the ZoneCode table and the eight bit offset is inserted directly into the output stream. Thus the Zone may be encoded using from 2 to 7 bits depending upon zone value with the strings closest in recent history getting favorably shorter encodings. Example. Hexadecimal Binary
String Disp 0057 0000 0000 0101 0111
I I I I 1 I I I I I I I I I I I I
I OOOO OOOO = Offset = 57 i
Z ZZZZ = Zone = 0 String offset encoding is also illustrated as the third and fourth operations in Fig. 7C.
Fig. 7C illustrates string encoding, mode A, and shows a Character Stream 751 with a character string beginning with "w" 752, Font (No) 742 and Global Font Encoding Table 743 yielding, for a font size value SZ =4, at entry point 3 (3 = String Escape = NC + 1) 744, a Font String code 001 724. Fig. 7C shows a History Buffer 753 having a character string beginning with "w" at location 933 (hexadecimal) 754. Fig. 7C shows that a string of nine characters 752 in the character stream match the nine characters in the history buffer beginning at location 933 754. The "Global Code" or Global Frequency Encoding Table 745 is entered at entry point 7 (9 - 6 + 4 = 7) 755 to create a Length Code of 1011 756. The string location 933 (hexadecimal) 754 is subtracted from the location of the Next History Buffer Location 1201 (hexadecimal) 757 to yield 8CE (hexadecimal) whose 13 least significant bits 758 comprise the displacement which is broken into two components: i) a zone portion from the most significant five bits 759 and ii) an offset portion from the least significant eight bits 760. The five bit zone portion is Huffman coded using the Zone Code table 761 and the eight bit offset is inserted directly in the output stream. The Output Bit Stream Sequence 752 includes 1st: Font (001), 2nd: Length (1011), 3rd: Zone (01001) and 4th: Offset (11001110). Minimum String Length and Search Advance
In both mode A and mode B string encoding, the string search process discards matches having less than a predetermined number of characters, the predetermined number being greater than the hash length. Thus, we define a minimum string length. The minimum string length can be greater than the hash length and it is advantageous to make it so. In mode A the hash length is 4 and the predetermined number is 6. In mode B the hash length is 2 and the predetermined number is 3. Setting a lower limit on the length of the string reduces the bit-cost of encoding longer strings because the top (shortest code) entry into the Huffman table is used to represent a string of the minimum length. On completion of a string search, if no match is found, a predetermined number of characters (3 if mode A and 1 if mode B) are released (in font encoded or pair encoded form) and the search pointer is advanced by a corresponding number of positions before the next search. Pair Encoding
Pair Encoding is a novel method for encoding character pairs. Up to 1024 fonts, those associated with recently encountered character pairs, are maintained in memory principally for the purpose of font encoding. Pair encoding takes advantage of the unambiguous one-to-one mapping between the input character pairs and the fonts effected using the CRC hash and the MatchVal. Since there are 1024 fonts maximum, ten bits (210 = 1024) may be used to encode any of the character pairs that these fonts represent. Thus, other than escape bit sequences, ten bits is all that is required to encode many character pairs. Assuming an average escape sequence of three bits, the resulting thirteen bit encoding compares quite favorably with the sixteen bits for two uncompressed characters especially in computer binary codes files (eg .COM and .EXE).
In addition to the fonts and access structure maintained by the encoder, the decoder maintains a table of the actual two characters which are associated with each font. Thus it can do a direct lookup when directed by the encoded bit stream.
Example. Refer to Fig. 7B which shows an input character stream 731 with a character pair ««w " 732, Font (No) 722 (font address hex 195) and a Global Font Encoding Table 723 yielding, at entry point 3 725, a Font String Code 001 724. Assume that the character pair "w " (lower case w and caret) has occurred previously in the input character stream and has font number 215 (hexadecimal) assigned to it by the font encoding process. The sequence "wA" 732 has occurred again following "No" in the input stream and is next to be processed for output by the encoder. After determining that the pair "wA" exists, and that Pair Encoding is the least cost, the encoder, entering the Global Font Encoding Table 723 of Font Size 6 (Font Size = SZ + 2 for Mode A) at NC+1 (String Escape 724) , emits the String Escape "001" from font "No" 735, followed by the Pair Encoding Escape "11", 736 and the ten bit value "1000010101" (from binary of hex 215, the font number of "wA") 737 creating output bit stream 738. Font Escapes
A font escape is a bit encoded sequence which serves as a signal from the encoder that the subsequent item is to be treated differently from that normally expected. A font encoded sequence that signifies that a NewChar follows in the data stream is an escape. It is used as an Escape to signal GlobalNC encoding. Another escape is String Escape. This is a bit sequence specifically to condition the decoder for reception of a string. When used in the context of a Font encoding/decoding, String Escape has a value equal to NewChar Escape + 1 when String "A" mode is active. In string mode B the font has 3 escapes: 1) New character. Value - Font NC. 2) Pair encoding. Value *-*•* Font NC + 1. 3) String mode B. Value = Font NC + 2 Other escapes are described under Detail of Specific Encodings hereinbelow. Second Phase Processing Second Phase Processing, steps 11 through 20, includes string search, least cost encoding selection, formatting and output. Throughout these steps, the pointer into the process buffer is the current output pointer which is from 1 to 256 characters behind (older than) the current input pointer.
11. According to the state of the mode switch, the correct string search routine is invoked, String
Search Mode A or String Search Mode B.
12. If a less than a minimum length string (3 if mode B and 6 if Mode A) is found in step 11, proceed to step 15. Otherwise, the bit cost of the string is computed by summing the costs of String Escape,
String Length, Zone Code and the String Offset of 8, as follows: a. Fetch the ECNewIndex value corresponding to the first character of the string and add 1 if Mode A or 2 if Mode B. Use the result to access the FontBits section of the Global Font Encoding Table of Fig. 4A. The retrieved value from FontBits is the bit cost for the String Escape. b. If Mode A is active, subtract 6 and add 4 to the string length and use this result to access the GlobalBits table. If Mode B is active, subtract 3 from the string length and use this result to access the LengthBBits table. This is the bit cost for the string length. c. Subtract the position of the first character of the string from the next history buffer store location and divide the result by 256 giving the zone. Using the computed zone, access the ZoneBits table. This is the bit cost for the Zone encoding. d. The bit cost of the String Offset is 8. e. Add items a through d. This sum is the total bit cost of the string.
13. Compute the bit cost of equivalent font encoding for each position corresponding to a character in the string using step a or b below. Subtract this bit cost from the total from step 12. If underflow (the result goes negative) at any point, exit step 13 since the string encoding wins over the font encoding. If all corresponding positions are examined without underflow occurring, font encoding has a lesser or equal bit cost and will be used so proceed to step 15. a. If the ECType field is 4, use the ECNewindex field to access the FontBits table for the bit cost of NewChar Escape. Use the ECFrequency field to access the GlobalBits table for the bit cost of the NewChar. b. If the ECType field is 2, use the ECFontlndex field to access the FontBits table for the bit cost of a font encoding. 14. If string encoding wins as indicated in step 13, change the ECType field corresponding to the first character of the string to an 8 (denoting String Encoding) and then change the ECType field corresponding to all remaining characters of the string to a 0 (denoting string continuation) . Set UpdateLength to string length. Proceed to step 19. 15. Examine the ECHashX2 field corresponding to the character at the current output position + 2. If NULL (the font exists in the encoder but does not yet exist in the decoder) proceed to step 18, otherwise compute the cost of a Pair Encoding as follows: a. Fetch the ECNewIndex value corresponding to the current output position and add 1. The result is used to index the FontBits table. This is the bit cost for the Pair Encoding Escape. b. Add 10 to the result of step a. This is the total Pair Encoding cost. 16. Compute the bit cost of equivalent font encoding for each of the two characters in the pair (current output position and current output position +1) using step a or step b below. Subtract this bit cost from the total in step 15.
If underflow (the result goes negative) at any point, exit step 16 since the pair encoding wins over the font encoding. If the two positions are examined without underflow occurring, font encoding has a lesser or equal bit cost and will be used so proceed to step 18. a. If the ECType field is 4, use the ECNewindex field to access the FontBits table for the bit cost of NewChar Escape. Use the ECFrequency field to access the GlobalBits table for the bit cost of the NewChar. b. If the ECType field is 2, use the ECFontlndex field to access the FontBits table for the bit cost of a font encoding. 17. If pair encoding wins as indicated in step 16, change the ECType field corresponding to the first character of the string to a 6 (denoting Pair Encoding) and then change the ECType field corresponding to the next character of the pair to a 0 (denoting string/pair continuation) . Set
UpdateLength to 2. Proceed to step 19.
18. Set UpdateLength to 1. This is to be a font or NewChar encoding.
19. Access the ECType field at current output position, format and output the bit sequences illustrated in Figs. 10A through 10D. Access the ECRepeats field at current output position, if greater than 0, output the repeat count using the GlobalCode (High and Low) table. Add each output character to the history buffer and associated access tables. Increment current output position, decrement UpdateLength. Repeat step 19 while UpdateLength is greater than 0. 20. If a Flush or Mode change operation is in process, repeat steps 11 through 19 until the process buffer is empty (current output position equals current input position) . Otherwise proceed~to step 1. Compressibility and Encoding Process Switching
The following process is found to provide a useful measure of data compressibility. Every forty-eight new characters (i.e. characters not found in the font associated with the previous two characters) , the cumulative bit-cost of encoding the previous ninety-six such characters is compared with a preset value. It is of no consequence to this process that such character might later be encoded as part of a string.
Every forty-eight NewChars (which may be more than forty-eight input characters) , the current sum in NCBitsNew is added to the previous forty-eight character sum from NCBitsPrior and the result compared to the constant 96 * 7.5 (representing 96 characters at 7.5 bits per character). If there are less than 96 * 7.5 bits in the result, the Compressibility Trending Switch is turned OFF (or remains OFF). If the result is 96 * 7.5 or greater, the Compressibility Trending Switch is turned ON (or remains ON) . After the calculation, the current NCBitsNew is stored in NCBitsPrior in preparation for the next cycle forty-eight NewChars later.
If the compressibility trending switch is on, the following are in effect: l. Font encoding is active.
2. String mode B is active.
3. Pair encoding is active.
4. Anti-expansion mode is active.
If the compressibility trending switch is off, the following are in effect:
1. Font encoding is active.
2. String mode A is active. 3. Pair encoding is active.
4. Anti-expansion mode is inactive. Detail of Specific Encodings
The several encodings produced by the present invention, in addition to font encoding (NewChar, Pair and String) are shown in Tables 1A through ID below. Table 2 provides the key to the data in these tables.
Preconditions Old Font, Mode A Old Font, Mode B
Preconditions Old Font, Mode A New Font, Mode A New Font, Mode A Old Font, Mode B New Font, Mode B New Font, Mode B
Mode B, Antiexpan
Figure imgf000045_0001
Table IB - New Character Encodings
Preconditions Escapes 10 Bit Font Number Old Font, Mode A SEA 11 bb bbbb bbbb New Font, Mode A 01 0 bb bbbb bbbb Old Font, Mode B PEB bb bbbb bbbb New Font, Mode B 11 0 bb bbbb bbbb Table 1C - Pair Encodings
Figure imgf000046_0001
Table ID - String Encodings
Key to Tables 1A through ID bb bbbb bbbb A ten bit number representing the font number with which the encoded pair is associated.
A single bit emitted to comprise the second of two bits which serve as a prefix to the
PPP encoding. ffff ffff Eight bits representing a character frequency in the range 0 - 255. hhh hhhh Seven bits representing a character frequency in the range 128 - 255. iii iiii Seven bits representing a character frequency in the range 0 - 127. oooo oooo The eight least significant bits of the buffer (relative to the Next Buffer Store Location) displacement of the first character of a string. Used with a ZZZ encoding to identify a string position.
EEE A Huffman pattern from the FontCode table from one to four bits in length encoding a value not equal to the Font NewChar or Font NewChar plus one and representing the font relative position of the encoded character in the Font.
FFF A Huffman pattern from the FontCode table from one to five bits in length encoding a value not equal to the Font NewChar, Font NewChar plus one, or Font NewChar plus two and representing the font relative position of the encoded character in the Font. GGG A Huffman pattern from the GlobalHigh/
GlobalLow table from four to thirteen bits in length, in mode A, encoding a value from 0 to 249 and representing a string length of 6 -
255 characters; in mode B, encoding a value from 0 to 243 and representing a string length of 12 - 255 characters. LLL A Huffman pattern from the LengthBBits table, from one to six bits in length, encoding a value from 0 to 8 and representing a string length from three to eleven characters. NES A Huffman pattern from the FontCode table from one to four bits in length encoding a value equal to the Font NewChar and representing a NewChar Escape. NNN A Huffman pattern from the GlobalHigh/
GlobalLow table from four to thirteen bits in length, encoding a value from 0 to 255 and representing a character frequency.
PEB A Huffman pattern from the FontCode table from two to four bits in length encoding a value equal to the Font NewChar plus one and representing a Pair Encoding Escape, Mode B. PPP The remainder of a Huffman pattern from the
GlobalHigh/GlobalLow table, excepting the first two bits which are emitted separately, from two to eleven bits in length, encoding a value (in consideration of the prior two bits) from 0 to 255 and representing a character frequency. SEA A Huffman pattern from the FontCode table from two to four bits in length encoding a value equal to the Font NewChar plus one and representing a String Escape, Mode A.
SEB A Huffman pattern from the FontCode table from three to five bits in length encoding a value equal to the Font NewChar plus two and representing a String Escape, Mode B. SSS A Huffman pattern from the GlobalHigh/
GlobalLow table, excepting the first four entries (those beginning with 11) , from four to thirteen bits in length, encoding a value from 4 to 255 and representing a string length of 6 - 253 characters. ZZZ A Huffman pattern from the ZoneCode table, from four to thirteen bits in length, encoding a value from 0 to 31 and representing the five most significant bits of the string displacement. Used with the oooo oooo, described above to identify a string position in the history buffer.
Table 2 - Key to Encodings of Tables 1A through ID
Selecting and Assembling Encodings
The least-cost encoding is built by selecting the encoding that has the fewest bits. If the bit cost of the two encodings are the same, font encoding is chosen.
If string encoding is selected, there may be up to three prefix characters not included in the string (e.g., the current character to the character immediately prior to the beginning of the string) . Any such prefix characters are font encoded or pair encoded and their code is transmitted ahead of the string encoding. Anti-expansion
Whereas it is possible^ for certain data streams to exhibit very little patterning, data expansion is a possible outcome of font encoding and string encoding systems. To counter this possibility, a running computation of the output bit count for Mode B minus 8 (bits per character) is maintained, i.e., for each equivalent character output, SUM = SUM + BitCost - 8. Thus a positive result indicates poor compression and a negative result indicates good compression. A switch is maintained which controls the output stream such that, when the switch is on, the eight- bit frequency is output instead of the normal font, string, or pair encoding for mode B. A command (frequency OFEh followed by a single 1 bit) is used to signal the decoder to change state.
Table 3 indicates the action taken for each character output.
Switch On (Transparent Mode) Switch Off (Mode B Encoding) SUM >= 0 No Change j SUM < 0 No Change
I I
SUM -1 to -19 No Change SUM 0 to 19 No Change
SUM < -19 Set Switch Off SUM >19 Set Switch On Table 3 - Anti-expansion Actions
Decoder Process
Figs. 10A and 10B provide a flowchart of the decoding process. Table 4 provides the key to the flowchart of Figs. 10A and 10B.
F<n> Fetch the next <n> bits from the input stream
(where n is an integer) . D G Decode Global. Decode a Huffman pattern which was selected and encoded from the
GlobalCode encoding table. D F Decode Font. Decode a Huffman pattern which was selected and encoded from the appropriate Font Encoding tables (Fig. 4A) . D S Decode Short. Decode a Huffman pattern which was selected and encoded from the GlobalCode encoding table.. Same as the DG (Decode Global) except that two bits have already been fetched (F2) and are in DCCode. Used for length of 'A' type strings.
D L Decode Length. Decode a Huffman pattern which was selected and encoded from the LengthBCode encoding table. Used for length of *B' type strings. D Z Decode Zone. Decode a Huffman pattern which was selected and encoded from the lowest 64 entries in the GlobalCode encoding table.
Used for length of 'B* type strings. Table 4 - Key to Decoder Flow
In the decoder flow, there are only four possible endpoints to a single decoder (and implicitly encoder) cycle. These four different endpoints are shown in Figs. 10A and 10B by an integer inside a triangle. They correspond to the four methods of encoding, shown in Tables 1A through 1C hereinabove: Font, NewChar, Pair and String.
Dual Processor Configuration As discussed under Background Art hereinabove, the combination of higher data rates in data transmission systems, the achievement of high data compression efficiencies and the use of complex process-intensive algorithms for data compression increases the processing throughput required to perform modem control and data compression/decompression tasks. In a preferred embodiment, referring to Fig. 11, the system uses two processors connected in series between the computer (the DTE interface, 111) and the telephone line (the DCE interface, 112) , each processor having its own memory. One processor, the DCE Interface Processor 113, a Zilog Z80180, performs DCE interface processes (modem control and data flow management) . The other processor, the
Compression/Decompression and DTE Interface Processor 114, a Rockwell C19, performs data compression, data decompression and DTE interface processes (data interchange with the PC) . This configuration is shown for duplex operation in Fig. 11. Fig. 11 shows a data rate of 11,500 characters/second at the DTE Interface and a data rate of 1,500 characters/second at the DCE Interface. The conventional (prior art) approach using a single processor 121 to perform all functions (DTE interface, DCE interface and Compression/Decompression) is shown in Fig. 12. The single processor approach involves using a more powerful, albeit more expensive, processor. The general problem of sharing tasks among multiple processors is known to be a difficult problem in computer science. A conventional solution that might be applied to data compression modem applications is shown in Fig. 13. Fig. 13 shows a conventional two-processor configuration having a DTE/DCE Interface Processor 131 and a Compression/Decompression Processor 132. The present invention achieves the sharing of tasks by a simple but, nonetheless, unexpectedly effective configuration.
The preferred embodiment, shown in Fig. 11, achieves efficient control over all processes occurring in the system. This configuration utilizes the insight that compression and decompression and interface with the terminal all occur at a high error-free data rate, whereas modem control and the data line interface processes operate at a lower data rate and involves error detection and repeat transmission to cope with transmission errors. Accordingly, a first relatively high speed processor is used for both control of the terminal interface and for data compression and decompression; and a second processor is used for the processes involved in control of the data line interface including error detection and retransmission. Thus loading peaks occurring in either processor cannot interfere with the other.
Glossary Encoder Current Character The most recent character from the input stream which is being processed by the encoder font system. At the end of each encoder cycle, encoder current character, "ECChar", becomes CharlPrior and the fetch and encode process continues with the next character from the input stream as encoder current character.
Escape A bit encoded sequence which serves as a signal from the encoder that the subsequent item is to be treated differently from that normally expected. Example: a font encoded sequence that signifies that a NewChar follows in the data stream.
Font One record of an array of records, each record consisting of pointers, links, characters, etc., each record having an address based on the prior two characters in the input stream, each record containing a list of historically occurring candidate characters to be matched with characters from the input stream.
Huffman Codes As used in this document, this term refers to any variable length bit representation having fewer bits corresponding to higher frequency of occurrence, including but not limited to codes created by a tree algorithm.
NewChar The occurrence of "NewChar" is a font encoding event wherein either the encoder current character, "ECChar", is not found in the selected font or there is no font in existence (and it thus contains 0 characters based on the font selection scheme) .
NewChar Symbol A dynamic value in the range of 0 through n (where n is the maximum number of characters per font) which represents the current virtual position in the font which represents "character not in table". It is used as an escape to signal GlobalNC encoding. NewChar Escape Specifically an encoding representing the NewChar Symbol. String Escape An escape sequence specifically to condition the decoder for reception of a string. When used in the context of a font encoding or a font decoding, a value equal to NewChar Escape + 1.
APPENDIX 1
SOURCE CODE StringA, StringB with full separation in StringTime calls
From TC90F2K2.MAC
IF2
.printx /C19 Encoder and Decoder/
ENDIF
.xlist .C18 ; assembler, please do C18 intructions lodr6 equ 1 ; 6.144mhz clock
.sfcond include ITEcl9 include TCdfmOOl
. printstat macro a,b,c,d if2
.printx /a b e d/ endif endm .list pagealign macro if (tblofs and 255) ne 0 fred defl (0-tblofs) and 255 printstat <Page Align Waste =>,%fred tblofs defl tblofs + fred endif endm
************ A S S E M B L Y O P T I O N S ************
OPTIONS WHICH CHANGE COMPRESSION/SPEED
AHashX2 EQU 0 ; J (0) 0 - no; 1 - yes
MaximumASearches EQU 8 ; J (8) maximum A hashes APPENDIX 1
searched
MaximumBSearches EQU 16 ; J (16) maximum B hashes searched
NC8BitCycle EQU 64 ; J (64) controls A-String/B-String
TwoBytes EQU 1 ; 7 (1) 0 - one-byte font controls
; 1 - two-byte font controls ZoneTestA EQU 1 ; J (1) 0 - HIGH; 1 - HIGH &
LOW ZoneTestB EQU 1 ; J (1) 0 - HIGH; 1 - HIGH &
LOW
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
OPTIONS WHICH PROBABLY ARE NOT GOING TO CHANGE
Fontsize EQU 8 ; only 8,16 are supported; this
; keeps fonts on page boundaries
FontTables EQU 1024 ; may be 512-1024 provided that
; (FontTables*FontSize) MOD 256 =
IF Fontsize EQ 16
CharsPerFont EQU 13 ; otherwise need 17,18-index tables
ELSE
CharsPerFont EQU FontSize-TwoBytes-1
ENDIF
SetLength EQU 128 ; refill to SetLength*2 bytes after
; SetLength(+) bytes have been encoded AntiEx EQU 1 ; 0 - off; 1 - on BufferSize EQU 8192 ; size of Round Robin buffer BufferHashes EQU 2048 ; # of Round Robin 4-byte hashes APPENDIX 1
BufferSuffix EQU 1 ; 0 - nulls; 1 - maintained Failsafe EQU 0 ; 0 - no failsafe; 1 - output failsafe
IF Failsafe FailSafeSets EQU 4 ; output every (n * 256) encodings
ENDIF FTHashes EQU 2048 ; # of Round Robin 2-byte hashes
MinimumAString EQU 6 ; minimum length A string MinimumAUpdate EQU 3 ; bytes advanced if no A string found MinimumBString EQU 2 ; minimum length B string MinimumBUpdate EQU 1 ; bytes advanced if no B string found NCFreqSets EQU 4 ; uses 256*2*2*Sets bytes
NCFreqSetsHigh EQU 0 ; if used, 0 gives best result ????? NCFreqSetsReset EQU 1 ; 0 - no; 1 - reset on B to A change Repeats EQU 1 ; 0 - no repeat logic; 1 - repeat logic
IF FontTables GT 512
MatchMask EQU 0FC00H NextMask EQU 7FEH ; after ASL A ELSE
MatchMask EQU 0FE00H NextMask EQU 3FEH ; after ASL A ENDIF
* * * * * * * * * * * * * * * * * * * * * * * * * * * * *
DEBUG AND TEST OPTIONS
Debug EQU 1 ; set to 0 to skip statistics DbgDum EQU Debug XOR 1 APPENDIX 1
EOFControl EQU 1 ; 0 - endless data flow;l - file by file Macros EQU 1 0 - use subroutines; 1 - use macros Prodder EQU 0 ; 0 - no prods; 1 - force prods
IF Prodder
ProdCycle EQU 67 ; prod every ProdCycle characters
ENDIF
Test EQU 0 ; 0 - no test code; 1 - test code
*********** OS EQUATES and ASSEMBLY OPTIONS *************
Load8250 EQU serial loader/debugger ; parallel loader/debugger
Figure imgf000057_0002
********* H O S T I N T E R F A C E M A P ***********
HOST INTERFACE MAP definition (16450 mode)
W8250_RXD equ 00020h
W8250_TXD equ 0002lh
Figure imgf000057_0001
W8250_MCR equ 00024h ln_stat equ 0003Oh mdm_stat equ 0003lh
HostContrl equ 00032h APPENDIX 1
******** F O N T T A B L E S T R U C T U R E ********
ENCODER / DECODER STRUCTURE MAPS
Map of 1 FONT entry
tblbgn
IF TwoBytes tbyte NCIndex tbyte Characters ELSE ; Bits 7-4 = Characters tbyte CharsNCIndex ; Bits 3-0 = NCIndex ENDIF tstor CharTable,CharsPerFont tblend TestFontSize ; size of a font table
if TestFontSize NE FontSize db 256,Font size not 16 else printstat <Font Size =>,%FontSize printstat <Chars per font =>,%CharsPerFont endif
************ P A G E 1 V A R I A B L E S *************
ENCODER / DECODER RAM PAGE 1 VARIABLES ( 8H through 07fH inclusive)
Miscellaneous Variables
tblbgn RamPtrl IF EOFControl ; ! ! ! i ! must be in 48h tbyte HostLCR ; BBS,BBR ENDIF tbyte FetchPtr
Figure imgf000059_0001
IF EOFControl tstor Bytesln,3 tstor BytesOut,3 tbyte. DCStack tbyte ECStack tbyte OutFetch tbyte OutStore
ENDIF
IF Prodder tbyte ProdCounter ENDIF
if tblofs gt 8Oh db 256,Ram Window Error else
MemoryOne equ tblofs - RamPtrl printstat <Page 1 Window Free =>,%080h-tblofs endif APPENDIX 1
************ P A G E 0 V A R I A B L E S *************
ENCODER / DECODER RAM PAGE 0 VARIABLES (83h through Offh inclusive)
tblbgn RamPtrO
Decoder Variables
tbyte DCABStatus tbyte DCBuffer tbyte DCCharacters tbyte DCCharCount tbyte DCCharlPrior tbyte DCChar2Prior tbyte DCCommand tbyte DCCurrentChar tbyte DCCurrentFreq tword DCCurrentHash IF Failsafe tword DCFailSafe ENDIF tword DCFontBase tbyte DCFontlndex tword DCFTLastHash tword DCFTNextRough tword DCFTParent tword DCFTChild tword DCNCBitsNew tword DCNCBitsPrior tbyte DCNCCounter tbyte DCNCIndex tword DCRRPtr
Encoder Variables
Figure imgf000061_0001
tbyte ECABChange IF AntiEx tbyte ECAntiEStatus
ENDIF
Figure imgf000062_0001
if tblofs gt lOOh db 256,Page 0 Ram Error else
MemoryZero equ tblofs - RamPtrO printstat <Page 0 Ram Free =>,%0100h-tblofs endif
***** 0 8 0 0 - 4 0 0 0 h M E M O R Y B L O C K *****
ENCODER / DECODER TABLES
tblbgn 0800h
tstor ECChar ,256 56 tstor ECCharCopy ,256 56
IF Repeats tstor ECRepeats ,256 56 tstor ECRepeatSW ,256 56 APPENDIX 1
ENDIF tstor ECType,256
256 tstor FTHashMatch,FontTables*2 2048 tstor ECRRHashHead,BufferHashes*2
4096 tstor InBuffer,256
256
IF EOFControl;{ tstor OutBuffer,256
256
ENDIF ;} tstor DCGlobalHigh,4 4 tstor ECGlobalHigh,4
7944
Figure imgf000063_0001
APPENDIX 1
ENDIF ; } r
IF tblofs GT 4000h
DB 256,Addr 800h - 4000h Block Error ELSE fred defl 400 Oh- tblofs printstat <800h-4000h Block Free =>,%fred ENDIF
;**** 4 0 0 0 - C O O O h E N C O D E R B L O C K ****
.
; This block, from 4000h to Obfffh inclusive, is the 32 kbyte page
; area. Access to this block or its alter-ego is controlled by the setting of PB2.
Must be page aligned
; Encoder Font Tables:
tblbgn 4000h
MAX tstor ECRRHashLink,BufferSize/2*2 ; 8192 tstor ECFontTables,FontTables*FontSize
8192
IF FontTables GT 512;{ tstor ECFTHashRough,1024*2
2048
ELSE ;{} tstor ECFTHashRough, 512*2 ENDIF ; } tstor ECFTHashNext,FontTables*2 ; 2048 tstor ECRRBuffer,BufferSize APPENDIX 1
; 8192 tstor ECRRSuffix,256
256 tstor ECNCChar,256 256 tstor ECNCFreg,256
256 tstor ECNCCandF, (NCFreqSets-1)*512 sets ; 1536 tstor ECFontlndex,256 ; 256 tstor ECFrequency,256 ; 256 tstor ECHashRawO,256 256 tstor ECHashRawl,256
256 tstor ECHashX20,256
256 tstor ECHashX21,256
256 tstor ECNewIndex,256 56
;32768 IF tblofs GT OCOOOh
DB 256,Addr 4000h-C000h Block Error ELSE fred defl OCOOOh-tblofs printstat <Encoder Main Ram Free =>,%fred ENDIF
**** 4 0 0 0 - C O O O h D E C O D E R B L O C K ****
Decoder Font Tables:
tblbgn 4000h APPENDIX 1
MAX tstor ECRRHashTest,BufferSize/2*2
; 8192 tstor DCFontTables,FontTables*FontSize ; 8192
IF FontTables GT 512;{ tstor DCFTHashRough,1024*2
; 2048
Figure imgf000066_0001
tstor DCNCChar,256
256 tstor DCNCFreq,256
256 tstor DCNCCandF, (NCFreqSets-l)*512 sets ; 1536
,32768 IF tblofs GT OCOOOh
DB 256,Addr 4000h-C000h Block Error ELSE fred β&fl OCOOOh-tblofs printstat <Decoder Main Ram Free =>,%fred ENDIF
************* B A S E O F P R O G R A M *************
BASE OF PROGRAM
cb equ $ APPENDIX 1
***** E N C / D E C I N T E R F A C E S U B S ** ****
IF EOFControl XOR 1;{ all code in this Section
IS active only in modem operation
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
READ FROM PC VIA INTERRUPT
Hostlnt:
RTI
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
ENCODER SENDS PROD/COMMAND TO REMOTE
SendProdCommand:
STI #0A0h,ECCommand ; Prod is 10 STI #0C8h,ECCommand ; Command is linn
JSR ECProdCommand
etc.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
DECODER PROCESSES COMMAND FROM REMOTE
ProcessCommand: ;
; process command and then APPENDIX 1
; return to DCFontParams STI #000h,DCCommand JMP DCFontParams
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
DO WHATEVER IS REQUIRED WHEN DECODER FAILS
FailSafeFailed:
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
ENCODER READ FROM PC
ECReadCharacter:
RTS
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
ENCODER WRITE TO PC
ECWriteCharacter:
RTS
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
DECODER READ FROM PC
DCReadCharacter: APPENDIX 1
RTS
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
DECODER WRITE TO PC
DCWriteCharacter:
RTS
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
ENDIF ;} end of modem enc/dec routines
*********** I N I T I A L I Z E M E M O R Y ***********
Initialize Working Storage
Initialize:
LDX #MemoryZero
LDA #0 ClearMemoryO:
STA RamPtrO-l,X
DEX
BNE ClearMemoryO
LDX #MemoryOne DEX
ClearMemoryl:
STA RamPtrl,X ; leave HostLCR unreset
DEX
BNE ClearMemoryl ;
; CLEAR 800h-3FFFh APPENDIX 1
STI #008h,DCWordl+1
LDY #038h
JSR BlockReset
CLEAR 4000h-BFFFh - DECODER
DecBankSelect
STI #040h,DCWordl+l
LDY #08Oh JSR BlockReset
STI #HIGH(DCNCChar) ,DCWordl+l JSR NCCharFreqReset
IF Failsafe STI #000h, DCFailSafe+O
STI #FailSafeS"ets, DCFailSafe+1 ENDIF
STI #001h, DCABStatus
STI #HIGH(DCFontTables) , DCFontBase+1 IF FontSize EQ 8 STI #008h, DCFontBase+0 ELSE
STI #010h, DCFontBase+O
ENDIF
LDA #HIGH(DCFTHashRough) STA DCFTNextRough+1 STA DCFTParent+1
LDA #HIGH(DCFTHashNext) STA DCFTHashRough+1
STA DCFTLastHash+1
LDA #002h STA DCFTHashRough+0
STA DCCurrentHash+0
Figure imgf000071_0001
STI #HIGH(DCRRBuffer) , DCRRPtr+1
STI #HIGH(NC8BitCycle*8), DCNCBitsPrior+1
STI #LOW (NC8BitCycle*8) , DCNCBitsPrior+O
STI #NC8BitCycle, DCNCCounter
CLEAR 4000h-BFFFh - ENCODER
EncBankSelect
STI #040h,DCWordl+l
LDY #08Oh
JSR BlockReset
STI #HIGH(ECNCChar),DCWordl+l JSR NCCharFreqReset
IF Prodder
STI #ProdCycle, ProdCounter ENDIF IF Failsafe
STI #000h, ECFailSafe+0
STI #FailSafeSets, ECFailSafe+1 ENDIF
STI #001h, ECABStatus IF AntiEx
STI #001h, ECAntiEStatus ENDIF
LDA #HIGH(ECFTHashRough) STA ECFTNextRough+1
STA ECFTParent+1
STI #HIGH(ECFTHashNext) , ECFTLastHash+1 APPENDIX 1
STI #002h, ECFTLastHash+O
LDA #HIGH(ECFTHashNext)
STA ECFTHashRough+1
LDA #002h
STA ECFTHashRough+0
STA ECCurrentHash+O
STA ECHashX20+255
LDA #08Oh
STA ECCurrentHash+1
STA ECHashX21+255
STI #001h, ECBuffer
STI #HIGH(ECRRBuffer) , ECRRPtr+1
STI #HIGH(NC8BitCycle*8) , ECNCBitsPrior+1
STI #LOW (NC8BitCycle*8) , ECNCBitsPrior+O
STI #NC8BitCycle, ECNCCounter
RTS
BlockReset:
STI #000h,DCWordl+0
LDA #000
BRLoop:
STA (DCWordl)
INC DCWordl+0
BNE BRLoop
DEY
BEQ BRExit
INC DCWordl+1
IJMP BRLoop BRExit:
RTS
NCCharFreqReset:
LDA #NCFreqSets STA DCBytel
STI #000h,DCWordl+0 APPENDIX 1
NCCFRLoopO:
LDX #000h NCCFRLoopl:
LDA Best128,X STA (DCWordl)
INC DCWordl+0
INX
BPL NCCFRLoopl
NCCFRLoop2: TXA
STA (DCWordl)
INC DCWordl+0
INX
BMI NCCFRLoop2 LDA DCWordl+1
ADD #001h
STA DCWord2+l
STI #000h,DCWord2+0
LDY #000h NCCFRLoop3:
LDA (DCWordl)
TAX
TYA
STA (DCWord2) ,X INC DCWordl+0
INY
BNE NCCFRLoop3
DEC DCBytel
BEQ NCCFRExit INC DCWordl+1
INC DCWordl+1
!JMP NCCFRLoopO
NCCFRExit:
RTS ;
************** M A I N D A T A F L O W ************** APPENDIX 1
StrtUp:
LDX #0FFh ; set stack pointer TXS mask_gen <bcr_fast_es2,bcr_fast_esl>
STI #mask,bcr ; set C18/C19 to fast execution mask_gen <cir_fast_es3>
STI #mask ,clint STI #007h,HostContrl ; enable 16450 mode + interrupts
STI #05Fh,ln_stat ; 8250.THRE = 1 ResetMemory:
BBS 4,HostLCR,ResetHostLCR JSR Initialize ResetHostLCR:
STI #080h,HostLCR ; set bit 7 LCRLoop:
STI #000h,FetchPtr STI #000h,StorePtr
BBR 7,HostLCR,SetBreak ; reset if host wrote LCR !JMP LCRLoop SetBreak:
LDA w8250_LCR ; save host command info STA HostLCR
BBR 5,HostLCR,SetBreakCont
BBS 6,HostLCR,SetBreakCont
BBR 2,HostLCR,SetBreakCont
STI #0F6h,HostContrl ; no ints during Memory Load
SetBreakCont:
BBS 4,HostLCR,SetBreakNoReset
JSR Initialize ; HostLCR(4) - 0 reset memory SetBreakNoReset:
BBS 6,ln_stat,SetBreakTSRE APPENDIX 1
STI #02Fh,ln_stat set 4, leave 6 at 0
!JMP WhichProcess SetBreakTSRE:
STI #06Fh,ln_stat set 4, leave 6 at 1 WhichProcess:
BBS 6,HostLCR,LoopBack
BBS 5,HostLCR,DumpLoadMemory
BBR 2,HostLCR,ECStart HostLCR(2) - 0 Encoder DCStart: ; - 1 Decoder DecBankSelect
JMP DCFontParams ECStart:
EncBankSelect
JMP ECRefill DCOrECEOF:
LDX #0FFh reset primary stack TXS
BBS 6,ln_stat,EOFTSRE
STI #02Fh,ln_stat set 4, leave 6 at 0 IJMP EOFStats EOFTSRE:
STI #06Fh,ln_stat set 4, leave 6 at 1 EOFStats:
BBS 3,HostLCR,EOFAcked set if host set LCR bit 3
IJMP EOFStats EOFAcked:
LDA Bytesln+O
JSR SubWriteToPC LDA Bytesln+l
JSR SubWriteToPC
LDA BytesIn+2
JSR SubWriteToPC
LDA #000h JSR SubWriteToPC
LDA BytesOut+0 APPENDIX 1
JSR SubWriteToPC
LDA BytesOut+1
JSR SubWriteToPC
LDA BytesOut+2 JSR SubWriteToPC
LDA #000h
JSR SubWriteToPC
LDA #000h
STA Bytesln+O STA Bytesln+l
STA BytesIn+2
STA BytesOut+0
STA BytesOut+1
STA BytesOut+2 JMP ResetMemory
*********** p c L O O P B A C K C O D E ************
IF EOFControl ;{ all code in this Section is
; active only in loopback operation
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
DumpLoadMemory:
BBS 2,HostLCR,LoadMemory DumpMemory:
JSR MemoryDump !JMP DCOrECEOF
LoadMemory:
JSR Me oryLoad
•JMP DCOrECEOF
* LoopBack:
BBR 5,HostLCR,LoopBackNoDump APPENDIX 1
JSR MemoryDump
BBS 6,ln_stat,LoopBackTSRE
STI #02Fh,ln_stat ; set 4, leave 6 at 0
!JMP LoopBackWait LoopBackTSRE:
STI #06Fh,ln_stat ; set 4, leave 6 at 1 LoopBackWait:
BBS 3,HostLCR,LoopBackAcked ; set if host set LCR bit 3 !JMP LoopBackWait
LoopBackAcked:
LDA HostLCR
AND #0F7h
STA HostLCR LoopBackNoDump:
LDX #07Fh
TXS
LDA #HIGH(ECRefill)
PHA LDA #LOW(ECRefill)
PHA
PSH
TSX
STX ECStack LDX #0FFh
TXS
DecBankSelect
JMP DCFontParams
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
SwitchToDecode: PSH TSX STX ECStack
LDX DCStack APPENDIX 1
TXS
DecBankSelect
PUL
RTS ;
SwitchToEncode:
PSH
TSX
STX DCStack LDX ECStack
TXS
EncBankSelect
PUL
RTS
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
MISC READ FROM PC; USED ONLY FOR MEMORY LOAD; INTS ARE OFF ;
SubReadFro PC:
LDA Hostcontrl BPL SubReadFromPC LDA W8250_TXD STI #076h,HostContrl
STI #0lFh,ln_stat RTS
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
MISC WRITE TO PC
SubWriteToPC:
BBS 0,ln_stat,SubWriteToPC BBS 0,ln_stat,SubWriteToPC ; twice for SPERRY et al APPENDIX 1
STA W8250_RXD
BBS 6,ln_stat,SubWritePCTSRE
STI #03Eh,ln_stat ; set 0, leave 6 at 0 !JMP SubWritePCCont SubWritePCTSRE:
STI #07Eh,ln_stat ; set 0, leave 6 at 1 SubWritePCCont:
RTS
. ;* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
.
MemoryDump:
LDX #000h
LDY #((RamPtrl+1)-OOOh) MDLooplFF:
TXA
JSR SubWriteToPC
INX
DEY BNE MDLooplFF
LDY #(080h-(RamPtrl+l)) ; X = #RamPtrl+l MDLoopl:
LDA PortA,X
JSR SubWriteToPC INX
DEY
BNE MDLoopl
LDY #(RamPtrO-080h) MDLoop2FF: TXA
JSR SubWriteToPC
INX
DEY
BNE MDLoop2FF LDY #(100h-RamPtr0) ; X = #RamPtr0
MDLoop2:
Figure imgf000080_0001
; X = #RamPtrl+l
Figure imgf000081_0001
Figure imgf000082_0001
APPENDIX 1
MLLoop4:
JSR SubReadFromPC
STA (ECWordl)
INC ECWordl+0 ifEQ INC ECWordl+1 LDA ECWordl+1 CMP #0C0h BEQ MLLoop4Exit fi
1JMP MLLoop4 MLLoop4Exit:
DecBankSelect
STI #040h,ECWordl+l STI #000h,ECWordl+0
MLLoopδ:
JSR SubReadFromPC
Figure imgf000083_0001
MLLoopδExit: LDA MDSave+0 STA ECWordl+0 LDA MDSave+1
STA ECWordl+1 STI #0F7h,HostContrl RTS
MDSave:
ORG $+2 APPENDIX 1
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
READ FROM PC VIA INTERRUPT
Hostlnt:
PSH
LDA Hostcontrl ifMI LDX StorePtr
LDA W8250_TXD STA InBuffer,X INX
STX StorePtr INX
CPX FetchPtr ifNE
STI #01fh,ln_stat fi fi
BBR 5,Hostcontrl,Hostlntl
LDA w8250_LCR ; save host command info STA HostLCR Hostlntl: STI #007h,Hostcontrl
PUL RTI
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
ENCODER READ FROM PC
ECReadCharacter:
IF Test ;{ LDA Bytesln+O
CMP #050h APPENDIX 1
BNE ECReadChar LDA Bytesln+l CMP #002h BNE ECReadChar LDA BytesIn+2
CMP #000h BNE ECReadChar NOP ; set breakpoint here ENDIF } ECReadChar:
LDX FetchPtr CPX StorePtr ifEQ BBR 1,HostLCR,ECReadChar ; HostLCR is read in interrupt
SMB 7,HostLCR 1JMP ECReadCharExit bit 1 set when EOF and ptrs =
flush input when Decoder ; has FailedSafe
Figure imgf000085_0001
; A = char from PC APPENDIX 1
ECReadCharExit : RTS
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
ENCODER WRITE TO PC
ECWriteCharacter: PSH IF Test ;{
LDA BytesOut+0 CMP #06Bh
BNE ECWriteSearched LDA BytesOut+1 CMP #001h
BNE ECWriteSearched LDA BytesOut+2 CMP #000h
BNE ECWriteSearched NOP ; set breakpoint here
ECWriteSearched:
ENDIF ;}
BBR 6,HostLCR,ECWriteChar LDX OutStore LDA ECBuffer
STA OutBuffer,X INX
STX OutStore INX CPX OutFetch ifEQ
JSR SwitchToDecode fi
!JMP ECWriteCont ECWriteChar:
BBS 0,HostLCR,ECWriteCont APPENDIX 1
BBS 0,ln_stat,ECWriteChar
BBS 0,ln_stat,ECWriteChar ; twice for SPERRY et al
LDA ECBuffer
STA W8250_RXD
BBS 6,ln_stat,ECWriteTSRE
STI #03Eh,ln_stat set 0, leave 6 at 0 !JMP ECWriteCont ECWriteTSRE: STI #07Eh,ln_stat set 0, leave 6 at 1
ECWriteCont: PUL
STI #001h,ECBuffer INC BytesOut+0 ifEQ
INC BytesOut+1 ifEQ
INC BytesOut+2 fi fi
RTS
* * * * * * * * * * * * * * * * * * * * * * * * * * * * *
DECODER READ FROM PC
DCReadCharacter:
PSH
IF Test ;{ LDA BytesIn+0 CMP #0A0h BNE DCReadChar LDA Bytesln+l CMP #005h BNE DCReadChar LDA BytesIn+2 APPENDIX 1
CMP #002h BNE DCReadChar
NOP ; set breakpoint here
ENDIF ,*) DCReadChar:
BBR 6,HostLCR,DCReadCharNLB LDX OutFetch CPX OutStore ifEQ JSR SwitchToEncode
1JMP DCReadChar fi
INC OutFetch
LDA OutBuffer,X ; A = char from Encoder STA DCBuffer
IJMP DCReadCharExit DCReadCharNLB:
LDX FetchPtr CPX StorePtr ifEQ
BBR 1,HostLCR,DCReadChar ; HostLCR is read in interrupt
SMB 7,HostLCR ; NOTE: not a normal EOF IJMP DCReadCharExit ; bit 1 set when EOF and ptrs = fi
BBS 5,ln_stat,DCReadCharLS STI #01fh,ln_stat DCReadCharLS: INC FetchPtr
INC Bytesln+O ifEQ INC Bytesln+l ifEQ INC BytesIn+2 fi APPENDIX 1
fi
LDA ECCommand flush input when Decoder BMI DCReadChar ; has FailedSafe LDA InBuffer,X ; A = char from PC STA DCBuffer
DCReadCharExit: PUL RTS
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
DECODER WRITE TO PC
DCWriteCharacter:
IF Test ;{
PHA
LDA BytesOut+0
CMP #027h
BNE DCWriteSearched
LDA BytesOut+1
CMP #017h
BNE DCWriteSearched
LDA BytesOut+2
CMP #000h
BNE DCWriteSearched
NOP ; set breakpoint here
DCWriteSearched:
PLA ENDIF ;} DCWriteChar:
BBS 0,HostLCR,DCWriteCont BBS 0,ln_stat,DCWriteChar BBS 0,ln_stat,DCWriteChar twice for SPERRY et al
STA W8250_RXD
BBS 6,ln_stat,DCWriteTSRE APPENDIX 1
STI #03Eh,ln_stat ; set 0, leave 6 at 0 IJMP DCWriteCont DCWriteTSRE: _
STI #07Eh,ln_stat ; set 0, leave 6 at 1 DCWriteCont:
STI #001h,ECBuffer BBS 6,HostLCR,DCWriteCharNLB INC BytesOut+0 ifEQ INC BytesOut+1 ifEQ
INC BytesOut+2 fi fi DCWriteCharNLB: RTS
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
ENDIF ;} end of loopback enc/dec routines
************* E N C O D E R M A C R O S **************
Writel7 MACRO
IF Macros MSWritel7 ELSE JSR MSWritel7 ENDIF
ENDM
Writeδ MACRO IF Macros MSWriteδ
ELSE APPENDIX 1
JSR MSWriteδ ENDIF ENDM
Write817 MACRO
IF Macros
MSWriteδ17 ELSE
JSR MSWrite817 ENDIF ENDM
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
WRITE 1-7 BITS PER GUARD BIT POSITION
IF Macros MSWritel7 MACRO
LOCAL Writel7Loop,Writel7Exit ELSE
MSWritel7:
ENDIF Write17Loop:
ASL A BEQ Writel7Exit
ROL ECBuffer BCC Writel7Loop JSR ECWriteCharacter IJMP Writel7Loop Writel7Exit:
IF Macros
ENDM ELSE RTS ENDIF APPENDIX 1
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
WRITE ONE BYTE OF BITS
IF Macros
MSWriteS MACRO
LOCAL WriteδLoop,WriteδSkip,WriteSExit ELSE MSWriteδ: ENDIF
ASL A ORA #001h IJMP WriteδSkip WriteδLoop: ASL A
BEQ WriteδExit WriteδSkip:
ROL ECBuffer BCC WriteδLoop JSR ECWriteCharacter
IJMP WriteδLoop WriteδExit:
IF Macros ENDM ELSE
RTS ENDIF
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
WRITE ONE+ BYTE(S) OF BITS
IF Macros MSWriteδ17 MACRO LOCAL
Writeδ17Loopl,Writeδ17Skip,Writeδ17Exit1,Writeδ17Loop2,Write APPENDIX 1
617Exit2
ELSE MSWriteδ17:
ENDIF ASL A
ORA #001h IJMP Writeδ17Skip Writeδ17Loopl:
ASL A BEQ Writeδ17Exitl
Writeδ17Skip:
ROL ECBuffer BCC Writeδ17Loopl JSR ECWriteCharacter IJMP Writeδ17Loopl
Writeδ17Exitl: TXA Writeδ17Loop2:
ASL A BEQ Writeδ17Exit2
ROL ECBuffer BCC Writeδ17Loop2 JSR ECWriteCharacter IJMP Writeδ17Loop2 Writeδ17Exit2:
IF Macros
ENDM ELSE RTS ENDIF
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
SET POINTER TO NCChar,NCFreq TABLES
SetCharFreq MACRO ED,BW ; NCFreqSets = APPENDIX 1
Figure imgf000094_0001
HIGH(base of
NCChar)
ADD #001h
STA ED&Word2+l ; Word2+l = HIGH(base of
NCFreq)
STI #000h,ED&Wordl+0
STI #000h,ED&Word2+0 ENDIF IF "&BW" EQ "Wl"
ASL A ; HIGH((0-3)*512)
ADD #HIGH(ED&NCChar)
STA ED&Wordl+l ; Wordl+1 = HIGH(base of NCChar)
ENDIF
IF "&BW" EQ "W2"
ASL A ; HIGH((0-n)*512)
ADD #(HIGH(ED&NCChar)+1)
STA ED&Word2+l ; Word2+l = HIGH(base of
NCFreq)
ENDIF ENDM
;********** F O N T U P D A T E M A C R O ********** APPENDIX 1
FONT UPDATE MACRO
IF EC, Y = ECNextChar
FontUpdate MACRO XX,YY LOCAL FontlstUse,FontActive,ECNC8Bit,ECNCGlobal,ECNCCoded
LDA XX&CurrentHash+l HIGH of prior hash IF "&XX" EQ "DC" ;{
BPL FontActive
JMP DCNewCharacter ELSE ;{}
STA ECFontBase+1 (stored as * 2)
LDA ECCurrentHash+O ; LOW of prior hash
ASL A
ROL ECFontBase+1 now = * 4
ASL A
ROL ECFontBase+1 now = * δ
IF FontSize EQ 16 ASL A ROL ECFontBase+1 now = * 16
ENDIF
STA ECFontBase+0
TAX ; save for PHX
LDA ECCurrentHash+1 HIGH of prior hash BPL FontActive
FontlstUse: IF "&YY" EQ "FU" { LDA ECFontBase+1 ADD #HIGH(ECFontTables) STA ECFontBase+1 LDA #000h STA ECCharacters STA ECNCIndex
ENDIF ;}
LDA #000h APPENDIX 1
STA ECNewIndex,Y ; zero when new font IF "&YY" EQ "FU" ;{
JMP ECNewCharacter ; Characters = 0 ELSE ;{} JMP ECNewCharCommand
ENDIF ;}
ENDIF ;}
FontActive:
IF "&XX" EQ "DC" ;{ LDA DCFontIndex ifPL ;{ LDA DCFontlndex STI #0δOh,DCFontlndex CMP DCNCIndex BCC DCKCharLTNCIndex
BEQ DCKCharEQNCIndex DCKCharGTNCIndex: IJMP DCCharSwap DCKCharEQNCIndex: INC . DCNCIndex bump NCIndex
JMP DCFontUpdated
DCKCharLTNCIndex:
AND #0FFh
BNE DCCharSwap JMP DCFontUpdated els ;{} LDA DCFontBase+1 LDX DCFontBase+0 PHA push address back on stack
PHX and pull to I PLI LAN NCIndex or CharsNCIndex IF TwoBytes ;{
LAN Characters ENDIF ;} APPENDIX 1
fi ?} ELSE ;{}
LDA ECFontBase+1 ADD #HIGH(ECFontTables) STA ECFontBase+1 PHA push address back on stack PHX and pull to I PLI LAN NCIndex or CharsNCIndex IF TwoBytes ;{
Characters
W - CharsNCIndex
; NCIndex in bits
3-0
; Characters in bits
7-4
; +1 if δ-bit active
Figure imgf000097_0001
= # of font indices (base 0)
LDA EncodingTable,X A = FontCode(Bits) offset
STA ECBytel ; Bytel = EncodingTable base ptr
ADD ECNCIndex ; Bytel + NCIndex STA ECNewIndex,Y r I = ptr to 1st
X # of characters to
; W = A = character le index, base 0
Figure imgf000098_0001
(= character encoding index
BCC XX&CharLTNCIndex if < NCIndex) BEQ XX&CharEQNCIndex
XX&CharGTNCIndex:
IF "&XX" EQ "EC" ADD XX&ABStatus ; 0 or 1 ADD #002h ; A = character encoding index
ADD XX&Bytel + character table index
STA ECFontlndex,Y ; W = A = character table index
TWA for table swap ENDIF IJMP XX&CharSwap APPENDIX 1
XX&CharEQNCInde :
IF "&XX" EQ "EC" ADD XX&ABStatus ; 0 or 1 ADD #002h ; A = character encoding index
ADD XX&Bytel + character table index STA ECFontlndex,Y ENDIF
INC XX&NCIndex ; bump NCIndex IJMP XX&FontEncoding
XX&CharLTNCIndex:
IF "&XX" EQ "EC" ADD XX&Bytel ; A = character encoding index STA ECFontlndex,Y ; + character table index
TWA ; W = character table index ELSE
AND #0FFh ENDIF BEQ XX&FontEncoding ; no swap if already index 0 XX&CharSwap: A = character index. base 0
ADD #(CharTable-l) ; ptr to previous character
TAX
LDA (XX&FontBase),X INX
STA (XX&FontBase) ,X LDA XX&CurrentChar
DEX
STA (XX&FontBase) ,X XX&FontEncoding:
IF "&XX" EQ "DC" JMP DCFontUpdated
ELSE - 9δ - APPENDIX 1
LDA #002h
STA ECType,Y ; Type 2 - normal font encoding
BBS 0,ECABStatus,ECFontSaveEight 5 JMP ECFontUpdated
ECFontSaveEight:
SetCharFreq XX,W2 ; output is NC frequency value
LDA ECCurrentChar ; save for consistent 0 'strings
STA ECWord2+0 ; off* ouput write code LDA (ECWord2) STA ECFrequency,Y JMP ECFontUpdated ENDIF
XX&NewCharacter:
LDA XX&NCIndex ifNE DEC XX&NCIndex fi
SetCharFreq XX,WB LDX XX&CurrentChar LDA (XX&Word2),X frequency of current character STA XX&CurrentFreq
BEQ XX&NewCharOK
LDX XX&Byte3 ; 0-n where n = 0, 3, 7 or 15 LDA XX&GlobalHigh,X CMP XX&CurrentFreq BCS XX&NewCharSwap ; CurrentFreq <=
■ GlobalHigh
XX&NewCharExchange:
INC XX&GlobalHigh,X
TAX ; X = high frequency LDA (XX&Wordl) ,X
TAW ; W = high character APPENDIX 1
LDA XX&CurrentChar
STA (XX&Wordl) ,X ; current char > high char
TXA
LDX XX&CurrentChar
STA (XX&Word2) ,X ; high freq > char freq
LDX XX&CurrentFreq
TWA
STA (XX&Wordl) ,X high char > current char
TAX
LDA XX&CurrentFreq
STA (XX&Word2),X ; current freq > high freq
IJMP XX&NewCharOK XX&NewCharSwap:
LDX XX&CurrentFreq
DEX X = lower freq
LDA (XX&Wordl),X
TAW W = lower char
LDA XX&CurrentChar
STA (XX&Wordl),X ; current char > lower char
TXA
LDX XX&CurrentChar
STA (XX&Word2) ,X ; lower freq > char freq
LDX XX&CurrentFreq
TWA
STA (XX&Wordl) ,X ; lower char > current char
TAX
LDA XX&CurrentFreq STA (XX&Word2),X current freq > lower freq XX&NewCharOK:
LDA XX&Characters APPENDIX 1
CMP #CharsPerFont
BEQ XX&NewCharOverflow ; check for font table full
INC XX&Characters if not, add to char count
ADD #001H XX&NewCharOverflow:
ADD #(CharTable-l)
TAX
LDA XX&CurrentChar store current char in font
STA (XX&FontBase) ,X
LDX XX&CurrentFreq X = ECCurrentFreq
LDA GlobalBits,X
ADD XX&NCBitsNew+O
STA XX&NCBitsNew+O update NC trending total ifCS INC XX&NCBitsNew+l fi
ELSE ;{}
ECNewCharCommand:
LDX #0FFh ; X = ECCurrentFreq for command ENDIF ;}
IF "&XX" EQ "EC" BBR 0 ,ECABStatus,ECNCGlobal ECNCβBit:
TXA ; output is NC frequency value STA ECFrequency,Y ; save for consistent
•strings
IJMP ECNCCoded ; off* ouput write code ECNCGlobal:
TXA ; global index for write STA ECFontlndex,Y
ECNCCoded: ; prod/commands do not APPENDIX 1
affect
LDA #004h ; any of the trending totals.
STA ECType,Y tables or hashes ENDIF
IF "&YY" EQ "FU" ;{ XX&NCTrending:
DEC XX&NCCounter BNE XX&FontUpdated STI #NC8BitCycle,XX&NCCounter
LDY #000h ; Wordl -
LDX XX&NCBitsNew+O ; NCBitsPrior +
NCBitsNew
TXA
ADD XX&NCBitsPrior+O ; NCBitsPrior set to
STA XX&Wordl+O NCBitsNew
STX XX&NCBitsPrior+O
STY XX&NCBitsNew+O ; NCBitsNew set to 0
LDX XX&NCBitsNew+l TXA ifCS
ADD #001h ; if low order carry fi
ADD XX&NCBitsPrior+l
STA XX&Wordl+l
STX XX&NCBitsPrior+l
STY XX&NCBitsNew+l
BBR 0,XX&ABStatus,XX&NCOff
XX&NCOn:
LDA #HIGH(NC8BitCycle*15)
CMP XX&Wordl+l
BCC XX&FontUpdated ; HIGH(Wordl) > A
BNE XX&TurnNCOff ; HIGH(Wordl) < A
LDA #LOW(NCδBitCycle*15)
CMP XX&Wordl+O
BCC XX&FontUpdated APPENDIX 1
XX&TurnNCOff:
STI #000h,XX&ABStatus IF "&XX" EQ "EC" STI #001h,ECABChange IF Test
INC SwitchToA+0 ifEQ
INC SwitchToA+1 fi ENDIF
ENDIF
IJMP XX&FontUpdated XX&NCOff:
LDA #HIGH(NC8BitCycle*15) CMP XX&Wordl+l
BCC XX&TurnNCOn ; HIGH(Wordl) > A BNE XX&FontUpdated ; HIGH(Wordl) < A LDA #LOW(NCδBitCycle*15) CMP XX&Wordl+O BCS XX&FontUpdated
XX&TurnNCOn:
IF "&XX" EQ "EC" STI #001h,ECABStatus STI #001h,ECABChange IF Test
INC SwitchToB+0 ifEQ
INC SwitchToB+1 fi ENDIF
ELSE
STI #001h,DCABStatUS ENDIF
IF NCFreqSetsReset LDA #NCFreqSetsHigh
LDX #NCFreqSets APPENDIX 1
XX&ResetGlobalHigh:
STA XX&GlobalHigh-l,X DEX
BNE XX&ResetGlobalHigh ENDIF
XX&FontUpdated:
IF TwoBytes LDA XX&NCIndex STA (XX&FontBase) LDX #001h
LDA XX&Characters STA (XX&FontBase) ,X ELSE LDA XX&Characters ASL A
ASL A ASL A ASL A
ORA XX&NCIndex STA (XX&FontBase)
ENDIF XX&PlusHash:
LDY XX&CharlPrior STY XX&Char2Prior LDX CRC_TH,Y
LDA XX&CurrentChar STA XX&CharlPrior
NEG A ; extra NEG over 1st try ?????
EOR CRC_TL,Y TAY
TXA
EOR CRC_TL,Y
TAW ; W = LOW(rough hash)
IF "&XX" EQ "EC" LDX ECNextChar ; ECHashRaw is bits 15-0 of APPENDIX 1
STA ECHashRawO,X ; the CRC
ENDIF
LDA CRC_TH,Y ; A = HIGH(rough hash)
IF "&XX" EQ "EC" STA ECHashRawl,X
ENDIF
STA XX&Wordl+l
AND #HIGH(MatchMask)
STA XX&Bytel ; Bytel = match bits TWA
ASL A
ROL XX&Wordl+l
STA XX&Wordl+O
LDA XX&Wordl+l AND #HIGH(NextMask)
ADD #HIGH(XX&FTHashRough) ; Wordl = ptr to
STA XX&Wordl+l ; FTHashRough
LDX #001h ; X = 1 for all of
PlusHash XX&PlusFineLoop:
LDA (XX&Wordl) ,X ; direct ptr, can't be 0
BEQ XX&PlusNewHash
TAY ; Wordl may be either Rough
LDA (XX&Wordl) ; or Next; is always the AND #0FEh ; predecessor to new hash
STA XX&Wordl+O
TYA
ADD #(HIGH(FTHashMatch)-HIGH(XX&FTHashNext) ) STA XX&Wordl+l ; Wordl = ptr to
IF "&XX" EQ "EC" ; FTHashMatch
LDA (ECWordl),X ELSE ; HashMatch values are inter-
LDA (DCWordl) ; mixed Decoder/Encoder ENDIF
CMP XX&Bytel APPENDIX 1
BEQ XX&PlusFineFound STY XX&Wordl+l Wordl •= ptr to IJMP XX&PlusFineLoop FTHashNext XX&PlusFineFound: TYA
STA XX&Wordl+l ADD #(0-HIGH(XX&FTHashNext)) STA XX&CurrentHash+l IF "&XX" EQ "EC" LDY ECNextChar
STA ECHashX21,Y ENDIF
LDA XX&Wordl+O STA XX&CurrentHash+O IF "&XX" EQ "EC"
STA ECHashX20,Y ENDIF
JMP XX&PlusHashExit XX&PlusNewHash: Bytel = match bits LDY XX&FTLastHash+l Wordl = rough hash
* 2
BEQ XX&PlusNewSearch XX&PlusNewlstPass:
LDA XX&FTLastHash+O LastHash initialized to 2
ADD #002h + HIGH(FTHashNext) STA XX&FTLastHash+O BNE XX&PlusNewlstCont INY CPY #(HIGH(XX&FTHashNext)+HIGH(FontTables*2)) ifEQ STI #000h,XX&FTLastHash+l ; 0 when wrapped to force search
IJMP XX&PlusNewSearch els
STY XX&FTLastHash+l APPENDIX 1
fi XX&PlusNewlstCont:
STA XX&Word2+0 Word2 = ptr to new's IF "&XX" EQ "DC" FTHashMatch
STA DCWord3+0 ENDIF Word3 = ptr to new's TYA DCFTHashChars
STA (XX&Wordl) ,X
ADD #(HIGH(FTHashMatch)-HIGH(XX&FTHashNext)) STA XX&Word2+l IF "&XX" EQ "EC"
ADD #(0-HIGH(FTHashMatch) ) ELSE
ADD #(HIGH(DCFTHashChars)-HIGH(FTHashMatch) )
STA DCWord3+l
ADD #(0-HIGH(DCFTHashChars)) ENDIF
ORA #0δ0h set bit 7 for new hash
STA XX&CurrentHash+l IF "&XX" EQ "EC"
LDY ECNextChar store new hash in
ECHashX2
STA ECHashX21,Y or in DCCurrentHash ENDIF
LDA XX&FTLastHash+O STA (XX&Wordl) STA XX&CurrentHash+O IF "&XX" EQ "EC"
STA ECHashX20,Y ELSE
LDA DCChar2Prior ; store prior/current chars
STA (DCWord3) ; in DCFTHashChars LDA DCCharlPrior STA (DCWord3),X ENDIF APPENDIX 1
LDA XX&Bytel IF "&XX" EQ "EC" STA (ECWord2) ,X ; HashMatch values are inter- ELSE mixed Decoder/Encoder
STA (DCWord2) ENDIF
JMP XX&PlusHashExit XX&PlusNewSearch: LDA (XX&FTParent) ,X ; initialized to
FTHashRough
BNE XX&PlusNewParent XX&PlusNewRoughAdvance:
LDA XX&FTNextRough+O initialized to FTHashRough
ADD #002h
STA XX&FTNextRough+O ifEQ INC XX&FTNextRough+l ; memory allocation dependent
LDY #HIGH(XX&FTHashNext) ; i.e. FTHashRough table must
CPY XX&FTNextRough+l ; be right before FTHashNext ifEQ STI #HIGH(XX&FTHashRough) ,XX&FTNextRough+l fi fi
LDA (XX&FTNextRough) ,X BEQ XX&PlusNewRoughAdvance
LDY XX&FTNextRough+O STY XX&FTParent+O LDY XX&FTNextRough+l STY XX&FTParent+l XX&PlusNewParent:
STA XX&FTChild+l - lOδ - APPENDIX 1
LDA (XX&FTParent) STA XX&FTChild+O
TAY ; Y = FTChild+0
IJMP XX&PlusNewNext2Found XX&PlusNewNextAdvance:
LDA XX&FTChild+l STA XX&FTParent+l LDA XX&FTChild+O STA XX&FTParent+O IJMP XX&PlusNewSearch
XX&PlusNewNext2Found:
CPY XX&Wordl+O ifEQ LDA XX&FTChild+l CMP XX&Wordl+l
BEQ XX&PlusNewNextAdvance fi
LDA (XX&FTChild) STA (XX&FTParent) LDA (XX&FTChild) ,X
STA (XX&FTParent) ,X
STY XX&Word2+0 ; Word2 = ptr to new's IF "&XX" EQ "DC" ; FTHashMatch STY DCWord3+0 ENDIF ; Word3 = ptr to new's
TYA ; DCFTHashChars
STA (XX&Wordl) LDA XX&FTChild+l STA (XX&Wordl),X ADD #(HIGH(FTHashMatch)-HIGH(XX&FTHashNext) )
STA XX&Word2+l IF "&XX" EQ "EC"
ADD #(0-HIGH(FTHashMatch) ) ELSE ADD #(HIGH(DCFTHashChars)-HIGH(FTHashMatch) )
STA DCWord3+l APPENDIX 1
ADD #(0-HIGH(DCFTHashChars)) ENDIF
ORA #0δ0h set bit 7 for new hash
STA XX&CurrentHash+l IF "&XX" EQ "EC"
LDY ECNextChar ; store new hash in
ECHashX2
STA ECHashX21,Y ; or in DCCurrentHash ENDIF
LDA XX&Word2+0 STA XX&CurrentHash+O IF "&XX" EQ "EC"
STA ECHashX20,Y ELSE
LDA DCChar2Prior ; store prior/current chars
STA (DCWord3) in DCFTHashChars LDA DCCharlPrior STA (DCWord3),X ENDIF
LDA XX&Bytel IF "&XX" EQ "EC" STA (ECWord2),X ; HashMatch values are inter-
ELSE ; mixed Decoder/Encoder
STA (DCWord2) ENDIF LDA #000h STA (XX&FTChild) ,X STA (XX&FTChild) XX&PlusHashExit:
IF "&XX" EQ "DC";{
LDA DCCurrentHash+0 LOW of prior hash
STA DCFontBase+0 LDA DCCurrentHash+1 HIGH of prior hash
AND #07Fh APPENDIX 1
CLC
ROL DCFontBase+0
ROL A ; now * 4
ROL DCFontBase+0
ROL A ; now = * δ
IF FontSize EQ 16;{
ROL DCFontBase+0
ROL A now = * 16 ENDIF ;}
ADD #HIGH(DCFontTables) STA DCFontBase+1 LDA DCCurrentHash+1 HIGH of prior hash ifMI
LDA #000h new font
STA DCCharacters
STA DCNCIndex els
LDA (DCFontBase) old font
IF TwoBytes ;{
NCIndex
Characters
W = CharsNCIndex
NCIndex in bits 3-0
Figure imgf000112_0001
Characters in bits 7-4
ENDIF ;}
Figure imgf000113_0001
*************** E N C O D E R R E F I L L **************
ENCODER REFILL
ECRef ill :
IF Prodder DEC ProdCounter ifEQ STI #ProdCycle,ProdCounter STI #0A0h,ECCommand
JMP ECProdCommand fi ENDIF
JSR ECReadCharacter ; A = char from PC IF EOFControl
BBR 7,HostLCR,ECRefillRepeats STI #0Cδh,ECCommand JMP ECProdCommand ENDIF ECRefillRepeats:
IF Repeats ;{ CMP ECChar2Prior BNE ECRefillUpdate CMP ECCharlPrior BNE ECRefillUpdate
LDY ECRepeatCount ; 3 in a row are equal BEQ ECRefilllstRepeat INC ECRepeatCount
BEQ ECRefill256thRepeat ; ECRepeats = lOOh IJMP ECRefill
ECRefilllstRepeat: APPENDIX 1
STI #00lh,ECRepeatCount IJMP ECRefill ECRefill256thRepeat:
STI #OFFh,ECRepeatCount ENDIF ;}
ECRefillUpdate:
LDY ECNextChar IF Repeats LDX ECRepeatCount ECRepeatCount = 0 to OFFh
repeat character S
Figure imgf000114_0001
swapped with new character fi ENDIF
STA ECChar,Y ; A = new character STA ECCharCopy ,Y STA ECCurrentChar IF Test LDA ECABStatus ifEQ INC AStringsOn+0 ifEQ
INC AStringsOn+1 fi els INC BStringsθn+0 ifEQ
INC BStringsOn+1 fi APPENDIX 1
fi ENDIF
FontUpdate EC,FU LDA ECABChange ; either condition requires
ORA ECCommand ; flushing the ECChar buffer ifEQ ;{
INC ECAvailable ifEQ ; 256 characters available
LDY ECABStatus ifEQ JSR StringATime els
JSR StringBTi e fi fi els STA ECFlush INC ECAvailable LDA ECABChange ifNE ; if ABStatus change, this pass
EOR ECABStatus ; is cleaning up remaining ifEQ ; characters from prior status
Figure imgf000115_0001
APPENDIX 1
ifEQ JSR StringATime els JSR StringBTime fi fi
STI #00Oh,ECABChange STI #000h,ECFlush LDA ECCommand BEQ ECRefillReturn
RTS fi } ECRefillReturn:
INC ECNextChar ; INC here to avoid ECChar
IF Repeats ; buffer advance on LDA ECRepeatCount ; prods/commands ifNE STI #00Oh,ECRepeatCount LDA ECCharSave use saved character which JMP ECRefillRepeats forced repeat output fi ENDIF
JMP ECRefill
ECProdCommand:
LDA ECAvailable ifNE
STI #OFFh,ECFlush LDY ECABStatus ifEQ
JSR StringATime els
JSR StringBTime APPENDIX 1
fi
STI #OOOh,ECFlush fi
IF Repeats LDY ECRepeatCount ifNE
LDA ECCharlPrior ; set up repeat character
JSR ECRefillUpdate ; as new character STI #OOOh,ECRepeatCount
INC ECNextChar fi ENDIF
LDY ECNextChar FontUpdate EC,PC
LDA ECABStatus ifNE JMP ECProdB fi ECProdA:
LDX ECNewIndex,Y ifEQ
JMP ECProdANCNF fi LDA FontCode,X
Writel7 ECProdANCOF:
LDA GlobalCodeHigh+255 LDX GlobalCodeLow+255 BEQ ECProdANCOFHigh
Writeδ17
JMP ECProdShift ECProdANCOFHigh: Writel7 JMP ECProdShift
ECProdANCNF: plain OFFh
Figure imgf000118_0001
ECProdB:
LDA ECAntiEStatus BMI ECProdNoStrings ELSE
ECProdB:
ENDIF LDX ECNewIndex,Y
BEQ ECProdBNF
LDA FontCode,X
Writel7
LDA #0FFh FFh = 11111111 Writeδ
IJMP ECProdShift ECProdBNF:
LDA #0BFh ; FFh = 101111111
LDX #0C0h Writeδ17 ECProdShift:
LDA ECCommand
STI #000h,ECCommand
Writel7 ECProdShiftLoop:
LDA ECBuffer APPENDIX 1
CMP #001h ifEQ
JMP ECProdDone fi LDA #04Oh
Writel7
IJMP ECProdShiftLoop ECProdDone:
IF EOFControl BBR 7,HostLCR,ECToRefill
BBR 6,HostLCR,ECProdNLB JSR SwitchToDecode ECProdNLB:
JMP DCOrECEOF ECToRefill:
JMP ECRefill ELSE RTS ENDIF
************ A - S T R I N G M A C R O S **************
A-STRING HASH HEAD AND SEARCH MACRO
FindAString MACRO ; A = 1st ECChar ptr
LOCAL FindABackOK,FindAMoreLoop,FindAUpdate,FindASkip,FindABackMat ch,FindABackLoop,FindAReturn
STA ECBytel STI #003h,ECByte3
ADD #001h
TAX ; Y = ECNextOut+ECFind4s+l
ADD #002h
TAY ; Y = ECNextOut+ECFind4s+3 LDA ECHashRawO,X
NEG A APPENDIX 1
EOR ECHashRawO,Y STA ECFindHash
STA ECWord3+0 ; Word3 = ptr to LDA ECHashRawl,X ; RRHashHead NEG A
EOR ECHashRawl,Y AND #HIGH(BufferHashes-l) ASL ECWord3+0 ROL A ADD #HIGH(ECRRHashHead)
STA ECWord3+l
LDX #001h ; Word2 = ptr to 1st
LDA (ECWord3) ,X ; RRBuffer location ifNE ; for this hash STA ECWord4+l
ADD #(HIGH(ECRRBuffer)-HIGH(ECRRHashLink) ) STA ECWord2+l
LDA (ECWord3) ; Word4 = ptr to 1st STA ECWord2+0 ; RRHashLink location STA ECWord4+0 ; for this hash
STI #MaximumASearches,ECWordl+1 IJMP FindABackMatch fi
IF Test ifEQ
INC FSNoHash+0 ifEQ INC FSNoHash+1 ifEQ INC FSNθHash+2 fi fi fi ENDIF IJMP FindAReturn
FindASkip: APPENDIX 1
DEC ECWordl+1 BEQ FindAReturn LDX #001h LDA (ECWord4) use RRHashLink to find TAY next RRHashLink and LDA (ECWord4),X ;; next RRBuffer offset BEQ FindAReturn STY ECWord4+0 STA ECWord4+l DecBankSelect LDA (ECWord4) EncBankSelect CMP ECFindHash BNE FindAReturn STY ECWord2+0 LDA ECWord4+l
ADD #(HIGH(ECRRBuffer)-HIGH(ECRRHashLink)) STA ECWord2+l
FindABackMatch:
LDX ECByte3 FindABackLoop:
LDA (ECWord2),X ; check byte at longest CMP (ECBytel) ,X ; string + 1 and work IF Test backwards to origin ifNE
INC FSSkips+0 ifEQ INC FSSkips+1 ifEQ
INC FSSkips+2 fi fi
IJMP FindASkip fi ELSE BNE FindASkip APPENDIX 1
ENDIF DEX
BNE FindABackLoop LDA (ECWord2) CMP (ECBytel)
IF Test BEQ FindABackOK INC FSSkips+0 ifEQ INC FSSkips+1 ifEQ
INC FSSkips+2 fi fi IJMP FindASkip
ELSE
BNE FindASkip ENDIF FindABackOK: IF Test Wordl = RRHashCount (+0)
INC FSSearches+0 ; Word2 = RRBuffer offset ifEQ ; Word3 = RRBuffer best string
INC FSSearches+1 ; Word4 = 1st RRHashLink ifEQ ; Bytel = 1st unmatched ECChar
INC FSSearches+2 ; Byte3 = best string length fi fi ENDIF
LDX ECByte3 FindAMoreLoop: INX ifNE APPENDIX 1
LDA (ECWord2),X CMP (ECBytel) ,X BEQ FindAMoreLoop els
LDX #0FFh fi FindAUpdate:
LDA ECWord2+0 STA ECWord3+0 Word3 = RRBuffer offset of LDA ECWord2+l best string STA ECWord3+l CPX ECMaxLength ifCC STX ECByte3 Byte3 = best string length
IJMP FindASkip fi LDA ECMaxLength string length at maximum
STA ECByte3
FindAReturn:
ENDM
************* A - S T R I N G S E A R C H *************
SkipAStrings:
CMP #(MinimumAUpdate+l) ifCS JMP NoAFound fi
LDY ECNextOut INC ECNextOutSave DEC ECAvailable JMP WriteAEncodings Y = ECNextOut StringATime:
LDY ECNextOut APPENDIX 1
STY ECNextOutSave IJMP StringASearchlst StringASearch:
LDY ECNextOut StringASearchlst:
LDA ECAvailable ifEQ
LDA #0FFh els CMP #007h
BCC SkipAStrings fi
ADD #(0-003h)
STA ECMaxLength ; 255 - 3 is MaxLeng STI #0FFh,ECStringOrigin
STI #HIGH(ECChar) ,ECByte2 IF Test INC FSEntries+O ifEQ INC FSEntries+1 ifEQ
INC FSEntries+2 fi fi ENDIF
FindA43:
LDA ECNextOut ADD #003h FindAString LDY ECByte3 ; Y = string length
CPY #004h BCC FindA42 LDA ECNextOut STA ECBytel LDX #002h ; X = ECOrigin - 1
FindA43Loop: APPENDIX 1
LDA ECWord3+0 ADD #OFFh STA ECWord3+0 ifCC LDA ECWord3+l
ADD #0FFh
CMP #HIGH(ECRRBuffer) ifCC
ADD #HIGH(BufferSize) fi
STA ECWord3+l fi
LDA (ECWord3) CMP (ECBytel) ,X BNE FindA43Adjust
INY DEX
BPL FindA43Loop IJMP FindA43Done FindA43Adjust:
INC ECWord3+0 ifEQ
INC ECWord3+l fi FindA43Done:
STY ECStringLength INX
STX ECStringOrigin LDA ECWord3+0 STA ECFound+0
LDA ECWord3+l STA ECFound+1 SEC
IF ZoneTestA LDA ECRRPtr+0
SBC ECWord3+0 APPENDIX 1
ENDIF
LDA ECRRPtr+1
SBC ECWord3+l
AND #HIGH(BufferSize-l)
STA ECZone
FindA42;
tring length
a little more compression if string length governs ?????
X = ECOrigin - 1
Figure imgf000126_0001
APPENDIX 1
INY DEX
BPL FindA42Loop IJMP FindA42Done FindA42Adjust:
INC ECWord3+0 ifEQ
INC ECWord3+l fi FindA42Done: INX
STX ECBytel ; ECOrigin
SEC
IF ZoneTestA LDA ECRRPtr+0
SBC ECWord3+0 ENDIF
LDA ECRRPtr+1 SBC ECWord3+l AND #HIGH(BufferSize-l)
TAX
LDA ECStringOrigin ifPL ;{ CPY ECStringLength BCC FindA4Exit ifEQ ;{ CPX ECZone BCS FindA4Exit fi ;} fi ;}
STY ECStringLength
STX ECZone
LDA ECBytel
STA ECStringOrigin LDA ECWord3+0
STA ECFound+0 APPENDIX 1
LDA ECWord3+l STA ECFound+1 FindA4Exit: __
LDA ECStringOrigin BMI NoAFound
LDA ECStringLength CMP #MinimumAString BCC NoAFound StringAOverlap: LDX ECRRPtr+1'
LDA ECStringOrigin ADD ECRRPtr+0 ifCS INX fi
SEC
SBC ECFound+0
STA ECWordl+0 ; Wordl+0 = LOW(Diff) TAW TXA
SBC ECFound+1 AND #HIGH(Buffersize-1) STA ECWordl+1 ifEQ ;{ ; Delta(ECWordl) < 256 LDA ECStringOrigin ifNE ;{ ADD ECStringLength STA ECByte4
TWA ; W = ECWordl+0 (saved later) ifNE ;{ CMP ECByte4 ; UL = StringLength+StringOrigin
BCC NoAFound fi ;} fi APPENDIX 1
fi ;}
JMP ProcessAString NoAFound:
IF AHashX2 XOR 1 LDA ECNextOut
ADD #MinimumAUpdate STA ECNextOut JMP ResetACharCounts ELSE LDA #MinimumAUpdate
NoAFoundHashX2:
STA ECByte4
JSR HashAX2 ; A = -1 or -2 ADD ECByte4 BMI NoAFoundHashX2Negative
BNE NoAFoundHashX2 NoAFoundHashX2Negative:
JMP ResetACharCounts HashAX2: Y = index to reach
ECNextOut+l data items
2(11 length) + 10
3(001 length) + 10
ype 2
Figure imgf000130_0001
APPENDIX 1
INX
DEC ECByte3 BNE HashAX2SumBits HashAX2Null: INC ECNextOut
LDA #(0-001h) RTS HashAX20K:
LDX ECNextOut LDY ECNextOut
INY
IF Test INC AHashX2s+0 ifEQ INC AHashX2s+l fi ENDIF LDA #006h Type 6 STA ECType,X LDA ECWord2+l
ASR A
ROR ECWord2+0 AND #003h CLC ROR A
ROR A ROR A ORA #02Oh STA ECHashX21,Y ; ECHashX21 of 2nd character =
LDA ECWord2+0 2 high-order hash bits STA ECHashX20,Y LDA ECNextOut ECHashX20 of 2nd character = ADD #002h ; δ low-order hash bits STA ECNextOut
LDA #(0-002h) APPENDIX 1
RTS ENDIF ProcessAString:
IF AHashX2 XOR 1 LDA ECNextOut
ADD ECStringOrigin STA ECNextOut ELSE LDA ECStringOrigin IJMP ProcessABestXlst
ProcessABestXLoop:
ADD ECStringOrigin STA ECStringOrigin ProcessABestXlst: CMP #002h
BCC ProcessABestX
JSR HashAX2 ; A = -1 OR -2
IJMP ProcessABestXLoop ProcessABestX: AND #0FFh ifNE
INC ECNextOut fi ENDIF IF Test
INC AStringsFound+O ifEQ
INC AStringsFound+1 fi ENDIF
DirectAString:
LDY ECNextOut LDA ECStringLength
STA ECByte3 ; Byte3 = StringLength LDX ECNewIndex,Y ifNE APPENDIX 1
IF AHashX2
ADD #(0-(MinimumAString-4) ) ELSE
ADD #(0-MinimumAString) ENDIF STA ECByte4 Byte4 = Global length index
INX
LDA FontBits,X els ADD #(0-MinimumAString) STA ECByte4 Byte4 = Global length index
011
01
A = total string
Type 2
; Type 4
Figure imgf000133_0001
APPENDIX 1
SBC FontBits,X
LDX ECFontlndex,Y
SBC GlobalBits,X els ;{} LDX ECFontlndex,Y
SBC GlobalBits,X TAW
LDA GlobalCodeHigh,X ifPL ;{ TWA
SBC #001h els ;{} TWA fi ;} fi ;} fi ;}
BMI DirectAUse INY
DEC ECByte3 BNE DirectASumBits
DirectAReject:
LDA ECNextOut ADD #MinimumAUpdate
STA ECNextOut ; try HashX2'S ????? JMP ResetACharCounts
DirectAUse:
LDY ECNextOut IF Test INC AStringsUsed+O ifEQ
INC AStringsUsed+1 fi ENDIF
LDA #00δh ; Type δ STA ECType,Y
LDA ECByte4 ; Global or LengthB index APPENDIX 1
STA ECHashX20,Y ; saved for ECWrite INY
LDA ECWordl+0 save Zone codes for ECWrite
STA ECHashX20,Y ; in 2nd character's ECHashX2
LDA ECWordl+1
STA ECHashX21,Y
LDA ECStringLength
ADD ECNextOut STA ECNextOut
ResetACharCounts:
LDA ECAvailable
ADD ECNextOutSave update ECAvailable SEC SBC ECNextOut
STA ECAvailable
LDY ECNextOutSave ; interchange ECNextOut and
LDA ECNextOut ; ECNextOutSave STA ECNextOutSave STY ECNextOut ; Y - ECNextOut
************* A - S T R I N G O U T P U T *************
ENCODER WRITE ROUTINE
WriteAEncodings: ; Y = ECNextOut
LDA ECType,Y ORA ECRepeatSW,Y ; bit 3 on if repeats TAX
JMP (WriteAJumps) ,X WriteAJumps:
IF AHashX2 DW WriteANull 0 - HashX2(2) - no repeats
ELSE j or repeats APPENDIX 1
DW 0 0 - HashX2 inactive ENDIF DW WriteAOFont ; 2 - Font char - no repeats
DW WriteAONewChar ; 4 - New char - no repeats
IF AHashX2 DW WriteAHashX2 ; 6 - HashX2(l) - no repeats
ELSE
DW 0 6 - HashX2 inactive ENDIF DW WriteAString ; δ - String(1) - no repeats
IF Repeats or repeats DW WriteAlFont ; 10 - Font char - repeats
DW WriteAlNewChar 12 New char - repeats
IF AHashX2 DW WriteAHashX2 14 - HashX2(l) - repeats
ELSE
DW 0 14 - HashX2 inactive ENDIF ENDIF
IF AHashX2 WriteANull: IF Repeats
LDA ECRepeats,Y ifNE
JMP WriteAORepeats els JMP UpdateAOBuffer fi APPENDIX 1
ELSE JMP UpdateAOBuffer ENDIF ENDIF WriteAOFont:
LDX ECFontlndex,Y char encoding index - font
LDA FontCode,X Writel7 JMP UpdateAOBuffer
IF Repeats WriteAlFont:
LDX ECFontlndex,Y char encoding index - font LDA FontCode,X
Writel7
JMP WriteAORepeats ENDIF WriteAONewChar: LDX ECNewIndex,Y NC encoding index
BEQ WriteAONCNF LDA FontCode,X Writel7 WriteAONCOF: LDX ECFontlndex,Y char encoding index - global
LDA GlobalCodeHigh,X TAW
LDA GlobalCodeLow,X BEQ WriteAONCOFHigh
TAX TWA
Writeδ17
IJMP WriteAOCommand WriteAONCOFHigh: TWA ; char encoding index
Figure imgf000138_0001
; leading 0 bit
Writel7 LDA GlobalCodeHigh ,X
TAW
LDA GlobalCodeLow,X
BEQ WriteAONCNFHighl
TAX TWA
Write817
IJMP WriteAOCommand WriteAONCNFHighl:
TWA WriteA0NCNFHigh2:
Writel7 WriteAOCommand:
LDA ECFontlnde ,Y ; char encoding index - global
CMP #0FEh ifCC JMP UpdateAOBuffer fi
LDA #040h Writel7
JMP UpdateAOBuffer IF Repeats WriteAlNewChar:
LDX ECNewIndex,Y ; NC encoding index BEQ WriteAlNCNF
LDA FontCode,X APPENDIX 1
Writel7 WriteAlNCOF:
LDX ECFontlndex, char encoding index - global LDA GlobalCodeHigh,X
TAW
LDA GlobalCodeLow,X BEQ WriteAlNCOFHigh TAX TWA
Write817
IJMP WriteAlCommand WriteAlNCOFHigh: TWA Writel7
JMP WriteAlCommand WriteAlNCNF:
LDX ECFontlndex,Y ; char encoding index - global LDA GlobalCodeHigh,X
BMI WriteAlNCNFHigh2 LDA #04Oh leading 0 bit Writel7
LDA GlobalCodeHigh,X TAW
LDA GlobalCodeLow,X BEQ WriteAlNCNFHighl TAX TWA Writeδ17
IJMP WriteAlCommand WriteAlNCNFHighl:
TWA WriteAlNCNFHigh2: Writel7
WriteAlCommand: - 13δ - APPENDIX 1
LDA ECFontlndex,Y char encoding index - global
CMP #0FEh ifCC JMP WriteAORepeats fi
LDA #04Oh
Writel7
JMP WriteAORepeats ENDIF
IF AHashX2 WriteAHashX2:
LDX ECNewInde ,Y
BEQ WriteAHNF WriteAHOF:
INX
LDA FontCode,X
Writel7
LDA #0E0h 11 length Writel7
IJMP WriteAHMain WriteAHNF:
LDA #050h 010
Writel7 WriteAHMain:
INY
LDA ECHashX21,Y
Writel7
LDA ECHashX20,Y Writeδ
LDA #000h
STA ECType,Y ; Type 0 (2nd byte)
STA ECRepeatSW,Y
DEY IF Repeats
LDA ECRepeats,Y APPENDIX 1
ifNE
JMP WriteAORepeats els JMP UpdateAOBuffer fi
ELSE
JMP UpdateAOBuffer ENDIF ENDIF WriteAString:
LDX ECNewIndex,Y BEQ WriteASNF WriteASOF:
INX LDA FontCode,X
Writel7
IJMP WriteASLength WriteASNF:
IF AHashX2 LDA #070h ; Oil
ELSE
LDA #060h ; 01
ENDIF Writel7 WriteASLength:
LDX ECHashX20,Y LDA GlobalCodeHigh,X TAW
LDA GlobalCodeLow,X BEQ WriteASLHigh
TAX TWA
Writeδ17 IJMP WriteASMain WriteASLHigh: TWA APPENDIX 1
Writel7 WriteASMain: INY
LDX ECHashX21,Y LDA ZoneCode,X
Writel7
LDA ECHashX20,Y Writeδ DEY IF Repeats
LDA ECRepeats,Y ifNE JMP WriteAlRepeats els JMP UpdateAlBuffer fi ELSE
JMP UpdateAlBuffer ENDIF
****** A - S T R I N G B U F F E R U P D A T E *****
IF Repeats WriteAORepeats: LDA ECRepeats,Y
CMP #001h
BNE WriteAOAreRepeats WriteAONoRepeats:
LDA #04Oh Writel7
IJMP WriteAORClear WriteAOAreRepeats:
LDA #OCOh Writel7 LDA ECRepeats,Y
ADD #OFEh APPENDIX 1
TAX
LDA GlobalCodeHigh,X
TAW
LDA GlobalCodeLow,X BEQ WriteAORHigh
TAX
TWA
Writeδ17
IJMP WriteAORClear WriteAORHigh:
TWA
Writel7 WriteAORClear:
LDA #000h STA ECRepeats,Y
STA ECRepeatSW,Y ENDIF t
UpdateAOBuffer: ; Y = ECNextOut LDA ECChar,Y
STA (ECRRPtr) IF BufferSuffix LDX ECRRPtr+1 CPX #HIGH(ECRRBuffer) ifEQ
STI #(HIGH(ECRRBuffer)+HIGH(BufferSize)) ,ECRRPtr+l STA (ECRRPtr)
STI #HIGH(ECRRBuffer) ,ECRRPtr+l fi
ENDIF
BBS 0,ECRRPtr+0,UpdateAOHead JMP UpdateAOBufferPtr UpdateAOHead: LDX #001h
LDA ECRRPtr+0 ; Word3 = ptr to RRHashLink APPENDIX 1
ADD #(0-003h) ; at location RRPtr-3 STA ECWord3+0 LDA ECRRPtr+1 ifCC
ADD #0FFh
CMP #HIGH(ECRRBuffer) ifCC LDA #(HIGH(ECRRBuffer)+HIGH(BufferSize)-1) fi fi
ADD #(HIGH(ECRRHashLink)-HIGH(ECRRBuffer) )
STA ECWord3+l
LDA ECHashRawO,Y ; Word4 = ptr to
EOR ECPriorHashO ; RRHashHead
STA ECWord4+0 DecBankSelect STA (ECWord3) store LOW(Hash) in EncBankSelect ; ; RRHashTest table
LDA ECHashRawl,Y
EOR ECPriorHashl AND #HIGH(BufferHashes-1) ASL ECWord4+0 ROL A ADD #HIGH(ECRRHashHead) STA ECWord4+l LDA ECHashRawO,Y NEG A STA ECPriorHashO LDA ECHashRawl,Y NEG A STA ECPriorHashl
UpdateAOLink:
LDA (ECWord4) transfer RRHashHead to
STA (ECWord3) RRHashLink table LDA (ECWord4) ,X
STA (ECWord3) ,X
Figure imgf000145_0001
UpdateAOBufferPtr:
INC ECRRPtr+0 ifEQ INC ECRRPtr+1 LDA ECRRPtr+1 CMP #(HIGH(ECRRBuffer)+HIGH(BufferSize)) ifEQ
STI #HIGH(ECRRBuffer) ,ECRRPtr+1 fi fi IF Failsafe
DEC ECFailSafe+O ifEQ DEC ECFailSafe+1 ifEQ STI #FailSafeSets,ECFailSafe+1
LDA #008h Writel7 fi fi ENDIF
. r
OutputAOControl:
INY
STY ECNextOut CPY ECNextOutSave ifNE JMP WriteAEncodings ; Y = ECNextOut fi
LDA ECFlush BNE OutputAOFlush
LDA #SetLength APPENDIX 1
CMP ECAvailable ifCS
RTS ~" fi JMP StringASearch
OutputAOFlush:
LDA ECAvailable ifEQ RTS fi
JMP StringASearch .
IF Repeats WriteAlRepeats: CMP #001h
BNE WriteAlAreRepeats WriteAlNoRepeats:
LDA #040h
Writel7 IJMP WriteAlRClear
WriteAlAreRepeats:
LDA #0C0h
Writel7
LDA ECRepeats,Y ADD #0FEh
TAX
LDA GlobalCodeHigh,
TAW
LDA GlobalCodeLow,X BEQ WriteAlRHigh
TAX
TWA
Writeδ17
IJMP WriteAlRClear WriteAlRHigh:
TWA APPENDIX 1
Writel7 WriteAlRClear:
LDA #000h STA ECRepeats,Y STA ECRepeatSW,Y
ENDIF
9
UpdateAlBuffer: ; Y = ECNextOut
LDA ECChar,Y STA (ECRRPtr)
IF BufferSuffix LDX ECRRPtr+1 CPX #HIGH(ECRRBuffer) ifEQ STI
#(HIGH(ECRRBuffer)+HIGH(BufferSize)) ,ECRRPtr+l STA (ECRRPtr)
STI #HIGH(ECRRBuffer) ,ECRRPtr+l fi ENDIF
BBS 0,ECRRPtr+0,UpdateAlHead JMP UpdateAlBufferPtr UpdateAlHead:
LDX #001h LDA ECRRPtr+0 ; Word3 = ptr to RRHashLink
ADD #(0-003h) ; at location RRPtr-3 STA ECWord3+0 LDA ECRRPtr+1 ifCC ADD #0FFh
CMP #HIGH(ECRRBuffer) ifCC
LDA #(HIGH(ECRRBuffer)+HIGH(BufferSize)-1) fi fi
ADD #(HIGH(ECRRHashLink)-HIGH(ECRRBuffer) )
Figure imgf000148_0001
APPENDIX 1
STI #HIGH(ECRRBuffer) ,ECRRPtr+l fi fi
IF Failsafe DEC ECFailSafe+O ifEQ DEC ECFailSafe+1 ifEQ STI #FailSafeSets,ECFailSafe+1 LDA #00δh
Writel7 fi fi ENDIF ;
OutputAlControl: INY
STY ECNextOut CPY ECNextOutSave ifNE
LDA ECRepeats,Y ifEQ JMP UpdateAlBuffer fi JMP WriteAlRepeats ; Y - ECNextOut fi
LDA ECFlush BNE OutputAlFlush LDA #SetLength CMP ECAvailable ifCS RTS fi
JMP StringASearch OutputAlFlush:
LDA ECAvailable - 146 - APPENDIX 1
ifEQ RTS fi JMP StringASearch
************ B - S T R I N G M A C R O S ************
B-STRING SEARCH MACRO - BYTE 2
FindB2String MACRO STI #002h,ECByte3 INX Word3 = ptr to LDA ECHashRawO,X FTHashHead STA ECFindHash STA ECWord3+0
LDA ECHashRawl,X ; Word2 = ptr to 1st AND #HIGH(BufferHashes-l) ; (RRBuffer+1) location
ASL ECWord3+0 ; for this hash ROL A
ADD #HIGH(ECRRHashHead)
STA ECWord3+l Word4 = ptr to 1st
LDX #001h ; (FTHashLink+1) location
LDA (ECWord3),X ; for this hash ifNE
STA ECWord4+l
LDA (ECWord3)
STA ECWord4+0
TAY
STI #MaximumBSearches,ECWordl+1
IJMP FindB2Skiplst fi
IJMP FindB2Return FindB2Skip:
DEC ECWordl+1
Figure imgf000151_0001
APPENDIX 1
ifEQ INC FSSkips+2 fi fi
IJMP FindB2Skip ELSE
BNE FindB2Skip ENDIF FindB2More:
IF Test Wordl = emergency max hashes
INC FSSearches+0 ; Word2 = RRBuffer ptr ifEQ ; Word3 = RRBuffer best string
INC FSSearches+1 ; Word4 = FTHashLink ptr ifEQ ; Bytel = 1st unmatched ECChar
INC FSSearches+2 ; Byte3 = best string length fi fi ENDIF LDX #000h FindB2MoreLoop: INX ifNE LDA (ECWord2) ,X CMP (ECBytel) ,X BEQ FindB2MoreLoop els
LDX #0FFh fi FindB2Update:
CPX ECByte3 BCC FindB2Skip reset ECChar offset used APPENDIX 1
; in FindnnStart routine
; Word3 « RRBuffer offset of ; best string
Figure imgf000153_0001
Byte3 = best string length
IJMP FindB2Skip fi LDA ECMaxLength ; string length at maximum
STA ECByte3 FindB2Return: ENDM
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
B-STRING SEARCH MACRO - BYTE 1
3 = ptr to FTHashHead
Word2 = ptr to 1st
(RRBuffer+1)
this hash
4 = ptr to 1st (FTHashLink+1)
Figure imgf000153_0002
APPENDIX 1
LDA (ECWord3),X for this hash ifNE STA ECWord4+l LDA (ECWord3) STA ECWord4+0 TAY
STI #MaximumBSearches,ECWordl+l IJMP FindBlSkiplst fi
IJMP FindBlReturn FindBlSkip:
DEC ECWordl+1 BEQ FindBlReturn LDX #001h LDA (ECWord4) use FTHashLink to find TAY next FTHashLink and LDA (ECWord4) ,X ; next RRBuffer offset BEQ FindBlReturn STY ECWord4+0 STA ECWord4+l DecBankSelect LDA (ECWord4) EncBankSelect CMP ECFindHash BNE FindBlReturn FindBlSkiplst:
STY ECWord2+0 Y = ECWord4+0 LDA ECWord4+l
ADD #(HIGH(ECRRBuffer)-HIGH(ECRRHashLink)) STA ECWord2+l LDX #002h LDA (ECWord2),X CMP (ECBytel) ,X IF Test BEQ FindBlMore INC FSSkips+0 APPENDIX 1
ifEQ INC FSSkips+1 ifEQ INC FSSkips+2 fi fi
IJMP FindBlSkip ELSE BNE FindBlSkip ENDIF
FindBlMore:
IF Test ; Wordl - emergency max hashes INC FSSearches+0 ; Word2 = RRBuffer ptr ifEQ ; Word3 = RRBuffer best string
INC FSSearches+1 ; Word4 = FTHashLink ptr ifEQ Bytel = 1st unmatched
ECChar INC FSSearches+2 ; Byte3 = best string length fi fi ENDIF LDX #000h
IJMP FindBlMorelst FindBlMoreLoop: INX ifNE FindBlMorelst:
LDA (ECWord2),X CMP (ECBytel) ,X BEQ FindBlMoreLoop els LDX #0FFh fi APPENDIX 1
FindBlUpdate :
CPX ECByte3 BCC FindBlSkip ; reset ECChar offset used
BEQ FindBlSkip ; in FindnnStart routine LDA ECWord2+0 STA ECWord3+0 Word3 = RRBuffer offset of LDA ECWord2+l best string STA ECWord3+l
10 CPX ECMaxLength ifCC STX ECByte3 Byte3 = best string length
IJMP FindBlSkip
15 fi LDA ECMaxLength string length at maximum
STA ECByte3 FindBlReturn: 20 ENDM
************ B - S T R I N G S E A R C H *************
SkipBStrings :
25 CMP #(MinimumBUpdate+1) ifCS
JMP NoBFound fi
LDY ECNextOut
30 INC ECNextOutSave DEC ECAvailable IF AntiEx
JMP TotalBBits ELSE
35 JMP UpdateBBuffer Y = ECNextOut ENDIF
Figure imgf000157_0001
STY ECNextOutStart
STI #000h,ECExcessBits ENDIF
IJMP StringBSearchlst StringBSearch: LDY ECNextOut
StringBSearchlst:
LDA ECAvailable ifEQ LDA #0FFh els
CMP #003h BCC SkipBStrings fi
STA ECMaxLength ; 255 is MaxLength STY ECBytel
STI #HIGH(ECChar) ,ECByte2 STI #OOOh,ECStringLength IF Test INC FSEntries+O ifEQ
INC FSEntries+1 ifEQ
INC FSEntries+2 fi fi
ENDIF FindB2:
LDX ECNextOut INX FindB2String
LDX ECByte3 APPENDIX 1
CPX #(MinimumBString+1)
BCC FindBl
SEC
IF ZoneTestB
LDA ECRRPtr+0
SBC ECWord3+0 ENDIF
LDA ECRRPtr+1 SBC ECWord3+l AND #HIGH(BufferSize-l) STA ECZone STX ECStringLength LDA ECWord3+0 STA ECFound+0 LDA ECWord3+l STA ECFound+1 LDA ECWord3+0
FindBl:
LDX ECNextOut
FindBlString
LDX ECByte3
CPX #(MinimumBString+1)
BCC FindBlExit
CPX ECStringLength
BCC StringBOverlap ifEQ
SEC
IF ZoneTestB LDA ECRRPtr+0 SBC ECWord3+0
ENDIF
LDA ECRRPtr+1
SBC ECWord3+l
AND #HIGH(BufferSize-l)
CMP ECZone
BCS StringBOverlap APPENDIX 1
fi STX ECStringLength LDA ECWord3+0 STA ECFound+0 LDA ECWord3+l
STA ECFound+1 IJMP StringBOverlap FindBlExit:
LDA ECStringLength BEQ NoBFound
StringBOverlap;
LDA ECRRPtr+0
SEC
SBC ECFound+0 STA ECWordl+0 Wordl+0 = LOW(Diff)
LDA ECRRPtr+1 SBC ECFound+1 AND #HIGH(BufferSize-l) STA ECWordl+1 JMP ProcessBString
NoBFound:
JSR HashBX2 JMP ResetBCharCounts HashBX2: LDY ECNextOut ; Y = index to reach
INY ; ECNextOut+l data items LDA ECHashX21,Y BMI HashBX2Null LDX ECNextOut STA ECWord2+l
ORA #080h CMP ECHashX21,X ifEQ LDA ECHashX20,Y CMP ECHashX20,X
BEQ HashBX2Null APPENDIX 1
els
LDA ECHashX20,Y fi
STA ECWord2+0 HashBX2Bits:
LDY ECNewIndex,X ; NC encoding index ifNE INY
LDA FontBits,Y ST bits + Hash(=10) ADD #00Ah els
LDA #00Dh ; 3(110 length) + 10 fi
IF AntiEx TAW
ENDIF
STI #002h,ECByte3 SEC HashBX2SumBits:
pe 2
pe 4
; NC encoding index
Figure imgf000160_0001
APPENDIX 1
; total bits for 2-byte
less cost of δ-bit
; save bit length for
; in 2nd character
Type 6
Type 0 (skip)
Figure imgf000161_0001
APPENDIX 1
; ECHashX21 of 2nd
high-order hash bits
ashX20 of 2nd character = ; δ low-order hash bits
Figure imgf000162_0001
; Byte3 = StringLength
ADD # ( 0- (MinimumBString+1 ) )
STA ECByte4 Byte4 = LengthB index CMP #009h ifCC
TAX
LDA LengthBBits,X length bits from table els
ADD #(0-009h)
TAX
LDA GlobalBits,X
Figure imgf000163_0001
APPENDIX 1
SBC #009h fi ;} fi } fi BMI DirectBUse
INY
DEC ECByte3 BNE DirectBSumBits DirectBReject: JSR HashBX2
JMP ResetBCharCounts DirectBUse:
LDY ECNextOut IF AntiEx TWA
STA ECHashX21,Y save bit length for
AntiEx
; Type δ
; Global or LengthB index ; saved for ECWrite
e Zone codes for ECWrite ; in 2nd character's
ECHashX2
Figure imgf000164_0001
APPENDIX 1
DEX
LDA #000h UseBStringLoop:
STA ECType,Y set Type to 0 in the 5 INY (ECStringLength-1) chars
DEX which generate no output
BNE UseBStringLoop ENDIF
LDA ECStringLength 10 ADD ECNextOut
STA ECNextOut ResetBCharCounts:
LDA ECAvailable ADD ECNextOutSave ; update ECAvailable 15. SEC
SBC ECNextOut STA ECAvailable LDY ECNextOutSave ; interchange ECNextOut and
20 LDA ECNextOut ECNextOutSave STA ECNextOutSave STY ECNextOut Y = ECNextOut
************ B - S T R I N G O U T P U T *************
25
TOTAL B BITS FOR AntiExpansion
IF AntiEx TotalBBits: ; Y - ECNextOut 30 LDA ECExcessBits
ADD #(0-008h) LDX ECType,Y JMP (TotalBJumps) ,X TotalBJumps: 35 DW TotalBDone
DW TotalBFont APPENDIX 1
Figure imgf000166_0001
; char encoding index - font
ADC FontBits,X IJMP TotalBDone Tota1BNewChar: CLC
LDX ECNewIndex,Y ; NC encoding index BEQ TotalBNCMainNF ADC FontBits,X TotalBNCMainOF:
ADD #008h char encoding index - δ-bit
IJMP TotalBDone TotalBNCMainNF: LDX ECFrequency,Y ; char encoding index - δ-bit ifPL ADD #00δh els ADD #009h fi IJMP TotalBDone TotalBHashX2: TotalBString: CLC
ADC ECHashX21,Y IF AntiEx STA ECExcessBits JMP UpdateBlBuffer ENDIF
TotalBDone:
Figure imgf000167_0001
***** B - S T R I N G B U F F E R U P D A T E *****
UPDATE BUFFER
UpdateBBuffer: ; Y ■ ECNextOut
LDA ECChar,Y STA (ECRRPtr) IF BufferSuffix
LDX ECRRPtr+1 CPX #HIGH(ECRRBu fer) ifEQ STI #(HIGH(ECRRBuffer)+HIGH(BufferSize)) ,ECRRPtr+1
STA (ECRRPtr)
STI #HIGH(ECRRBuffer) ,ECRRPtr+l fi ENDIF BBS 0,ECRRPtr+0,UpdateBHead
JMP UpdateBBufferPtr APPENDIX 1
Figure imgf000168_0001
DecBankSelect STA (ECWord3) store LOW(Hash) in EncBankSelect ; ; RRHashTest table
LDA ECHashRawl,Y
AND #HIGH(BufferHashes-1) ASL ECWord4+0 ROL A ADD #HIGH(ECRRHashHead) STA ECWord4+l UpdateBLink: LDA (ECWord4) transfer RRHashHead to STA (ECWord3) RRHashLink table LDA (ECWord4)fX STA (ECWord3) ,X LDA ECWord3+0 reset RRHashHead to new
STA (ECWord4) RRHashLink ptr LDA ECWord3+l
STA (ECWord4) ,X
UpdateBBufferPtr: INC ECRRPtr+0 ifEQ INC ECRRPtr+1 LDA ECRRPtr+1
CMP #(HIGH(ECRRBuffer)+HIGH(BufferSize) ) ifEQ
STI #HIGH(ECRRBuffer),ECRRPtr+l APPENDIX 1
fi fi
OutputBControl: INY
STY ECNextOut CPY ECNextOutSave ifNE ; Y = ECNextOut
IF AntiEX JMP TotalBBits
ELSE
JMP WriteBEncodings ENDIF fi IF AntiEx XOR 1 ;{
LDA ECFlush BNE ECOutputBFlUSh LDA #SetLength CMP ECAvailable ifCS
RTS fi
JMP StringBSearch OutputBFlush: LDA ECAvailable ifEQ RTS fi
JMP StringBSearch ELSE ;{}
LDA ECAntiEStatus ; saved at start of StringTime
BMI OutputBSTOff OutputBSTOn: LDY ECExcessBits
IF Macros - 166 - APPENDIX 1
ifMI JMP OutputBCurrent if minus, write current fi ELSE BMI OutputBCurrent if minus, write current
ENDIF CPY #014h IF Macros ifCC JMP OutputBDefer if a bit plus. defer writing fi ELSE
BCC OutputBDefer if a bit plus. defer writing
ENDIF
ORA #0δ0h STA ECAntiEStatus if too plus, turn off LDX ECNextOutStart LDY ECNewInde ,X NC encoding index ifNE ;{
LDA FontCode,Y
Writel7
LDA #0FEh OFEh = 11111110
LDX #0C0h + 1
Writeδ17 els ;{}
LDA #0BFh ; OFEh = (10)111111
LDX #060h ; + 01
Writeδ17 fi ;}
IF Test ;{
INC AntiExOn+0 ifEQ ;{ APPENDIX 1
INC AntiExOn+1 fi ;}
ENDIF ;} IJMP OutputBCurrent OutputBSTOff:
LDY ECExcessBits BPL OutputBDefer ; if losing, no change CPY #(0-013h) should be 9 ????? BCS OutputBDefer ; if a bit minus, no change
AND #07Fh
STA ECAntiEStatus if too minus, turn on
LDA #0FEh strings and write current
LDX #0C0h Writeδ17 OFEh, 1 to turn on IF Test ;{
INC AntiExOff+0 ifEQ ;{ INC AntiExOff+1 fi ;} ENDIF ;} OutputBCurrent:
JSR OutputBWrite LDA ECFlush ifEQ ;{ RTS els ;{) LDA ECAvailable ifEQ ;{
RTS fi ;} fi ;}
JMP StringBSearch OutputBDefer:
LDA ECAvailable APPENDIX 1
BEQ OutputBWrite LDY ECFlush ifEQ ;{ CMP #SetLength BCC OutputBWrite fi ;}
JMP StringBSearch OutputBWrite:
LDA ECNextOut TAX
SEC
SBC ECNextOutStart
STA ECBytel
LDY ECNextOutStart STX ECNextOutStart
STI #OOOh,ECExcessBits ENDIF ;}
WriteBEncodings: ; Y = ECNextOut IF AntiEx
LDA ECAntiEStatus BPL WriteBStringsOn WriteBδBit:
LDA ECFrequency,Y character frequency Writeδ
LDA ECFrequency,Y CMP #0FEh ifCC
JMP WriteBOTestRepeats fi
LDA #040h Writel7
JMP WriteBOTestRepeats WriteBStringsOn: ; Y = ECNextOut LDA ECType,Y
ORA ECRepeatSW,Y ; bit 3 on if repeats String(n) -(no
0 - HashX2(2) - or
2 - Font char - no
4 - New char - no
6 - HashX2(l) - no
δ - String(1) - no
or repeats
10 - Font char -
12 - New char -
14 - HashX2(l) -
Figure imgf000173_0001
char encoding index - APPENDIX 1
font
LDA FontCode,X Writel7 IF AntiEx JMP WriteBODone
ELSE IF Repeats
JMP WriteBOTestRepeats ELSE JMP WriteBODone
ENDIF ENDIF
IF AntiEx IF Repeats WriteBlFont:
LDX ECFontlndex,Y char encoding index - font
LDA FontCode,X Writel7 LDA ECRepeats,Y
JMP WriteBORepeats ENDIF ENDIF WriteBONewChar: LDX ECNewIndex,Y NC encoding index
BEQ WriteBONCMainNF LDA FontCode,X Writel7 WriteBONCMainOF: LDA ECFrequency,Y char encoding index - δ-bit
Writeδ
IJMP WriteBOCommand WriteBONCMainNF: LDA ECFrequency,Y char encoding index - δ-bit APPENDIX 1
ifPL Writeδ els STI #0δ0h,ECByte4 ASR A
ROR ECByte4 AND #0BFh LDX ECByte4 Writeδ17 fi
WriteBOCommand:
LDA ECFrequency,Y ; character frequency CMP #0FEh ifCC WriteBOCommandX:
IF AntiEx
JMP WriteBODone ELSE IF Repeats JMP WriteBOTestRepeats
ELSE
JMP WriteBODone ENDIF ENDIF fi
LDA #040h Writel7
BRA WriteBOCommandX IF AntiEx IF Repeats
WriteBlNewChar:
LDX ECNewIndex,Y ; NC encoding index BEQ WriteBlNCMainNF LDA FontCode,X Writel7
WriteBlNCMainOF: APPENDIX 1
LDA ECFrequency,Y char encoding index - δ-bit
Writeδ
IJMP WriteBlCommand WriteBlNCMainNF:
LDA ECFrequency,Y char encoding index - δ-bit ifPL Writeδ els
STI #0δ0h,ECByte4 ASR A ROR ECByte4 AND #0BFh LDX ECByte4
Write817 fi WriteBlCommand: LDA ECFrequency,Y character frequency
CMP #0FEh ifCC
WriteBlCommandX:
LDA ECRepeats,Y JMP WriteBORepeats fi LDA #04Oh Writel7
BRA WriteBlCommandX ENDIF
ENDIF WriteBHashX2:
LDX ECNewIndex,Y ; NC encoding index BEQ WriteBX2NF WriteBX20F:
INX APPENDIX 1
LDA FontCode,X Writel7
IJMP WriteBX2Main WriteBX2NF: LDA #0D0h ; 110
Writel7 WriteBX2Main: INY
LDA ECHashX21,Y Writel7
LDA ECHashX20,Y Writeδ IF AntiEx LDA #000h STA ECType,Y
IF Repeats
STA ECRepeatSW,Y ENDIF ENDIF DEY
IF Repeats JMP WriteBOTestRepeats ELSE JMP WriteBODone ENDIF
WriteBSXtraLength:
LDA LengthBCode+9 Writel7
LDA ECHashX20,Y ; length index ADD #(0-009h)
TAX
LDA GlobalCodeHigh,X TAW
LDA GlobalCodeLow,X BEQ WriteBSXLHigh
TAX APPENDIX 1
TWA
Writeδ17
JMP WriteBSMain WriteBSXLHigh: TWA
Writel7
JMP WriteBSMain WriteBString:
LDX ECNewIndex,Y NC encoding index BEQ WriteBSNF
WriteBSOF:
INX INX
LDA FontCod ,X Writel7
IJMP WriteBSLength WriteBSNF:
LDA #0F0h 111 Writel7 WriteBSLength:
IF AntiEX LDA ECHashX20,Y length index TAX
ADD #(MinimumBString+l) STA ECByte4
NEG A
ADD ECBytel STA ECBytel ELSE LDX ECHashX20,Y length index
ENDIF CPX #009h ifCS JMP WriteBSXtraLength fi
LDA LengthBCode,X APPENDIX 1
Writel7 WriteBSMain: INY
LDX ECHashX21,Y LDA ZoneCode,X
Writel7
LDA ECHashX20,Y Writeδ DEY IF AntiEX
IF Repeats LDA ECRepeats,Y ifEQ JMP WriteBlDone els
JMP WriteBlRepeats fi ELSE JMP WriteBlDone ENDIF
ELSE
JMP WriteBOTestRepeats ENDIF
IF Repeats
WriteBOAreRepeats:
LDA #0C0h
Writel7
LDA ECRepeats,Y ADD #0FEh
TAX
LDA GlobalCodeHigh,X
TAW
LDA GlobalCodeLow,X BEQ WriteBORHigh
TAX APPENDIX 1
TWA
Write817
JMP WriteBORClear WriteBORHigh: TWA
Writel7
JMP WriteBORClear WriteBOTestRepeats:
LDA ECRepeats,Y BEQ WriteBODone
WriteBORepeats:
CMP #001h
BNE WriteBOAreRepeats WriteBONoRepeats: LDA #04Oh
Writel7 WriteBORClear:
LDA #000h STA ECRepeats,Y STA ECRepeatSW,Y
ENDIF WriteBODone:
IF Failsafe DEC ECFailSafe+O ifEQ
DEC ECFailSafe+1 ifEQ STI #FailSafeSets,ECFailSafe+1 LDA #00δh Writel7 fi fi ENDIF
IF AntiEX DEC ECBytel ifEQ APPENDIX 1
RTS fi INY
JMP WriteBEncodings ELSE
JMP UpdateBBuffer ENDIF
IF AntiEX IF Repeats
WriteBlAreRepeats:
LDA #0C0h
Writel7
LDA ECRepeats,Y ADD #0FEh
TAX
LDA GlobalCodeHigh,X
TAW
LDA GlobalCodeLow,X BEQ WriteBlRHigh
TAX
TWA
Writeδ17
JMP WriteBlRClear WriteBlRHigh:
TWA
Writel7
JMP WriteBlRClear WriteBlRepeats: CMP #001h
BNE WriteBlAreRepeats WriteBlNoRepeats:
LDA #04Oh
Writel7 WriteBlRClear:
LDA #000h APPENDIX 1
STA ECRepeats,Y STA ECRepeatSW,Y ENDIF WriteBlDone: IF Failsafe
DEC ECFailSafe+0 ifEQ DEC ECFailSafe+1 ifEQ STI #FailSafeSets,ECFailSafe+1
LDA #00δh Writel7 fi fi ENDIF
DEC ECByte4 ifNE INY
IF Repeats LDA ECRepeats,Y
BEQ WriteBlDone JMP WriteBlRepeats ELSE JMP WriteBlDone ENDIF fi
LDA ECBytel ifEQ RTS fi
INY
JMP WriteBEncodings ENDIF
IF AntiEX
UpdateBlBuffer: ; Y = ECNextOut - lδi -
APPENDIX 1
LDA ECChar,Y STA (ECRRPtr) IF BufferSuffix LDX ECRRPtr+1 CPX #HIGH(ECRRBuffer) ifEQ STI #(HIGH(ECRRBuffer)+HIGH(BufferSize)) ,ECRRPtr+l STA (ECRRPtr) STI #HIGH(ECRRBuffer) ,ECRRPtr+l fi ENDIF
BBS 0,ECRRPtr+0,UpdateBlHead JMP UpdateBlBufferPtr UpdateBlHead:
LDX #001h
LDA ECRRPtr+0 ; Word3 = ptr to RRHashLink ADD #(0-001h) ; at location RRPtr-1 STA ECWord3+0 LDA ECRRPtr+1
ADD #(HIGH(ECRRHashLink)-HIGH(ECRRBuffer) ) STA ECWord3+l
LDA ECHashRawO,Y ; Word4 - ptr to STA ECWord4+0 ; RRHashHead DecBankSelect
STA (ECWord3) ; store LOW(Hash) in EncBankSelect ; RRHashTest table
LDA ECHashRawl,Y AND #HIGH(BufferHashes-l) ASL ECWord4+0
ROL A
ADD #HIGH(ECRRHashHead) STA ECWord4+l UpdateBlLink: LDA (ECWord4) ; transfer RRHashHead to
STA (ECWord3) ; RRHashLink table - 162 - APPENDIX 1
LDA (ECWord4) ,X STA (ECWord3) , X
LDA ECWord3+0 ; reset RRHashHead to new STA (ECWord4) ; RRHashLink ptr LDA ECWord3+l
STA (ECWord4) ,X UpdateBlBufferPtr:
INC ECRRPtr+0 ifEQ INC ECRRPtr+1
LDA ECRRPtr+1
CMP #(HIGH(ECRRBuffer)+HIGH(BufferSize) ) ifEQ STI #HIGH(ECRRBuffer) ,ECRRPtr+l fi fi ; OutputBlControl:
DEC ECStringLength ifNE
INY
LDA ECExcessBits ADD #(0-00δh) ifPL ;{ CMP #040h ifCS ;{ LDA #040h fi ;} els ;{} CMP #(0-040h) ifCC ;{
LDA #(0-04Oh) fi ;} fi ;} STA ECExcessBits
JMP UpdateBlBuffer APPENDIX 1
fi
JMP OutputBControl ENDIF
************* D E C O D E R M A C R O S **************
DCGlobalShort MACRO IF Macros MSDCGlobalShort ELSE
JSR MSDCGlobalShort ENDIF ENDM
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Enter: A = guard bit at proper shift point (i.e. 80h for 1 bit)
Exit: A = fetched bits (right justified)
Z flag properly set(reset) for [A]
DecodeNBits MACRO
LOCAL DecodeNBl DecodeNBl: ASL DCBuffer ifEQ JSR DCReadCharacter SEC
ROL DCBuffer fi
ROL A
BCC DecodeNBl
ENDM
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * - 164 - APPENDIX 1
DCFreqToChar MACRO
LOCAL DCFToCExit STA DCWordl+0 CMP #0FEh ifCS LDA #08Oh ; get 1 bit DecodeNBits ifNE ; set DCCommand = 1 STA DCCommand ; NOTE: this is the valid EOF
IJMP DCFToCExit fi fi
SetCharFreq DC,W1 ; sets HIGH(NCFreq) in DCWordl+1
LDA (DCWordl) DCFToCExit:
ENDM
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
Enter: A = # of font table indices (2-16) - 2 Exit: A = font index (0-15)
DCReadFontFreq MACRO
LOCAL DCRNextBit TAY
LDX DCFontTblIndex,Y DCRNextBit: ASL DCBuffer ifEQ JSR DCReadCharacter SEC
ROL DCBuffer fi ifcs - 165 - APPENDIX 1
INX fi
LDA DCFontNext,X ifNE
TXA
CLC
ADC DCFontNext,X
TAX
IJMP DCRNextBit fi
LDA DCFontValue,X A = font index ENDM
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
DCGlobalLong MACRO LDA #04Oh ; get 2 bits DecodeNBits ; DCGlobalShort expects that DCGlobalShort ; 1st 2 bits of the
Global
ENDM ; code are in A and that ; the Z flag is based on [A] * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
IF Macros MSDCGlobalShort MACRO
LOCAL DGSOOZones,DGSOOOl,DGSOOOOl,DGSOOOOOl,DGSOOOOOO,DGSExit ELSE
MSDCGlobalShort: ENDIF
BEQ DGSOOZones ; Z flag set for [A] CMP #002h ifCC ; zone = 01
LDA #02Oh ; get 3 bits - 166 - APPENDIX 1
STI #008h,DCBytel ; DCBytel = base (DL) els ifNE ; zone = 11
STI #000h,DCBytel els ; zone = 10
STI #004h,DCBytel fi
; get 2 bits
get 4 bits
base 16
append 1 bit
base 24
append 4 bits
Figure imgf000188_0001
base 32 - 167 - APPENDIX 1
IJMP DGSExit DGSOOOOOl:
LDA #004h ; get 6 bits
DecodeNBits ADD #04Oh ; base 64
IJMP DGSExit DGSOOOOOO:
LDA #002h ; get 7 bits
DecodeNBits ADD #080h ; base 128
DGSExit:
IF Macros
ENDM ELSE RTS
ENDIF
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
DCLengthLong MACRO
LOCAL DCLNextBit,DCLExit LDX #000h DCLNextBit:
ASL DCBuffer ifEQ
JSR DCReadCharacter SEC
ROL DCBuffer fi ifCS
INX fi
LDA LengthBNext,X ifNE TXA
CLC - lδδ - APPENDIX 1
ADC LengthBNext,X TAX
IJMP DCLNextBit fi LDA LengthBValue,X
CMP #009h ifNE
JMP DCLExit fi STA DCByte2
DCGlobalLong
ADD DCByte2
DCLExit:
ENDM ;
;* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
.
DCZoneLong MACRO
LOCAL DCZNextBit LDX #000h
DCZNextBit:
ASL DCBuffer ifEQ JSR DCReadCharacter SEC
ROL DCBuffer fi ifCS INX fi
LDA ZoneNext,X ifNE
TXA
CLC ADC ZoneNext,X
TAX - Iδ9 - APPENDIX 1
IJMP DCZNextBit fi
LDA ZoneValue,X STA DCWordl+1 ENDM
************* D E C O D E R R E F I L L **************
DCFontParams: LDY DCABStatus
BPL DCFontsActive
LDA #001h get 8 bits
DecodeNBits
JMP DCNewAChar DCFontsActive:
LDA DCCurrentHash+1
BPL DCOldFont
TYA i Y = DCABStatus ifEQ JMP DCNewAFont fi
JMP DCNewBFont DCOldFont:
LDA DCCharacters ADD DCABStatus
; ADD DCSTIndex ; always 1 if strings on
ADD #0FFh ; A is # of font indices
(0-16)
DCReadFontFreq ; [A] is returned as index
LDY DCABStatus ifNE CMP DCNCIndex ifEQ LDA #001h ; get 8 bits DecodeNBits APPENDIX 1
JMP DCNewAChar fi
BCC DCReadOldChar ADD #0FFh CMP DCNCIndex ifEQ JMP DC2ByteHash fi
ADD #0FFh CMP DCNCIndex ifEQ
JMP DCReadBString fi
ADD #0FFh els
CMP DCNCIndex BCC DCReadOldChar ifEQ
JMP DCNewACharLong fi
ADD #0FFh CMP DCNCIndex ifEQ
JMP DCReadAString fi
ADD #0FFh fi DCReadOldChar:
STA DCFontlndex ; used in FontUpdate ADD #(TwoBytes+1)
ADD DCFontBase+0 STA DCWordl+0 LDA DCFontBase+1 STA DCWordl+1 . LDA (DCWordl)
STA (DCRRPtr) APPENDIX 1
STI #001h,DCCharCount JMP DCResetFont DCNewAFont:
LDA #04Oh get 2 bits DecodeNBits
CMP #001h ifCC ; 00
LDA #08Oh get 1 bit DecodeNBits IJMP DCNewACharShort fi
BNE DCNewACharShort 10,11 IF AHashX2 XOR 1 01 JMP DCReadAString ELSE
LDA #080h get 1 bit DecodeNBits ifEQ JMP DC2ByteHash fi
DCGlobalLong ADD #MinimuιnAString JMP DCDirectString ENDIF DCNewACharShort:
AND #0FFh reset Z flag for [A] DCGlobalShort JMP DCNewAChar DCNewACharLong: DCGlobalLong
DCNewAChar:
DCFreqToChar [A] is input LDY DCCommand ifEQ STA (DCRRPtr) character is output
STI #001h,DCCharCount APPENDIX 1
fi
JMP DCResetFont DCNewBFont:
LDA #040h ; get 2 bits DecodeNBits CMP #002h ifCC ; 00,01
ORA #004h ; append 6 bits DecodeNBits JMP DCNewAChar els ifEQ 10 LDA #003h ; append 7 bits to 1 DecodeNBits JMP DCNewAChar fi fi 11
LDA #080h ; get 1 bit DecodeNBits ifEQ JMP DC2ByteHash fi DCReadBString:
DCLengthLong ADD #(MinimumBString+1)
JMP DCDirectString DCReadAString:
IF AHashX2 LDA #040h ; get 2 bits DecodeNBits
CMP #003h ifEQ
JMP DC2ByteHash fi AND #0FFh ; reset Z flag for [A]
DCGlobalShort APPENDIX 1
ADD #(MinimumAString-4) ELSE DCGlobalLong ADD #MinimumAString ENDIF
DCDirectString:
STA DCCharCount DCZoneLong DCWordl = offset from
RRPtr
LDA #001h get δ bits DecodeNBits
STA DCBytel
LDA DCRRPtr+0
STA DCWord2+0 SEC
SBC DCBytel
STA DCWord3+0 DCWord3 = source string offset
TAX X = save DCWord3+0 for LAN
LDA DCRRPtr+1
STA DCWord2+l DCWord2 - object string offset
SBC DCWordl+1 CMP #HIGH(DCRRBuffer) ifCC ADD #HIGH(BufferSize) ; A - save DCWord3+l for
LAN fi
LDY DCWordl+1 ; DCWordl+1: 0 - Right to Left
BEQ DCDirectBackward ; <> 0 - Left to
Right DCDirectForward:
LDY DCCharCount DCDirectForwardl:
CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize)-1) APPENDIX 1
BEQ DCDirectForward2 DCDirectForwardOK: PHA PHX PLI
DCDirectFLoopl: LAN
STA (DCWord2) INC DCWord2+0 ifEQ
LDA DCWord2+l ADD #001h
CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize) ) ifEQ ADD #(0-HIGH(BufferSize)) fi
STA DCWord2+l fi DEY BNE DCDirectFLoopl
JMP DCResetFont DCDirectForward2: TAW
TYA ; A = Y = DCCharCount ADD #(0-001h)
ADD DCWord3+0 TWA
BCC DCDirectForwardOK STA DCWord3+l DCDirectFLoop2:
LDA (DCWord3) STA (DCWord2) INC DCWord3+0 ifEQ LDA DCWord3+l
ADD #001h APPENDIX 1
CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize)) ifEQ ADD #(0-HIGH(BufferSize)) fi STA DCWord3+l fi
INC DCWord2+0 ifEQ LDA DCWord2+l ADD #001h
CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize)) ifEQ
ADD #(O-HIGH(BufferSize)) fi STA DCWord2+l fi DEY
BNE DCDirectFLoop2 JMP DCResetFont DCDirectBackward:
LDY DCCharCount CPY DCBytel BCC DCDirectForwardl BEQ DCDirectForwardl STA DCWord3+l
DEY TYA
ADD DCWord3+0 STA DCWord3+0 ifCS
LDA DCWord3+l ADD #001h
CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize)) ifEQ ADD #(0-HIGH(BufferSize)) fi APPENDIX 1
STA DCWord3+l fi TYA
ADD DCWord2+0 STA DCWord2+0 ifCS LDA DCWord2+l ADD #001h
CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize)) ifEQ
ADD #(0-HIGH(BufferSize)) fi
STA DCWord2+l fi INY
DCDirectBLoop:
LDA (DCWord3) STA (DCWord2) LDA DCWord3+0 ADD #0FFh
STA DCWord3+0 ifCC LDA DCWord3+l CMP #HIGH(DCRRBuffer) ifEQ
ADD #HIGH(BufferSize) fi
ADD #0FFh STA DCWord3+l fi
LDA DCWord2+0 ADD #0FFh STA DCWord2+0 ifCC LDA DCWord2+l
CMP #HIGH(DCRRBuffer)
Figure imgf000199_0001
- 19δ - APPENDIX 1
; get 1 bit
; 0 = prod, 1 = command
; Prod resets DCBuffer,
; DCCommand and returns
; DCFontParams
; get next 2 bits and
; DCCommand (right
rocessCommand does as named ; and then JMP's back to ; DCFontParams
rod/command encountered
Figure imgf000200_0001
APPENDIX 1
LDA DCABStatus ifPL
ORA #080h els AND #07Fh fi
STA DCABStatus STI #00Oh,DCCommand JMP DCFontParams ELSE
JMP DCProdCommand ENDIF fi
LDA (DCRRPtr) STA DCCurrentChar
JSR DCWriteCharacter IF Repeats CMP DCChar2Prior ifNE JMP DCUpdateFont fi
CMP DCCharlPrior ifNE JMP DCUpdateFont fi
LDA #08Oh ; get 1 bit
DecodeNBits ifEQ JMP DCUpdateFont fi
DCGlobalLong TAY INY
LDA DCCurrentChar DCWriteRepeatsLoop:
JSR DCWriteCharacter APPENDIX 1
DEY
BNE DCWriteRepeatsLoop ENDIF "~
DCUpdateFont: FontUpdate DC,FU
IF Failsafe DEC DCFailSafe+0 ifEQ DEC DCFailSafe+1 ifEQ
STI #FailSafeSets,DCFailSafe+1 LDA #010h ; get 4 bits
DecodeNBits ifNE IF EOFControl
FailSafeTrap:
STI #OFFh,ECCommand BBR 6,HostLCR,FailSafeNLB JSR ECReadCharacter STI #OOOh,ECCommand
JMP DCOrECEOF FailSafeNLB:
Figure imgf000202_0001
APPENDIX 1
LDA DCRRPtr+1
ADD #001h
CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize) ) ifEQ ADD #(0-HIGH(Bu ferSize)) fi
STA DCRRPtr+1 fi
DEC DCCharCount ifEQ
JMP DCFontParams fi JMP DCResetFont
printstat Code,size,is,%$-cb
************* I N C L U D E T A B L E S **************
tb equ $ ; include TCtabOll
************** E N C O D E R T A B L E S **************
EncodingTable: ; plantl macro q,r q&r: endm
IF FontSize EQ 8 sepno defl 0 irp y,<e0,el,e2,e3,e4,e5,e6,e7,eδ> db y-FontCode endm ENDIF APPENDIX 1
IF FontSize EQ 16 sepno defl 0 irp y,<e0,el,e2,e3,e4,e5,e6,e7,eδ,e9,elO,ell,el2,el3,el4> db y-FontCode endm
ENDIF
. etbase macro plantl e,%sepno sepno defl sepno+1 endm
.
FontCode: etbase ; 2 db 11000000b,01000000b etbase ; 3 db 11000000b,00100000b,01100000b etbase ; 4 db 11000000b,00100000b,01010000b,01110000b etbase ; 5 db 11000000b,00100000b,01010000b,01101000b,01111000b etbase ; 6 db
10100000b,11100000b,00010000b,00110000b,01010000b db 01110000b etbase ; 7 db 10100000b,11100000b,00010000b,00110000b,01010000b db 01101000b,01111000b etbase ; 8 db 10100000b,11100000b,00010000b,00110000b,01001000b db 01011000b,01101000b,01111000b etbase ; 9 APPENDIX 1
db 10100000b,11100000b,00010000b,00110000b,01001000b db 01011000b,01101000b,01110100b,01111100b etbase ; 10 db
10100000b,11100000b,00010000b,00110000b,01001000b db 01011000b,01100100b,01101100b,01110100b,01111100b IF FontSize EQ 16 etbase ; 11 db 10100000b,11100000b,00010000b,00110000b,01001000b db 01011000b,01100100b,01101100b,01110100b,01111010b db 01111110b etbase ; 12 db 10100000b,11100000b,00010000b,00110000b,01001000b db 01011000b,01100100b,01101100b,01110010b,01110110b db 01111010b,01111110b etbase ; 13 ERP db 10100000b,11100000b,00010000b,00101000b,00111000b db
01001000b,01011000b,01100100b,01101100b,01110010b db 01110110b,01111010b,01111110b ; etbase ; 13 FLB db 10100000b,11100000b,00010000b,00101000b,00111000b db 01001000b,01010100b,01011100b,01100100b,01101100b db 01110100b,01111010b,01111110b etbase ; 14 db
10100000b,11100000b,00010000b,00101000b,00111000b APPENDIX 1
db 01001000b,01010100b,01011100b,01100100b,01101100b db 01110010b,01110110b,01111010b,01111110b etbase ; 15 db
10100000b,11100000b,00010000b,00101000b,00111000b db 01000100b,01001100b,01010100b,01011100b,01100100b db 01101100b,01110010b,01110110b,01111010b,01111110b etbase ; 16 db 10100000b,11100000b,00010000b,00101000b,00111000b db 01000100b,01001100b,01010100b,01011100b,01100100b db 01101010b,01101110b,01110010b,01110110b,01111010b db 01111110b ENDIF ; fontesz equ $-FontCode
************** D E C O D E R T A B L E S ************** ;
DCFontTblIndex:
DB 000,002,006,012 DB 020,030,042,056 DB 072 IF FontSize EQ 16
DB 090,110,132 DB 156,182,210 ENDIF
DCFontNext:
DB 0, 0 ; 2 APPENDIX 1
DB 2 0, 0, 0 ; 3
DB 2 0, 0, 1, 0, 0 ; 4
DB 2 0, 0, 1, 0, 1 ; 5
DB 4 1, o, 0, 2, 3 ; 6
DB 0 0
DB 4 1, 0, 0, 2, 3 ; 7
DB 0 1, 0, 0
DB 4 1, o, 0, 2, 3 ; 8
DB 2 3, 0, 0, 0, 0
DB 4 1, 0, 0, 2, 3 0 ; 9
DB 2 3, 0, 0, 0, 1 0
DB 4 1, o, 0, 2, 3 0 ; ιo
DB 2 3, 0, 0, 2, 3 0
DB 0 0
IF Fontsize EQ 16
Figure imgf000207_0001
DB 2 0 0 2 0 DB 0 0 DB 4 0 2 0 ; 12 DB 2 0 2 0 DB 2 0 0 DB 4 0 2 1 ; 13 ERP DB 0 3 0 3 DB 0 3 0 0 DB 4 0 2 0 ; 13 FLB DB 4 0 0 5 DB 0 0 0 0 DB 4 0 2 3 14 DB 4 0 0 5 DB 0 0 2 0 DB 0 DB 4 0 2 3 ; 15 DB 4 0 4 7 DB 0 0 0 3 DB 0 0 DB 4 o. 2, 3, 0, 3 ; 16 DB DB DB
Figure imgf000208_0001
ENDIF ;
DCFontValue:
DB 0 ; 2
DB 0, 1, 2 ; 3
DB 0, 1, 0, 2, 3 ; 4 DB 0, 1, 0, 2, 0, 3, 4 ; 5
DB 0, 0, 1, 0, 0, 2, 3 ; 6
DB 5
DB 0, 0, 1, 0, 0, 2, 3 ; 7
DB 0, 5, 6 DB 0, 0, 1, 0, 0, 2, 3 ; δ
DB 0, 4, 5, 6, 7
DB 0, 0, 1, 0, 0, 2, 3 ; 9
DB 0, 4, 5, 6, 0, 7, 8
DB 0, 0, 1, 0, 0, 2, 3 ; ιo DB 0, 4, 5, 0, 0, 6, 7
DB 9
IF FontSize EQ 16
DB 1, 0, 0, 2, 3 ; il
DB 5, 0, 0, 6, 7 DB A
DB 1, 0, 0, 2, 3 ; 12
DB 5, 0, 0, 6, 7
DB 9, A, B
DB 1, 0, 0, 2, 0 ; 13 ERP DB 0, 5, 6, 0, 0
DB 0, 9, A, B, C
DB 1, 0, 0, 2, 0 ; 13 FLB
DB 4, 5, 0, 0, 0
DB 9, A, 0, B, C DB 1, 0, 0, 2, 0 ; 14
DB 4, 5, 0, 0, 0 APPENDIX 1
DB 6, 7, 8, 9, 0, 0, A, B
DB C, D
DB 0, 0, 0, 1, 0, 0, 2, 0 ; 15
DB 0, 0, 3, 4, 0, 0, 0, 0 DB 5, 6, 7, 8, 9, A, 0, 0
DB B, C, D, E
DB 0, 0, 0, 1, 0, 0, 2, 0 ; 16
DB 0, 0, 3, 4, 0, 0, 0, 0
DB 5, 6, 7, 8, 9, 0, 0, 0 DB A, B, C, D, E, F
ENDIF
************* S H A R E D T A B L E S **************
Bestl28: DB 20h,30h,45h,65h,0Ah,0Dh,31h,54h,74h,52h,32h,61h,49h,53h,41h, 4Fh
DB 72h,43h,4Eh,6Eh,4Ch,6Fh,69h,73h,09h,2Ch,44h,4Dh,35h,2Dh,33h, 64h
DB 46h,2Eh,6δh,50h,6Ch,3δh,34h,29h,28h,39h,63h,55h,2Fh,3Dh,4δh, 36h DB
75h,66h,6Dh,42h,37h,70h,47h,57h,67h,5δh,56h,62h,59h,77h,22h, 79h
DB 2Ah,2Bh,5Fh,76h,27h,4Bh,25h,3Eh,21h,3Bh,5Ah,3Ch,24h,40h,3Ah, 6Bh
DB 4Ah,7δh,26h,51h,5Bh,5Dh,23h,71h,7Ah,lAh,6Ah,19h,3Fh,5Ch,00h, Olh
DB 02h,03h,04h,05h,06h,07h,Oδh,OBh,OCh,OEh,OFh,lOh,llh,12h,13h, 14h - 20δ - APPENDIX 1
DB 15h,16h,17h,lβh,IBh,ICh,IDh,lEh,IFh,5Eh,60h,7Bh,7Ch,7Dh,7Eh, 7Fh
FontBits: db 1,1 ; 2 db 1,2,2 ; 3 db 1,2,3,3 ; 4 db 1,2,3,4,4 ; 5 db 2,2,3,3,3,3 ; 6 db 2,2,3,3,3,4,4 ; 7 db 2,2,3,3,4,4,4,4 ; δ db 2,2,3,3,4,4,4,5,5 9 db 2,2,3,3,4,4,5,5,5,5 10 IF FontSize EQ 16 db 2,2,3,3,4,4,5,5,5,6,6 11 db 2,2,3,3,4,4,5,5,6,6,6,6 12 db 2,2,3,4,4,4,4,5,5,6,6,6,6 13 ERP db 2,2,3,4,4,4,5,5,5,5,5,6,6 13 FLB db 2,2,3,4,4,4,5,5,5,5,6,6,6,6 r 14 db 2,2,3,4,4,5,5,5,5,5,5,6,6,6,6 15 db 2,2,3,4,4,5,5,5,5,5,6,6,6,6,6,6 r 16 ENDIF
GlobalBits: DB 04,04,04,04,04,04,04,04,05,05,05,05,05,05,05,05
DB 06,06,06,06,06,06,06,06,07,07,07,07,07,07,07,07 DB
10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10
DB 10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10 DB 12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12 DB APPENDIX 1
12.12.12,12,12,12,12,12,12,12,12,12,12,12,12,12
DB
12.12.12.12.12.12.12.12.12.12.12.12.12.12.12.12 DB 12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12 DB
13.13.13.13.13.13.13.13.13.13.13.13.13.13.13.13 DB
13.13.13,13,13,13,13,13,13,13,13,13,13,13,13,13 DB
13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13
DB 13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13
DB 13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13
DB 13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13
DB 13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13 DB
13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13
.
GlobalCodeHigh:
DB 11001000B,11011000B ; 0- 7 DB 11101000B,11111000B
DB 10001000B,10011000B
DB 10101000B,10111000B
DB 01000100B,01001100B ; δ- 15
DB 01010100B,01011100B DB 01100100B,01101100B
DB 01110100B,01111100B
DB 00100010B,00100110B ; 16- 23
DB 00101010B,00101110B
DB 00110010B,00110110B DB 00111010B,00111110B
DB 00010001B,00010011B ; 24- 31 APPENDIX 1
DB 00010101B,00010111B
DB 00011001B,00011011B
DB 00011101B,00011111B -
DB OOOOIOOOB,OOOOIOOOB ; 32- 47 DB OOOOIOOOB,OOOOIOOOB
DB 00001001B,00001001B
DB 00001001B,00001001B
DB OOOOIOIOB,OOOOIOIOB
DB OOOOIOIOB,OOOOIOIOB DB OOOOIOIIB,OOOOIOIIB
DB OOOOIOIIB,OOOOIOIIB
DB 00001100B,00001100B ; 4δ- 63
DB 00001100B,00001100B
DB 00001101B,00001101B DB 00001101B,00001101B
DB 00001110B,00001110B
DB 00001110B,00001110B
DB 00001111B,00001111B
DB 00001111B,00001111B DB 00000100B,00000100B ; 64- 79
DB 00000100B,00000100B
DB 00000100B,00000100B
DB 00000100B,00000100B
DB 00000100B,00000100B DB 00000100B,00000100B
DB 00000100B,00000100B
DB 00000100B,00000100B
DB 00000101B,00000101B ; 60- 95
DB 00000101B,00000101B DB 00000101B,00000101B
DB 00000101B,00000101B
DB 00000101B,00000101B
DB 00000101B,00000101B
DB 00000101B,00000101B DB 00000101B,00000101B
DB 00000110B,00000110B ; 96-111 APPENDIX 1
DB OOOOOllOB,OOOOOllOB
DB OOOOOllOB,OOOOOllOB
DB OOOOOllOB,OOOOOllOB
DB OOOOOllOB,OOOOOllOB DB OOOOOllOB,OOOOOllOB
DB OOOOOllOB,OOOOOllOB
DB OOOOOllOB,OOOOOllOB
DB 00000111B,00000111B ; 112-127
DB 00000111B,00000111B DB 00000111B,00000111B
DB 00000111B,00000111B
DB 00000111B,00000111B
DB 00000111B,00000111B
DB 00000111B,00000111B DB 00000111B,00000111B
DB 00000000B,00000000B ; 126-143
DB 00000000B,00000000B
DB 00000000B,00000000B
DB 00000000B,00000000B DB 00000000B,00000000B
DB OOO0OOOOB,OO00OOOOB
DB 00000000B,00000000B
DB 00000000B,00000000B
DB 00000000B,00000000B ; 144-159 DB 00000000B,00000000B
DB 00000000B,00000000B
DB 00000000B,00000000B
DB 00000000B,00000000B
DB 00000000B,00000000B DB 00000000B,00000000B
DB 00000000B,00000000B
DB 00000001B,00000001B ; 160-175
DB 00000001B,00000001B
DB 00000001B,00000001B DB 00000001B,00000001B
DB 00000001B,00000001B APPENDIX 1
DB OOOOOOOIB,OOOOOOOIB
DB OOOOOOOIB,OOOOOOOIB
DB OOOOOOOIB,OOOOOOOIB
DB OOOOOOOIB,OOOOOOOIB ; 176-191
5 DB OOOOOOOIB,OOOOOOOIB
DB OOOOOOOIB,OOOOOOOIB
DB OOOOOOOIB,OOOOOOOIB
DB OOOOOOOIB,OOOOOOOIB
DB OOOOOOOIB,OOOOOOOIB
10 DB OOOOOOOIB,OOOOOOOIB
DB OOOOOOOIB,OOOOOOOIB
DB O0OOOO10B,OOOOO010B ; 192-207
DB 00000010B,00000010B
DB 00000010B,00000010B
15 DB 00000010B,00000010B
DB 00000010B,00000010B
DB 00000010B,00000010B
DB 00000010B,00000010B
DB 00000010B,00000010B
20 DB 00000010B,00000010B ; 206-223
DB 00000010B,00000010B
DB 00000010B,00000010B
DB 00000010B,00000010B
DB 00000010B,00000010B
25 DB 00000010B,00000010B
DB 00000010B,00000010B
DB 00000010B,00000010B
DB 00000011B,00000011B ; 224-239
DB 00000011B,00000011B
30 DB 00000011B,00000011B
DB 00000011B,00000011B
DB 00000011B,00000011B
DB 00000011B,00000011B
DB 00000011B,00000011B
35 DB 00000011B,00000011B
DB 00000011B,00000011B ; 240-255 APPENDIX 1
DB OOOOOOllB,OOOOOOllB
DB OOOOOOllB,OOOOOOllB
DB OOOOOOllB,OOOOOOllB
DB OOOOOOllB,OOOOOOllB DB OOOOOOllB,OOOOOOllB
DB OOOOOOllB,OOOOOOllB
DB OOOOOOllB,OOOOOOllB
.
GlobalCodeLow: DB OOOOOOOOB,OOOOOOOOB ; 0- 7
DB OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB ; 8- 15 DB OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB ; 16- 23
DB OOOOOOOOB,OOOOOOOOB DB OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB ,* 24- 31
DB OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB DB OOOOOOOOB,OOOOOOOOB
DB 00100000B,01100000B ; 32- 47
DB 10100000B,11100000B
DB 00100000B,01100000B
DB 10100000B,11100000B DB 00100000B,01100000B
DB 10100000B,11100000B
DB 00100000B,01100000B
DB 10100000B,11100000B
DB 00100000B,01100000B ; 48- 63 DB 10100000B,11100000B
DB 00100000B,01100000B APPENDIX 1
DB 10100000B,lllOOOOOB
DB OOIOOOOOB,OllOOOOOB
DB lOlOOOOOB,lllOOOOOB
DB OOIOOOOOB,OllOOOOOB DB 10100000B,lllOOOOOB
DB OOOOIOOOB,OOOllOOOB ; 64- 79
DB 00101000B,00111000B
DB 01001000B,01011000B
DB 01101000B,01111000B DB 10001000B,10011000B
DB 10101000B,10111000B
DB 11001000B,11011000B
DB 11101000B,11111000B
DB OOOOIOOOB,OOOllOOOB ; 80- 95 DB 00101000B,00111000B
DB 01001000B,01011000B
DB 01101000B,01111000B
DB 10001000B,10011000B
DB 10101000B,10111000B DB 11001000B,11011000B
DB 11101000B,11111000B
DB OOOOIOOOB,OOOllOOOB ; 96-111
DB 00101000B,00111000B
DB 01001000B,01011000B DB 01101000B,01111000B
DB 10001000B,10011000B
DB 101010O0B,10111000B
DB 11001000B,11011000B
DB 11101000B,11111000B DB OOOOIOOOB,OOOllOOOB ; 112-127
DB 00101000B,00111000B
DB 01001000B,01011000B
DB 01101000B,01111000B
DB 10001000B,10011000B DB 10101000B,10111000B
DB 11001000B,11011000B
Figure imgf000217_0001
APPENDIX 1
DB 00110100B,00111100B
DB 01000100B,01001100B
DB 01010100B,OlOlllOOB
DB OllOOlOOB,OllOllOOB
DB 01110100B,01111100B
DB 10000100B,10001100B ; 208-223
DB 10010100B,10011100B
DB 10100100B,10101100B
DB 10110100B,10111100B
DB 11000100B,11001100B
DB 11010100B,11011100B
DB 11100100B,11101100B
DB 11110100B,11111100B
DB 00000100B,00001100B ; 224-239
DB OOOIOIOOB,OOOlllOOB
DB 00100100B,00101100B
DB 00110100B,00111100B
DB 01000100B,01001100B
DB 01010100B,OlOlllOOB
DB OllOOlOOB,OllOllOOB
DB 01110100B,01111100B
DB 10000100B,10001100B ; 240-255
DB 100101006,100111006
DB 10100100B,10101100B
DB 10110100B,10111100B
DB 11000100B,11001100B
DB 11010100B,11011100B
DB 11100100B,11101100B
DB 11110100B,11111100B
IF 1 EQ 0 LengthABits: DB 02,03,03,03,04,04,04,05,05,06,06,07,07,07,07,08 DB
08,08,08,09,09,09,09,09,09,09,10,10,10,10,10,10 APPENDIX 1
DB 10 , 10 , 10 , 11, 11, 11 , 11 , 11 , 11 , 11 , 11 , 11 , 11, 12 , 12 , 12
DB 12,12,12,12,12,12,12,12,12,12,12,13,13,13,13,06 ;
LengthACodeHigh:
DB lllOOOOOB,10110000B,10010000B,01110000B
DB 01011000B,01001000B,00111000B,00101100B
DB 00100100B,00011110B,00011010B,00010011B DB 00010001B,00001111B,00001101B,OOOOIOIIB
DB OOOOIOIOB,00001001B,OOOOIOOOB,00000111B
DB 00000111B,OOOOOllOB,OOOOOllOB,00000101B
DB 00000101B,00000100B,00000100B,00000100B
DB OOOOOOllB,OOOOOOllB,OOOOOOllB,OOOOOOllB DB 00000010B,00000010B,00000010B,00000010B
DB 00000010B,OOOOOOOIB,OOOOOOOIB,OOOOOOOIB
DB OOOOOOOIB,OOOOOOOIB,OOOOOOOIB,OOOOOOOIB
DB OOOOOOOIB,OOOOOOOOB,OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,OOOOOOOOB DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,OOOIOIOOB ; no guard bit ; ; on index 63 LengthACodeLow:
DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,OOOOOOOOB
DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,10Θ-00000B DB 10000000B,10000000B,10000000B,11000000B •
DB 01000000B,11000000B,01000000B,11000000B
DB 01000000B,11000000B,OllOOOOOB,OOIOOOOOB
DB lllOOOOOB,lOlOOOOOB,OllOOOOOB,OOIOOOOOB
DB lllOOOOOB,lOlOOOOOB,OllOOOOOB,00110000B DB 00010000B,11110000B,11010000B,10110000B
DB 10010000B,01110000B,01010000B,00110000B APPENDIX 1
DB OOOIOOOOB,lllllOOOB,IIIOIOOOB,IIOIIOOOB
DB IIOOIOOOB,10111000B,10101000B,10011000B
DB 10001000B,OllllOOOB,OllOlOOOB,OlOllOOOB
DB OIOOIOOOB,OOlllOOOB,OOIOIOOOB,OOOlllOOB
DB OOOIOIOOB,OOOOllOOB,OOOOOIOOB,OOOOOOOOB
o, o, 0, 6, 1
0, 6, 1, 2, 0
1, 4, 1, 0, 0 0,12, 1, 4, 1 1, 0, 0, 0, 0 1, 4, 1 , 0, 0 1, 0, 0, 2, 0 1, 8, 1, 4, 1 0, 4, 1, 0, 0 0,16, 1, 8, 1 0, 0, 0, 4, 1 0,16, 1, 8, 1 0, 0, 0, 4, 1 0, 8, 1, 4, 1 0, 4, 1, 0, 0
Figure imgf000220_0001
0, 0, 0
LengthAValue:
DB 0, 0, 0, 0, 2, 1, 0, 0
DB 0, 3, 5, 4, 0, 0, 0, 6
DB 8, 7, 0, 0, 0, 0,10, 9
DB 0,63, 12,11, 0, 0, 0, 0 DB 14,13, 0, 0,16,15,18,17
DB 0, 0, 0, 0, 0, 0,20,19
DB 22,21, 0, 0,24,23, 0,25
DB 27,26, 0, 0, 0, 0, 0, 0
DB 29,28, 31,30, 0, 0,33,32 DB 0,34, 36,35, 0, 0, 0, 0
DB 0, 0, 38,37,40,39, 0, 0 APPENDIX 1
DB 42,41,44,43, 0, 0, 0, 0
DB 0, 0,46,45,48,47, 0, 0
DB 50,49,52,51, 0, 0, 0, 0
DB 54,53,56,55, 0, 0,58,57 DB 0, 0,60,59,62,61
ENDIF
.
LengthBBits:
DB 01,03,03,04,05,05,05,06,06,04 ;
LengthBCode: DB llOOOOOOB,OlllOOOOB,OIOIOOOOB,OOlllOOOB,OOOlllOOB DB OOOIOIOOB,OOOOllOOB,OOOOOllOB,OOOOOOIOB,OOIOIOOOB
LengthBNext:
DB 2, 0, 4, 1, 0, 0, 4, 1, 0, 0 DB 4, 1, 0, 0, 2, 0, 0, 0 ;
LengthBValue:
DB 0, 0, 0, 0, 2, 1, 0, 0, 9, 3 DB 0, 0, 5, 4, 0, 6, 8, 7
IF BufferSize EQ 8192
ZoneBits:
DB 02,03,03,04,05,05,05,05,05,05,06,06,06,06,06,06
DB 06,06,06,06,07,07,07,07,07,07,07,07,07,07,07,07
.
ZoneCode:
DB lllOOOOOB,10110000B,10010000B,OllllOOOB
DB OllOllOOB,OllOOlOOB,OlOlllOOB,01010100B DB 01001100B,01000100B,00111110B,00111010B
DB 00110110B,00110010B,00101110B,00101010B APPENDIX 1
DB 00100110B,00100010B,00011110B,00011010B
DB 00010111B,00010101B,00010011B,00010001B
DB OOOOllllB,OOOOIIOIB,OOOOIOIIB,OOOOIOOIB
DB OOOOOlllB,OOOOOIOIB,OOOOOOllB,OOOOOOOIB
ZoneNext:
DB 6, 1, 2, 0, 0, 0,14, 1
DB 6, 1, 2, 0, 0, 0, 4, 1
DB 0, 0, 0, 0,16, 1, 8, 1
DB 4, 1, 0, 0, 0, 0, 4, 1
DB 0, 0, 0, 0,12, 1, 4, 1
DB 0, 0, 4, 1, 0, 0, 0, 0
DB 8, 1, 4, 1, 0, 0, 0, 0
DB 4, 1, 0, 0, 0, 0
ZoneValue:
DB 0, 0, 0, 0, 2, 1, 0, 0
DB 0, 0, 0, 3, 5, 4, 0, 0
DB 7, 6, 9, 8, 0, 0, 0, 0 DB 0, 0,11,10,13,12, 0, 0
DB 15,14,17,16, 0, 0, 0, 0
DB 19,18, 0, 0,21,20,23,22
DB 0, 0, 0, 0,25,24,27,26
DB 0, 0,29,28,31,30
ELSE ZoneBits:
DB
02,02,03,04,04,05,05,05,05,05,06,06,06,06,06,06
.
ZoneCode:
DB lllOOOOOB,10100000B,OlllOOOOB,OlOllOOOB
DB OIOOIOOOB,00111100B,00110100B,00101100B
DB 00100100B,OOOlllOOB,00010110B,00010010B DB 00001110B,OOOOIOIOB,OOOOOllOB,OOOOOOIOB APPENDIX 1
ZoneNext:
DB 4, 1, 0, 0, 6, 1, 2, 0
DB 0, 0, 8, 1, 4, 1, 0, 0
DB 0, 0, 6, 1, 2, 0, 0, 0
DB 4, 1, 0, 0, 0, 0
ZoneValue:
DB 0, 0, 1, 0, 0, 0, 0, 2
DB 4, 3, 0, 0, 0, 0, 6, 5 DB 8, 7, 0, 0, 0, 9,11,10
DB 0, 0,13,12,15,14 ENDIF
CRC_TH:
DB OOOH,011H,023H,032H,046H,057H,065H,074H 000
DB 08CH,09DH,OAFH,OBEH,OCAH,ODBH,0E9H,OF8H 008
DB 010H,001H,033H,022H,056H,047H,075H,064H 010
DB 09CH,08DH,OBFH,OAEH,ODAH,OCBH,0F9H,0E8H 018 DB 021H,030H,002H,013H,067H,076H,044H,055H 020
DB OADH,OBCH,08EH,09FH,OEBH,OFAH,0C8H,0D9H 028
DB 031H,020H,012H,003H,077H,066H,054H,045H 030
DB OBDH,OACH,09EH,08FH,OFBH,OEAH,0D8H,0C9H 038
DB 042H,053H,061H,070H,004H,015H,027H,036H 040 DB 0CEH,0DFH,0EDH,0FCH,088H,099H,0ABH,0BAH 046
DB 052H,043H,071H,060H,014H,005H,037H,026H 050
DB ODEH,OCFH,OFDH,OECH,098H,089H,OBBH,OAAH 058
DB 063H,072H,040H,051H,025H,034H,006H,017H 060
DB OEFH,OFEH,OCCH,ODDH,0A9H,0B8H,08AH,09BH 068 DB 073H,062H,050H,041H,035H,024H,016H,007H 070
DB OFFH,OEEH,ODCH,OCDH,0B9H,0A8H,09AH,08BH 078
DB 084H,095H,0A7H,0B6H,0C2H,0D3H,0E1H,OFOH 080
DB 008H,019H,02BH,03AH,04EH,05FH,06DH,07CH 088
DB 094H,085H,0B7H,0A6H,0D2H,0C3H,0F1H,OEOH 090 DB 018H,009H,03BH,02AH,05EH,04FH,07DH,06CH 098
DB 0A5H,0B4H,086H,097H,0E3H,0F2H,OCOH,0D1H OAO APPENDIX 1
DB 029H,038H,OOAH,01BH,06FH,07EH,04CH,05DH 0A8
DB 0B5H,0A4H,096H,087H,0F3H,0E2H,ODOH,OCIH OBO
DB 039H,028H,01AH,OOBH,07FH,06EH,05CH,04DH 0B8
DB 0C6H,0D7H,0E5H,0F4H,080H,091H,0A3H,0B2H OCO
DB 04AH,05BH,069H,078H,OOCH,01DH,02FH,03EH OCδ
DB 0D6H,0C7H,0F5H,0E4H,090H,OδlH,0B3H,0A2H ODO
DB 05AH,04BH,079H,06δH,01CH,OODH,03FH,02EH 0D8
DB 0E7H,0F6H,0C4H,0D5H,0A1H,OBOH,082H,093H OEO
DB 06BH,07AH,048H,059H,02DH,03CH,OOEH,01FH OEδ
DB 0F7H,0E6H,0D4H,0C5H,0B1H,OAOH,092H,063H OFO
DB 07BH,06AH,058H,049H,03DH,02CH,01EH,OOFH 0F8
CRC TL:
DB OOOH,089H,012H,09BH,024H,OADH,036H,OBFH 000
DB 04δH,OCIH,05AH,0D3H,06CH,0E5H,07EH,0F7H 008
DB 081H,008H,093H,01AH,0A5H,02CH,0B7H,03EH 010
DB 0C9H,040H,ODBH,052H,OEDH,064H,OFFH,076H 018
DB 002H,06BH,010H,099H,026H,OAFH,034H,OBDH 020
DB 04AH,0C3H,058H,0D1H,06EH,0E7H,07CH,0F5H 028
DB 083H,OOAH,091H,OlβH,0A7H,02EH,0B5H,03CH 030
DB OCBH,042H,0D9H,050H,OEFH,066H,OFDH,074H 038
DB 004H,08DH,OlβH,09FH,020H,0A9H,032H,OBBH 040
DB 04CH,0C5H,05EH,0D7H,068H,0E1H,07AH,0F3H 048
DB 085H,OOCH,097H,01EH,0A1H,028H,0B3H,03AH 050
DB OCDH,044H,ODFH,056H,0E9H,060H,OFBH,072H 058
DB 006H,08FH,014H,09DH,022H,OABH,030H,0B9H 060
DB 04EH,0C7H,05CH,0D5H,06AH,0E3H,078H,0F1H 068
DB 087H,OOEH,095H,01CH,0A3H,02AH,0B1H,038H 070
DB OCFH,046H,ODDH,054H,OEBH,062H,0F9H,070H 078
DB 008H,081H,01AH,093H,02CH,0A5H,03EH,0B7H OδO
DB 040H,0C9H,052H,ODBH,064H,OEDH,076H,OFFH Oδδ
DB 0δ9H,OOOH,09BH,012H,OADH,024H,OBFH,036H 090
DB OCIH,048H,0D3H,05AH,0E5H,06CH,0F7H,07EH 098
DB OOAH,083H,016H,091H,02EH,0A7H,03CH,0B5H OAO
DB 042H,OCBH,050H,0D9H,066H,OEFH,074H,OFDH 0A8
DB 08BH,002H,099H,010H,OAFH,026H,OBDH,034H OBO APPENDIX 1
DB 0C3H,04AH,OD1H,058H,0E7H,06EH,0F5H,07CH 0B8
DB OOCH,085H,01EH,097H,028H,0A1H,03AH,0B3H OCO
DB 044H,OCDH,056H,ODFH,060H,0E9H,072H,OFBH 0C8
DB 08DH,004H,09FH,016H,0A9H,020H,OBBH,032H ODO
DB 0C5H,04CH,0D7H,05EH,OEIH,068H,0F3H,07AH 0D8
DB OOEH,087H,01CH,095H,02AH,0A3H,038H,0B1H OEO
DB 046H,OCFH,054H,ODDH,062H,OEBH,070H,0F9H OEδ
DB 08FH,006H,09DH,014H,OABH,022H,0B9H,030H OFO
DB 0C7H,04EH,0D5H,05CH,0E3H,06AH,0F1H,078H OFδ
printstat Data, size, is, %$-tb
******************** D E B U G G E R ********************
DEBUGGER or DUMMY INCLUSION
include TCdbgOOl
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
unplanned_int: brk ;break: ;break_out: nop nop nop rti
*************** V E C T O R T A B L E ****************
VECTOR TABLE
ds 0-progaddr-($-cb)-32,0
Jsb 0 APPENDIX 1
dw unplanned_int
Jsb 1 dw unplanned_int
Jsb 1 dw unplanned_int
Jsb 3 dw unplanned_int
Jsb 4 dw unplanned_int ; Jsb 5 dw unplanned_int
Jsb 6 dw unplanned_int ; Jsb 7 dw unplanned_int
.
; Irq6,brea ,PTGA,PTGb,bE dw break ; Irq5,Serin Stat, TimerA dw break_out
; Irq4,PA3,Edge/bF dw unplanned_int ; Irq3,Host/Timerb dw Hostlnt ; Irq2,Pb§ Edge dw unplanned_int ; Irql,Pd7 Edge dw unplanned_int ; NMI dw unplanned_int
; unplanned_int reset: dw dbginit ; start in debugger printstat <C000-FFFFh Block Free =>,%163δ4-($-cb) ; end APPENDIX 2
SOURCE LISTING GUIDE Page 1, lines 1 through 11
Define the assembly environment. Page 1, line 12 The "include ITEC19" statement copies a source file which uses the MACRO facility in the assembler to provide some higher level language type constructs. Page 1, line 13
The "include TCDFM001" statement copies a source file which defines the internal register and I/O structure in the C19. Page 1, lines 14 through 28
Assembly macros used to manage/display the assembly environment and status. Page 1, lines 33 through 40
Setting of symbols which control some assembly time features of the algorithm. These are used to enable/disable various structures and code to evaluate compression effectiveness. Page 1, line 46 through Page 2, line 32
More assembly time controls which affect compression mechanisms and establish sizes of certain memory structures. Page 2, line 38 through Page 3, line 8 Assembly time controls for diagnostics and speed of execution having little or no effect on compression effectivity. Page 3, lines 12 through 42
Definition (mapping) of some structures used by the algorithm.
Page 3, line 47 through Page 6, line 36
Declaration of byte (8 bit) and word (16 bit) variables used by the encoding and decoding processes. Variables beginning with "EC" are used by the encoder (compression process) and beginning with "DC" are the decoder (decompression process) . Other prefixes are APPENDIX 2
general use. Page 6, line 42 through Page 7, line 36
Declaration of encoder/decoder structures which are a multiple of 256 bytes in length. Page 7, line 40 through Page 8, line 27
Declaration of more encoder structures that are a multiple of 256 bytes in length. This block is from absolute address 4000h to OcOOOh (size = 32768 bytes) and is bank switched alternating with the next described block.
Page 8, line 33 through Page 9, line 5
The decoder bankswitched block (size = 32768) . Page 9, line 10 through Page 11, line 48
Interface points to the operating system code for the production implementation of the algorithm. In production, these hooks replace development environment code on pages 22 through 25 inclusive. Page 11, line 11 through Page 14, line 14
Table initialize code. All compression/decompression tables and variables are set to initial conditions. Page 14, line 18 through Page 16, line 46
Program startup code which sets environment and initializes stacks for alternate execution of encoder/decoder. Page 17, lines 1 through 19
Context switch subroutines. Page 17, line 25 through Page 26, line 3
Development environment routines for Memory Dump to PC and character transfer to/from PC bus. Characters transferred are to be compressed or decompressed. The PC interface is an emulation of a standard PC asynchronous communications IC an INS16450. Page 26, line 11 through Page 28, line 24
Macro declarations which facilitate the generation of certain microcode routines for bit stream output as either in-line code or as subroutines. APPENDIX 2
Page 28, line 30 through Page 29, line 11
A macro which embodies the microcode to select the appropriate one of four NCToFrequency tables based on the prior character of the input stream, leaving the base address of the table ECNCChar in ECWordl and the base address of the table ECNCFreq in ECWord2. Page 29, line 17 through Page 41, line 21
The body of the FontUpdate Macro. This generates all of the microcode to perform the processes of CRC Hash generation. Font access. Font creation, etc. In general, all of the processes (steps) 3 through 9 as described with Figure 2B. Page 41, line 27 through Page 45, line 44
The Encode main loop first phase. This is the Refill process which accepts characters from the input stream, stores them in the process buffer (ECChar, ECCharCopy) , invokes the FontUpdate Macro (process) . As required by flush operations and ProcessBuffer full conditions, this process invokes the second phase of execution. Page 46, line 1 through Page 48, line 37
The Mode A string search macro. This embodies the code to locate the longest string in the history buffer matching the string beginning at the position of the ECChar buffer at position A (the value in the C19 accumulator register) .
Page 48, line 41 through Page 52, line 43
The Mode A string find routines. These routines perform two iterations of the above macro, reject strings overlapping the next history buffer stream location, select the longer of the two if two were found. Page 52, line 44 through Page 54, line 44
The Mode A Pair Encoding and bit cost comparison routines. Page 54, line 45 through Page 57, line 19
Mode A String bit cost computation and comparison with APPENDIX 2
Font encoding. Page 57, line 25 through Page 62, line 8
Mode A bit stream format and output routines. - Page 62, line 12 through Page 67, line 34 Mode A repeats output, history buffer and access table update routine, and phase 2 iteration (flush mode or normal) control. Page 67, line 40 through Page 72, line 3 Mode B String search macro routines. Page 72, line 6 through Page 74, line 16
Mode B String find routines. Performs Mode B string search macros and rejects strings which overlap next history buffer store location. Page 74, line 17 through Page 76, line 13 Mode B Pair encoding and bit cost comparison subroutines. Page 76, line 14 through Page 78, line 27
Mode B string bit cost comparison routines. Page 78, line 32 through Page 79, line 40 Mode B antiexpansion summing routines. Page 79, line 46 through Page 80, line 49
Mode B history buffer and access table update. Page 81, line 1 through Page 92, line 18 Mode B bit stream format and output. Page 92, line 17 through Page 96, line 29
Decoder macros for character input/output and bit stream fetch. Page 96, line 33 through Page 105, line 17 Decoder main body. Page 105, line 26 through Page 107 line 9
Encoding Table and FontCode (Huffman Font codes) tables. Used to emit Font encoding bit patterns. Page 107, line 14 through Page 109, line 9 Decoder Huffman Font decoding trees. Page 109, line 13 through Page 109, line 21
New Character to Frequency preload tables. APPENDIX 2
Page 109, line 23 through Page 109, line 41
Font Bits table. Used for computing bit cost of Font encodings. Page 109, line 43 through Page 110, line 10 Global bits Table. Used to compute the bit cost of NewChar and any other encodings which use the GlobalBits tables. Page 110, line 12 through Page 115, line 25
The GlobalBits High and Low Huffman tables, used as a pair to encode any items such as NewChar, Mode A String length, and Repeat count. Page 115, line 27 through Page 117, line 7
The LengthA encoding tables. Not in use by the preferred embodiment. Page 117, line 9 through Page 117, line 22
The LengthBBits LengthBCode, LengthBValue and LengthBNext tables. LengthBBits and LengthB value are used for encoding the Mode B string length. LengthBNext and LengthBValue are used for decoding Mode B string lengths.
Page 117, line 24 through Page 118, line 32
The ZoneBits, ZoneCode, ZoneNext and ZoneValue tables. ZoneBits and ZoneCode are used for encoding the Zone portion of Mode A and Mode B string location offsets. ZoneNext and ZoneValue are used for decoding the Zone portion of Mode A and Mode B string location offsets. Page 118, line 34 through Page 120, line 3
The precalculated CRC table. Used for rapid CRC hash calculations in the FontAccess Routines. Page 120, line 11
Inclusion of the C19 debugger (soft monitor) file. Page 120, line 15 through Page 121, line 17
Vector jump tables for the C19 hardware vectoring system.

Claims

What is claimed is:
1. A system for the dynamic encoding of a character stream, the system comprising: an input for receiving the character stream; an output for providing encoded data; single character encoding means, connected to the input, for providing, for a given character, an encoded signal indicative of the given character, including a) means, hereinafter referred to as "font means," connected to the input, associated with a character pair, hereinafter referred to as "the given character pair", for storing, accessing and updating for each given character of a plurality of characters, a table listing the set o candidates for the character that may follow the given character pair in the stream, such table hereinafter referred to as a "font"; wherein all the candidates in such font are stored in approximate order of their local frequency of occurrence after the given character pair with which the font is associated; b) font identification means, connected to the input, for identifying the font, hereinafter referred to as the "given font", for that character in the stream at the input; and c) position encoding means for providing, for one given character, a signal indicative of the position, occupied by the given character, in the given font; string encoding means, connected to the input, for providing, for a given string of characters, an encoded signal indicative of the given string of characters, including: a) a history buffer; b) history buffer access means for finding a candidate string in the history buffer; and c) longest match search means for searching for longest match by comparing an object string in the character stream with a candidate string in the history buffer; and output selection means for accepting encoded signals from the single character encoding means and encoded signals from the string encoding means and selectively sending these encoded signals to the output; wherein the font identification means further includes hash encoding means for producing hash codes and hash code storage means for storing hash codes and the history buffer access means further includes means for retrieving hash codes from the hash code storage means, such that a common hash code is used by both the font encoding means and the string encoding means.
2. A system according to claim 1, wherein the hash encoding means includes means for applying a CRC algorithm to an ordered character pair to produce a hash code
3. A system according to claim 1, further including: means for maintaining a value for the position of any character that is not otherwise listed in the font, such character hereinafter referred to as "new character" or "NC", in relation to other candidates in a given font, in approximate order of such new character's local frequency of occurrence after the given character pair; such that new character is assigned a "virtual position" in the font, as distinct from a position that is associated with a location in the font capable of storing a specific candidate character; and such that the address of the position of each candidate character below the new character position in the table is incremented by 1.
4. A system according to claim 3, further including: a plurality of NC fonts, each font listing the candidates for the new character which may follow a given set of characters in the character stream wherein all the candidates in such font are stored in approximate order of their local frequency of occurrence after the given set of characters with which the font is associated; and NC font selection means for selecting the NC font to be used to encode a given new character based on predefined bits from the set of characters preceding the given character in the character stream.
5. A system according to claim 4, wherein the number of NC fonts is four and the predefined bits are bits 5 and 6 from the character prior to the given character.
6. A system according to claim 1, further including: means for maintaining a value for the position of a string in a given font.
7. A system according to claim 1, further including: means for maintaining a value for the position of a new character, i.e., any character that is not otherwise listed in the font, in relation to other candidates in a given font, in approximate order of such new character's local frequency of occurrence after the given character pair; and means for maintaining a value for the position of a string; such that the value for the position of a string is one greater than the value for the position of a new character; such that the string is assigned a "virtual position" in the font as distinct from a position associated with a location in the font capable of storing a specific candidate character; and such that the address of the position of each candidate character below the new character position in the font is incremented by 2.
8. A system according to claim 1, further including: repeat character encoding means for encoding repeat character sequences, i.e. characters all alike, found in the character stream; wherein the history buffer stores characters found in the character stream; and wherein repeat character sequences having three or more characters are represented in the history buffer by three characters only.
9. A system according to claim 3, wherein the string encoding means has a plurality of modes of operation, the system further including: means for summing, over a predetermined number of new character occurrences, the bit-count of the code for each new character encoded; means for comparing the sum with a predetermined value; and switch means for switching modes whenever the bit-count exceeds the predetermined value.
10. A system according to claim 9, wherein the predetermined value has a value between seven bits per character and eight bits per character.
11. A system according to claim 9, wherein the predetermined value is 7.5 bits per character.
12. A system for providing, for a given string of characters, an encoded signal indicative of the given string of characters, comprising: a) a history buffer tagged at regular intervals; b) history buffer access means for finding a candidate string in the history buffer; and c) longest match search means for searching for longest match by comparing an object string in the character stream with a candidate string in the history buffer; d) a hash head table, which may be entered by a hash code derived from consecutive characters; e) a hash link/test table, having a number of records equal to the number of tagged entries in the history buffer, each record having a link field and a test field and an address related to the address of the corresponding tagged entry in the history buffer; wherein the hash head table contains pointers, consisting of part of the hash code, each pointing to the first candidate match in a linked list of candidates in the Hash Link field and the Hash Test field contains a match value consisting of another part of the hash code.
13. A system according to claim 12, wherein the longest match search means includes means for testing for a match beginning at a character in the candidate string at least one character ahead of the first character in such string.
14. A system according to claim 13, wherein the longest match search means includes means for testing for a match beginning at character "n" ahead of the first character of the candidate string in the history buffer, where "n" is the length of the longest match found so far, and searching forward to identify the longest match.
15. A system according to claim 14, wherein the longest match search means further includes means for searching back for the longest match.
16. A system according to claim 12, further including means for discarding string matches having less than a predetermined number of characters.
17. A system according to claim 16, wherein the predetermined number is 3.
18. A system according to claim 12, wherein the linked list is terminated by non-match of the contents of the Hash Test field with its corresponding part of the hash code.
19. A system according to claim 1, further including pair encoding means for encoding two characters by presenting the two characters in sequence to a CRC algorithm.
20. A system according to claim 19, wherein pair encoding processes and string encoding processes may be active at the same time.
21. An improved data compression modem of the type having terminal interface control means for controlling an interface with a terminal, data compression means for compressing data from the terminal, line control means for controlling data flow over a data line, line interface means for interfacing with a data line, wherein the improvement comprises: (a) first processor means for controlling both flow of data over the interface with the terminal and for compressing data from the terminal, and
(b) second processor means for controlling flow of data over the data line.
22. An improved data compression modem of the type having terminal interface control means for controlling an interface with a terminal, data compression and decompression means for compressing data received from the terminal and for decompressing data going to the terminal, line control means for controlling data flow over a data line, line interface means for interfacing with a data line, wherein the improvement comprises:
(a) first processor means for controlling both flow of data over the interface with the terminal and for compressing data received from the terminal, and for decompressing data going to the terminal, and (b) second processor means for controlling flow of data over the data line.
23. An improved data compression modem according to claim 21, wherein the first processor and the second processor access a common memory.
24. A method for dynamically encoding a character stream, in an encoder having a history buffer and fonts, comprising the following steps: a) receiving the character stream; b) creating, from a two-character string, having a first character and a second character, a hash code; c) associating each font with a pair of characters; d) maintaining the position of a candidate character in a font in approximate order of the local frequency of occurrence of the candidate character in the character stream after the pair of characters with which the font is associated; e) encoding a given character using the hash code to access the font associated with the pair of characters immediately preceding the given character in the character stream; and f) encoding a given string of characters using the hash code to access a matching string in the history buffer.
25. A method for dynamically encoding a character stream, in an encoder having a history buffer and having fonts that are dynamically created and updated, comprising: a) receiving the character stream; b) creating, from a two-character string, having a first character and a second character, a hash code having desirable statistical properties and a match code; c) associating each font with a pair of characters; d) maintaining the position of a candidate character in a font in approximate order of the local frequency of occurrence of the candidate character in the character stream after the pair of characters with which the font is associated; e) encoding a given character using the hash code and the match code to access the font associated with the pair of characters immediately preceding the given character in the character stream; and f) encoding a given string of characters using the hash code and the match code to access a matching string in the history buffer.
26. A method for creating, from a two-byte string having a first byte and a second byte, a hash code having desirable statistical properties, comprising: encoding the two-byte string by presenting the two bytes in sequence to a CRC algorithm to produce a CRC hash; and designating selected bits from the CRC hash for use as a hash code.
27. A method for creating, from a two-byte string having a first byte and a second byte, a hash code having desirable statistical properties and a match code for resolving ambiguity, comprising: encoding the two-byte string by presenting the two bytes in sequence to a CRC algorithm to produce a CRC hash; designating selected bits from the CRC hash for use as a hash code; and designating the remaining bits from the CRC hash for use as a match code.
2δ. A method according to claim 27, wherein ten bits are selected for use as a hash code.
29. A method for accessing a specific font within a data processing system, the system having a link table and a plurality of fonts, each font being uniquely associated with a specific character pair, comprising: accepting a pair of characters, having a first character and a second character, each character represented by a single byte; encoding the pair of characters using a CRC algorithm to produce a CRC hash; selecting a first part of the CRC hash as a look-up code; linking, in the link table, those fonts that are associated with pairs of characters whose encoding produces the same first part of the CRC hash; entering the hash table with the look-up code to access a linked list of fonts; and identifying, from among the fonts in the linked list, the specific font corresponding to the pair of characters, by matching the remainder of the CRC hash.
30. A method according to claim 29, wherein the method of encoding the pair of characters includes: encoding the two-byte string by presenting the two bytes in sequence to a CRC algorithm to produce a CRC hash; and designating selected bits from the CRC hash for use as a hash code.
31. A method according to claim 29, wherein the first part of the CRC hash consists of ten bits.
32. A method, for accessing a specific pair of characters in a history buffer within a system for the dynamic encoding of a character stream, the system having a history buffer containing characters from the character stream, and a link table, comprising: accepting a pair of characters from the character stream, hereinbelow referred to as "the given pair of characters", each pair having a first character and a second character, each character represented by a single byte; encoding the given pair of characters using a CRC algorithm to produce a CRC hash; selecting a first part of the CRC hash as a look-up code; linking, in the link table, history buffer entry points that have pairs of characters in the history buffer whose encoding produces the same first part of the CRC hash; entering the hash table with the look-up code to access a linked list of history buffer entry points; and identifying, from among the history buffer entry points in the linked list, points corresponding to the given pair of characters, by matching the remainder of the CRC hash.
33. A method according to claim 32, wherein the method of encoding the given pair of characters using the CRC algorithm to produce the CRC hash comprises: encoding the byte representing the first character using the CRC algorithm to produce an intermediate CRC hash; encoding the second character using the CRC algorithm and the intermediate CRC hash to produce the CRC hash.
34. A method according to claim 32, wherein the first part of the CRC hash consists of ten bits.
35. A method, for accessing a specific sequence of four characters in a history buffer within a system for the dynamic encoding of a character stream, the system having a history buffer containing characters from the character stream, and a link table, comprising: accepting four consecutive characters, hereinbelow referred to as "the given four characters", comprising a first pair of consecutive characters and a second pair of consecutive characters, each pair having a first character and a second character, each character represented by a single byte, from the character stream; encoding the given four characters using a CRC algorithm to produce a hash code; selecting a first part of the hash code as a look-up code; linking, in the link table, history buffer entry points that have four sequential characters in the history buffer whose encoding produces the same first part of the hash code; entering the link table with the look-up code to access a linked list of history buffer entry points; and identifying, from among the history buffer entry points in the linked list, points corresponding to the given four characters, by matching the remainder of the hash code.
36. A method according to claim 35, wherein the method of encoding the given four characters using a CRC algorithm to produce a hash code comprises: encoding the first pair of characters by presenting the characters in sequence to a CRC algorithm to produce a first pair CRC hash; encoding the second pair of characters by presenting the characters in sequence to a CRC algorithm to produce a second pair CRC hash; subtracting the second pair CRC hash from zero to produce a negated second pair CRC hash; and performing an Exclusive OR operation on the first pair CRC hash and the negated second pair CRC hash to produce a hash code.
37. A method according to claim 35, wherein the first part of the hash code consists of ten bits.
38. A method for controlling the selection of alternative string encoding modes in a system for the dynamic encoding of a character stream, comprising: maintaining a set of fonts, each font being associated with a pair of characters, wherein all the candidates in such font are stored in approximate order of their local frequency of occurrence after the given character pair with which the font is associated, the fonts further including means for maintaining the position of a symbol for a new character, i.e., any character that is not otherwise listed in the font, in relation to other candidates in a given font in approximate order of such symbol's local frequency of occurrence after the given character pair; maintaining a new character encoding table; encoding new characters from the character stream. according to the position of the new character in the new character encoding table; summing the bit cost of encoding each new character over a predetermined plurality of new character occurrences; comparing the sum with a predetermined value; and switching modes whenever the bit-count exceeds the predetermined value.
39. A method for encoding a character pair, within a system for the dynamic encoding of a character stream, the system having a link table and a plurality of fonts, each font being uniquely associated with a specific character pair, comprising: accepting a pair of characters, having a first character and a second character, each character represented by a single byte; encoding the pair of characters using a CRC algorithm to produce a CRC hash; selecting ten bits of the CRC hash as a look-up code; linking, in the link table, those fonts that are associated with pairs of characters whose encoding produces the same first part of the CRC hash; entering the hash table with the look-up code to access a linked list of fonts; identifying, from among the fonts in the linked list, the specific font corresponding to the pair of characters, by matching the remaining six bits of the CRC hash; and encoding the pair of characters as the relative address of the identified font.
40. In a system for dynamic encoding of a character stream having a link table and a plurality of fonts, each font being associated with a unique, ordered character pair, the system encoding a given character by means of the font associated with the pair of characters immediately preceding a given character in the character stream, a method for maintaining fonts that are most recently used, comprising: accepting a pair of characters, having a first character and a second character, each character represented by a single byte; encoding the pair of characters using a CRC algorithm to produce a CRC hash; selecting a first part of the CRC hash as a look-up code; linking, in the link table, fonts that are associated with pairs of characters whose encoding produces the same first part of the CRC hash; entering the hash table with the look-up code to access a linked list of fonts; identifying, from among the fonts in the linked list, the specific font corresponding to the pair of characters, by matching the remainder of the CRC hash; and discarding a font, when a font must be discarded, whose associated character pair was least recently encountered in the character stream.
41. A method for use in a data processing system for finding an object string within a data string comprising: presenting the object string to a CRC algorithm to produce an object string hash code; presenting each of a plurality of candidate strings within the data string to a CRC algorithm to produce a candidate string hash code for each candidate string; identifying candidate strings whose hash code matches the object string hash code; testing a candidate string, whose hash code matches the object string hash code, for a match with the object string.
42. A method for finding the longest match between an object string in a stream of characters and candidate strings in a buffer, comprising: comparing a character in the object string with a character in a first candidate string; comparing, if the prior comparison yields a match, each next character in the object string with each next character in the first candidate string until the comparison fails to yield a match; storing the number of characters so matched as the length of the longest match; comparing a character in the object string with a character in a second candidate string, starting at a character ahead of the origin of each string by a number of characters substantially equal to the length of the longest match.
PCT/US1991/005659 1990-08-09 1991-08-08 Compounds adaptive data compression system WO1992002989A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US56515590A 1990-08-09 1990-08-09
US565,155 1990-08-09

Publications (1)

Publication Number Publication Date
WO1992002989A1 true WO1992002989A1 (en) 1992-02-20

Family

ID=24257433

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1991/005659 WO1992002989A1 (en) 1990-08-09 1991-08-08 Compounds adaptive data compression system

Country Status (1)

Country Link
WO (1) WO1992002989A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0643491A1 (en) * 1993-08-02 1995-03-15 Microsoft Corporation Method and system for data compression
CN116032292A (en) * 2023-03-27 2023-04-28 山东智慧译百信息技术有限公司 Efficient big data storage method based on translation file

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4730348A (en) * 1986-09-19 1988-03-08 Adaptive Computer Technologies Adaptive data compression system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4730348A (en) * 1986-09-19 1988-03-08 Adaptive Computer Technologies Adaptive data compression system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
MINI MICRO SYSTEMS. vol. 21, no. 2, February 1988, BOSTON US pages 77 - 81; BACON: 'How to quadruple dial-up communications efficiency' see page 79, middle column, last paragraph page 81, right column, last paragraph; figure 2 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0643491A1 (en) * 1993-08-02 1995-03-15 Microsoft Corporation Method and system for data compression
US5521597A (en) * 1993-08-02 1996-05-28 Mircosoft Corporation Data compression for network transport
CN116032292A (en) * 2023-03-27 2023-04-28 山东智慧译百信息技术有限公司 Efficient big data storage method based on translation file

Similar Documents

Publication Publication Date Title
JP2610084B2 (en) Data expansion method and apparatus, and data compression / expansion method and apparatus
US5572206A (en) Data compression method and system
US5652878A (en) Method and apparatus for compressing data
US5016009A (en) Data compression apparatus and method
US5532694A (en) Data compression apparatus and method using matching string searching and Huffman encoding
US5506580A (en) Data compression apparatus and method
US5126739A (en) Data compression apparatus and method
JP2713369B2 (en) Data compression apparatus and method
US5140321A (en) Data compression/decompression method and apparatus
CA1290061C (en) Text compression and expansion method and apparatus
US5737733A (en) Method and system for searching compressed data
US6218970B1 (en) Literal handling in LZ compression employing MRU/LRU encoding
US5229768A (en) Adaptive data compression system
JP2863065B2 (en) Data compression apparatus and method using matching string search and Huffman coding, and data decompression apparatus and method
JPH07509823A (en) Single clock cycle data compressor/decompressor with string reversal mechanism
WO1986000479A1 (en) Data compression apparatus and method
JPH04500571A (en) Methods and apparatus for encoding, decoding and transmitting data in a compressed state
JPH04502377A (en) data compression
US5216423A (en) Method and apparatus for multiple bit encoding and decoding of data through use of tree-based codes
WO2001086818A2 (en) Lzw data compression and decompression apparatus and method using grouped data characters to reduce dictionary accesses
US5610603A (en) Sort order preservation method used with a static compression dictionary having consecutively numbered children of a parent
JP4156381B2 (en) Method and apparatus for data compression implemented by a character table
WO1992002989A1 (en) Compounds adaptive data compression system
Ong et al. Compression of Chinese text files using a multiple four-bit coding scheme
US20030102988A1 (en) Data compression method and apparatus utilizing cascaded subdictionaries

Legal Events

Date Code Title Description
AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IT LU NL SE