WO1992002989A1

WO1992002989A1 - Compounds adaptive data compression system

Info

Publication number: WO1992002989A1
Application number: PCT/US1991/005659
Authority: WO
Inventors: Francis L. Bacon; Ernest R. Price
Original assignee: Telcor Systems Corporation
Priority date: 1990-08-09
Filing date: 1991-08-08
Publication date: 1992-02-20

Abstract

A system for the dynamic encoding of a character stream has a single character encoder that includes a plurality of fonts, a string encoder that includes a history buffer, and an output selector that compares encodings from the single character encoder and the string encoder and selects the least cost encoding for output. The single character encoder generates and stores hash codes used for font access and the string encoder retrieves these same hash codes and uses them for history buffer access.

Description

COMPOUND ADAPTIVE DATA COMPRESSION SYSTEM

DESCRIPTION

Technical Field The invention relates to the field of data compression systems and particularly to apparatus and methods for compressing data signals and reconstituting the data signals.

Background Art Data Compression System Requirements

Data compression systems are known in the prior art that encode a stream of digital data signals into compressed digital signals and decode the compressed digital data signals back into the original data signals. Data compression refers to any process that converts data in one format into another format having fewer bits than the original. The objective of data compression systems is to reduce the amount of storage required to hold a given body of digital information or to increase the speed of data transmission by permitting an effective data transmission rate that is greater than the rated capacity of a given data communication link. Compression effectiveness is characterized by the compression ratio of the system. Compression ratio is herein defined as the ratio of the number of bits in the input data to the number of bits in the encoded output data. The larger the compression ratio, the greater will be the reduction in storage space or transmission time.

In order for data to be compressible, the data must contain redundancy. Compression effectiveness is determined by how effectively the compression procedure matches the forms of redundancy in the input data. In typical computer stored data, e.g. English text, computer programs, arrays of integers and the like, redundancy occurs both in the nonuniform usage of individual symbols, e.g. characters, bytes, or digits, and in frequent recurrence of symbol sequences, such as common words, blank record fields, and the like. An effective data compression system should respond to both types of redundancy: A typical data stream contains both types of redundancy in varying portions resulting in varying statistics. An example of a data stream of varying statistics is a data stream wherein

"normal" English text is immediately followed by a computer program, for example source code in the "C" programming language.

To be of practical and general utility, a digital data compression system must possess the property of reversibility, i.e. it must be possible to reexpand or decompress the compressed data back into its original form without any alteration or loss of information. The decompressed and the original information must be identical and indistinguishable with respect to each other. In addition, it should satisfy several performance criteria. First, the compression effectiveness should be high, and therefore the compression ratio should be large. Second, the system should provide high data rate performance with respect to the data rates provided by and accepted by the equipment with which the data compression and decompression systems are interfacing. For real time, switched network, data communications applications, preferably the rate at which data should be compressed should match the output data rate from the compression system. Because it should match the output (compressed) rate, it should be higher in proportion to the compression effectiveness, typically 6:1. The higher the compression effectiveness, the faster the input data must be processed to provide sufficient output data to fully utilize the capacity of the output channel. Thus high data rate performance of data compression processing is necessary to match the line speed of today's communication systems and the compression effectiveness of modern data compression methods. The data rate performance of data compression and decompression systems is typically limited by the time required to perform the processing steps associated with encoding each incoming character, which in turn is limited by serial processing and the speed of the compression processor. High performance for a given compression processor is achieved by a compression method that uses fewer processing steps, on average, to encode each incoming character. The fewer processing steps, the higher the performance. However, complex methods are needed to achieve high compression effectiveness for data streams of varying statistics. Such methods tend to increase the number of processing steps and therefore tend to reduce data compression processing performance.

Third, the system should be adaptable, that is, capable of achieving high compression effectiveness and high performance on data having a variety of statistical characteristics. Many prior art data compression procedures require prior knowledge of the statistics of the data being compressed. Some prior art procedures adapt to the statistics of the data as it is received. Adaptability in the prior art processes has either been limited to a narrow range of variation e.g. character-by-character encoding or has required a high degree of complexity with resultant severe penalty in data rate performance. The requirement for data compression systems suitable for use in modems in high speed data communication links is to accommodate a wide range of data characteristics without prior knowledge of data statistics and achieve both high compression ratios and high data rate performance. Data compression and decompression systems and modems currently available are either not adaptable over a wide range of data characteristics or are severely limited in compression efficiency or data-rate performance and so are not suitable for general purpose usage.

Finally, the system should be responsively adaptable, that is, capable of reestablishing a high compression ratio quickly after the beginning of a new data file from a stream of data files, wherein each file has different statistical properties from the data in the immediately proceeding file. Prior Art Systems

U.S. Patent 4,612,532 to Bacon et al., which is hereby incorporated herein by reference, discloses a system for adaptive compression and decompression of a data stream designed to compress redundancy resulting from non-uniform usage of individual symbology. The Bacon invention uses an adaptive character-by-character compression technique wherein dynamically updated "followset" tables having Huffman codes are used to encode characters, using, on average, far fewer bits per character than is required by ASCII or EBCDIC encoding. Each incoming character is encoded using information from the three preceding characters (character type, character type, character identity), i.e. (two bits, two bits, seven bits). Thus, for each incoming character, information from the three preceding characters is used to select the appropriate followset table. The Bacon invention has a high compression efficiency on a character-by-character basis and achieves high performance by using fewer processing steps, on average, to encode each character than other character-by- character encoding techniques.

U.S. Patent 4,558,302 to Welsh discloses a string search system designed to compress redundancy resulting from frequent recurrence of symbol sequences. The Welsh invention includes a compressor which compresses a stream of data character signals into a compressed stream of code signals. The compressor stores strings of data character signals parsed from the input data stream and searches the stream of data character signals by comparing the stream to the stored strings to determine the longest match. Having found the longest match, the compressor stores an extended string comprising the longest match plus the next data character signal following the longest match and assigns a code signal thereto. A compressed stream of code signals is provided from the code signals corresponding to the stored longest matches.

U.S. Patent 4,464,650 to Eastman et al. discloses an adaptive string search system designed to compress redundancy resulting from frequent recurrence of symbol sequences. The Eastman invention uses the Lempel-Ziv algorithm to encode strings of characters without constraint on the length of the input or output word. However, ^the Eastman invention suffers the disadvantage of requiring numerous RAM cycles per input character and utilizing time consuming and complex mathematical procedures such as multiplication and division to effect compression and decompression. These disadvantages tend to render the

Eastman invention unsuitable for on-line data communications applications.

U.S. Patent 4,730,348 to McCrisken discloses a system for adaptive compression and decompression of a data stream using a combination of techniques to compress redundancy from non-uniform usage of individual symbols and frequent recurrence of symbol sequences. The McCrisken implementation uses an adaptive character-by-character compression technique described as "bigram encoding" based on "pruned tree" Huffman and "running bigrams" to compress redundancy resulting from non-uniform usage of individual symbology. As part of his adaptive character-by-character compression technique, McCrisken uses a plurality of encoding tables, on-line analysis of compression efficiency, an on-line table builder, a table changer and a table change code to permit rapid adaptation to changes when compressing data streams having varying statistics. McCrisken also uses a history buffer and a string substitution technique which identifies and further compresses matching strings of up to eighteen characters to compress redundancy resulting from frequent recurrence of symbol sequences. Both techniques are adaptive and therefore do not need prior knowledge of data statistics. In a preferred embodiment, some of the data stream is encoded on a character-by-character basis and some of the data stream is encoded with a string substitution code. McCrisken also uses protocol emulation and packet size control to improve performance. The McCrisken character-by-character compression technique has a low compression efficiency and a poor data rate performance compared to the Bacon method. This is partly because the encoding tables of McCrisken's character-by-character compression technique are updated on the basis of on-line explicit analysis of compression efficiency and this technique is very inefficient compared with the transposition heuristic used by Bacon to update his followset tables. McCrisken's use of a string substitution technique compensates to a large extent for the low compression efficiency and poor performance of the adaptive updating of the McCrisken encoding tables. However, the processing required to perform the search for the longest list in the McCrisken is time-consuming and the search is limited, in McCrisken's preferred embodiment, to the first twenty items in the list. Also, because of the McCrisken string substitution code, the longest matching string that can be encoded as such is eighteen characters long (column 14, lines 13-18) . Because of these disadvantages, McCrisken does not achieve as good a compression ratio as the Eastman implementation of the Lempel-Ziv algorithm which uses no character-by-character encoding of any kind. Furthermore, because of its complexity the McCrisken implementation is inherently slow. James A. Storer, in his book Data Compression: Methods and Theory, Computer Systems Press, 1988, which is hereby incorporated herein by reference, discusses methods and theories pertaining to lossless data compression over a noiseless channel with serial I/O. Storer describes a family of character-by-character techniques (p.20) and notes (p.21) that (i) the performance of Huffman codes has been well studied and can serve as a useful benchmark on which to judge the effectiveness of more complex methods and (ii) for several applications it will be useful to combine more sophisticated techniques with Huffman codes. A dynamic Huffman codes method is discussed (p.40) in which "tries" (special tree structures - see p.15) are built dynamically and maintained based on characters appearing in the data stream. Storer describes the "unseen leaf" (the equivalent of "new character" in the Bacon patent) but does not describe the floating position characteristic of Bacon's "new character." Higher order Huffman codes are described (p.44) along with the "transposition" heuristic (p.45), correctly attributed to Bacon (p.52) .

Storer discusses in detail three on-line textual substitution methods (p.54), all of which use dynamically updated local dictionaries. The three methods are the sliding dictionary method, the improved sliding dictionary method and the dynamic dictionary method. The local dictionary of strings is stored in a "trie" structure (p.15) which is a tree where the edges are labeled by elements of the alphabet in such a way that children of a given parent are connected via edges that have distinct labels, all leaf nodes are labelled as "marked", and all internal nodes are labeled as either "marked" or "unmarked". The set of strings represented by a trie are those that correspond to all root to marked node paths. The sliding dictionary method (p.64) contains within its local dictionary all strings contained within a portion of the source string defined by a "sliding window" technique well known (but used for other purposes) in data communications systems. This method is similar to the method using a history buffer described by McCrisken except for the method of storing pointers to strings. It is a practical realization of the first of two universal data compression algorithms proposed by Lempel and Ziv designated by Storer (p.67) as LZ1. The LZ1 algorithm works as follows. At each stage, the longest prefix of the (unread portion of the) input stream that matches a substring of the input already seen is identified as the current match. Then a triple (d, 1, c) is transmitted where d is the displacement back to a previous occurrence of this match, 1 is the length of the match, and c is the next input character following the current match (the transmission of c is pointer guaranteed progress) . The input is then advanced past the current match and the character following the current match. The sliding dictionary method can be viewed as a practical implementation of LZl that uses fixed size pointers; instead of remembering the entire input stream the system remembers only a fixed number of characters back, and instead of pointer guaranteed progress, the system uses dictionary guaranteed progress by reserving codes for the characters of the alphabet. The improved sliding dictionary method (p.67) contains a heuristic that eliminates duplicate strings. It too requires that the alphabet be added initially in the local dictionary. Storer also suggests using Huffman coding of output pointers. The dynamic dictionary method (p.69) uses update and deletion heuristics that maintain a collection of strings that do not, in general, form a contiguous portion of the input stream. Various update and delete heuristics (i.e. mechanisms which provide learning capability) are described which are used to implement the methods. Both the improved sliding dictionary method and the dynamic dictionary method create and maintain a dictionary that is different from the history buffer of McCrisken. Apart from the heuristic for locating the longest match (Storer's "greedy match heuristic") most of the heuristics described by Storer are directed to the maintenance of pointer sets for the special dictionaries. Difficulties encountered by the use of heuristics such as "pruning" to remove "dead strings" relate also to the special nature of these dictionaries. Storer's experimental data shows that sliding dictionary methods provide significantly better compression ratios than Huffman coding methods especially on spread-sheet data; the improved sliding dictionary method provides a higher compression ratio by 1% to 2% over the sliding dictionary method; and the best performance of the dynamic dictionary methods is better than the best performance of the sliding dictionary and the improved sliding dictionary methods. Storer textual substitution methods provide compression ratios of typically between 3-to-l and 2-to-l on English text and between 5-to-l and 2.5-to-l on programming language text. U.S. Patent No. 4,876,541 to Storer discloses and claims the AP (all-prefixes) heuristic, modifications of the LRU (least recently used) strategy, limited look ahead, and the use of the MaxChildren parameter.

Textual substitution methods achieve higher compression ratios with large files and dictionaries. However, as the files and dictionaries grow, so too does the time taken to access and update them. Storer, in his patent, describes a string search data compression system that uses a sliding dictionary that is stored as a tree ("trie") structure. This approach provides fast access to dictionary entries but updating the tree structure loads the processor heavily so Storer uses sophisticated update heuristics. McCrisken describes a string search data compression system that uses a history buffer. The McCrisken approach provides fast updating of the history buffer but, in this case, string matching loads the processor heavily. McCrisken resolves this with arbitrary cut-off of his search process. Practical on-line, prior art, textual substitution techniques are thus limited by the trade-off between the size of the files and dictionaries on the one hand and the speed of the access algorithms and update heuristics on the other. To the extent that access and update processing can be done more efficiently, i.e. faster, then larger files and dictionaries can be maintained with a corresponding improvement in compression ratios for a given data rate. Disclosure of Invention

The invention provides a system for the dynamic encoding of a character stream. A preferred embodiment of the system comprises a single character encoder which includes a plurality of fonts, a string encoder which includes a history buffer, and an output selector which compares encodings from the single character encoder and the string encoder and selects the least cost encoding for output. The single character encoder generates and stores hash codes which it uses for font access. The string encoder retrieves these same hash codes and uses them for history buffer access. The hash codes are generated by applying a CRC algorithm to a character pair and are given the name "CRC hash". The single character encoder maintains a position in a font for all characters not otherwise listed in the font, such characters herein called "new character", and four tables are maintained for the encoding of such characters. The single character encoder also maintains a position in a font for a symbol representing a string, which position directly follows the position of new character in the font. Three or more consecutive like characters are represented in the history buffer by three characters only. A pair encoder is provided that encodes character pairs using the font number. The pair encoder may be active at the same time as the string encoder. Two string encoding modes are provided. A switch controls activation and deactivation of string search processes based on a comparison of the average bit cost of new character encoding with a predetermined value. A hash-link/hash-test table is provided in the string search encoder having entries corresponding to every second character position in the history buffer. This table uses properties of the CRC hash to access matching strings in the history buffer. String match testing starts "n" characters beyond the current character where "n" is the length of the longest match found so far. Accordingly, the string search encoder, in addition to searching forward, also searches back. The string search encoder discards a string match that has less than a predetermined number of characters. Linked lists of pointers to candidate strings are maintained and the end of the linked list is determined using a property of the CRC hash. Brief Description of the Drawings

Fig. 1A is a block diagram and overview of the main buffers, tables and processes of the preferred embodiment of - li ¬

the present invention.

Fig. IB shows the two phases of the encoding process of Fig. 1A.

Fig. 2A illustrates the loading of the data stream into the CC buffer.

Fig. 2B illustrates the relationships among the encoding buffers, tables and processes of the preferred embodiment of the present invention.

Fig. 3 illustrates the fonts used in the font encoder. Fig. 4A shows the global (Huffman) font encoding tables.

Fig. 4B shows the Huffman Tables used for encoding New Character and for encoding String Length in Mode A.

Fig. 4C shows the Huffman Tables used for encoding String Length in Mode B.

Fig. 4D shows the Huffman Tables used for encoding Zone Code in Mode A and Mode B.

Fig. 5 illustrates the font access tables used in the font encoding process. Fig. 6 illustrates the generation and use of the CRC hash.

Fig. 7A illustrates the new character encoding process. Fig. 7B illustrates the Pair Encoding, Mode A process. Fig. 7C illustrates the String Encoding, Mode A process.

Fig. 8A illustrates the use of the history buffer access tables for mode A string encoding.

Fig. 8B illustrates the use of the history buffer access tables for mode B string encoding. Fig. 9 shows the start points for string searches. Figs. 10A and 10B show the decoding logic. Fig. 11 shows the dual processor configuration. Fig. 12 shows the prior art processor configuration. Fig. 13 shows a conventional two-processor configuration.

Detailed Description of Specific Embodiments The present invention in a preferred embodiment combines a novel adaptive font encoding single-character compression technique with a repeat character compression technique and several novel string encoding compression techniques. It includes an adaptive font encoding process that is an improved version of the efficient, high performance font encoding process disclosed by Bacon et al. in U.S. Patent 4,612,532. It includes several novel string encoding processes. It further includes a novel data compressibility trending function which is used to select the most effective encoding process according to the compressibility of the data. The font encoding process and the string encoding process of a preferred embodiment share memory and processes associated with the generation of a novel "CRC hash" using a CRC algorithm, a portion of the CRC hash being used as a hash code for font and dictionary addressing and another portion being used for identification. The present invention achieves superior compression ratios and superior performance over the prior art described above. A copy of the source.code of the preferred embodiment of the present invention, expressed in the assembly language of the Rockwell C-19 processor, is attached hereto as Appendix 1. A guide to the source code listing is given in Appendix 2. A general overview of a preferred embodiment of the system is shown in Fig. 1A. The system provides full duplex operation and it is generally divided into an encoder 1 and a decoder 2 such that each contains its own set of buffers (encoder: PC In Buffer 4, Process Buffer 5, History Buffer 6, and Modem Out Buffer 7; decoder: Modem In Buffer 8,

History Buffer 9, and PC Out Buffer 10), character fonts 11 and 12 and access tables 13 and 14. Both the encoder and the decoder are operated by control software 3 that runs on a single, shared Rockwell C-19 processor. Fig. 1A shows the main tasks performed by the encoder software (Load Process Buffers, Do Font Encoding, Update Fonts, Do String Encoding, Select Least-Code Encoding, Update History Buffers, and Format and Output) 15 and the decoder software (Receive Bit Stream, Interpret Escape Codes, Decode Single Characters and Strings, Load PC Out Buffer, Update History Buffer, and Update Fonts) 16. Fig. IB shows the two phases of encoding. Phase 1 processes (steps 1-10) 17, including Loading Process Buffer, Doing Font Encoding and Repeat Character Encoding and Updating Font, are performed once for each character of input. Phase 2 processes (steps 11-20) 18 including String Encoding, Selecting Least-Cost Encoding, Formatting For Output, and Updating Buffers are performed, typically, when the process buffer is full. Test 19 following Phase 1 is "Process Buffer Full or Flush". Test 20 following Phase 2 is "Flush and Process Buffer not Empty". String encoding includes string encoding mode A, or string encoding mode B which combines string encoding with pair encoding. The decoder performs corresponding decoding processes.

The character stream 20 enters the CC buffer as shown in Fig. 2A. The CC buffer consists of ECChar 21 which contains 256 bytes representing the most recent characters from the data stream and ECCharCopy 22 which contains an identical copy of the content of ECChar. ECCharCopy is provided to^" remove the necessity for boundary checking in the string matching process. Fig. 2A shows string continuation for searching 23 extending into ECCharCopy. ECCharCopy is contiguous with ECChar in memory. Fig. 2A also shows the next store location in ECChar 24 and in ECCharCopy 25, and old data 26.

The ECChar and ECCharCopy buffers are two of nine process buffers, shown in Fig. 2B, which operate in parallel and share input and output pointers. These buffers are used by the font encoding and string encoding processes.

Fonts 31 are shown in Fig. 3. Fig. 3 shows a table of fonts 31 having 1024 font numbers 32, an FTLink field 33, an FTMatch field 34, an FTNC field (NewCharPosition) 35, an FTSize field 36, and Font Character fields (6 per Font max) 37. Huffman encoding tables are shown in Figs. 4A-4D. Fig. 4A shows global (Huffman) font encoding tables including an Access Table 41 having an index 42, a Font Code table 43 and a Font Bits table 44. Fig. 4B shows the Huffman Global Code (Frequency) Tables, used for encoding New Character and for encoding String Length in Mode A. The tables have 256 Table Entries, a Code Length of 4-13 bits and are referenced as "Global Code High; Global Code Low" in the source code. Fig. 4C shows the Huffman Tables used for encoding String Length in Mode B. These tables have 10 table entries, a code length of 1-6 bits, and are referenced as "LengthBCode" in the source code. Fig. 4D shows the Huffman tables used for encoding Zone Code in Mode A and Mode B. These tables have 32 table entries, a Code Length of 2-7 bits and are referenced as "ZoneCode" in the source code. Font access tables 51 along with a font table 31 and an input data stream 52 are shown in Fig. 5. The font access table include a CRC Hash Table 53 having CRC Hash 54, a MatchVal data 55 and RoughAdr data 56. The font access tables also include an FTRough Table 59 having an index 57 and FTRough data 58. CRC(^Λv)=2963, CRC(in)=05D6 and CRC(^Λd)=7DD6 provide entry points 501, 502 and 503 respectively to the CRC Hash Table from the input data stream. The history buffer and history buffer access tables used for string search are shown in Figs. 8A and 8B.

The entire compound encoding process includes: 1. Repeat character encoding;

2. Font (single-character) encoding;

3. Monitoring compressibility of data stream;

4. Selecting encoding processes dynamically (mode A or mode B) ; 5. String encoding (longest match. Mode A) ;

6. String encoding (longest match. Mode B) ;

7. Pair encoding;

8. Anti-expansion process (mode B only) ;

9. Selecting and concatenating encodings having fewest bits.

These processes will now be described in detail starting with font encoding. Font Encoding

In a preferred embodiment of the present invention, font encoding uses a set of fonts having character symbols stored in approximate order of the frequency of occurrence of such character after the occurrence of a pair of characters with which the font is associated. For example, if the input data stream contained the words "this" and "those", then a font would exist associated with the pair of characters "th" and the font would contain the letter "i" and the letter "o". A single font consists of pointers, links, characters, etc. whose selection (font number) is based on the prior two characters in the input stream and which contains a list of historically occurring candidate characters to be matched with Encoder Current Character. Fig. 3 shows an array of fonts. A single font is illustrated by a single row. "New Character", i.e., any character that is not otherwise listed in the table, is also assigned a position in the table in approximate order of such characters local frequency of occurrence after the occurrence of a pair of characters with which the table is associated. "New Character", is hereinbelow referred to as "NewChar" and sometimes abbreviated as "NC". Just as the occurrence of a particular character in the data stream is a font encoding event, so the occurrence of NewChar is a font encoding event. NewChar is a font encoding event wherein either the Encoder Current Character is not found in the selected font or the selected font does not exist. The value of NewChar Position is a dynamic value in the range of 0 through n (where n is the maximum number of characters per font) meaning "Character Not in Table". NewChar does not occupy a character position in the font: it is assigned a "virtual position". Fig. 3 shows how the position of NewChar is stored in field FTNC in the font. In mode A, each font includes a virtual position for a string directly following the NewChar position. In mode B, each font includes a virtual position for a "pair encoding" directly following the NewChar and includes another virtual position for string encoding following the pair encoding. Font Encoding, CRC Hash and Font Access

Font access tables are shown in Fig. 5. Fig. 6 shows how the hash pointer (RoughAdr) and the match value (MatchVal) are derived from the CRC hash.

Font encoding includes the following steps:

1. Computing a CRC hash using a CRC algorithm applied to the prior two characters;

2. Using a portion of the CRC hash (RoughAdr) as a rough selector for a linked list of fine entries and using the remaining portion of the CRC hash (MatchVal) to identify a font;

3. Determining and storing the position of the current character in the selected font; 4. Selecting a global Huffman table according to the current size of the font. FTSize from Fig. 5 is used to enter the Access Table of Fig. 4A. The Font Encoding process occurs once for each character of input data. Fig. 6 shows the data stream 61 including the current character to be encoded "N" and its two predecessors "P" and "S". Encoder Current Character "N" is the most recent character from the input stream which is being processed by the font encoder. At the end of each encoder cycle "Encoder Current Character" becomes CharlPrior and the fetch and encoding process continues with the next character from the input stream as the new Encoder Current Character. In the example of Fig. 6, in the input data stream, 61, Encoder Current Character is "N", CharlPrior (character immediately prior to Encoder Current Character) is "P" and Char2Prior is "S".

After the initial value of the CRC hash is seeded to zero, the CRC hash for the two prior characters ("S" and "P") is created as follows. A CRC function (CCITT polynomial xl6 + xl2 + x5 +1) is performed on the character S and then on P yielding 65 a sixteen-bit CRC result (64 see Fig. 6) (herein below referred to as "the CRC hash" indicative of its function in the present invention) . The CRC hash is used as follows: a) The ten least significant bits of the CRC hash are extracted and stored as RoughAdr (62 see Fig. 6) for use as a hash pointer. b) The six most significant bits of the CRC hash are extracted and stored as MatchVal (63 see Fig. 6) to be used as a match value with the selected font, c) The CRC hash is also stored for later use in constructing hashes for string encoding.

The CRC hash has two very important properties: i) Its ten least significant bits provide a hash code having excellent statistical properties for use in hashing. ii) The sixteen-bit result produced by every possible two-byte combination is unique. No two-byte combination shares a sixteen-bit result with another two-byte combination so the sixteen-bit result may be used to provide one-to-one mapping with the original two bytes.

Accordingly, the ten least significant bits may be used as a hash code to access a table and the remaining six bits may be used to test if this is the specific font assigned to that exact character pair. The CRC hash is used in font encoding and for history buffer access in string encoding mode A and string encoding mode B. It provides significant benefit in reducing the average amount of processor time consumed in accessing the fonts and history buffer, thereby enabling a given processor to handle higher encoding throughput rates. The use of the CRC hash, as described herein below, by virtue of the throughput rate benefits, also provides a practical realization of trigram font encoding. The combination of the CRC hash and MatchVal will always identify uniquely the font associated with the prior two characters.

We found experimentally that the use of all sixteen bits of the prior two characters to identify a font gives an 8%-12% improvement in font encoding compression efficiency on "normal" English text when compared with the Type/Type/Prior Character method described in U.S. Patent 4,612,532 to Bacon et al. We also found experimentally that use of ten bits from the CRC hash, in the manner described hereinabove, produces less synonyms and therefore reduces execution time. This benefit is achieved because less time is spent linking through the fonts via the FTLink fields (see Fig. 5) . Huffman Encoding Tables

Fig. 4A shows a set of global Huffman tables and the associated access table. The Access Table 41 is indexed by Font Size 42 and contains pointers to the several Huffman tables for Font Code 43 and Font Bits 44 (the bit cost of the encoding) . The Access Table is "Encoding Table" in the source code. Index 0, and the corresponding Font Code (1,0) and Font Bits (1,1) are not used. ECFontlndex is computed and stored during font encoding. Later, during string encoding, FontBits is retrieved and, during the output process, FontCode is retrieved. Figs. 4B, 4C and 4D each show a single Huffman table. Fig. 4B shows the table used for new character encoding, for string length encoding and for repeats encoding. Fig. 4C shows the tables used for string length, mode B encoding. Fig. 4D shows the table used for the zone portion of string address encoding, mode A and mode B.

Font Encoding, Example 1. Finding the Current Character in the Current Font

Referring now to Fig. 5, let us consider the encoding of the following string:

"^ΛVeni, Vidi,^ΛVinci.^ΛA^Λdo"

In this string the caret character "^A" has been substituted for the space character " " to reduce ambiguity. Fig. 5 shows the static state of the Font Encoding and Access Tables directly after processing the string. Beginning at an initial state having empty fonts, the process of encoding the first character proceeds as follows. 1. Initialization and Assignment of the first Font. As described above, each new character to be encoded is associated with a CRC hash. The ten least significant bits of the CRC hash 56 are used as a pointer to the ECFTRoughTable 59 (Encoder Font Rough Table) . Since all fonts are empty at the outset, the ECFTRoughTable is initially null indicating the need for new font creation. A font number is assigned and stored in the ECFTRoughTable in the position pointed to by the hash ("000" in the example given in Fig. 5) . This font number is either the next available not-in-use font or an old font selected as described later.

The newly created font is initialized as follows: FTLink = NULL FTMatch = MatchVal from CRC calculation

FTNC = 0 (Most frequent)

FTSize = 1

First Font Character = Encoder Current Character Other Font Characters = N/A Following table reset, the first character to be processed is the "^Λ". The prior two characters and the CRC are assumed to be 0. Thus a MatchVal and ten-bit RoughAdr of 0 are used. This points to FTRoughTable entry number 0 (which was initially null) and font number 1 was assigned. Font number l was initialized as specified above and has not changed since, as indicated by Fig. 5.

2. Finding the Current Font and the Current Character in the Font a. The current font is accessed as follows. When "e" becomes the current character, a CRC hash is performed on

" " and "V". The result is hexadecimal 2963 (third row of the hash table in Fig. 5) giving a MatchVal of 28 and a RoughAdr of 163. b. The RoughAdr of 163 is used to enter the FTRough Table and yield the font number 0003, the font to be tested to determine if it is the font ^»*v". c. To test if the selected font is the font "^ΛV", MatchVal is compared with the FTMatch from the selected font. If these are equal, the font is searched for the occurrence of Encoder Current Character. 3. Storing the Current Character a. If the current character is found, its position, the size of the font and other pertinent data are stored in the process buffers for later use by the encoding selection process. The character matching Encoder Current Character is promoted towards the top of the table (higher frequency) by exchange with the next higher frequency entity (character or NC) . b. If the current character is not found, it is added to the table in the next available position (overwriting the last character when the table is full) and the table size is incremented (if not full) . The NC value is promoted one position towards the top of the table unless already at the top (highest frequency) .

Font Encoding, Example 2. Finding the Current Font Using the Link Table If FTMatch does not equal MatchVal, FTLink is examined. If FTLink is null, then the Ftlink field is assigned the next font number and the flow joins step 1 above for the creation of a new font. If FTLink is not null, control proceeds to FTMatch comparison in step 2 with the FTLink field as the new font number. Linking and match comparison repeat until either the desired font is found or a new one is created.

The last line of the input data stream in Fig. 5 details the "o" character from the sequence "A^Ado". The calculated CRC hash for "*d" is 7DD6 which yields a MatchValue of 7C and a RoughAdr of 1D6. Note that the sequence "in", seven characters earlier, produced a CRC hash of 05D6, MatchValue of 04 and RoughAdr of 1D6. Access to entry 1D6 in the ECFTHashRough Table yields a pointer to font number OOOC but comparison of the FTMatch field in font OOOC does not equal the desired value of 7C. At that point in time, the FTLink field of font number OOOC was set to NULL. Consequently, font number 0013 was assigned, set to initial state and the character "o" was added to it. A future occurrence of the sequence " d" can link to font number 0013 via font number OOOC and search for or add characters as required.

Font Encoding, Example 3. New Character

When a character is encountered in the data stream that does not appear in the font defined by its prior two characters, it is encoded using one of four frequency encoding tables.

Fig. 7A shows the encoding of character "w" which follows, in the character stream 701, "No". As shown in Fig. 7A, looking at Font (No) 702, "w" does not appear, and NC 703 = 2, indicating that "New Character" has a virtual position between the position of "v" and the position of "n" in the font. Also Font (No) contains four characters so SZ 704 = 4. The two Global Font encoding Tables shown in Fig. 7A 710 are two of the tables from Fig. 4A, corresponding to font size SZ = 4 (from Font (No)) + 2 (for NC and ST in mode A) 713 or SZ = 4 + 3 (for NC, PE and ST in mode B) 714. Mode A font size = (SZ) + 2. Mode B font size - (SZ) + 3. Position "2" 715 in these tables yields 709 the bit string "000" in the Global Font Encoding Tables 710 for either Mode A 711 or Mode B 712. String "000" will be transmitted by the encoder and will be recognized by the decoder as the "new character escape". This will indicate to the decoder that the next bits to be received are the encoding of a new character. In a preferred embodiment, there are four NC to Frequency Encoding tables 705, identified as 00, 01, 10 and 11. Bits 5 and 6 706 from the prior character (in this example "o", and "o" = 6F in hexadecimal) are used to select one of these four NC to frequency tables (in this example NC to FreqTable 11 707) . The binary value of "w" (77 in hexadecimal, 708 in Fig. 7A) is used to enter the selected NC to Frequency table, yielding a position (or frequency) of 15, which defines an entry into the Global Code High/Low Table 716. This table in turn, yields the Huffman code 01111, the font encoding of new character "w" following "No". The output bit stream sequence 717 is therefore 000 (font) followed by 01111 (frequency) . The use of four tables, instead of the one table described in U.S. Patent 4,612,532 to Bacon et al, is found to improve compression efficiency. Of course, more or less than four tables could be used. Process Buffers

The process buffers, shown in Fig. 2B, consist of nine "First In/First Out" buffers 201-209, each having 256 locations, which operate in parallel and share input and output pointers. These buffers are used by the font encoding and string encoding processes. Fig. 2B shows the flow of font encoding data among the process buffers and various tables. The contents and significance of the several buffers are as follows:

The ECChar buffer 203 contains the most recent 256 characters from the input stream to be encoded. Characters are received singly from the input stream, placed in rotation in ECChar, font encoded, and later string encoded. Least-cost selection and output formatting follow. The value range^' of ECChar is 0 - 255.

The ECCharCopy 202 buffer contains an exact copy of the ECChar buffer. It is contiguous with ECChar to facilitate string searching. The value range of ECCharCopy is 0 - 255. ECType 209 is a steering value which is set by the font encoding and/or the string encoding process. ECType is used by the output format process to control the output bit stream. ECType may have any one of the following values: 0 - String or pair continuation (the second or subsequent character of a mode A string or a mode B string or the second character of a pair encoding) . 2 - Font encoding. The encoding is the relative offset of the character in the selected Font.

4 - New character. 6 - First character of a pair encoding. 8 - First character of a string encoding. ECFontlndex 207 is the zero relative index into the FontCode or FontBits tables for this character. By using the value of ECFontlndex as an index, either the encoding size in bits or the actual encoding bit pattern can be accessed quickly. The value range of ECFontlndex is 2 - 43 as shown in Fig. 4A.

ECFrequency 208 is the frequency value of the character. It is obtained by using the character as an index into the NC to FreqTables (Fig. 7A) . The value range of ECFrequency is 0 - 255.

ECHashRawO 2040 contains the eight least significant bits of the CRC hash computed from the prior two characters in the input stream. The value range of ECHashRawO is 0 - 255. Data is shown in hexadecimal in Fig. 2B.

ECHashRawl 2041 contains the eight most significant bits of the CRC hash computed from the prior two characters in the input stream. The value range of ECHashRawl is 0 - 255. ECHashX20 2050 contains the eight least significant bits of zero relative font number multiplied by two. This value is maintained for quick access to the ECFTHashNext table. The value range of ECHashX20 is 0 - 254, even numbers. Data is shown in hexadecimal in Fig. 2B. ECHashX21 2051 contains the eight most significant bits of zero relative font number multiplied by two. This value is maintained for quick access to the ECFTHashNext table. The value range of ECHashX21 is 0 - ((MaxFontTable-l)*2)/256) . ECNewIndex 206 is the zero relative index into FontCode or FontBits representing the New Character position in this Font. The value of ECNewIndex is derived from Font Size and font-relative new character position. (During font encoding, ECNewIndex is computed and stored. Later, during string encoding, FontBits is retrieved and, during the output process, FontCode is retrieved. See Fig. 4A.) Similarly for pair encoding and/or string escapes, the value of ECNewIndex is incremented by 1 or 2 and the bit cost or pattern quickly determined. The value range of ECNewIndex is 2 - 41.

ECRepeats 201 is the count of repeats of this character beyond two. That is, the two prior characters are the same as this one. The buffer pointer will not advance as long as subsequent input characters remain the same and ECRepeats is less than or equal to 255. The value range of ECRepeats is 0 - 255. Font Encoding Process Flow

Font encoding process flow is shown in Fig. IB, first phase, steps 1 through 10. Font encoding and font update processing are performed in steps l through 10. This series of steps occurs once for each character of input. In this process, known as "refill", a character is added to the process buffer and the current input pointer is advanced by one. The steps (shown in Fig. 2B as SI, 82, S3, etc. corresponding to step 1, step 2 step 3, etc.) are as follows: 1. A character from the input stream is fetched and stored in the current input ECChar field 210. 2. The same character is stored in the current

ECCharCopy field 211. (The relationship of ECChar and ECCharCopy is shown in Fig. 2A) . 3. The current character is compared with the two prior characters in the input stream. If equal, the ECRepeats field is incremented (e.g. 212 in Fig. 2B) and, if the ECRepeats field is less than or equal to 255, flow proceeds to step 1 above. This loop insures that no more than three consecutive like characters are stored in the history buffer (except when the number of consecutive like characters exceeds 258) . 4. The CRC hash is computed on the two prior characters in the input stream (as described under

"Font Encoding, CRC Hash and Font Access" hereinabove) and the result is stored in the low and high bytes of EChashRaw 213 for later use.

5. The appropriate font 214 is accessed or created (as described hereinabove) . If the font exists, the font number from FT Rough Table 215 is stored in the ECHashX2 table high and low bytes (217 and

216) and the character fetched in step 1 above is looked up in the font 31. If the font is created (new font) , ECHashX2 is set to NULL.

6. Using SZ (the number of characters in the font) from the accessed font, the Access Table of Fig.

4A 41 is accessed for a pointer 218 to be used as an index value. Neither the FontCode or the FontBits tables are used at this time.

7. The index value fetched in step 6 is added to the NC (NewChar position) 219 from the font accessed in step 5. The result 220 is stored in the ECNewIndex for later use as a NewChar or String Escape. If the current character (from step 1) was not found in the accessed font, the ECType field is set to 4 denoting a NewChar encoding.

8. If the current character (from step 1) was found in the accessed font, the raw position 221 of that character in the font is added to the index value 222 fetched in step 6 and the result 223 stored in the current ECFontlndex field. If the character position is greater than or equal to the NC (NewChar) value from the font, the ECFontlndex field is incremented by two if in Mode A and 3 if in Mode B allowing for the virtual positions of the NewChar, Pair Encoding and/or String Escapes.

The ECType field is set to 2 224 denoting that the character was found in the font, a "Font Encoding".

9. If in Mode B or if the current character (from step 1) was not found in the font, the appropriate one of four ECNCFrequency tables 225 (selected from bits 5 and 6 of the immediately prior character) is selected (Fig. 7A) . The frequency value 226 corresponding to the current character is fetched from the selected table and stored in the current input position of the ECFrequency field 227. This is for later use as a new character encoding or for 8-bit output in antiexpansion mode. 10. The current input pointer 228 into the process buffer is incremented by one. If the number of characters in the process buffer array is now 256 or, the Font Trending Switch changed from Mode A to B (or vice versa) , or a timer-initiated flush occurs, flow proceeds to step 11 below for string processing and output. Otherwise flow proceeds to step l above.

Steps 11 through 20, including string search, least cost encoding selection and output are described hereinbelow under "Second Phase Processing". Font Reallocation: As input context changes, old fonts go out of use and new ones are created. Since there is a limit to the number of practical (actual) fonts in a preferred embodiment (e.g. 1024) , a method for reassigning fonts is required. In the preferred embodiment this is a circular (low to high then back to low) replacement heuristic. An alternative embodiment may also include a "less recently used" heuristic. The next three paragraphs describe the combination. (The source listing of Appendix 1 details the circular heuristic only) . Since the fonts are linked in chains starting at FTRoughTable and forward-only linked via FTLink, the circular reallocation process points into the FTRoughTable advancing from 0 through 1023 and back to 0. The selected font, and subsequently linked fonts (if any) as indicated by FTLink are examined for potential reuse.

Each time a font is accessed by the previously described Font Encoding Process, an unused bit of the FTHashNext field is set to 1 indicating activity. As the reallocation process traverse the fonts, it will reset the activity bit if it is set and link to the next candidate font. If the activity bit is reset, the font will be reallocated as a new font. By the use of the single activity bit, any given font has the opportunity to survive permanently provided that it is used at least once per pass of the reallocation search process.

For example, referring to Fig. 5, assume that the main reallocation pointer is pointing to the FTRoughTable at hexadecimal 1D6. The FTLink field of font OOOC will be examined for the activity bit. Assuming it to be reset, font OOOC will be the next assigned for a new font. This is done by copying the contents of the FTLink field (in this case 0013) into the FTRoughTable at 1D6 thus freeing font OOOC. The reallocation pointer is moved to 0013 for use in the next allocation cycle. String Encoding

The string encoder of the present invention uses a circular history buffer to store a sliding dictionary. The history buffer is a dictionary of all the strings it contains. String encoding may operate in one of two modes, mode A (using the tables in Fig. 8A) for use on relatively compressible text or mode B (using the tables in Fig. 8B) for use on less compressible text. In both modes, string encoding is designed to achieve near-optimum compression efficiency under the time constraints of on-line operation. The history buffer is tagged at regular intervals and, in a preferred embodiment, is tagged every second character position. The string encoder of the present invention also uses a novel dictionary access structure having a set of tables for accessing the history buffer. Updating the history buffer involves very little processing because it involves no more than accepting the next character and incrementing a pointer. However, updating the dictionary access structure is as challenging a problem as updating the sliding dictionary in string encoding systems which store the sliding dictionary as a tree structure. The present invention addresses this problem by the use of a novel history buffer access method. The method is based on the structure of the history buffer access tables as shown in Figs. 8A and 8B and it retrieves and uses the same CRC hash codes created and used in the font update process during font encoding. Accordingly, by use of this method, updating of the dictionary access structure is faster and requires less processing than updating a tree structure would require.

The use of a tagged history buffer provides additional benefit for accessing and matching strings. String encoding mode A, using a tagged history buffer, locates longer strings in a shorter time than earlier methods. While the process searches the same number of candidates, the process encounters shorter linked lists in the access buffers than would otherwise occur. Processing time spent building and searching access tables is beneficially reduced.

The history buffer/dictionary access structure, in a preferred embodiment, includes a history buffer and access tables. The history buffer and the access tables shown in Figs. 8A and 8B are used by the string encoding process of mode A and the string encoding process of mode B respectively. Both Figs. 8A and 8B show an ECRR (History) Buffer (1 byte wide) 81 with a Next Available Buffer

Position 82 and an ECRR Suffix (256 positions) 83. Both Figs, show an ECRR Hash Head Buffer (2 bytes wide) 84. Both Figs, show an ECRR Hash Buffer containing an ECRR Hash Link portion (2 bytes wide) 85 and an ECRR Hash Test Portion (2 bytes wide) 86. Both Figs, show the derivation 87 and 88 of the CRC hash used as entry to the ECRR Hash Head Buffer 84. Both string encoding mode A and string encoding mode B use the CRC hash created earlier during font encoding and stored in the ECHashRaw table (see Fig. 2B) . However, each of these processes uses the CRC hash in a slightly different way. String encoding mode B uses the CRC hash (a hash based on two consecutive characters) directly. String encoding mode A uses a novel algorithm (which includes the CRC hash) to create a hash based on two consecutive pairs of characters (four consecutive characters) as illustrated by the following example for the four characters "THEY": "TH" [CRC hash] yields XXXX (16 bits)

"EY" [CRC hash] yields YYYY (16 bits)

XXXX θ (0-YYYY) yields ZZZZ (16 bits) where ® is exclusive OR, 0-YYYY is zero minus YYYY and ZZZZ is the resultant hash. String Encoding. Mode A

Every second character position in the history buffer is tagged and the tags are used to index the string search process. Each tagged position has corresponding Hash Link and Hash Test field. String encoding for mode A includes the following steps:

1. Set LookAhead = 3 (Fig. 9 shows a Data Stream 91, a History Buffer ECRR 92 with a Next Available Buffer Position 93, A CC Buffer 94 with a Current Character 95, and a Pointer "p" 96. The pointer 96 is shown for Mode A to have a First Start Point for String Search 901 displaced 3 characters from the position of the current character and a Second Start Point for String Search 902 dispaced 2 characters from the position of the current character.) Set pointer p to CCBuffer pointer

(ECNextChar pointer in Fig. 2) plus a number of characters equal to LookAhead.

2. Create the hash for the string of four characters starting at the "p"th character as described hereinabove.

3. Use the least significant eleven bits of the hash (ZZZZ in the example above) as a pointer (e.g. 1811 in Fig. 8A) to enter ECRR Hash Head Table of Fig. 8A. Set pointer "h" to the first potential match by using the contents of ECRR Hash Head field (e.g. 7300 in Fig. 8A) to point to the most recent four-character string in the history buffer, starting at a tagged location, that hashes to that same hash. 4. Find the longest match: a) Set n = 3 b) Set x = 0 c) Compare (one character at a time) the character at (p + n - x) in the CC buffer with the character at (h + n - x) in the history buffer, incrementing x by 1 until x = n or no match. The "fast reject step" is when x = 0. d) Increment n by 1 and compare the character at (p + n) in the CC buffer with the character at (h + n) in the history buffer until no match.

Continue to search for the longest match as follows. Use pointer "h" to enter the ECRRHashLink table (at 7300 in Fig. 8A) . Reset pointer "h" from the content of the ECRRHashLink table so that pointer "h" points to the next most recent four-character string in the history buffer (7284 in Fig. 8A) . In each search, using steps b through d above, begin comparing characters for match starting at character n, where n is the length of the current longest match. Continue until the end of the linked list, as indicated by a non-match of the hash with the corresponding entry in the ECRR Hash Test field or, to prevent looping, until MaximumASearches (eight in the preferred embodiment) have been performed.

Store length and location of longest match if n (length of longest match) > 3. 5. Backmatch, as follows, to maximize the length of the string: a) First time through (LookAhead = 3) , check until no match: character preceding 1st character, the character preceding that and then the character preceding that (the current character) . b) Within repeat steps (from step 6, LookAhead = 2) check until no match: character preceding first character and then the character preceding that (the current character) .

6. Repeat steps 1-5 with LookAhead = 2.

7. Select from the outputs of steps 5a and 5b the string which: a) is the longest; b) if the strings are equal, the one that is most recently stored. The following advantages follow from the structure and method of string encoding mode A: a) History buffer update processing time is reduced when the history buffer is accessed at fewer entry points than every character. In the preferred embodiment, the history buffer update processing time is reduced by a factor of two because the history buffer access table update takes place every second character instead of every character. b) String search processing time is reduced when the history buffer is accessed at fewer entry points than every character. In the preferred embodiment, the linked list to be searched is, on average, only one-half the size it would otherwise be (the list is drawn from a population of candidates only one-half the size it would otherwise be) . c) Less memory is required for the ECRR Hash- Link/Hash-Test Table because, in the preferred embodiment, it is only one-half the size it would otherwise be. d) The end of the linked list is determined dynamically by comparing the current hash code with the content of the ECRR Hash Test field. Thus the need to maintain end of list pointers or link length pointers or the like is eliminated. Because the end of the linked list is determined dynamically, no maintenance is required for the overwritten string. e) Non-matches are eliminated faster and with fewer processing steps because each search starts at p + n. This "fast reject" technique ensures that the candidate string is rejected immediately if it cannot be at least one character longer than the previous longest match. String Encoding. Mode B ^~~

Every second character position in the history buf er is tagged and the tags are used to index the string search process. Each tagged position has corresponding Hash Link and Hash Test fields. String encoding for mode B includes the following steps:

1. Set pointer p to CCBuffer pointer + 1 (Fig. 9 shows a Data Stream 91, a History Buffer ECRR 92 with a Next Available Buffer Position 93, a CC Buffer 94 with a Current Character 95, and a Pointer "p" 96. The pointer 96 is shown for Mode

B to have a First Start Point for String Search 903 displaced 1 character from the position of the current character and a Second Start Point for String Search 904 coincident with the position of the current character) .

2. Retrieve the CRC hash from ECHashRaw (Fig. 2B) for the string of two characters starting at the "p"th character.

3. Use the least significant eleven bits of the (16 bit) CRC hash as a pointer (2048 positions) to enter ECRR Hash Head Table (0012 in Fig. 8B) . Set pointer "h" to the start of the first potential match by using the contents of ECRR Hash Head field (0006 in Fig. 8B) to point to the most recent two-character string in the history buffer, starting at a tagged location that hashes to a CRC hash that has the same least significant eleven bits (ZQ in Fig. 8B) .

4. Find the longest match having three or more characters: a) Compare the character at (p -1) in the CC buffer with the character at (h - 1) in the history buffer and terminate if no match. This is the "fast reject step". b) Set n = 0 c) Compare (one character at a time) the character at (p + n) in the CC buffer with the character at (h + n) in the history buffer, incrementing n by 1 until no match. Continue to search for the longest match as follows. Use pointer "h" to enter the ECRRHashLink table (at 0006 in Fig. 8B) . Reset pointer "h" from the content of the ECRRHashLink table so that pointer "h" points to the next most recent two-character string in the history buffer (3750 in Fig. 8B) . In each search, use steps 4a through 4c above (or steps 5a through step 5c below) . Continue until the end of the linked list, as indicated by a non-match of the hash with the corresponding entry in the ECRR Hash Test field or, to prevent looping, until MaximumBSearches (sixteen in the preferred embodiment) have been performed. Store length and location of longest match if n (length of longest match) > 2.

5. Set p to CCBuffer pointer (the current character) and repeat steps 2 through 4, using the following steps a, b and c instead of steps 4a, 4b and 4c to find the longest match: a) Compare the character at (p + 2) in the CC buffer with the character at (h + 2) in the history buffer and terminate if no match.

This is the "fast reject step". b) Set n = 0 c) Compare (one character at a time) the character at (p + n) in the CC buffer with the character at (h + n) in the history buffer, incrementing n by 1 until no match.

6. Select from the outputs of step 4 and step 5 the string which: a) is the longest; b) if the strings are equal, the one that is most recently stored. String Length Encoding

String lengths are encoded differently for mode A string encoding and mode B string encoding.

In mode A, string lengths are encoded using the GlobalHigh/Low table. Further, the encoding is slightly different depending upon the method of string escape, i) If the escape follows creation of a font, MinimumAString (which is 6) is subtracted from the actual length of the string and the result is used to index the GlobalHigh/Low table, ii) If the escape follows an old (existing) font, MinimumAString (which is 6) is subtracted from the actual length of the string, four is added, and the result is used to index the GlobalHigh/Low table. This latter operation is because the bit pattern 11, which begins the first four entries in the GlobalHigh/Low table is reserved to signify a Pair Encoding. The selected Huffman pattern from the GlobalHigh/Low table is placed into the output stream. String length encoding, mode A, old font, is illustrated as the second operation in Fig. 7C.

In mode B, string lengths are encoded by subtracting MinimumBString (which is 3) from the actual length of the string. If the result is less than 9, the LengthBCode table is used to encode the string length. If the result is greater than or equal to nine, the further escape 0010 is output, an additional nine is subtracted from the result above, and the new result is used to index the

GlobalHigh/Low table. The selected Huffman pattern from the GlobalHigh/Low or LengthBCode table is placed into the output stream. String Pointer Encoding String pointer encoding for both mode A and mode B proceeds as follows:

The history buffer location of the first string character is subtracted from the Next Buffer Store location. Buffer wraparound, if any, is corrected such that the result is the displacement from the found string to the Next Buffer Store location and is in the range 0 through BufferSize-1. Note that strings closest in recent history (newer) have lesser displacements than do older strings. Example 1. (using 8192 character buffer)

Decimal Hexadecimal Next Store location 3152 0C50 Found string location 1511- 05E7-

1641 0669 Example 2. (using 8192 character buffer) Next Store location 0052 0034 Found string location 8157- 1FDD-

8105- 1FA9-

Correction 8192+ 2000+

String Displacement 87 57

With the BufferSize in the preferred embodiment selected as 8192, the calculated displacement can be expressed in thirteen bits.

The displacement is further broken into two components. A) A zone portion from the most significant five bits. B) An offset portion from the least significant eight bits. In the proper string encoding context (i.e. after appropriate string encoding escapes) the five bit zone is Huffman coded using the ZoneCode table and the eight bit offset is inserted directly into the output stream. Thus the Zone may be encoded using from 2 to 7 bits depending upon zone value with the strings closest in recent history getting favorably shorter encodings. Example. Hexadecimal Binary

String Disp 0057 0000 0000 0101 0111

I I I I 1 I I I I I I I I I I I I

I OOOO OOOO = Offset = 57 i

Z ZZZZ = Zone = 0 String offset encoding is also illustrated as the third and fourth operations in Fig. 7C.

Fig. 7C illustrates string encoding, mode A, and shows a Character Stream 751 with a character string beginning with "w" 752, Font (No) 742 and Global Font Encoding Table 743 yielding, for a font size value SZ =4, at entry point 3 (3 = String Escape = NC + 1) 744, a Font String code 001 724. Fig. 7C shows a History Buffer 753 having a character string beginning with "w" at location 933 (hexadecimal) 754. Fig. 7C shows that a string of nine characters 752 in the character stream match the nine characters in the history buffer beginning at location 933 754. The "Global Code" or Global Frequency Encoding Table 745 is entered at entry point 7 (9 - 6 + 4 = 7) 755 to create a Length Code of 1011 756. The string location 933 (hexadecimal) 754 is subtracted from the location of the Next History Buffer Location 1201 (hexadecimal) 757 to yield 8CE (hexadecimal) whose 13 least significant bits 758 comprise the displacement which is broken into two components: i) a zone portion from the most significant five bits 759 and ii) an offset portion from the least significant eight bits 760. The five bit zone portion is Huffman coded using the Zone Code table 761 and the eight bit offset is inserted directly in the output stream. The Output Bit Stream Sequence 752 includes 1st: Font (001), 2nd: Length (1011), 3rd: Zone (01001) and 4th: Offset (11001110). Minimum String Length and Search Advance

In both mode A and mode B string encoding, the string search process discards matches having less than a predetermined number of characters, the predetermined number being greater than the hash length. Thus, we define a minimum string length. The minimum string length can be greater than the hash length and it is advantageous to make it so. In mode A the hash length is 4 and the predetermined number is 6. In mode B the hash length is 2 and the predetermined number is 3. Setting a lower limit on the length of the string reduces the bit-cost of encoding longer strings because the top (shortest code) entry into the Huffman table is used to represent a string of the minimum length. On completion of a string search, if no match is found, a predetermined number of characters (3 if mode A and 1 if mode B) are released (in font encoded or pair encoded form) and the search pointer is advanced by a corresponding number of positions before the next search. Pair Encoding

Pair Encoding is a novel method for encoding character pairs. Up to 1024 fonts, those associated with recently encountered character pairs, are maintained in memory principally for the purpose of font encoding. Pair encoding takes advantage of the unambiguous one-to-one mapping between the input character pairs and the fonts effected using the CRC hash and the MatchVal. Since there are 1024 fonts maximum, ten bits (2¹⁰ = 1024) may be used to encode any of the character pairs that these fonts represent. Thus, other than escape bit sequences, ten bits is all that is required to encode many character pairs. Assuming an average escape sequence of three bits, the resulting thirteen bit encoding compares quite favorably with the sixteen bits for two uncompressed characters especially in computer binary codes files (eg .COM and .EXE).

In addition to the fonts and access structure maintained by the encoder, the decoder maintains a table of the actual two characters which are associated with each font. Thus it can do a direct lookup when directed by the encoded bit stream.

Example. Refer to Fig. 7B which shows an input character stream 731 with a character pair ^««w " 732, Font (No) 722 (font address hex 195) and a Global Font Encoding Table 723 yielding, at entry point 3 725, a Font String Code 001 724. Assume that the character pair "w " (lower case w and caret) has occurred previously in the input character stream and has font number 215 (hexadecimal) assigned to it by the font encoding process. The sequence "w^A" 732 has occurred again following "No" in the input stream and is next to be processed for output by the encoder. After determining that the pair "w^A" exists, and that Pair Encoding is the least cost, the encoder, entering the Global Font Encoding Table 723 of Font Size 6 (Font Size = SZ + 2 for Mode A) at NC+1 (String Escape 724) , emits the String Escape "001" from font "No" 735, followed by the Pair Encoding Escape "11", 736 and the ten bit value "1000010101" (from binary of hex 215, the font number of "w^A") 737 creating output bit stream 738. Font Escapes

A font escape is a bit encoded sequence which serves as a signal from the encoder that the subsequent item is to be treated differently from that normally expected. A font encoded sequence that signifies that a NewChar follows in the data stream is an escape. It is used as an Escape to signal GlobalNC encoding. Another escape is String Escape. This is a bit sequence specifically to condition the decoder for reception of a string. When used in the context of a Font encoding/decoding, String Escape has a value equal to NewChar Escape + 1 when String "A" mode is active. In string mode B the font has 3 escapes: 1) New character. Value - Font NC. 2) Pair encoding. Value *-*•^* Font NC + 1. 3) String mode B. Value = Font NC + 2 Other escapes are described under Detail of Specific Encodings hereinbelow. Second Phase Processing Second Phase Processing, steps 11 through 20, includes string search, least cost encoding selection, formatting and output. Throughout these steps, the pointer into the process buffer is the current output pointer which is from 1 to 256 characters behind (older than) the current input pointer.

11. According to the state of the mode switch, the correct string search routine is invoked, String

Search Mode A or String Search Mode B.

12. If a less than a minimum length string (3 if mode B and 6 if Mode A) is found in step 11, proceed to step 15. Otherwise, the bit cost of the string is computed by summing the costs of String Escape,

String Length, Zone Code and the String Offset of 8, as follows: a. Fetch the ECNewIndex value corresponding to the first character of the string and add 1 if Mode A or 2 if Mode B. Use the result to access the FontBits section of the Global Font Encoding Table of Fig. 4A. The retrieved value from FontBits is the bit cost for the String Escape. b. If Mode A is active, subtract 6 and add 4 to the string length and use this result to access the GlobalBits table. If Mode B is active, subtract 3 from the string length and use this result to access the LengthBBits table. This is the bit cost for the string length. c. Subtract the position of the first character of the string from the next history buffer store location and divide the result by 256 giving the zone. Using the computed zone, access the ZoneBits table. This is the bit cost for the Zone encoding. d. The bit cost of the String Offset is 8. e. Add items a through d. This sum is the total bit cost of the string.

13. Compute the bit cost of equivalent font encoding for each position corresponding to a character in the string using step a or b below. Subtract this bit cost from the total from step 12. If underflow (the result goes negative) at any point, exit step 13 since the string encoding wins over the font encoding. If all corresponding positions are examined without underflow occurring, font encoding has a lesser or equal bit cost and will be used so proceed to step 15. a. If the ECType field is 4, use the ECNewindex field to access the FontBits table for the bit cost of NewChar Escape. Use the ECFrequency field to access the GlobalBits table for the bit cost of the NewChar. b. If the ECType field is 2, use the ECFontlndex field to access the FontBits table for the bit cost of a font encoding. 14. If string encoding wins as indicated in step 13, change the ECType field corresponding to the first character of the string to an 8 (denoting String Encoding) and then change the ECType field corresponding to all remaining characters of the string to a 0 (denoting string continuation) . Set UpdateLength to string length. Proceed to step 19. 15. Examine the ECHashX2 field corresponding to the character at the current output position + 2. If NULL (the font exists in the encoder but does not yet exist in the decoder) proceed to step 18, otherwise compute the cost of a Pair Encoding as follows: a. Fetch the ECNewIndex value corresponding to the current output position and add 1. The result is used to index the FontBits table. This is the bit cost for the Pair Encoding Escape. b. Add 10 to the result of step a. This is the total Pair Encoding cost. 16. Compute the bit cost of equivalent font encoding for each of the two characters in the pair (current output position and current output position +1) using step a or step b below. Subtract this bit cost from the total in step 15.

If underflow (the result goes negative) at any point, exit step 16 since the pair encoding wins over the font encoding. If the two positions are examined without underflow occurring, font encoding has a lesser or equal bit cost and will be used so proceed to step 18. a. If the ECType field is 4, use the ECNewindex field to access the FontBits table for the bit cost of NewChar Escape. Use the ECFrequency field to access the GlobalBits table for the bit cost of the NewChar. b. If the ECType field is 2, use the ECFontlndex field to access the FontBits table for the bit cost of a font encoding. 17. If pair encoding wins as indicated in step 16, change the ECType field corresponding to the first character of the string to a 6 (denoting Pair Encoding) and then change the ECType field corresponding to the next character of the pair to a 0 (denoting string/pair continuation) . Set

UpdateLength to 2. Proceed to step 19.

18. Set UpdateLength to 1. This is to be a font or NewChar encoding.

19. Access the ECType field at current output position, format and output the bit sequences illustrated in Figs. 10A through 10D. Access the ECRepeats field at current output position, if greater than 0, output the repeat count using the GlobalCode (High and Low) table. Add each output character to the history buffer and associated access tables. Increment current output position, decrement UpdateLength. Repeat step 19 while UpdateLength is greater than 0. 20. If a Flush or Mode change operation is in process, repeat steps 11 through 19 until the process buffer is empty (current output position equals current input position) . Otherwise proceed^~to step 1. Compressibility and Encoding Process Switching

The following process is found to provide a useful measure of data compressibility. Every forty-eight new characters (i.e. characters not found in the font associated with the previous two characters) , the cumulative bit-cost of encoding the previous ninety-six such characters is compared with a preset value. It is of no consequence to this process that such character might later be encoded as part of a string.

Every forty-eight NewChars (which may be more than forty-eight input characters) , the current sum in NCBitsNew is added to the previous forty-eight character sum from NCBitsPrior and the result compared to the constant 96 * 7.5 (representing 96 characters at 7.5 bits per character). If there are less than 96 * 7.5 bits in the result, the Compressibility Trending Switch is turned OFF (or remains OFF). If the result is 96 * 7.5 or greater, the Compressibility Trending Switch is turned ON (or remains ON) . After the calculation, the current NCBitsNew is stored in NCBitsPrior in preparation for the next cycle forty-eight NewChars later.

If the compressibility trending switch is on, the following are in effect: l. Font encoding is active.

2. String mode B is active.

3. Pair encoding is active.

4. Anti-expansion mode is active.

If the compressibility trending switch is off, the following are in effect:

1. Font encoding is active.

2. String mode A is active. 3. Pair encoding is active.

4. Anti-expansion mode is inactive. Detail of Specific Encodings

The several encodings produced by the present invention, in addition to font encoding (NewChar, Pair and String) are shown in Tables 1A through ID below. Table 2 provides the key to the data in these tables.

Preconditions Old Font, Mode A Old Font, Mode B

Preconditions Old Font, Mode A New Font, Mode A New Font, Mode A Old Font, Mode B New Font, Mode B New Font, Mode B

Mode B, Antiexpan

Table IB - New Character Encodings

Preconditions Escapes 10 Bit Font Number Old Font, Mode A SEA 11 bb bbbb bbbb New Font, Mode A 01 0 bb bbbb bbbb Old Font, Mode B PEB bb bbbb bbbb New Font, Mode B 11 0 bb bbbb bbbb Table 1C - Pair Encodings

Table ID - String Encodings

Key to Tables 1A through ID bb bbbb bbbb A ten bit number representing the font number with which the encoded pair is associated.

A single bit emitted to comprise the second of two bits which serve as a prefix to the

PPP encoding. ffff ffff Eight bits representing a character frequency in the range 0 - 255. hhh hhhh Seven bits representing a character frequency in the range 128 - 255. iii iiii Seven bits representing a character frequency in the range 0 - 127. oooo oooo The eight least significant bits of the buffer (relative to the Next Buffer Store Location) displacement of the first character of a string. Used with a ZZZ encoding to identify a string position.

EEE A Huffman pattern from the FontCode table from one to four bits in length encoding a value not equal to the Font NewChar or Font NewChar plus one and representing the font relative position of the encoded character in the Font.

FFF A Huffman pattern from the FontCode table from one to five bits in length encoding a value not equal to the Font NewChar, Font NewChar plus one, or Font NewChar plus two and representing the font relative position of the encoded character in the Font. GGG A Huffman pattern from the GlobalHigh/

GlobalLow table from four to thirteen bits in length, in mode A, encoding a value from 0 to 249 and representing a string length of 6 -

255 characters; in mode B, encoding a value from 0 to 243 and representing a string length of 12 - 255 characters. LLL A Huffman pattern from the LengthBBits table, from one to six bits in length, encoding a value from 0 to 8 and representing a string length from three to eleven characters. NES A Huffman pattern from the FontCode table from one to four bits in length encoding a value equal to the Font NewChar and representing a NewChar Escape. NNN A Huffman pattern from the GlobalHigh/

GlobalLow table from four to thirteen bits in length, encoding a value from 0 to 255 and representing a character frequency.

PEB A Huffman pattern from the FontCode table from two to four bits in length encoding a value equal to the Font NewChar plus one and representing a Pair Encoding Escape, Mode B. PPP The remainder of a Huffman pattern from the

GlobalHigh/GlobalLow table, excepting the first two bits which are emitted separately, from two to eleven bits in length, encoding a value (in consideration of the prior two bits) from 0 to 255 and representing a character frequency. SEA A Huffman pattern from the FontCode table from two to four bits in length encoding a value equal to the Font NewChar plus one and representing a String Escape, Mode A.

SEB A Huffman pattern from the FontCode table from three to five bits in length encoding a value equal to the Font NewChar plus two and representing a String Escape, Mode B. SSS A Huffman pattern from the GlobalHigh/

GlobalLow table, excepting the first four entries (those beginning with 11) , from four to thirteen bits in length, encoding a value from 4 to 255 and representing a string length of 6 - 253 characters. ZZZ A Huffman pattern from the ZoneCode table, from four to thirteen bits in length, encoding a value from 0 to 31 and representing the five most significant bits of the string displacement. Used with the oooo oooo, described above to identify a string position in the history buffer.

Table 2 - Key to Encodings of Tables 1A through ID

Selecting and Assembling Encodings

The least-cost encoding is built by selecting the encoding that has the fewest bits. If the bit cost of the two encodings are the same, font encoding is chosen.

If string encoding is selected, there may be up to three prefix characters not included in the string (e.g., the current character to the character immediately prior to the beginning of the string) . Any such prefix characters are font encoded or pair encoded and their code is transmitted ahead of the string encoding. Anti-expansion

Whereas it is possible_^ for certain data streams to exhibit very little patterning, data expansion is a possible outcome of font encoding and string encoding systems. To counter this possibility, a running computation of the output bit count for Mode B minus 8 (bits per character) is maintained, i.e., for each equivalent character output, SUM = SUM + BitCost - 8. Thus a positive result indicates poor compression and a negative result indicates good compression. A switch is maintained which controls the output stream such that, when the switch is on, the eight- bit frequency is output instead of the normal font, string, or pair encoding for mode B. A command (frequency OFEh followed by a single 1 bit) is used to signal the decoder to change state.

Table 3 indicates the action taken for each character output.

Switch On (Transparent Mode) Switch Off (Mode B Encoding) SUM >= 0 No Change j SUM < 0 No Change

I I

SUM -1 to -19 No Change SUM 0 to 19 No Change

SUM < -19 Set Switch Off SUM >19 Set Switch On Table 3 - Anti-expansion Actions

Decoder Process

Figs. 10A and 10B provide a flowchart of the decoding process. Table 4 provides the key to the flowchart of Figs. 10A and 10B.

F<n> Fetch the next <n> bits from the input stream

(where n is an integer) . D G Decode Global. Decode a Huffman pattern which was selected and encoded from the

GlobalCode encoding table. D F Decode Font. Decode a Huffman pattern which was selected and encoded from the appropriate Font Encoding tables (Fig. 4A) . D S Decode Short. Decode a Huffman pattern which was selected and encoded from the GlobalCode encoding table.. Same as the DG (Decode Global) except that two bits have already been fetched (F2) and are in DCCode. Used for length of 'A' type strings.

D L Decode Length. Decode a Huffman pattern which was selected and encoded from the LengthBCode encoding table. Used for length of *B' type strings. D Z Decode Zone. Decode a Huffman pattern which was selected and encoded from the lowest 64 entries in the GlobalCode encoding table.

Used for length of 'B* type strings. Table 4 - Key to Decoder Flow

In the decoder flow, there are only four possible endpoints to a single decoder (and implicitly encoder) cycle. These four different endpoints are shown in Figs. 10A and 10B by an integer inside a triangle. They correspond to the four methods of encoding, shown in Tables 1A through 1C hereinabove: Font, NewChar, Pair and String.

Dual Processor Configuration As discussed under Background Art hereinabove, the combination of higher data rates in data transmission systems, the achievement of high data compression efficiencies and the use of complex process-intensive algorithms for data compression increases the processing throughput required to perform modem control and data compression/decompression tasks. In a preferred embodiment, referring to Fig. 11, the system uses two processors connected in series between the computer (the DTE interface, 111) and the telephone line (the DCE interface, 112) , each processor having its own memory. One processor, the DCE Interface Processor 113, a Zilog Z80180, performs DCE interface processes (modem control and data flow management) . The other processor, the

Compression/Decompression and DTE Interface Processor 114, a Rockwell C19, performs data compression, data decompression and DTE interface processes (data interchange with the PC) . This configuration is shown for duplex operation in Fig. 11. Fig. 11 shows a data rate of 11,500 characters/second at the DTE Interface and a data rate of 1,500 characters/second at the DCE Interface. The conventional (prior art) approach using a single processor 121 to perform all functions (DTE interface, DCE interface and Compression/Decompression) is shown in Fig. 12. The single processor approach involves using a more powerful, albeit more expensive, processor. The general problem of sharing tasks among multiple processors is known to be a difficult problem in computer science. A conventional solution that might be applied to data compression modem applications is shown in Fig. 13. Fig. 13 shows a conventional two-processor configuration having a DTE/DCE Interface Processor 131 and a Compression/Decompression Processor 132. The present invention achieves the sharing of tasks by a simple but, nonetheless, unexpectedly effective configuration.

The preferred embodiment, shown in Fig. 11, achieves efficient control over all processes occurring in the system. This configuration utilizes the insight that compression and decompression and interface with the terminal all occur at a high error-free data rate, whereas modem control and the data line interface processes operate at a lower data rate and involves error detection and repeat transmission to cope with transmission errors. Accordingly, a first relatively high speed processor is used for both control of the terminal interface and for data compression and decompression; and a second processor is used for the processes involved in control of the data line interface including error detection and retransmission. Thus loading peaks occurring in either processor cannot interfere with the other.

Glossary Encoder Current Character The most recent character from the input stream which is being processed by the encoder font system. At the end of each encoder cycle, encoder current character, "ECChar", becomes CharlPrior and the fetch and encode process continues with the next character from the input stream as encoder current character.

Escape A bit encoded sequence which serves as a signal from the encoder that the subsequent item is to be treated differently from that normally expected. Example: a font encoded sequence that signifies that a NewChar follows in the data stream.

Font One record of an array of records, each record consisting of pointers, links, characters, etc., each record having an address based on the prior two characters in the input stream, each record containing a list of historically occurring candidate characters to be matched with characters from the input stream.

Huffman Codes As used in this document, this term refers to any variable length bit representation having fewer bits corresponding to higher frequency of occurrence, including but not limited to codes created by a tree algorithm.

NewChar The occurrence of "NewChar" is a font encoding event wherein either the encoder current character, "ECChar", is not found in the selected font or there is no font in existence (and it thus contains 0 characters based on the font selection scheme) .

NewChar Symbol A dynamic value in the range of 0 through n (where n is the maximum number of characters per font) which represents the current virtual position in the font which represents "character not in table". It is used as an escape to signal GlobalNC encoding. NewChar Escape Specifically an encoding representing the NewChar Symbol. String Escape An escape sequence specifically to condition the decoder for reception of a string. When used in the context of a font encoding or a font decoding, a value equal to NewChar Escape + 1.

APPENDIX 1

SOURCE CODE StringA, StringB with full separation in StringTime calls

From TC90F2K2.MAC

IF2

.printx /C19 Encoder and Decoder/

ENDIF

.xlist .C18 ; assembler, please do C18 intructions lodr6 equ 1 ; 6.144mhz clock

.sfcond include ITEcl9 include TCdfmOOl

. printstat macro a,b,c,d if2

.printx /a b e d/ endif endm .list pagealign macro if (tblofs and 255) ne 0 fred defl (0-tblofs) and 255 printstat <Page Align Waste =>,%fred tblofs defl tblofs + fred endif endm

************ A S S E M B L Y O P T I O N S ************

OPTIONS WHICH CHANGE COMPRESSION/SPEED

AHashX2 EQU 0 ; J (0) 0 - no; 1 - yes

MaximumASearches EQU 8 ; J (8) maximum A hashes APPENDIX 1

searched

MaximumBSearches EQU 16 ; J (16) maximum B hashes searched

NC8BitCycle EQU 64 ; J (64) controls A-String/B-String

TwoBytes EQU 1 ; 7 (1) 0 - one-byte font controls

; 1 - two-byte font controls ZoneTestA EQU 1 ; J (1) 0 - HIGH; 1 - HIGH &

LOW ZoneTestB EQU 1 ; J (1) 0 - HIGH; 1 - HIGH &

LOW

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

OPTIONS WHICH PROBABLY ARE NOT GOING TO CHANGE

Fontsize EQU 8 ; only 8,16 are supported; this

; keeps fonts on page boundaries

FontTables EQU 1024 ; may be 512-1024 provided that

; (FontTables*FontSize) MOD 256 =

IF Fontsize EQ 16

CharsPerFont EQU 13 ; otherwise need 17,18-index tables

ELSE

CharsPerFont EQU FontSize-TwoBytes-1

ENDIF

SetLength EQU 128 ; refill to SetLength*2 bytes after

; SetLength(+) bytes have been encoded AntiEx EQU 1 ; 0 - off; 1 - on BufferSize EQU 8192 ; size of Round Robin buffer BufferHashes EQU 2048 ; # of Round Robin 4-byte hashes APPENDIX 1

BufferSuffix EQU 1 ; 0 - nulls; 1 - maintained Failsafe EQU 0 ; 0 - no failsafe; 1 - output failsafe

IF Failsafe FailSafeSets EQU 4 ; output every (n * 256) encodings

ENDIF FTHashes EQU 2048 ; # of Round Robin 2-byte hashes

MinimumAString EQU 6 ; minimum length A string MinimumAUpdate EQU 3 ; bytes advanced if no A string found MinimumBString EQU 2 ; minimum length B string MinimumBUpdate EQU 1 ; bytes advanced if no B string found NCFreqSets EQU 4 ; uses 256*2*2*Sets bytes

NCFreqSetsHigh EQU 0 ; if used, 0 gives best result ????? NCFreqSetsReset EQU 1 ; 0 - no; 1 - reset on B to A change Repeats EQU 1 ; 0 - no repeat logic; 1 - repeat logic

IF FontTables GT 512

MatchMask EQU 0FC00H NextMask EQU 7FEH ; after ASL A ELSE

MatchMask EQU 0FE00H NextMask EQU 3FEH ; after ASL A ENDIF

* * * * * * * * * * * * * * * * * * * * * * * * * * * * *

DEBUG AND TEST OPTIONS

Debug EQU 1 ; set to 0 to skip statistics DbgDum EQU Debug XOR 1 APPENDIX 1

EOFControl EQU 1 ; 0 - endless data flow;l - file by file Macros EQU 1 0 - use subroutines; 1 - use macros Prodder EQU 0 ; 0 - no prods; 1 - force prods

IF Prodder

ProdCycle EQU 67 ; prod every ProdCycle characters

ENDIF

Test EQU 0 ; 0 - no test code; 1 - test code

*********** OS EQUATES and ASSEMBLY OPTIONS *************

Load8250 EQU serial loader/debugger ; parallel loader/debugger

********* H O S T I N T E R F A C E M A P ***********

HOST INTERFACE MAP definition (16450 mode)

W8250_RXD equ 00020h

W8250_TXD equ 0002lh

W8250_MCR equ 00024h ln_stat equ 0003Oh mdm_stat equ 0003lh

HostContrl equ 00032h APPENDIX 1

******** F O N T T A B L E S T R U C T U R E ********

ENCODER / DECODER STRUCTURE MAPS

Map of 1 FONT entry

tblbgn

IF TwoBytes tbyte NCIndex tbyte Characters ELSE ; Bits 7-4 = Characters tbyte CharsNCIndex ; Bits 3-0 = NCIndex ENDIF tstor CharTable,CharsPerFont tblend TestFontSize ; size of a font table

if TestFontSize NE FontSize db 256,Font size not 16 else printstat <Font Size =>,%FontSize printstat <Chars per font =>,%CharsPerFont endif

************ P A G E 1 V A R I A B L E S *************

ENCODER / DECODER RAM PAGE 1 VARIABLES ( 8H through 07fH inclusive)

Miscellaneous Variables

tblbgn RamPtrl IF EOFControl ; ! ! ! i ! must be in 48h tbyte HostLCR ; BBS,BBR ENDIF tbyte FetchPtr

IF EOFControl tstor Bytesln,3 tstor BytesOut,3 tbyte. DCStack tbyte ECStack tbyte OutFetch tbyte OutStore

ENDIF

IF Prodder tbyte ProdCounter ENDIF

if tblofs gt 8Oh db 256,Ram Window Error else

MemoryOne equ tblofs - RamPtrl printstat <Page 1 Window Free =>,%080h-tblofs endif APPENDIX 1

************ P A G E 0 V A R I A B L E S *************

ENCODER / DECODER RAM PAGE 0 VARIABLES (83h through Offh inclusive)

tblbgn RamPtrO

Decoder Variables

tbyte DCABStatus tbyte DCBuffer tbyte DCCharacters tbyte DCCharCount tbyte DCCharlPrior tbyte DCChar2Prior tbyte DCCommand tbyte DCCurrentChar tbyte DCCurrentFreq tword DCCurrentHash IF Failsafe tword DCFailSafe ENDIF tword DCFontBase tbyte DCFontlndex tword DCFTLastHash tword DCFTNextRough tword DCFTParent tword DCFTChild tword DCNCBitsNew tword DCNCBitsPrior tbyte DCNCCounter tbyte DCNCIndex tword DCRRPtr

Encoder Variables

tbyte ECABChange IF AntiEx tbyte ECAntiEStatus

ENDIF

if tblofs gt lOOh db 256,Page 0 Ram Error else

MemoryZero equ tblofs - RamPtrO printstat <Page 0 Ram Free =>,%0100h-tblofs endif

***** 0 8 0 0 - 4 0 0 0 h M E M O R Y B L O C K *****

ENCODER / DECODER TABLES

tblbgn 0800h

tstor ECChar ,256 56 tstor ECCharCopy ,256 56

IF Repeats tstor ECRepeats ,256 56 tstor ECRepeatSW ,256 56 APPENDIX 1

ENDIF tstor ECType,256

256 tstor FTHashMatch,FontTables*2 2048 tstor ECRRHashHead,BufferHashes*2

4096 tstor InBuffer,256

256

IF EOFControl;{ tstor OutBuffer,256

256

ENDIF ;} tstor DCGlobalHigh,4 4 tstor ECGlobalHigh,4

7944

APPENDIX 1

ENDIF ; } r

IF tblofs GT 4000h

DB 256,Addr 800h - 4000h Block Error ELSE fred defl 400 Oh- tblofs printstat <800h-4000h Block Free =>,%fred ENDIF

;**** 4 0 0 0 - C O O O h E N C O D E R B L O C K ****

.

; This block, from 4000h to Obfffh inclusive, is the 32 kbyte page

; area. Access to this block or its alter-ego is controlled by the setting of PB2.

Must be page aligned

; Encoder Font Tables:

tblbgn 4000h

MAX tstor ECRRHashLink,BufferSize/2*2 ; 8192 tstor ECFontTables,FontTables*FontSize

8192

IF FontTables GT 512;{ tstor ECFTHashRough,1024*2

2048

ELSE ;{} tstor ECFTHashRough, 512*2 ENDIF ; } tstor ECFTHashNext,FontTables*2 ; 2048 tstor ECRRBuffer,BufferSize APPENDIX 1

; 8192 tstor ECRRSuffix,256

256 tstor ECNCChar,256 256 tstor ECNCFreg,256

256 tstor ECNCCandF, (NCFreqSets-1)*512 sets ; 1536 tstor ECFontlndex,256 ; 256 tstor ECFrequency,256 ; 256 tstor ECHashRawO,256 256 tstor ECHashRawl,256

256 tstor ECHashX20,256

256 tstor ECHashX21,256

256 tstor ECNewIndex,256 56

;32768 IF tblofs GT OCOOOh

DB 256,Addr 4000h-C000h Block Error ELSE fred defl OCOOOh-tblofs printstat <Encoder Main Ram Free =>,%fred ENDIF

**** 4 0 0 0 - C O O O h D E C O D E R B L O C K ****

Decoder Font Tables:

tblbgn 4000h APPENDIX 1

MAX tstor ECRRHashTest,BufferSize/2*2

; 8192 tstor DCFontTables,FontTables*FontSize ; 8192

IF FontTables GT 512;{ tstor DCFTHashRough,1024*2

; 2048

tstor DCNCChar,256

256 tstor DCNCFreq,256

256 tstor DCNCCandF, (NCFreqSets-l)*512 sets ; 1536

,^•32768 IF tblofs GT OCOOOh

DB 256,Addr 4000h-C000h Block Error ELSE fred β^&fl OCOOOh-tblofs printstat <Decoder Main Ram Free =>,%fred ENDIF

************* B A S E O F P R O G R A M *************

BASE OF PROGRAM

cb equ $ APPENDIX 1

***** E N C / D E C I N T E R F A C E S U B S ** ****

IF EOFControl XOR 1;{ all code in this Section

IS active only in modem operation

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

READ FROM PC VIA INTERRUPT

Hostlnt:

RTI

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

ENCODER SENDS PROD/COMMAND TO REMOTE

SendProdCommand:

STI #0A0h,ECCommand ; Prod is 10 STI #0C8h,ECCommand ; Command is linn

JSR ECProdCommand

etc.

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

DECODER PROCESSES COMMAND FROM REMOTE

ProcessCommand: ;

; process command and then APPENDIX 1

; return to DCFontParams STI #000h,DCCommand JMP DCFontParams

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

DO WHATEVER IS REQUIRED WHEN DECODER FAILS

FailSafeFailed:

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

ENCODER READ FROM PC

ECReadCharacter:

RTS

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

ENCODER WRITE TO PC

ECWriteCharacter:

RTS

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

DECODER READ FROM PC

DCReadCharacter: APPENDIX 1

RTS

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

DECODER WRITE TO PC

DCWriteCharacter:

RTS

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

ENDIF ;} end of modem enc/dec routines

*********** I N I T I A L I Z E M E M O R Y ***********

Initialize Working Storage

Initialize:

LDX #MemoryZero

LDA #0 ClearMemoryO:

STA RamPtrO-l,X

DEX

BNE ClearMemoryO

LDX #MemoryOne DEX

ClearMemoryl:

STA RamPtrl,X ; leave HostLCR unreset

DEX

BNE ClearMemoryl ;

; CLEAR 800h-3FFFh APPENDIX 1

STI #008h,DCWordl+1

LDY #038h

JSR BlockReset

CLEAR 4000h-BFFFh - DECODER

DecBankSelect

STI #040h,DCWordl+l

LDY #08Oh JSR BlockReset

STI #HIGH(DCNCChar) ,DCWordl+l JSR NCCharFreqReset

IF Failsafe STI #000h, DCFailSafe+O

STI #FailSafeS^"ets, DCFailSafe+1 ENDIF

STI #001h, DCABStatus

STI #HIGH(DCFontTables) , DCFontBase+1 IF FontSize EQ 8 STI #008h, DCFontBase+0 ELSE

STI #010h, DCFontBase+O

ENDIF

LDA #HIGH(DCFTHashRough) STA DCFTNextRough+1 STA DCFTParent+1

LDA #HIGH(DCFTHashNext) STA DCFTHashRough+1

STA DCFTLastHash+1

LDA #002h STA DCFTHashRough+0

STA DCCurrentHash+0

STI #HIGH(DCRRBuffer) , DCRRPtr+1

STI #HIGH(NC8BitCycle*8), DCNCBitsPrior+1

STI #LOW (NC8BitCycle*8) , DCNCBitsPrior+O

STI #NC8BitCycle, DCNCCounter

CLEAR 4000h-BFFFh - ENCODER

EncBankSelect

STI #040h,DCWordl+l

LDY #08Oh

JSR BlockReset

STI #HIGH(ECNCChar),DCWordl+l JSR NCCharFreqReset

IF Prodder

STI #ProdCycle, ProdCounter ENDIF IF Failsafe

STI #000h, ECFailSafe+0

STI #FailSafeSets, ECFailSafe+1 ENDIF

STI #001h, ECABStatus IF AntiEx

STI #001h, ECAntiEStatus ENDIF

LDA #HIGH(ECFTHashRough) STA ECFTNextRough+1

STA ECFTParent+1

STI #HIGH(ECFTHashNext) , ECFTLastHash+1 APPENDIX 1

STI #002h, ECFTLastHash+O

LDA #HIGH(ECFTHashNext)

STA ECFTHashRough+1

LDA #002h

STA ECFTHashRough+0

STA ECCurrentHash+O

STA ECHashX20+255

LDA #08Oh

STA ECCurrentHash+1

STA ECHashX21+255

STI #001h, ECBuffer

STI #HIGH(ECRRBuffer) , ECRRPtr+1

STI #HIGH(NC8BitCycle*8) , ECNCBitsPrior+1

STI #LOW (NC8BitCycle*8) , ECNCBitsPrior+O

STI #NC8BitCycle, ECNCCounter

RTS

BlockReset:

STI #000h,DCWordl+0

LDA #000

BRLoop:

STA (DCWordl)

INC DCWordl+0

BNE BRLoop

DEY

BEQ BRExit

INC DCWordl+1

IJMP BRLoop BRExit:

RTS

NCCharFreqReset:

LDA #NCFreqSets STA DCBytel

STI #000h,DCWordl+0 APPENDIX 1

NCCFRLoopO:

LDX #000h NCCFRLoopl:

LDA Best128,X STA (DCWordl)

INC DCWordl+0

INX

BPL NCCFRLoopl

NCCFRLoop2: TXA

STA (DCWordl)

INC DCWordl+0

INX

BMI NCCFRLoop2 LDA DCWordl+1

ADD #001h

STA DCWord2+l

STI #000h,DCWord2+0

LDY #000h NCCFRLoop3:

LDA (DCWordl)

TAX

TYA

STA (DCWord2) ,X INC DCWordl+0

INY

BNE NCCFRLoop3

DEC DCBytel

BEQ NCCFRExit INC DCWordl+1

INC DCWordl+1

!JMP NCCFRLoopO

NCCFRExit:

RTS ;

************** M A I N D A T A F L O W ************** APPENDIX 1

StrtUp:

LDX #0FFh ; set stack pointer TXS mask_gen <bcr_fast_es2,bcr_fast_esl>

STI #mask,bcr ; set C18/C19 to fast execution mask_gen <cir_fast_es3>

STI #mask ,clint STI #007h,HostContrl ; enable 16450 mode + interrupts

STI #05Fh,ln_stat ; 8250.THRE = 1 ResetMemory:

BBS 4,HostLCR,ResetHostLCR JSR Initialize ResetHostLCR:

STI #080h,HostLCR ; set bit 7 LCRLoop:

STI #000h,FetchPtr STI #000h,StorePtr

BBR 7,HostLCR,SetBreak ; reset if host wrote LCR !JMP LCRLoop SetBreak:

LDA w8250_LCR ; save host command info STA HostLCR

BBR 5,HostLCR,SetBreakCont

BBS 6,HostLCR,SetBreakCont

BBR 2,HostLCR,SetBreakCont

STI #0F6h,HostContrl ; no ints during Memory Load

SetBreakCont:

BBS 4,HostLCR,SetBreakNoReset

JSR Initialize ; HostLCR(4) - 0 reset memory SetBreakNoReset:

BBS 6,ln_stat,SetBreakTSRE APPENDIX 1

STI #02Fh,ln_stat set 4, leave 6 at 0

!JMP WhichProcess SetBreakTSRE:

STI #06Fh,ln_stat set 4, leave 6 at 1 WhichProcess:

BBS 6,HostLCR,LoopBack

BBS 5,HostLCR,DumpLoadMemory

BBR 2,HostLCR,ECStart HostLCR(2) - 0 Encoder DCStart: ; - 1 Decoder DecBankSelect

JMP DCFontParams ECStart:

EncBankSelect

JMP ECRefill DCOrECEOF:

LDX #0FFh reset primary stack TXS

BBS 6,ln_stat,EOFTSRE

STI #02Fh,ln_stat set 4, leave 6 at 0 IJMP EOFStats EOFTSRE:

STI #06Fh,ln_stat set 4, leave 6 at 1 EOFStats:

BBS 3,HostLCR,EOFAcked set if host set LCR bit 3

IJMP EOFStats EOFAcked:

LDA Bytesln+O

JSR SubWriteToPC LDA Bytesln+l

JSR SubWriteToPC

LDA BytesIn+2

JSR SubWriteToPC

LDA #000h JSR SubWriteToPC

LDA BytesOut+0 APPENDIX 1

JSR SubWriteToPC

LDA BytesOut+1

JSR SubWriteToPC

LDA BytesOut+2 JSR SubWriteToPC

LDA #000h

JSR SubWriteToPC

LDA #000h

STA Bytesln+O STA Bytesln+l

STA BytesIn+2

STA BytesOut+0

STA BytesOut+1

STA BytesOut+2 JMP ResetMemory

*********** p c L O O P B A C K C O D E ************

IF EOFControl ;{ all code in this Section is

; active only in loopback operation

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

DumpLoadMemory:

BBS 2,HostLCR,LoadMemory DumpMemory:

JSR MemoryDump !JMP DCOrECEOF

LoadMemory:

JSR Me oryLoad

•JMP DCOrECEOF

_* LoopBack:

BBR 5,HostLCR,LoopBackNoDump APPENDIX 1

JSR MemoryDump

BBS 6,ln_stat,LoopBackTSRE

STI #02Fh,ln_stat ; set 4, leave 6 at 0

!JMP LoopBackWait LoopBackTSRE:

STI #06Fh,ln_stat ; set 4, leave 6 at 1 LoopBackWait:

BBS 3,HostLCR,LoopBackAcked ; set if host set LCR bit 3 !JMP LoopBackWait

LoopBackAcked:

LDA HostLCR

AND #0F7h

STA HostLCR LoopBackNoDump:

LDX #07Fh

TXS

LDA #HIGH(ECRefill)

PHA LDA #LOW(ECRefill)

PHA

PSH

TSX

STX ECStack LDX #0FFh

TXS

DecBankSelect

JMP DCFontParams

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

SwitchToDecode: PSH TSX STX ECStack

LDX DCStack APPENDIX 1

TXS

DecBankSelect

PUL

RTS ;

SwitchToEncode:

PSH

TSX

STX DCStack LDX ECStack

TXS

EncBankSelect

PUL

RTS

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

MISC READ FROM PC; USED ONLY FOR MEMORY LOAD; INTS ARE OFF ;

SubReadFro PC:

LDA Hostcontrl BPL SubReadFromPC LDA W8250_TXD STI #076h,HostContrl

STI #0lFh,ln_stat RTS

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

MISC WRITE TO PC

SubWriteToPC:

BBS 0,ln_stat,SubWriteToPC BBS 0,ln_stat,SubWriteToPC ; twice for SPERRY et al APPENDIX 1

STA W8250_RXD

BBS 6,ln_stat,SubWritePCTSRE

STI #03Eh,ln_stat ; set 0, leave 6 at 0 !JMP SubWritePCCont SubWritePCTSRE:

STI #07Eh,ln_stat ; set 0, leave 6 at 1 SubWritePCCont:

RTS

. ;* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

.

MemoryDump:

LDX #000h

LDY #((RamPtrl+1)-OOOh) MDLooplFF:

TXA

JSR SubWriteToPC

INX

DEY BNE MDLooplFF

LDY #(080h-(RamPtrl+l)) ; X = #RamPtrl+l MDLoopl:

LDA PortA,X

JSR SubWriteToPC INX

DEY

BNE MDLoopl

LDY #(RamPtrO-080h) MDLoop2FF: TXA

JSR SubWriteToPC

INX

DEY

BNE MDLoop2FF LDY #(100h-RamPtr0) ; X = #RamPtr0

MDLoop2:

; X = #RamPtrl+l

APPENDIX 1

MLLoop4:

JSR SubReadFromPC

STA (ECWordl)

INC ECWordl+0 ifEQ INC ECWordl+1 LDA ECWordl+1 CMP #0C0h BEQ MLLoop4Exit fi

1JMP MLLoop4 MLLoop4Exit:

DecBankSelect

STI #040h,ECWordl+l STI #000h,ECWordl+0

MLLoopδ:

JSR SubReadFromPC

MLLoopδExit: LDA MDSave+0 STA ECWordl+0 LDA MDSave+1

STA ECWordl+1 STI #0F7h,HostContrl RTS

MDSave:

ORG $+2 APPENDIX 1

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

READ FROM PC VIA INTERRUPT

Hostlnt:

PSH

LDA Hostcontrl ifMI LDX StorePtr

LDA W8250_TXD STA InBuffer,X INX

STX StorePtr INX

CPX FetchPtr ifNE

STI #01fh,ln_stat fi fi

BBR 5,Hostcontrl,Hostlntl

LDA w8250_LCR ; save host command info STA HostLCR Hostlntl: STI #007h,Hostcontrl

PUL RTI

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

ENCODER READ FROM PC

ECReadCharacter:

IF Test ;{ LDA Bytesln+O

CMP #050h APPENDIX 1

BNE ECReadChar LDA Bytesln+l CMP #002h BNE ECReadChar LDA BytesIn+2

CMP #000h BNE ECReadChar NOP ; set breakpoint here ENDIF } ECReadChar:

LDX FetchPtr CPX StorePtr ifEQ BBR 1,HostLCR,ECReadChar ; HostLCR is read in interrupt

SMB 7,HostLCR 1JMP ECReadCharExit bit 1 set when EOF and ptrs =

flush input when Decoder ; has FailedSafe

; A = char from PC APPENDIX 1

ECReadCharExit : RTS

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

ENCODER WRITE TO PC

ECWriteCharacter: PSH IF Test ;{

LDA BytesOut+0 CMP #06Bh

BNE ECWriteSearched LDA BytesOut+1 CMP #001h

BNE ECWriteSearched LDA BytesOut+2 CMP #000h

BNE ECWriteSearched NOP ; set breakpoint here

ECWriteSearched:

ENDIF ;}

BBR 6,HostLCR,ECWriteChar LDX OutStore LDA ECBuffer

STA OutBuffer,X INX

STX OutStore INX CPX OutFetch ifEQ

JSR SwitchToDecode fi

!JMP ECWriteCont ECWriteChar:

BBS 0,HostLCR,ECWriteCont APPENDIX 1

BBS 0,ln_stat,ECWriteChar

BBS 0,ln_stat,ECWriteChar ; twice for SPERRY et al

LDA ECBuffer

STA W8250_RXD

BBS 6,ln_stat,ECWriteTSRE

STI #03Eh,ln_stat set 0, leave 6 at 0 !JMP ECWriteCont ECWriteTSRE: STI #07Eh,ln_stat set 0, leave 6 at 1

ECWriteCont: PUL

STI #001h,ECBuffer INC BytesOut+0 ifEQ

INC BytesOut+1 ifEQ

INC BytesOut+2 fi fi

RTS

* * * * * * * * * * * * * * * * * * * * * * * * * * * * *

DECODER READ FROM PC

DCReadCharacter:

PSH

IF Test ;{ LDA BytesIn+0 CMP #0A0h BNE DCReadChar LDA Bytesln+l CMP #005h BNE DCReadChar LDA BytesIn+2 APPENDIX 1

CMP #002h BNE DCReadChar

NOP ; set breakpoint here

ENDIF ,*) DCReadChar:

BBR 6,HostLCR,DCReadCharNLB LDX OutFetch CPX OutStore ifEQ JSR SwitchToEncode

1JMP DCReadChar fi

INC OutFetch

LDA OutBuffer,X ; A = char from Encoder STA DCBuffer

IJMP DCReadCharExit DCReadCharNLB:

LDX FetchPtr CPX StorePtr ifEQ

BBR 1,HostLCR,DCReadChar ; HostLCR is read in interrupt

SMB 7,HostLCR ; NOTE: not a normal EOF IJMP DCReadCharExit ; bit 1 set when EOF and ptrs = fi

BBS 5,ln_stat,DCReadCharLS STI #01fh,ln_stat DCReadCharLS: INC FetchPtr

INC Bytesln+O ifEQ INC Bytesln+l ifEQ INC BytesIn+2 fi APPENDIX 1

fi

LDA ECCommand flush input when Decoder BMI DCReadChar ; has FailedSafe LDA InBuffer,X ; A = char from PC STA DCBuffer

DCReadCharExit: PUL RTS

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

DECODER WRITE TO PC

DCWriteCharacter:

IF Test ;{

PHA

LDA BytesOut+0

CMP #027h

BNE DCWriteSearched

LDA BytesOut+1

CMP #017h

BNE DCWriteSearched

LDA BytesOut+2

CMP #000h

BNE DCWriteSearched

NOP ; set breakpoint here

DCWriteSearched:

PLA ENDIF ;} DCWriteChar:

BBS 0,HostLCR,DCWriteCont BBS 0,ln_stat,DCWriteChar BBS 0,ln_stat,DCWriteChar twice for SPERRY et al

STA W8250_RXD

BBS 6,ln_stat,DCWriteTSRE APPENDIX 1

STI #03Eh,ln_stat ; set 0, leave 6 at 0 IJMP DCWriteCont DCWriteTSRE: _

STI #07Eh,ln_stat ; set 0, leave 6 at 1 DCWriteCont:

STI #001h,ECBuffer BBS 6,HostLCR,DCWriteCharNLB INC BytesOut+0 ifEQ INC BytesOut+1 ifEQ

INC BytesOut+2 fi fi DCWriteCharNLB: RTS

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

ENDIF ;} end of loopback enc/dec routines

************* E N C O D E R M A C R O S **************

Writel7 MACRO

IF Macros MSWritel7 ELSE JSR MSWritel7 ENDIF

ENDM

Writeδ MACRO IF Macros MSWriteδ

ELSE APPENDIX 1

JSR MSWriteδ ENDIF ENDM

Write817 MACRO

IF Macros

MSWriteδ17 ELSE

JSR MSWrite817 ENDIF ENDM

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

WRITE 1-7 BITS PER GUARD BIT POSITION

IF Macros MSWritel7 MACRO

LOCAL Writel7Loop,Writel7Exit ELSE

MSWritel7:

ENDIF Write17Loop:

ASL A BEQ Writel7Exit

ROL ECBuffer BCC Writel7Loop JSR ECWriteCharacter IJMP Writel7Loop Writel7Exit:

IF Macros

ENDM ELSE RTS ENDIF APPENDIX 1

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

WRITE ONE BYTE OF BITS

IF Macros

MSWriteS MACRO

LOCAL WriteδLoop,WriteδSkip,WriteSExit ELSE MSWriteδ: ENDIF

ASL A ORA #001h IJMP WriteδSkip WriteδLoop: ASL A

BEQ WriteδExit WriteδSkip:

ROL ECBuffer BCC WriteδLoop JSR ECWriteCharacter

IJMP WriteδLoop WriteδExit:

IF Macros ENDM ELSE

RTS ENDIF

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

WRITE ONE+ BYTE(S) OF BITS

IF Macros MSWriteδ17 MACRO LOCAL

Writeδ17Loopl,Writeδ17Skip,Writeδ17Exit1,Writeδ17Loop2,Write APPENDIX 1

617Exit2

ELSE MSWriteδ17:

ENDIF ASL A

ORA #001h IJMP Writeδ17Skip Writeδ17Loopl:

ASL A BEQ Writeδ17Exitl

Writeδ17Skip:

ROL ECBuffer BCC Writeδ17Loopl JSR ECWriteCharacter IJMP Writeδ17Loopl

Writeδ17Exitl: TXA Writeδ17Loop2:

ASL A BEQ Writeδ17Exit2

ROL ECBuffer BCC Writeδ17Loop2 JSR ECWriteCharacter IJMP Writeδ17Loop2 Writeδ17Exit2:

IF Macros

ENDM ELSE RTS ENDIF

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

SET POINTER TO NCChar,NCFreq TABLES

SetCharFreq MACRO ED,BW ; NCFreqSets = APPENDIX 1

HIGH(base of

NCChar)

ADD #001h

STA ED&Word2+l ; Word2+l = HIGH(base of

NCFreq)

STI #000h,ED&Wordl+0

STI #000h,ED&Word2+0 ENDIF IF "&BW" EQ "Wl"

ASL A ; HIGH((0-3)*512)

ADD #HIGH(ED&NCChar)

STA ED&Wordl+l ; Wordl+1 = HIGH(base of NCChar)

ENDIF

IF "&BW" EQ "W2"

ASL A ; HIGH((0-n)*512)

ADD #(HIGH(ED&NCChar)+1)

STA ED&Word2+l ; Word2+l = HIGH(base of

NCFreq)

ENDIF ENDM

;********** F O N T U P D A T E M A C R O ********** APPENDIX 1

FONT UPDATE MACRO

IF EC, Y = ECNextChar

FontUpdate MACRO XX,YY LOCAL FontlstUse,FontActive,ECNC8Bit,ECNCGlobal,ECNCCoded

LDA XX&CurrentHash+l HIGH of prior hash IF "&XX" EQ "DC" ;{

BPL FontActive

JMP DCNewCharacter ELSE ;{}

STA ECFontBase+1 (stored as * 2)

LDA ECCurrentHash+O ; LOW of prior hash

ASL A

ROL ECFontBase+1 now = * 4

ASL A

ROL ECFontBase+1 now = * δ

IF FontSize EQ 16 ASL A ROL ECFontBase+1 now = * 16

ENDIF

STA ECFontBase+0

TAX ; save for PHX

LDA ECCurrentHash+1 HIGH of prior hash BPL FontActive

FontlstUse: IF "&YY" EQ "FU" { LDA ECFontBase+1 ADD #HIGH(ECFontTables) STA ECFontBase+1 LDA #000h STA ECCharacters STA ECNCIndex

ENDIF ;}

LDA #000h APPENDIX 1

STA ECNewIndex,Y ; zero when new font IF "&YY" EQ "FU" ;{

JMP ECNewCharacter ; Characters = 0 ELSE ;{} JMP ECNewCharCommand

ENDIF ;}

FontActive:

IF "&XX" EQ "DC" ;{ LDA DCFontIndex ifPL ;{ LDA DCFontlndex STI #0δOh,DCFontlndex CMP DCNCIndex BCC DCKCharLTNCIndex

BEQ DCKCharEQNCIndex DCKCharGTNCIndex: IJMP DCCharSwap DCKCharEQNCIndex: INC . DCNCIndex bump NCIndex

JMP DCFontUpdated

DCKCharLTNCIndex:

AND #0FFh

BNE DCCharSwap JMP DCFontUpdated els ;{} LDA DCFontBase+1 LDX DCFontBase+0 PHA push address back on stack

PHX and pull to I PLI LAN NCIndex or CharsNCIndex IF TwoBytes ;{

LAN Characters ENDIF ;} APPENDIX 1

fi ?} ELSE ;{}

LDA ECFontBase+1 ADD #HIGH(ECFontTables) STA ECFontBase+1 PHA push address back on stack PHX and pull to I PLI LAN NCIndex or CharsNCIndex IF TwoBytes ;{

Characters

W - CharsNCIndex

; NCIndex in bits

3-0

; Characters in bits

7-4

; +1 if δ-bit active

= # of font indices (base 0)

LDA EncodingTable,X A = FontCode(Bits) offset

STA ECBytel ; Bytel = EncodingTable base ptr

ADD ECNCIndex ; Bytel + NCIndex STA ECNewIndex,Y r I = ptr to 1st

X # of characters to

; W = A = character le index, base 0

(= character encoding index

BCC XX&CharLTNCIndex if < NCIndex) BEQ XX&CharEQNCIndex

XX&CharGTNCIndex:

IF "&XX" EQ "EC" ADD XX&ABStatus ; 0 or 1 ADD #002h ; A = character encoding index

ADD XX&Bytel + character table index

STA ECFontlndex,Y ; W = A = character table index

TWA for table swap ENDIF IJMP XX&CharSwap APPENDIX 1

XX&CharEQNCInde :

ADD XX&Bytel + character table index STA ECFontlndex,Y ENDIF

INC XX&NCIndex ; bump NCIndex IJMP XX&FontEncoding

XX&CharLTNCIndex:

IF "&XX" EQ "EC" ADD XX&Bytel ; A = character encoding index STA ECFontlndex,Y ; + character table index

TWA ; W = character table index ELSE

AND #0FFh ENDIF BEQ XX&FontEncoding ; no swap if already index 0 XX&CharSwap: A = character index. base 0

ADD #(CharTable-l) ; ptr to previous character

TAX

LDA (XX&FontBase),X INX

STA (XX&FontBase) ,X LDA XX&CurrentChar

DEX

STA (XX&FontBase) ,X XX&FontEncoding:

IF "&XX" EQ "DC" JMP DCFontUpdated

ELSE - 9δ - APPENDIX 1

LDA #002h

STA ECType,Y ; Type 2 - normal font encoding

BBS 0,ECABStatus,ECFontSaveEight 5 JMP ECFontUpdated

ECFontSaveEight:

SetCharFreq XX,W2 ; output is NC frequency value

LDA ECCurrentChar ; save for consistent 0 'strings

STA ECWord2+0 ; off* ouput write code LDA (ECWord2) STA ECFrequency,Y JMP ECFontUpdated ENDIF

XX&NewCharacter:

LDA XX&NCIndex ifNE DEC XX&NCIndex fi

SetCharFreq XX,WB LDX XX&CurrentChar LDA (XX&Word2),X frequency of current character STA XX&CurrentFreq

BEQ XX&NewCharOK

LDX XX&Byte3 ; 0-n where n = 0, 3, 7 or 15 LDA XX&GlobalHigh,X CMP XX&CurrentFreq BCS XX&NewCharSwap ; CurrentFreq <=

■ GlobalHigh

XX&NewCharExchange:

INC XX&GlobalHigh,X

TAX ; X = high frequency LDA (XX&Wordl) ,X

TAW ; W = high character APPENDIX 1

LDA XX&CurrentChar

STA (XX&Wordl) ,X ; current char > high char

TXA

LDX XX&CurrentChar

STA (XX&Word2) ,X ; high freq > char freq

LDX XX&CurrentFreq

TWA

STA (XX&Wordl) ,X high char > current char

TAX

LDA XX&CurrentFreq

STA (XX&Word2),X ; current freq > high freq

IJMP XX&NewCharOK XX&NewCharSwap:

LDX XX&CurrentFreq

DEX X = lower freq

LDA (XX&Wordl),X

TAW W = lower char

LDA XX&CurrentChar

STA (XX&Wordl),X ; current char > lower char

TXA

LDX XX&CurrentChar

STA (XX&Word2) ,X ; lower freq > char freq

LDX XX&CurrentFreq

TWA

STA (XX&Wordl) ,X ; lower char > current char

TAX

LDA XX&CurrentFreq STA (XX&Word2),X current freq > lower freq XX&NewCharOK:

LDA XX&Characters APPENDIX 1

CMP #CharsPerFont

BEQ XX&NewCharOverflow ; check for font table full

INC XX&Characters if not, add to char count

ADD #001H XX&NewCharOverflow:

ADD #(CharTable-l)

TAX

LDA XX&CurrentChar store current char in font

STA (XX&FontBase) ,X

LDX XX&CurrentFreq X = ECCurrentFreq

LDA GlobalBits,X

ADD XX&NCBitsNew+O

STA XX&NCBitsNew+O update NC trending total ifCS INC XX&NCBitsNew+l fi

ELSE ;{}

ECNewCharCommand:

LDX #0FFh ; X = ECCurrentFreq for command ENDIF ;}

IF "&XX" EQ "EC" BBR 0 ,ECABStatus,ECNCGlobal ECNCβBit:

TXA ; output is NC frequency value STA ECFrequency,Y ; save for consistent

•strings

IJMP ECNCCoded ; off* ouput write code ECNCGlobal:

TXA ; global index for write STA ECFontlndex,Y

ECNCCoded: ; prod/commands do not APPENDIX 1

affect

LDA #004h ; any of the trending totals.

STA ECType,Y tables or hashes ENDIF

IF "&YY" EQ "FU" ;{ XX&NCTrending:

DEC XX&NCCounter BNE XX&FontUpdated STI #NC8BitCycle,XX&NCCounter

LDY #000h ; Wordl -

LDX XX&NCBitsNew+O ; NCBitsPrior +

NCBitsNew

TXA

ADD XX&NCBitsPrior+O ; NCBitsPrior set to

STA XX&Wordl+O NCBitsNew

STX XX&NCBitsPrior+O

STY XX&NCBitsNew+O ; NCBitsNew set to 0

LDX XX&NCBitsNew+l TXA ifCS

ADD #001h ; if low order carry fi

ADD XX&NCBitsPrior+l

STA XX&Wordl+l

STX XX&NCBitsPrior+l

STY XX&NCBitsNew+l

BBR 0,XX&ABStatus,XX&NCOff

XX&NCOn:

LDA #HIGH(NC8BitCycle*15)

CMP XX&Wordl+l

BCC XX&FontUpdated ; HIGH(Wordl) > A

BNE XX&TurnNCOff ; HIGH(Wordl) < A

LDA #LOW(NCδBitCycle*15)

CMP XX&Wordl+O

BCC XX&FontUpdated APPENDIX 1

XX&TurnNCOff:

STI #000h,XX&ABStatus IF "&XX" EQ "EC" STI #001h,ECABChange IF Test

INC SwitchToA+0 ifEQ

INC SwitchToA+1 fi ENDIF

ENDIF

IJMP XX&FontUpdated XX&NCOff:

LDA #HIGH(NC8BitCycle*15) CMP XX&Wordl+l

BCC XX&TurnNCOn ; HIGH(Wordl) > A BNE XX&FontUpdated ; HIGH(Wordl) < A LDA #LOW(NCδBitCycle*15) CMP XX&Wordl+O BCS XX&FontUpdated

XX&TurnNCOn:

IF "&XX" EQ "EC" STI #001h,ECABStatus STI #001h,ECABChange IF Test

INC SwitchToB+0 ifEQ

INC SwitchToB+1 fi ENDIF

ELSE

STI #001h,DCABStatUS ENDIF

IF NCFreqSetsReset LDA #NCFreqSetsHigh

LDX #NCFreqSets APPENDIX 1

XX&ResetGlobalHigh:

STA XX&GlobalHigh-l,X DEX

BNE XX&ResetGlobalHigh ENDIF

XX&FontUpdated:

IF TwoBytes LDA XX&NCIndex STA (XX&FontBase) LDX #001h

LDA XX&Characters STA (XX&FontBase) ,X ELSE LDA XX&Characters ASL A

ASL A ASL A ASL A

ORA XX&NCIndex STA (XX&FontBase)

ENDIF XX&PlusHash:

LDY XX&CharlPrior STY XX&Char2Prior LDX CRC_TH,Y

LDA XX&CurrentChar STA XX&CharlPrior

NEG A ; extra NEG over 1st try ?????

EOR CRC_TL,Y TAY

TXA

EOR CRC_TL,Y

TAW ; W = LOW(rough hash)

IF "&XX" EQ "EC" LDX ECNextChar ; ECHashRaw is bits 15-0 of APPENDIX 1

STA ECHashRawO,X ; the CRC

ENDIF

LDA CRC_TH,Y ; A = HIGH(rough hash)

IF "&XX" EQ "EC" STA ECHashRawl,X

ENDIF

STA XX&Wordl+l

AND #HIGH(MatchMask)

STA XX&Bytel ; Bytel = match bits TWA

ASL A

ROL XX&Wordl+l

STA XX&Wordl+O

LDA XX&Wordl+l AND #HIGH(NextMask)

ADD #HIGH(XX&FTHashRough) ; Wordl = ptr to

STA XX&Wordl+l ; FTHashRough

LDX #001h ; X = 1 for all of

PlusHash XX&PlusFineLoop:

LDA (XX&Wordl) ,X ; direct ptr, can't be 0

BEQ XX&PlusNewHash

TAY ; Wordl may be either Rough

LDA (XX&Wordl) ; or Next; is always the AND #0FEh ; predecessor to new hash

STA XX&Wordl+O

TYA

ADD #(HIGH(FTHashMatch)-HIGH(XX&FTHashNext) ) STA XX&Wordl+l ; Wordl = ptr to

IF "&XX" EQ "EC" ; FTHashMatch

LDA (ECWordl),X ELSE ; HashMatch values are inter-

LDA (DCWordl) ; mixed Decoder/Encoder ENDIF

CMP XX&Bytel APPENDIX 1

BEQ XX&PlusFineFound STY XX&Wordl+l Wordl •= ptr to IJMP XX&PlusFineLoop FTHashNext XX&PlusFineFound: TYA

STA XX&Wordl+l ADD #(0-HIGH(XX&FTHashNext)) STA XX&CurrentHash+l IF "&XX" EQ "EC" LDY ECNextChar

STA ECHashX21,Y ENDIF

LDA XX&Wordl+O STA XX&CurrentHash+O IF "&XX" EQ "EC"

STA ECHashX20,Y ENDIF

JMP XX&PlusHashExit XX&PlusNewHash: Bytel = match bits LDY XX&FTLastHash+l Wordl = rough hash

* 2

BEQ XX&PlusNewSearch XX&PlusNewlstPass:

LDA XX&FTLastHash+O LastHash initialized to 2

ADD #002h + HIGH(FTHashNext) STA XX&FTLastHash+O BNE XX&PlusNewlstCont INY CPY #(HIGH(XX&FTHashNext)+HIGH(FontTables*2)) ifEQ STI #000h,XX&FTLastHash+l ; 0 when wrapped to force search

IJMP XX&PlusNewSearch els

STY XX&FTLastHash+l APPENDIX 1

fi XX&PlusNewlstCont:

STA XX&Word2+0 Word2 = ptr to new's IF "&XX" EQ "DC" FTHashMatch

STA DCWord3+0 ENDIF Word3 = ptr to new's TYA DCFTHashChars

STA (XX&Wordl) ,X

ADD #(HIGH(FTHashMatch)-HIGH(XX&FTHashNext)) STA XX&Word2+l IF "&XX" EQ "EC"

ADD #(0-HIGH(FTHashMatch) ) ELSE

ADD #(HIGH(DCFTHashChars)-HIGH(FTHashMatch) )

STA DCWord3+l

ADD #(0-HIGH(DCFTHashChars)) ENDIF

ORA #0δ0h set bit 7 for new hash

STA XX&CurrentHash+l IF "&XX" EQ "EC"

LDY ECNextChar store new hash in

ECHashX2

STA ECHashX21,Y or in DCCurrentHash ENDIF

LDA XX&FTLastHash+O STA (XX&Wordl) STA XX&CurrentHash+O IF "&XX" EQ "EC"

STA ECHashX20,Y ELSE

LDA DCChar2Prior ; store prior/current chars

STA (DCWord3) ; in DCFTHashChars LDA DCCharlPrior STA (DCWord3),X ENDIF APPENDIX 1

LDA XX&Bytel IF "&XX" EQ "EC" STA (ECWord2) ,X ; HashMatch values are inter- ELSE mixed Decoder/Encoder

STA (DCWord2) ENDIF

JMP XX&PlusHashExit XX&PlusNewSearch: LDA (XX&FTParent) ,X ; initialized to

FTHashRough

BNE XX&PlusNewParent XX&PlusNewRoughAdvance:

LDA XX&FTNextRough+O initialized to FTHashRough

ADD #002h

STA XX&FTNextRough+O ifEQ INC XX&FTNextRough+l ; memory allocation dependent

LDY #HIGH(XX&FTHashNext) ; i.e. FTHashRough table must

CPY XX&FTNextRough+l ; be right before FTHashNext ifEQ STI #HIGH(XX&FTHashRough) ,XX&FTNextRough+l fi fi

LDA (XX&FTNextRough) ,X BEQ XX&PlusNewRoughAdvance

LDY XX&FTNextRough+O STY XX&FTParent+O LDY XX&FTNextRough+l STY XX&FTParent+l XX&PlusNewParent:

STA XX&FTChild+l - lOδ - APPENDIX 1

LDA (XX&FTParent) STA XX&FTChild+O

TAY ; Y = FTChild+0

IJMP XX&PlusNewNext2Found XX&PlusNewNextAdvance:

LDA XX&FTChild+l STA XX&FTParent+l LDA XX&FTChild+O STA XX&FTParent+O IJMP XX&PlusNewSearch

XX&PlusNewNext2Found:

CPY XX&Wordl+O ifEQ LDA XX&FTChild+l CMP XX&Wordl+l

BEQ XX&PlusNewNextAdvance fi

LDA (XX&FTChild) STA (XX&FTParent) LDA (XX&FTChild) ,X

STA (XX&FTParent) ,X

STY XX&Word2+0 ; Word2 = ptr to new's IF "&XX" EQ "DC" ; FTHashMatch STY DCWord3+0 ENDIF ; Word3 = ptr to new's

TYA ; DCFTHashChars

STA (XX&Wordl) LDA XX&FTChild+l STA (XX&Wordl),X ADD #(HIGH(FTHashMatch)-HIGH(XX&FTHashNext) )

STA XX&Word2+l IF "&XX" EQ "EC"

ADD #(0-HIGH(FTHashMatch) ) ELSE ADD #(HIGH(DCFTHashChars)-HIGH(FTHashMatch) )

STA DCWord3+l APPENDIX 1

ADD #(0-HIGH(DCFTHashChars)) ENDIF

ORA #0δ0h set bit 7 for new hash

STA XX&CurrentHash+l IF "&XX" EQ "EC"

LDY ECNextChar ; store new hash in

ECHashX2

STA ECHashX21,Y ; or in DCCurrentHash ENDIF

LDA XX&Word2+0 STA XX&CurrentHash+O IF "&XX" EQ "EC"

STA ECHashX20,Y ELSE

LDA DCChar2Prior ; store prior/current chars

STA (DCWord3) in DCFTHashChars LDA DCCharlPrior STA (DCWord3),X ENDIF

LDA XX&Bytel IF "&XX" EQ "EC" STA (ECWord2),X ; HashMatch values are inter-

ELSE ; mixed Decoder/Encoder

STA (DCWord2) ENDIF LDA #000h STA (XX&FTChild) ,X STA (XX&FTChild) XX&PlusHashExit:

IF "&XX" EQ "DC";{

LDA DCCurrentHash+0 LOW of prior hash

STA DCFontBase+0 LDA DCCurrentHash+1 HIGH of prior hash

AND #07Fh APPENDIX 1

CLC

ROL DCFontBase+0

ROL A ; now * 4

ROL DCFontBase+0

ROL A ; now = * δ

IF FontSize EQ 16;{

ROL DCFontBase+0

ROL A now = * 16 ENDIF ;}

ADD #HIGH(DCFontTables) STA DCFontBase+1 LDA DCCurrentHash+1 HIGH of prior hash ifMI

LDA #000h new font

STA DCCharacters

STA DCNCIndex els

LDA (DCFontBase) old font

IF TwoBytes ;{

NCIndex

Characters

W = CharsNCIndex

NCIndex in bits 3-0

Characters in bits 7-4

ENDIF ;}

*************** E N C O D E R R E F I L L **************

ENCODER REFILL

ECRef ill :

IF Prodder DEC ProdCounter ifEQ STI #ProdCycle,ProdCounter STI #0A0h,ECCommand

JMP ECProdCommand fi ENDIF

JSR ECReadCharacter ; A = char from PC IF EOFControl

BBR 7,HostLCR,ECRefillRepeats STI #0Cδh,ECCommand JMP ECProdCommand ENDIF ECRefillRepeats:

IF Repeats ;{ CMP ECChar2Prior BNE ECRefillUpdate CMP ECCharlPrior BNE ECRefillUpdate

LDY ECRepeatCount ; 3 in a row are equal BEQ ECRefilllstRepeat INC ECRepeatCount

BEQ ECRefill256thRepeat ; ECRepeats = lOOh IJMP ECRefill

ECRefilllstRepeat: APPENDIX 1

STI #00lh,ECRepeatCount IJMP ECRefill ECRefill256thRepeat:

STI #OFFh,ECRepeatCount ENDIF ;}

ECRefillUpdate:

LDY ECNextChar IF Repeats LDX ECRepeatCount ECRepeatCount = 0 to OFFh

repeat character S

swapped with new character fi ENDIF

STA ECChar,Y ; A = new character STA ECCharCopy ,Y STA ECCurrentChar IF Test LDA ECABStatus ifEQ INC AStringsOn+0 ifEQ

INC AStringsOn+1 fi els INC BStringsθn+0 ifEQ

INC BStringsOn+1 fi APPENDIX 1

fi ENDIF

FontUpdate EC,FU LDA ECABChange ; either condition requires

ORA ECCommand ; flushing the ECChar buffer ifEQ ;{

INC ECAvailable ifEQ ; 256 characters available

LDY ECABStatus ifEQ JSR StringATime els

JSR StringBTi e fi fi els STA ECFlush INC ECAvailable LDA ECABChange ifNE ; if ABStatus change, this pass

EOR ECABStatus ; is cleaning up remaining ifEQ ; characters from prior status

APPENDIX 1

ifEQ JSR StringATime els JSR StringBTime fi fi

STI #00Oh,ECABChange STI #000h,ECFlush LDA ECCommand BEQ ECRefillReturn

RTS fi } ECRefillReturn:

INC ECNextChar ; INC here to avoid ECChar

IF Repeats ; buffer advance on LDA ECRepeatCount ; prods/commands ifNE STI #00Oh,ECRepeatCount LDA ECCharSave use saved character which JMP ECRefillRepeats forced repeat output fi ENDIF

JMP ECRefill

ECProdCommand:

LDA ECAvailable ifNE

STI #OFFh,ECFlush LDY ECABStatus ifEQ

JSR StringATime els

JSR StringBTime APPENDIX 1

fi

STI #OOOh,ECFlush fi

IF Repeats LDY ECRepeatCount ifNE

LDA ECCharlPrior ; set up repeat character

JSR ECRefillUpdate ; as new character STI #OOOh,ECRepeatCount

INC ECNextChar fi ENDIF

LDY ECNextChar FontUpdate EC,PC

LDA ECABStatus ifNE JMP ECProdB fi ECProdA:

LDX ECNewIndex,Y ifEQ

JMP ECProdANCNF fi LDA FontCode,X

Writel7 ECProdANCOF:

LDA GlobalCodeHigh+255 LDX GlobalCodeLow+255 BEQ ECProdANCOFHigh

Writeδ17

JMP ECProdShift ECProdANCOFHigh: Writel7 JMP ECProdShift

ECProdANCNF: plain OFFh

ECProdB:

LDA ECAntiEStatus BMI ECProdNoStrings ELSE

ECProdB:

ENDIF LDX ECNewIndex,Y

BEQ ECProdBNF

LDA FontCode,X

Writel7

LDA #0FFh FFh = 11111111 Writeδ

IJMP ECProdShift ECProdBNF:

LDA #0BFh ; FFh = 101111111

LDX #0C0h Writeδ17 ECProdShift:

LDA ECCommand

STI #000h,ECCommand

Writel7 ECProdShiftLoop:

LDA ECBuffer APPENDIX 1

CMP #001h ifEQ

JMP ECProdDone fi LDA #04Oh

Writel7

IJMP ECProdShiftLoop ECProdDone:

IF EOFControl BBR 7,HostLCR,ECToRefill

BBR 6,HostLCR,ECProdNLB JSR SwitchToDecode ECProdNLB:

JMP DCOrECEOF ECToRefill:

JMP ECRefill ELSE RTS ENDIF

************ A - S T R I N G M A C R O S **************

A-STRING HASH HEAD AND SEARCH MACRO

FindAString MACRO ; A = 1st ECChar ptr

LOCAL FindABackOK,FindAMoreLoop,FindAUpdate,FindASkip,FindABackMat ch,FindABackLoop,FindAReturn

STA ECBytel STI #003h,ECByte3

ADD #001h

TAX ; Y = ECNextOut+ECFind4s+l

ADD #002h

TAY ; Y = ECNextOut+ECFind4s+3 LDA ECHashRawO,X

NEG A APPENDIX 1

EOR ECHashRawO,Y STA ECFindHash

STA ECWord3+0 ; Word3 = ptr to LDA ECHashRawl,X ; RRHashHead NEG A

EOR ECHashRawl,Y AND #HIGH(BufferHashes-l) ASL ECWord3+0 ROL A ADD #HIGH(ECRRHashHead)

STA ECWord3+l

LDX #001h ; Word2 = ptr to 1st

LDA (ECWord3) ,X ; RRBuffer location ifNE ; for this hash STA ECWord4+l

ADD #(HIGH(ECRRBuffer)-HIGH(ECRRHashLink) ) STA ECWord2+l

LDA (ECWord3) ; Word4 = ptr to 1st STA ECWord2+0 ; RRHashLink location STA ECWord4+0 ; for this hash

STI #MaximumASearches,ECWordl+1 IJMP FindABackMatch fi

IF Test ifEQ

INC FSNoHash+0 ifEQ INC FSNoHash+1 ifEQ INC FSNθHash+2 fi fi fi ENDIF IJMP FindAReturn

FindASkip: APPENDIX 1

DEC ECWordl+1 BEQ FindAReturn LDX #001h LDA (ECWord4) use RRHashLink to find TAY next RRHashLink and LDA (ECWord4),X ;; next RRBuffer offset BEQ FindAReturn STY ECWord4+0 STA ECWord4+l DecBankSelect LDA (ECWord4) EncBankSelect CMP ECFindHash BNE FindAReturn STY ECWord2+0 LDA ECWord4+l

ADD #(HIGH(ECRRBuffer)-HIGH(ECRRHashLink)) STA ECWord2+l

FindABackMatch:

LDX ECByte3 FindABackLoop:

LDA (ECWord2),X ; check byte at longest CMP (ECBytel) ,X ; string + 1 and work IF Test backwards to origin ifNE

INC FSSkips+0 ifEQ INC FSSkips+1 ifEQ

INC FSSkips+2 fi fi

IJMP FindASkip fi ELSE BNE FindASkip APPENDIX 1

ENDIF DEX

BNE FindABackLoop LDA (ECWord2) CMP (ECBytel)

IF Test BEQ FindABackOK INC FSSkips+0 ifEQ INC FSSkips+1 ifEQ

INC FSSkips+2 fi fi IJMP FindASkip

ELSE

BNE FindASkip ENDIF FindABackOK: IF Test Wordl = RRHashCount (+0)

INC FSSearches+0 ; Word2 = RRBuffer offset ifEQ ; Word3 = RRBuffer best string

INC FSSearches+1 ; Word4 = 1st RRHashLink ifEQ ; Bytel = 1st unmatched ECChar

INC FSSearches+2 ; Byte3 = best string length fi fi ENDIF

LDX ECByte3 FindAMoreLoop: INX ifNE APPENDIX 1

LDA (ECWord2),X CMP (ECBytel) ,X BEQ FindAMoreLoop els

LDX #0FFh fi FindAUpdate:

LDA ECWord2+0 STA ECWord3+0 Word3 = RRBuffer offset of LDA ECWord2+l best string STA ECWord3+l CPX ECMaxLength ifCC STX ECByte3 Byte3 = best string length

IJMP FindASkip fi LDA ECMaxLength string length at maximum

STA ECByte3

FindAReturn:

ENDM

************* A - S T R I N G S E A R C H *************

SkipAStrings:

CMP #(MinimumAUpdate+l) ifCS JMP NoAFound fi

LDY ECNextOut INC ECNextOutSave DEC ECAvailable JMP WriteAEncodings Y = ECNextOut StringATime:

LDY ECNextOut APPENDIX 1

STY ECNextOutSave IJMP StringASearchlst StringASearch:

LDY ECNextOut StringASearchlst:

LDA ECAvailable ifEQ

LDA #0FFh els CMP #007h

BCC SkipAStrings fi

ADD #(0-003h)

STA ECMaxLength ; 255 - 3 is MaxLeng STI #0FFh,ECStringOrigin

STI #HIGH(ECChar) ,ECByte2 IF Test INC FSEntries+O ifEQ INC FSEntries+1 ifEQ

INC FSEntries+2 fi fi ENDIF

FindA43:

LDA ECNextOut ADD #003h FindAString LDY ECByte3 ; Y = string length

CPY #004h BCC FindA42 LDA ECNextOut STA ECBytel LDX #002h ; X = ECOrigin - 1

FindA43Loop: APPENDIX 1

LDA ECWord3+0 ADD #OFFh STA ECWord3+0 ifCC LDA ECWord3+l

ADD #0FFh

CMP #HIGH(ECRRBuffer) ifCC

ADD #HIGH(BufferSize) fi

STA ECWord3+l fi

LDA (ECWord3) CMP (ECBytel) ,X BNE FindA43Adjust

INY DEX

BPL FindA43Loop IJMP FindA43Done FindA43Adjust:

INC ECWord3+0 ifEQ

INC ECWord3+l fi FindA43Done:

STY ECStringLength INX

STX ECStringOrigin LDA ECWord3+0 STA ECFound+0

LDA ECWord3+l STA ECFound+1 SEC

IF ZoneTestA LDA ECRRPtr+0

SBC ECWord3+0 APPENDIX 1

ENDIF

LDA ECRRPtr+1

SBC ECWord3+l

AND #HIGH(BufferSize-l)

STA ECZone

FindA42;

tring length

a little more compression if string length governs ?????

X = ECOrigin - 1

APPENDIX 1

INY DEX

BPL FindA42Loop IJMP FindA42Done FindA42Adjust:

INC ECWord3+0 ifEQ

INC ECWord3+l fi FindA42Done: INX

STX ECBytel ; ECOrigin

SEC

IF ZoneTestA LDA ECRRPtr+0

SBC ECWord3+0 ENDIF

LDA ECRRPtr+1 SBC ECWord3+l AND #HIGH(BufferSize-l)

TAX

LDA ECStringOrigin ifPL ;{ CPY ECStringLength BCC FindA4Exit ifEQ ;{ CPX ECZone BCS FindA4Exit fi ;} fi ;}

STY ECStringLength

STX ECZone

LDA ECBytel

STA ECStringOrigin LDA ECWord3+0

STA ECFound+0 APPENDIX 1

LDA ECWord3+l STA ECFound+1 FindA4Exit: __

LDA ECStringOrigin BMI NoAFound

LDA ECStringLength CMP #MinimumAString BCC NoAFound StringAOverlap: LDX ECRRPtr+1'

LDA ECStringOrigin ADD ECRRPtr+0 ifCS INX fi

SEC

SBC ECFound+0

STA ECWordl+0 ; Wordl+0 = LOW(Diff) TAW TXA

SBC ECFound+1 AND #HIGH(Buffersize-1) STA ECWordl+1 ifEQ ;{ ; Delta(ECWordl) < 256 LDA ECStringOrigin ifNE ;{ ADD ECStringLength STA ECByte4

TWA ; W = ECWordl+0 (saved later) ifNE ;{ CMP ECByte4 ; UL = StringLength+StringOrigin

BCC NoAFound fi ;} fi APPENDIX 1

fi ;}

JMP ProcessAString NoAFound:

IF AHashX2 XOR 1 LDA ECNextOut

ADD #MinimumAUpdate STA ECNextOut JMP ResetACharCounts ELSE LDA #MinimumAUpdate

NoAFoundHashX2:

STA ECByte4

JSR HashAX2 ; A = -1 or -2 ADD ECByte4 BMI NoAFoundHashX2Negative

BNE NoAFoundHashX2 NoAFoundHashX2Negative:

JMP ResetACharCounts HashAX2: Y = index to reach

ECNextOut+l data items

2(11 length) + 10

3(001 length) + 10

ype 2

APPENDIX 1

INX

DEC ECByte3 BNE HashAX2SumBits HashAX2Null: INC ECNextOut

LDA #(0-001h) RTS HashAX20K:

LDX ECNextOut LDY ECNextOut

INY

IF Test INC AHashX2s+0 ifEQ INC AHashX2s+l fi ENDIF LDA #006h Type 6 STA ECType,X LDA ECWord2+l

ASR A

ROR ECWord2+0 AND #003h CLC ROR A

ROR A ROR A ORA #02Oh STA ECHashX21,Y ; ECHashX21 of 2nd character =

LDA ECWord2+0 2 high-order hash bits STA ECHashX20,Y LDA ECNextOut ECHashX20 of 2nd character = ADD #002h ; δ low-order hash bits STA ECNextOut

LDA #(0-002h) APPENDIX 1

RTS ENDIF ProcessAString:

IF AHashX2 XOR 1 LDA ECNextOut

ADD ECStringOrigin STA ECNextOut ELSE LDA ECStringOrigin IJMP ProcessABestXlst

ProcessABestXLoop:

ADD ECStringOrigin STA ECStringOrigin ProcessABestXlst: CMP #002h

BCC ProcessABestX

JSR HashAX2 ; A = -1 OR -2

IJMP ProcessABestXLoop ProcessABestX: AND #0FFh ifNE

INC ECNextOut fi ENDIF IF Test

INC AStringsFound+O ifEQ

INC AStringsFound+1 fi ENDIF

DirectAString:

LDY ECNextOut LDA ECStringLength

STA ECByte3 ; Byte3 = StringLength LDX ECNewIndex,Y ifNE APPENDIX 1

IF AHashX2

ADD #(0-(MinimumAString-4) ) ELSE

ADD #(0-MinimumAString) ENDIF STA ECByte4 Byte4 = Global length index

INX

LDA FontBits,X els ADD #(0-MinimumAString) STA ECByte4 Byte4 = Global length index

011

01

A = total string

Type 2

; Type 4

APPENDIX 1

SBC FontBits,X

LDX ECFontlndex,Y

SBC GlobalBits,X els ;{} LDX ECFontlndex,Y

SBC GlobalBits,X TAW

LDA GlobalCodeHigh,X ifPL ;{ TWA

SBC #001h els ;{} TWA fi ;} fi ;} fi ;}

BMI DirectAUse INY

DEC ECByte3 BNE DirectASumBits

DirectAReject:

LDA ECNextOut ADD #MinimumAUpdate

STA ECNextOut ; try HashX2'S ????? JMP ResetACharCounts

DirectAUse:

LDY ECNextOut IF Test INC AStringsUsed+O ifEQ

INC AStringsUsed+1 fi ENDIF

LDA #00δh ; Type δ STA ECType,Y

LDA ECByte4 ; Global or LengthB index APPENDIX 1

STA ECHashX20,Y ; saved for ECWrite INY

LDA ECWordl+0 save Zone codes for ECWrite

STA ECHashX20,Y ; in 2nd character's ECHashX2

LDA ECWordl+1

STA ECHashX21,Y

LDA ECStringLength

ADD ECNextOut STA ECNextOut

ResetACharCounts:

LDA ECAvailable

ADD ECNextOutSave update ECAvailable SEC SBC ECNextOut

STA ECAvailable

LDY ECNextOutSave ; interchange ECNextOut and

LDA ECNextOut ; ECNextOutSave STA ECNextOutSave STY ECNextOut ; Y - ECNextOut

************* A - S T R I N G O U T P U T *************

ENCODER WRITE ROUTINE

WriteAEncodings: ; Y = ECNextOut

LDA ECType,Y ORA ECRepeatSW,Y ; bit 3 on if repeats TAX

JMP (WriteAJumps) ,X WriteAJumps:

IF AHashX2 DW WriteANull 0 - HashX2(2) - no repeats

ELSE j or repeats APPENDIX 1

DW 0 0 - HashX2 inactive ENDIF DW WriteAOFont ; 2 - Font char - no repeats

DW WriteAONewChar ; 4 - New char - no repeats

IF AHashX2 DW WriteAHashX2 ; 6 - HashX2(l) - no repeats

ELSE

DW 0 6 - HashX2 inactive ENDIF DW WriteAString ; δ - String(1) - no repeats

IF Repeats or repeats DW WriteAlFont ; 10 - Font char - repeats

DW WriteAlNewChar 12 New char - repeats

IF AHashX2 DW WriteAHashX2 14 - HashX2(l) - repeats

ELSE

DW 0 14 - HashX2 inactive ENDIF ENDIF

IF AHashX2 WriteANull: IF Repeats

LDA ECRepeats,Y ifNE

JMP WriteAORepeats els JMP UpdateAOBuffer fi APPENDIX 1

ELSE JMP UpdateAOBuffer ENDIF ENDIF WriteAOFont:

LDX ECFontlndex,Y char encoding index - font

LDA FontCode,X Writel7 JMP UpdateAOBuffer

IF Repeats WriteAlFont:

LDX ECFontlndex,Y char encoding index - font LDA FontCode,X

Writel7

JMP WriteAORepeats ENDIF WriteAONewChar: LDX ECNewIndex,Y NC encoding index

BEQ WriteAONCNF LDA FontCode,X Writel7 WriteAONCOF: LDX ECFontlndex,Y char encoding index - global

LDA GlobalCodeHigh,X TAW

LDA GlobalCodeLow,X BEQ WriteAONCOFHigh

TAX TWA

Writeδ17

IJMP WriteAOCommand WriteAONCOFHigh: TWA ; char encoding index

; leading 0 bit

Writel7 LDA GlobalCodeHigh ,X

TAW

LDA GlobalCodeLow,X

BEQ WriteAONCNFHighl

TAX TWA

Write817

IJMP WriteAOCommand WriteAONCNFHighl:

TWA WriteA0NCNFHigh2:

Writel7 WriteAOCommand:

LDA ECFontlnde ,Y ; char encoding index - global

CMP #0FEh ifCC JMP UpdateAOBuffer fi

LDA #040h Writel7

JMP UpdateAOBuffer IF Repeats WriteAlNewChar:

LDX ECNewIndex,Y ; NC encoding index BEQ WriteAlNCNF

LDA FontCode,X APPENDIX 1

Writel7 WriteAlNCOF:

LDX ECFontlndex, char encoding index - global LDA GlobalCodeHigh,X

TAW

LDA GlobalCodeLow,X BEQ WriteAlNCOFHigh TAX TWA

Write817

IJMP WriteAlCommand WriteAlNCOFHigh: TWA Writel7

JMP WriteAlCommand WriteAlNCNF:

LDX ECFontlndex,Y ; char encoding index - global LDA GlobalCodeHigh,X

BMI WriteAlNCNFHigh2 LDA #04Oh leading 0 bit Writel7

LDA GlobalCodeHigh,X TAW

LDA GlobalCodeLow,X BEQ WriteAlNCNFHighl TAX TWA Writeδ17

IJMP WriteAlCommand WriteAlNCNFHighl:

TWA WriteAlNCNFHigh2: Writel7

WriteAlCommand: - 13δ - APPENDIX 1

LDA ECFontlndex,Y char encoding index - global

CMP #0FEh ifCC JMP WriteAORepeats fi

LDA #04Oh

Writel7

JMP WriteAORepeats ENDIF

IF AHashX2 WriteAHashX2:

LDX ECNewInde ,Y

BEQ WriteAHNF WriteAHOF:

INX

LDA FontCode,X

Writel7

LDA #0E0h 11 length Writel7

IJMP WriteAHMain WriteAHNF:

LDA #050h 010

Writel7 WriteAHMain:

INY

LDA ECHashX21,Y

Writel7

LDA ECHashX20,Y Writeδ

LDA #000h

STA ECType,Y ; Type 0 (2nd byte)

STA ECRepeatSW,Y

DEY IF Repeats

LDA ECRepeats,Y APPENDIX 1

ifNE

JMP WriteAORepeats els JMP UpdateAOBuffer fi

ELSE

JMP UpdateAOBuffer ENDIF ENDIF WriteAString:

LDX ECNewIndex,Y BEQ WriteASNF WriteASOF:

INX LDA FontCode,X

Writel7

IJMP WriteASLength WriteASNF:

IF AHashX2 LDA #070h ; Oil

ELSE

LDA #060h ; 01

ENDIF Writel7 WriteASLength:

LDX ECHashX20,Y LDA GlobalCodeHigh,X TAW

LDA GlobalCodeLow,X BEQ WriteASLHigh

TAX TWA

Writeδ17 IJMP WriteASMain WriteASLHigh: TWA APPENDIX 1

Writel7 WriteASMain: INY

LDX ECHashX21,Y LDA ZoneCode,X

Writel7

LDA ECHashX20,Y Writeδ DEY IF Repeats

LDA ECRepeats,Y ifNE JMP WriteAlRepeats els JMP UpdateAlBuffer fi ELSE

JMP UpdateAlBuffer ENDIF

****** A - S T R I N G B U F F E R U P D A T E *****

IF Repeats WriteAORepeats: LDA ECRepeats,Y

CMP #001h

BNE WriteAOAreRepeats WriteAONoRepeats:

LDA #04Oh Writel7

IJMP WriteAORClear WriteAOAreRepeats:

LDA #OCOh Writel7 LDA ECRepeats,Y

ADD #OFEh APPENDIX 1

TAX

LDA GlobalCodeHigh,X

TAW

LDA GlobalCodeLow,X BEQ WriteAORHigh

TAX

TWA

Writeδ17

IJMP WriteAORClear WriteAORHigh:

TWA

Writel7 WriteAORClear:

LDA #000h STA ECRepeats,Y

STA ECRepeatSW,Y ENDIF t

UpdateAOBuffer: ; Y = ECNextOut LDA ECChar,Y

STA (ECRRPtr) IF BufferSuffix LDX ECRRPtr+1 CPX #HIGH(ECRRBuffer) ifEQ

STI #(HIGH(ECRRBuffer)+HIGH(BufferSize)) ,ECRRPtr+l STA (ECRRPtr)

STI #HIGH(ECRRBuffer) ,ECRRPtr+l fi

ENDIF

BBS 0,ECRRPtr+0,UpdateAOHead JMP UpdateAOBufferPtr UpdateAOHead: LDX #001h

LDA ECRRPtr+0 ; Word3 = ptr to RRHashLink APPENDIX 1

ADD #(0-003h) ; at location RRPtr-3 STA ECWord3+0 LDA ECRRPtr+1 ifCC

ADD #0FFh

CMP #HIGH(ECRRBuffer) ifCC LDA #(HIGH(ECRRBuffer)+HIGH(BufferSize)-1) fi fi

ADD #(HIGH(ECRRHashLink)-HIGH(ECRRBuffer) )

STA ECWord3+l

LDA ECHashRawO,Y ; Word4 = ptr to

EOR ECPriorHashO ; RRHashHead

STA ECWord4+0 DecBankSelect STA (ECWord3) store LOW(Hash) in EncBankSelect ; ; RRHashTest table

LDA ECHashRawl,Y

EOR ECPriorHashl AND #HIGH(BufferHashes-1) ASL ECWord4+0 ROL A ADD #HIGH(ECRRHashHead) STA ECWord4+l LDA ECHashRawO,Y NEG A STA ECPriorHashO LDA ECHashRawl,Y NEG A STA ECPriorHashl

UpdateAOLink:

LDA (ECWord4) transfer RRHashHead to

STA (ECWord3) RRHashLink table LDA (ECWord4) ,X

STA (ECWord3) ,X

UpdateAOBufferPtr:

INC ECRRPtr+0 ifEQ INC ECRRPtr+1 LDA ECRRPtr+1 CMP #(HIGH(ECRRBuffer)+HIGH(BufferSize)) ifEQ

STI #HIGH(ECRRBuffer) ,ECRRPtr+1 fi fi IF Failsafe

DEC ECFailSafe+O ifEQ DEC ECFailSafe+1 ifEQ STI #FailSafeSets,ECFailSafe+1

LDA #008h Writel7 fi fi ENDIF

. r

OutputAOControl:

INY

STY ECNextOut CPY ECNextOutSave ifNE JMP WriteAEncodings ; Y = ECNextOut fi

LDA ECFlush BNE OutputAOFlush

LDA #SetLength APPENDIX 1

CMP ECAvailable ifCS

RTS ^~" fi JMP StringASearch

OutputAOFlush:

LDA ECAvailable ifEQ RTS fi

JMP StringASearch .

IF Repeats WriteAlRepeats: CMP #001h

BNE WriteAlAreRepeats WriteAlNoRepeats:

LDA #040h

Writel7 IJMP WriteAlRClear

WriteAlAreRepeats:

LDA #0C0h

Writel7

LDA ECRepeats,Y ADD #0FEh

TAX

LDA GlobalCodeHigh,

TAW

LDA GlobalCodeLow,X BEQ WriteAlRHigh

TAX

TWA

Writeδ17

IJMP WriteAlRClear WriteAlRHigh:

TWA APPENDIX 1

Writel7 WriteAlRClear:

LDA #000h STA ECRepeats,Y STA ECRepeatSW,Y

ENDIF

9

UpdateAlBuffer: ; Y = ECNextOut

LDA ECChar,Y STA (ECRRPtr)

IF BufferSuffix LDX ECRRPtr+1 CPX #HIGH(ECRRBuffer) ifEQ STI

#(HIGH(ECRRBuffer)+HIGH(BufferSize)) ,ECRRPtr+l STA (ECRRPtr)

STI #HIGH(ECRRBuffer) ,ECRRPtr+l fi ENDIF

BBS 0,ECRRPtr+0,UpdateAlHead JMP UpdateAlBufferPtr UpdateAlHead:

LDX #001h LDA ECRRPtr+0 ; Word3 = ptr to RRHashLink

ADD #(0-003h) ; at location RRPtr-3 STA ECWord3+0 LDA ECRRPtr+1 ifCC ADD #0FFh

CMP #HIGH(ECRRBuffer) ifCC

LDA #(HIGH(ECRRBuffer)+HIGH(BufferSize)-1) fi fi

ADD #(HIGH(ECRRHashLink)-HIGH(ECRRBuffer) )

APPENDIX 1

STI #HIGH(ECRRBuffer) ,ECRRPtr+l fi fi

IF Failsafe DEC ECFailSafe+O ifEQ DEC ECFailSafe+1 ifEQ STI #FailSafeSets,ECFailSafe+1 LDA #00δh

Writel7 fi fi ENDIF ;

OutputAlControl: INY

STY ECNextOut CPY ECNextOutSave ifNE

LDA ECRepeats,Y ifEQ JMP UpdateAlBuffer fi JMP WriteAlRepeats ; Y - ECNextOut fi

LDA ECFlush BNE OutputAlFlush LDA #SetLength CMP ECAvailable ifCS RTS fi

JMP StringASearch OutputAlFlush:

LDA ECAvailable - 146 - APPENDIX 1

ifEQ RTS fi JMP StringASearch

************ B - S T R I N G M A C R O S ************

B-STRING SEARCH MACRO - BYTE 2

FindB2String MACRO STI #002h,ECByte3 INX Word3 = ptr to LDA ECHashRawO,X FTHashHead STA ECFindHash STA ECWord3+0

LDA ECHashRawl,X ; Word2 = ptr to 1st AND #HIGH(BufferHashes-l) ; (RRBuffer+1) location

ASL ECWord3+0 ; for this hash ROL A

ADD #HIGH(ECRRHashHead)

STA ECWord3+l Word4 = ptr to 1st

LDX #001h ; (FTHashLink+1) location

LDA (ECWord3),X ; for this hash ifNE

STA ECWord4+l

LDA (ECWord3)

STA ECWord4+0

TAY

STI #MaximumBSearches,ECWordl+1

IJMP FindB2Skiplst fi

IJMP FindB2Return FindB2Skip:

DEC ECWordl+1

APPENDIX 1

ifEQ INC FSSkips+2 fi fi

IJMP FindB2Skip ELSE

BNE FindB2Skip ENDIF FindB2More:

IF Test Wordl = emergency max hashes

INC FSSearches+0 ; Word2 = RRBuffer ptr ifEQ ; Word3 = RRBuffer best string

INC FSSearches+1 ; Word4 = FTHashLink ptr ifEQ ; Bytel = 1st unmatched ECChar

INC FSSearches+2 ; Byte3 = best string length fi ^• fi ENDIF LDX #000h FindB2MoreLoop: INX ifNE LDA (ECWord2) ,X CMP (ECBytel) ,X BEQ FindB2MoreLoop els

LDX #0FFh fi FindB2Update:

CPX ECByte3 BCC FindB2Skip reset ECChar offset used APPENDIX 1

; in FindnnStart routine

; Word3 « RRBuffer offset of ; best string

Byte3 = best string length

IJMP FindB2Skip fi LDA ECMaxLength ; string length at maximum

STA ECByte3 FindB2Return: ENDM

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

B-STRING SEARCH MACRO - BYTE 1

3 = ptr to FTHashHead

Word2 = ptr to 1st

(RRBuffer+1)

this hash

4 = ptr to 1st (FTHashLink+1)

APPENDIX 1

LDA (ECWord3),X for this hash ifNE STA ECWord4+l LDA (ECWord3) STA ECWord4+0 TAY

STI #MaximumBSearches,ECWordl+l IJMP FindBlSkiplst fi

IJMP FindBlReturn FindBlSkip:

DEC ECWordl+1 BEQ FindBlReturn LDX #001h LDA (ECWord4) use FTHashLink to find TAY next FTHashLink and LDA (ECWord4) ,X ; next RRBuffer offset BEQ FindBlReturn STY ECWord4+0 STA ECWord4+l DecBankSelect LDA (ECWord4) EncBankSelect CMP ECFindHash BNE FindBlReturn FindBlSkiplst:

STY ECWord2+0 Y = ECWord4+0 LDA ECWord4+l

ADD #(HIGH(ECRRBuffer)-HIGH(ECRRHashLink)) STA ECWord2+l LDX #002h LDA (ECWord2),X CMP (ECBytel) ,X IF Test BEQ FindBlMore INC FSSkips+0 APPENDIX 1

ifEQ INC FSSkips+1 ifEQ INC FSSkips+2 fi fi

IJMP FindBlSkip ELSE BNE FindBlSkip ENDIF

FindBlMore:

IF Test ; Wordl - emergency max hashes INC FSSearches+0 ; Word2 = RRBuffer ptr ifEQ ; Word3 = RRBuffer best string

INC FSSearches+1 ; Word4 = FTHashLink ptr ifEQ Bytel = 1st unmatched

ECChar INC FSSearches+2 ; Byte3 = best string length fi fi ENDIF LDX #000h

IJMP FindBlMorelst FindBlMoreLoop: INX ifNE FindBlMorelst:

LDA (ECWord2),X CMP (ECBytel) ,X BEQ FindBlMoreLoop els LDX #0FFh fi APPENDIX 1

FindBlUpdate :

CPX ECByte3 BCC FindBlSkip ; reset ECChar offset used

BEQ FindBlSkip ; in FindnnStart routine LDA ECWord2+0 STA ECWord3+0 Word3 = RRBuffer offset of LDA ECWord2+l best string STA ECWord3+l

10 CPX ECMaxLength ifCC STX ECByte3 Byte3 = best string length

IJMP FindBlSkip

15 fi LDA ECMaxLength string length at maximum

STA ECByte3 FindBlReturn: 20 ENDM

************ B - S T R I N G S E A R C H *************

SkipBStrings :

25 CMP #(MinimumBUpdate+1) ifCS

JMP NoBFound fi

LDY ECNextOut

30 INC ECNextOutSave DEC ECAvailable IF AntiEx

JMP TotalBBits ELSE

35 JMP UpdateBBuffer Y = ECNextOut ENDIF

STY ECNextOutStart

STI #000h,ECExcessBits ENDIF

IJMP StringBSearchlst StringBSearch: LDY ECNextOut

StringBSearchlst:

LDA ECAvailable ifEQ LDA #0FFh els

CMP #003h BCC SkipBStrings fi

STA ECMaxLength ; 255 is MaxLength STY ECBytel

STI #HIGH(ECChar) ,ECByte2 STI #OOOh,ECStringLength IF Test INC FSEntries+O ifEQ

INC FSEntries+1 ifEQ

INC FSEntries+2 fi fi

ENDIF FindB2:

LDX ECNextOut INX FindB2String

LDX ECByte3 APPENDIX 1

CPX #(MinimumBString+1)

BCC FindBl

SEC

IF ZoneTestB

LDA ECRRPtr+0

SBC ECWord3+0 ENDIF

LDA ECRRPtr+1 SBC ECWord3+l AND #HIGH(BufferSize-l) STA ECZone STX ECStringLength LDA ECWord3+0 STA ECFound+0 LDA ECWord3+l STA ECFound+1 LDA ECWord3+0

FindBl:

LDX ECNextOut

FindBlString

LDX ECByte3

CPX #(MinimumBString+1)

BCC FindBlExit

CPX ECStringLength

BCC StringBOverlap ifEQ

SEC

IF ZoneTestB LDA ECRRPtr+0 SBC ECWord3+0

ENDIF

LDA ECRRPtr+1

SBC ECWord3+l

AND #HIGH(BufferSize-l)

CMP ECZone

BCS StringBOverlap APPENDIX 1

fi STX ECStringLength LDA ECWord3+0 STA ECFound+0 LDA ECWord3+l

STA ECFound+1 IJMP StringBOverlap FindBlExit:

LDA ECStringLength BEQ NoBFound

StringBOverlap;

LDA ECRRPtr+0

SEC

SBC ECFound+0 STA ECWordl+0 Wordl+0 = LOW(Diff)

LDA ECRRPtr+1 SBC ECFound+1 AND #HIGH(BufferSize-l) STA ECWordl+1 JMP ProcessBString

NoBFound:

JSR HashBX2 JMP ResetBCharCounts HashBX2: LDY ECNextOut ; Y = index to reach

INY ; ECNextOut+l data items LDA ECHashX21,Y BMI HashBX2Null LDX ECNextOut STA ECWord2+l

ORA #080h CMP ECHashX21,X ifEQ LDA ECHashX20,Y CMP ECHashX20,X

BEQ HashBX2Null APPENDIX 1

els

LDA ECHashX20,Y fi

STA ECWord2+0 HashBX2Bits:

LDY ECNewIndex,X ; NC encoding index ifNE INY

LDA FontBits,Y ST bits + Hash(=10) ADD #00Ah els

LDA #00Dh ; 3(110 length) + 10 fi

IF AntiEx TAW

ENDIF

STI #002h,ECByte3 SEC HashBX2SumBits:

pe 2

pe 4

; NC encoding index

APPENDIX 1

; total bits for 2-byte

less cost of δ-bit

; save bit length for

; in 2nd character

Type 6

Type 0 (skip)

APPENDIX 1

; ECHashX21 of 2nd

high-order hash bits

ashX20 of 2nd character = ; δ low-order hash bits

; Byte3 = StringLength

ADD # ( 0- (MinimumBString+1 ) )

STA ECByte4 Byte4 = LengthB index CMP #009h ifCC

TAX

LDA LengthBBits,X length bits from table els

ADD #(0-009h)

TAX

LDA GlobalBits,X

APPENDIX 1

SBC #009h fi ;} fi } fi BMI DirectBUse

INY

DEC ECByte3 BNE DirectBSumBits DirectBReject: JSR HashBX2

JMP ResetBCharCounts DirectBUse:

LDY ECNextOut IF AntiEx TWA

STA ECHashX21,Y save bit length for

AntiEx

; Type δ

; Global or LengthB index ; saved for ECWrite

e Zone codes for ECWrite ; in 2nd character's

ECHashX2

APPENDIX 1

DEX

LDA #000h UseBStringLoop:

STA ECType,Y set Type to 0 in the 5 INY (ECStringLength-1) chars

DEX which generate no output

BNE UseBStringLoop ENDIF

LDA ECStringLength 10 ADD ECNextOut

STA ECNextOut ResetBCharCounts:

LDA ECAvailable ADD ECNextOutSave ; update ECAvailable 15. SEC

SBC ECNextOut STA ECAvailable LDY ECNextOutSave ; interchange ECNextOut and

20 LDA ECNextOut ECNextOutSave STA ECNextOutSave STY ECNextOut Y = ECNextOut

************ B - S T R I N G O U T P U T *************

25

TOTAL B BITS FOR AntiExpansion

IF AntiEx TotalBBits: ; Y - ECNextOut 30 LDA ECExcessBits

ADD #(0-008h) LDX ECType,Y JMP (TotalBJumps) ,X TotalBJumps: 35 DW TotalBDone

DW TotalBFont APPENDIX 1

; char encoding index - font

ADC FontBits,X IJMP TotalBDone Tota1BNewChar: CLC

LDX ECNewIndex,Y ; NC encoding index BEQ TotalBNCMainNF ADC FontBits,X TotalBNCMainOF:

ADD #008h char encoding index - δ-bit

IJMP TotalBDone TotalBNCMainNF: LDX ECFrequency,Y ; char encoding index - δ-bit ifPL ADD #00δh els ADD #009h fi IJMP TotalBDone TotalBHashX2: TotalBString: CLC

ADC ECHashX21,Y IF AntiEx STA ECExcessBits JMP UpdateBlBuffer ENDIF

TotalBDone:

***** B - S T R I N G B U F F E R U P D A T E *****

UPDATE BUFFER

UpdateBBuffer: ; Y ■ ECNextOut

LDA ECChar,Y STA (ECRRPtr) IF BufferSuffix

LDX ECRRPtr+1 CPX #HIGH(ECRRBu fer) ifEQ STI #(HIGH(ECRRBuffer)+HIGH(BufferSize)) ,ECRRPtr+1

STA (ECRRPtr)

STI #HIGH(ECRRBuffer) ,ECRRPtr+l fi ENDIF BBS 0,ECRRPtr+0,UpdateBHead

JMP UpdateBBufferPtr APPENDIX 1

DecBankSelect STA (ECWord3) store LOW(Hash) in EncBankSelect ; ; RRHashTest table

LDA ECHashRawl,Y

AND #HIGH(BufferHashes-1) ASL ECWord4+0 ROL A ADD #HIGH(ECRRHashHead) STA ECWord4+l UpdateBLink: LDA (ECWord4) transfer RRHashHead to STA (ECWord3) RRHashLink table LDA (ECWord4)_fX STA (ECWord3) ,X LDA ECWord3+0 reset RRHashHead to new

STA (ECWord4) RRHashLink ptr LDA ECWord3+l

STA (ECWord4) ,X

UpdateBBufferPtr: INC ECRRPtr+0 ifEQ INC ECRRPtr+1 LDA ECRRPtr+1

CMP #(HIGH(ECRRBuffer)+HIGH(BufferSize) ) ifEQ

STI #HIGH(ECRRBuffer),ECRRPtr+l APPENDIX 1

fi fi

OutputBControl: INY

STY ECNextOut CPY ECNextOutSave ifNE ; Y = ECNextOut

IF AntiEX JMP TotalBBits

ELSE

JMP WriteBEncodings ENDIF fi IF AntiEx XOR 1 ;{

LDA ECFlush BNE ECOutputBFlUSh LDA #SetLength CMP ECAvailable ifCS

RTS fi

JMP StringBSearch OutputBFlush: LDA ECAvailable ifEQ RTS fi

JMP StringBSearch ELSE ;{}

LDA ECAntiEStatus ; saved at start of StringTime

BMI OutputBSTOff OutputBSTOn: LDY ECExcessBits

IF Macros - 166 - APPENDIX 1

ifMI JMP OutputBCurrent if minus, write current fi ELSE BMI OutputBCurrent if minus, write current

ENDIF CPY #014h IF Macros ifCC JMP OutputBDefer if a bit plus. defer writing fi ELSE

BCC OutputBDefer if a bit plus. defer writing

ENDIF

ORA #0δ0h STA ECAntiEStatus if too plus, turn off LDX ECNextOutStart LDY ECNewInde ,X NC encoding index ifNE ;{

LDA FontCode,Y

Writel7

LDA #0FEh OFEh = 11111110

LDX #0C0h + 1

Writeδ17 els ;{}

LDA #0BFh ; OFEh = (10)111111

LDX #060h ; + 01

Writeδ17 fi ;}

IF Test ;{

INC AntiExOn+0 ifEQ ;{ APPENDIX 1

INC AntiExOn+1 fi ;}

ENDIF ;} IJMP OutputBCurrent OutputBSTOff:

LDY ECExcessBits BPL OutputBDefer ; if losing, no change CPY #(0-013h) should be 9 ????? BCS OutputBDefer ; if a bit minus, no change

AND #07Fh

STA ECAntiEStatus if too minus, turn on

LDA #0FEh strings and write current

LDX #0C0h Writeδ17 OFEh, 1 to turn on IF Test ;{

INC AntiExOff+0 ifEQ ;{ INC AntiExOff+1 fi ;} ENDIF ;} OutputBCurrent:

JSR OutputBWrite LDA ECFlush ifEQ ;{ RTS els ;{) LDA ECAvailable ifEQ ;{

RTS fi ;} fi ;}

JMP StringBSearch OutputBDefer:

LDA ECAvailable APPENDIX 1

BEQ OutputBWrite LDY ECFlush ifEQ ;{ CMP #SetLength BCC OutputBWrite fi ;}

JMP StringBSearch OutputBWrite:

LDA ECNextOut TAX

SEC

SBC ECNextOutStart

STA ECBytel

LDY ECNextOutStart STX ECNextOutStart

STI #OOOh,ECExcessBits ENDIF ;}

WriteBEncodings: ; Y = ECNextOut IF AntiEx

LDA ECAntiEStatus BPL WriteBStringsOn WriteBδBit:

LDA ECFrequency,Y character frequency Writeδ

LDA ECFrequency,Y CMP #0FEh ifCC

JMP WriteBOTestRepeats fi

LDA #040h Writel7

JMP WriteBOTestRepeats WriteBStringsOn: ; Y = ECNextOut LDA ECType,Y

ORA ECRepeatSW,Y ; bit 3 on if repeats String(n) -(no

0 - HashX2(2) - or

2 - Font char - no

4 - New char - no

6 - HashX2(l) - no

δ - String(1) - no

or repeats

10 - Font char -

12 - New char -

14 - HashX2(l) -

char encoding index - APPENDIX 1

font

LDA FontCode,X Writel7 IF AntiEx JMP WriteBODone

ELSE IF Repeats

JMP WriteBOTestRepeats ELSE JMP WriteBODone

ENDIF ENDIF

IF AntiEx IF Repeats WriteBlFont:

LDX ECFontlndex,Y char encoding index - font

LDA FontCode,X Writel7 LDA ECRepeats,Y

JMP WriteBORepeats ENDIF ENDIF WriteBONewChar: LDX ECNewIndex,Y NC encoding index

BEQ WriteBONCMainNF LDA FontCode,X Writel7 WriteBONCMainOF: LDA ECFrequency,Y char encoding index - δ-bit

Writeδ

IJMP WriteBOCommand WriteBONCMainNF: LDA ECFrequency,Y char encoding index - δ-bit APPENDIX 1

ifPL Writeδ els STI #0δ0h,ECByte4 ASR A

ROR ECByte4 AND #0BFh LDX ECByte4 Writeδ17 fi

WriteBOCommand:

LDA ECFrequency,Y ; character frequency CMP #0FEh ifCC WriteBOCommandX:

IF AntiEx

JMP WriteBODone ELSE IF Repeats JMP WriteBOTestRepeats

ELSE

JMP WriteBODone ENDIF ENDIF fi

LDA #040h Writel7

BRA WriteBOCommandX IF AntiEx IF Repeats

WriteBlNewChar:

LDX ECNewIndex,Y ; NC encoding index BEQ WriteBlNCMainNF LDA FontCode,X Writel7

WriteBlNCMainOF: APPENDIX 1

LDA ECFrequency,Y char encoding index - δ-bit

Writeδ

IJMP WriteBlCommand WriteBlNCMainNF:

LDA ECFrequency,Y char encoding index - δ-bit ifPL Writeδ els

STI #0δ0h,ECByte4 ASR A ROR ECByte4 AND #0BFh LDX ECByte4

Write817 fi WriteBlCommand: LDA ECFrequency,Y character frequency

CMP #0FEh ifCC

WriteBlCommandX:

LDA ECRepeats,Y JMP WriteBORepeats fi LDA #04Oh Writel7

BRA WriteBlCommandX ENDIF

ENDIF WriteBHashX2:

LDX ECNewIndex,Y ; NC encoding index BEQ WriteBX2NF WriteBX20F:

INX APPENDIX 1

LDA FontCode,X Writel7

IJMP WriteBX2Main WriteBX2NF: LDA #0D0h ; 110

Writel7 WriteBX2Main: INY

LDA ECHashX21,Y Writel7

LDA ECHashX20,Y Writeδ IF AntiEx LDA #000h STA ECType,Y

IF Repeats

STA ECRepeatSW,Y ENDIF ENDIF DEY

IF Repeats JMP WriteBOTestRepeats ELSE JMP WriteBODone ENDIF

WriteBSXtraLength:

LDA LengthBCode+9 Writel7

LDA ECHashX20,Y ; length index ADD #(0-009h)

TAX

LDA GlobalCodeHigh,X TAW

LDA GlobalCodeLow,X BEQ WriteBSXLHigh

TAX APPENDIX 1

TWA

Writeδ17

JMP WriteBSMain WriteBSXLHigh: TWA

Writel7

JMP WriteBSMain WriteBString:

LDX ECNewIndex,Y NC encoding index BEQ WriteBSNF

WriteBSOF:

INX INX

LDA FontCod ,X Writel7

IJMP WriteBSLength WriteBSNF:

LDA #0F0h 111 Writel7 WriteBSLength:

IF AntiEX LDA ECHashX20,Y length index TAX

ADD #(MinimumBString+l) STA ECByte4

NEG A

ADD ECBytel STA ECBytel ELSE LDX ECHashX20,Y length index

ENDIF CPX #009h ifCS JMP WriteBSXtraLength fi

LDA LengthBCode,X APPENDIX 1

Writel7 WriteBSMain: INY

LDX ECHashX21,Y LDA ZoneCode,X

Writel7

LDA ECHashX20,Y Writeδ DEY IF AntiEX

IF Repeats LDA ECRepeats,Y ifEQ JMP WriteBlDone els

JMP WriteBlRepeats fi ELSE JMP WriteBlDone ENDIF

ELSE

JMP WriteBOTestRepeats ENDIF

IF Repeats

WriteBOAreRepeats:

LDA #0C0h

Writel7

LDA ECRepeats,Y ADD #0FEh

TAX

LDA GlobalCodeHigh,X

TAW

LDA GlobalCodeLow,X BEQ WriteBORHigh

TAX APPENDIX 1

TWA

Write817

JMP WriteBORClear WriteBORHigh: TWA

Writel7

JMP WriteBORClear WriteBOTestRepeats:

LDA ECRepeats,Y BEQ WriteBODone

WriteBORepeats:

CMP #001h

BNE WriteBOAreRepeats WriteBONoRepeats: LDA #04Oh

Writel7 WriteBORClear:

LDA #000h STA ECRepeats,Y STA ECRepeatSW,Y

ENDIF WriteBODone:

IF Failsafe DEC ECFailSafe+O ifEQ

DEC ECFailSafe+1 ifEQ STI #FailSafeSets,ECFailSafe+1 LDA #00δh Writel7 fi fi ENDIF

IF AntiEX DEC ECBytel ifEQ APPENDIX 1

RTS fi INY

JMP WriteBEncodings ELSE

JMP UpdateBBuffer ENDIF

IF AntiEX IF Repeats

WriteBlAreRepeats:

LDA #0C0h

Writel7

LDA ECRepeats,Y ADD #0FEh

TAX

LDA GlobalCodeHigh,X

TAW

LDA GlobalCodeLow,X BEQ WriteBlRHigh

TAX

TWA

Writeδ17

JMP WriteBlRClear WriteBlRHigh:

TWA

Writel7

JMP WriteBlRClear WriteBlRepeats: CMP #001h

BNE WriteBlAreRepeats WriteBlNoRepeats:

LDA #04Oh

Writel7 WriteBlRClear:

LDA #000h APPENDIX 1

STA ECRepeats,Y STA ECRepeatSW,Y ENDIF WriteBlDone: IF Failsafe

DEC ECFailSafe+0 ifEQ DEC ECFailSafe+1 ifEQ STI #FailSafeSets,ECFailSafe+1

LDA #00δh Writel7 fi fi ENDIF

DEC ECByte4 ifNE INY

IF Repeats LDA ECRepeats,Y

BEQ WriteBlDone JMP WriteBlRepeats ELSE JMP WriteBlDone ENDIF fi

LDA ECBytel ifEQ RTS fi

INY

JMP WriteBEncodings ENDIF

IF AntiEX

UpdateBlBuffer: ; Y = ECNextOut - lδi -

APPENDIX 1

LDA ECChar,Y STA (ECRRPtr) IF BufferSuffix LDX ECRRPtr+1 CPX #HIGH(ECRRBuffer) ifEQ STI #(HIGH(ECRRBuffer)+HIGH(BufferSize)) ,ECRRPtr+l STA (ECRRPtr) STI #HIGH(ECRRBuffer) ,ECRRPtr+l fi ENDIF

BBS 0,ECRRPtr+0,UpdateBlHead JMP UpdateBlBufferPtr UpdateBlHead:

LDX #001h

LDA ECRRPtr+0 ; Word3 = ptr to RRHashLink ADD #(0-001h) ; at location RRPtr-1 STA ECWord3+0 LDA ECRRPtr+1

ADD #(HIGH(ECRRHashLink)-HIGH(ECRRBuffer) ) STA ECWord3+l

LDA ECHashRawO,Y ; Word4 - ptr to STA ECWord4+0 ; RRHashHead DecBankSelect

STA (ECWord3) ; store LOW(Hash) in EncBankSelect ; RRHashTest table

LDA ECHashRawl,Y AND #HIGH(BufferHashes-l) ASL ECWord4+0

ROL A

ADD #HIGH(ECRRHashHead) STA ECWord4+l UpdateBlLink: LDA (ECWord4) ; transfer RRHashHead to

STA (ECWord3) ; RRHashLink table - 162 - APPENDIX 1

LDA (ECWord4) ,X STA (ECWord3) , X

LDA ECWord3+0 ; reset RRHashHead to new STA (ECWord4) ; RRHashLink ptr LDA ECWord3+l

STA (ECWord4) ,X UpdateBlBufferPtr:

INC ECRRPtr+0 ifEQ INC ECRRPtr+1

LDA ECRRPtr+1

CMP #(HIGH(ECRRBuffer)+HIGH(BufferSize) ) ifEQ STI #HIGH(ECRRBuffer) ,ECRRPtr+l fi fi ; OutputBlControl:

DEC ECStringLength ifNE

INY

LDA ECExcessBits ADD #(0-00δh) ifPL ;{ CMP #040h ifCS ;{ LDA #040h fi ;} els ;{} CMP #(0-040h) ifCC ;{

LDA #(0-04Oh) fi ;} fi ;} STA ECExcessBits

JMP UpdateBlBuffer APPENDIX 1

fi

JMP OutputBControl ENDIF

************* D E C O D E R M A C R O S **************

DCGlobalShort MACRO IF Macros MSDCGlobalShort ELSE

JSR MSDCGlobalShort ENDIF ENDM

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

Enter: A = guard bit at proper shift point (i.e. 80h for 1 bit)

Exit: A = fetched bits (right justified)

Z flag properly set(reset) for [A]

DecodeNBits MACRO

LOCAL DecodeNBl DecodeNBl: ASL DCBuffer ifEQ JSR DCReadCharacter SEC

ROL DCBuffer fi

ROL A

BCC DecodeNBl

ENDM

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * - 164 - APPENDIX 1

DCFreqToChar MACRO

LOCAL DCFToCExit STA DCWordl+0 CMP #0FEh ifCS LDA #08Oh ; get 1 bit DecodeNBits ifNE ; set DCCommand = 1 STA DCCommand ; NOTE: this is the valid EOF

IJMP DCFToCExit fi fi

SetCharFreq DC,W1 ; sets HIGH(NCFreq) in DCWordl+1

LDA (DCWordl) DCFToCExit:

ENDM

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

Enter: A = # of font table indices (2-16) - 2 Exit: A = font index (0-15)

DCReadFontFreq MACRO

LOCAL DCRNextBit TAY

LDX DCFontTblIndex,Y DCRNextBit: ASL DCBuffer ifEQ JSR DCReadCharacter SEC

ROL DCBuffer fi ifcs - 165 - APPENDIX 1

INX fi

LDA DCFontNext,X ifNE

TXA

CLC

ADC DCFontNext,X

TAX

IJMP DCRNextBit fi

LDA DCFontValue,X A = font index ENDM

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

DCGlobalLong MACRO LDA #04Oh ; get 2 bits DecodeNBits ; DCGlobalShort expects that DCGlobalShort ; 1st 2 bits of the

Global

ENDM ; code are in A and that ; the Z flag is based on [A] * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

IF Macros MSDCGlobalShort MACRO

LOCAL DGSOOZones,DGSOOOl,DGSOOOOl,DGSOOOOOl,DGSOOOOOO,DGSExit ELSE

MSDCGlobalShort: ENDIF

BEQ DGSOOZones ; Z flag set for [A] CMP #002h ifCC ; zone = 01

LDA #02Oh ; get 3 bits - 166 - APPENDIX 1

STI #008h,DCBytel ; DCBytel = base (DL) els ifNE ; zone = 11

STI #000h,DCBytel els ; zone = 10

STI #004h,DCBytel fi

; get 2 bits

get 4 bits

base 16

append 1 bit

base 24

append 4 bits

base 32 - 167 - APPENDIX 1

IJMP DGSExit DGSOOOOOl:

LDA #004h ; get 6 bits

DecodeNBits ADD #04Oh ; base 64

IJMP DGSExit DGSOOOOOO:

LDA #002h ; get 7 bits

DecodeNBits ADD #080h ; base 128

DGSExit:

IF Macros

ENDM ELSE RTS

ENDIF

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

DCLengthLong MACRO

LOCAL DCLNextBit,DCLExit LDX #000h DCLNextBit:

ASL DCBuffer ifEQ

JSR DCReadCharacter SEC

ROL DCBuffer fi ifCS

INX fi

LDA LengthBNext,X ifNE TXA

CLC - lδδ - APPENDIX 1

ADC LengthBNext,X TAX

IJMP DCLNextBit fi LDA LengthBValue,X

CMP #009h ifNE

JMP DCLExit fi STA DCByte2

DCGlobalLong

ADD DCByte2

DCLExit:

ENDM ;

;* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

.

DCZoneLong MACRO

LOCAL DCZNextBit LDX #000h

DCZNextBit:

ASL DCBuffer ifEQ JSR DCReadCharacter SEC

ROL DCBuffer fi ifCS INX fi

LDA ZoneNext,X ifNE

TXA

CLC ADC ZoneNext,X

TAX - Iδ9 - APPENDIX 1

IJMP DCZNextBit fi

LDA ZoneValue,X STA DCWordl+1 ENDM

************* D E C O D E R R E F I L L **************

DCFontParams: LDY DCABStatus

BPL DCFontsActive

LDA #001h get 8 bits

DecodeNBits

JMP DCNewAChar DCFontsActive:

LDA DCCurrentHash+1

BPL DCOldFont

TYA i Y = DCABStatus ifEQ JMP DCNewAFont fi

JMP DCNewBFont DCOldFont:

LDA DCCharacters ADD DCABStatus

; ADD DCSTIndex ; always 1 if strings on

ADD #0FFh ; A is # of font indices

(0-16)

DCReadFontFreq ; [A] is returned as index

LDY DCABStatus ifNE CMP DCNCIndex ifEQ LDA #001h ; get 8 bits DecodeNBits APPENDIX 1

JMP DCNewAChar fi

BCC DCReadOldChar ADD #0FFh CMP DCNCIndex ifEQ JMP DC2ByteHash fi

ADD #0FFh CMP DCNCIndex ifEQ

JMP DCReadBString fi

ADD #0FFh els

CMP DCNCIndex BCC DCReadOldChar ifEQ

JMP DCNewACharLong fi

ADD #0FFh CMP DCNCIndex ifEQ

JMP DCReadAString fi

ADD #0FFh fi DCReadOldChar:

STA DCFontlndex ; used in FontUpdate ADD #(TwoBytes+1)

ADD DCFontBase+0 STA DCWordl+0 LDA DCFontBase+1 STA DCWordl+1 . LDA (DCWordl)

STA (DCRRPtr) APPENDIX 1

STI #001h,DCCharCount JMP DCResetFont DCNewAFont:

LDA #04Oh get 2 bits DecodeNBits

CMP #001h ifCC ; 00

LDA #08Oh get 1 bit DecodeNBits IJMP DCNewACharShort fi

BNE DCNewACharShort 10,11 IF AHashX2 XOR 1 01 JMP DCReadAString ELSE

LDA #080h get 1 bit DecodeNBits ifEQ JMP DC2ByteHash fi

DCGlobalLong ADD #MinimuιnAString JMP DCDirectString ENDIF DCNewACharShort:

AND #0FFh reset Z flag for [A] DCGlobalShort JMP DCNewAChar DCNewACharLong: DCGlobalLong

DCNewAChar:

DCFreqToChar [A] is input LDY DCCommand ifEQ STA (DCRRPtr) character is output

STI #001h,DCCharCount APPENDIX 1

fi

JMP DCResetFont DCNewBFont:

LDA #040h ; get 2 bits DecodeNBits CMP #002h ifCC ; 00,01

ORA #004h ; append 6 bits DecodeNBits JMP DCNewAChar els ifEQ 10 LDA #003h ; append 7 bits to 1 DecodeNBits JMP DCNewAChar fi fi 11

LDA #080h ; get 1 bit DecodeNBits ifEQ JMP DC2ByteHash fi DCReadBString:

DCLengthLong ADD #(MinimumBString+1)

JMP DCDirectString DCReadAString:

IF AHashX2 LDA #040h ; get 2 bits DecodeNBits

CMP #003h ifEQ

JMP DC2ByteHash fi AND #0FFh ; reset Z flag for [A]

DCGlobalShort APPENDIX 1

ADD #(MinimumAString-4) ELSE DCGlobalLong ADD #MinimumAString ENDIF

DCDirectString:

STA DCCharCount DCZoneLong DCWordl = offset from

RRPtr

LDA #001h get δ bits DecodeNBits

STA DCBytel

LDA DCRRPtr+0

STA DCWord2+0 SEC

SBC DCBytel

STA DCWord3+0 DCWord3 = source string offset

TAX X = save DCWord3+0 for LAN

LDA DCRRPtr+1

STA DCWord2+l DCWord2 - object string offset

SBC DCWordl+1 CMP #HIGH(DCRRBuffer) ifCC ADD #HIGH(BufferSize) ; A - save DCWord3+l for

LAN fi

LDY DCWordl+1 ; DCWordl+1: 0 - Right to Left

BEQ DCDirectBackward ; <> 0 - Left to

Right DCDirectForward:

LDY DCCharCount DCDirectForwardl:

CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize)-1) APPENDIX 1

BEQ DCDirectForward2 DCDirectForwardOK: PHA PHX PLI

DCDirectFLoopl: LAN

STA (DCWord2) INC DCWord2+0 ifEQ

LDA DCWord2+l ADD #001h

CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize) ) ifEQ ADD #(0-HIGH(BufferSize)) fi

STA DCWord2+l fi DEY BNE DCDirectFLoopl

JMP DCResetFont DCDirectForward2: TAW

TYA ; A = Y = DCCharCount ADD #(0-001h)

ADD DCWord3+0 TWA

BCC DCDirectForwardOK STA DCWord3+l DCDirectFLoop2:

LDA (DCWord3) STA (DCWord2) INC DCWord3+0 ifEQ LDA DCWord3+l

ADD #001h APPENDIX 1

CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize)) ifEQ ADD #(0-HIGH(BufferSize)) fi STA DCWord3+l fi

INC DCWord2+0 ifEQ LDA DCWord2+l ADD #001h

CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize)) ifEQ

ADD #(O-HIGH(BufferSize)) fi STA DCWord2+l fi DEY

BNE DCDirectFLoop2 JMP DCResetFont DCDirectBackward:

LDY DCCharCount CPY DCBytel BCC DCDirectForwardl BEQ DCDirectForwardl STA DCWord3+l

DEY TYA

ADD DCWord3+0 STA DCWord3+0 ifCS

LDA DCWord3+l ADD #001h

CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize)) ifEQ ADD #(0-HIGH(BufferSize)) fi APPENDIX 1

STA DCWord3+l fi TYA

ADD DCWord2+0 STA DCWord2+0 ifCS LDA DCWord2+l ADD #001h

CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize)) ifEQ

ADD #(0-HIGH(BufferSize)) fi

STA DCWord2+l fi INY

DCDirectBLoop:

LDA (DCWord3) STA (DCWord2) LDA DCWord3+0 ADD #0FFh

STA DCWord3+0 ifCC LDA DCWord3+l CMP #HIGH(DCRRBuffer) ifEQ

ADD #HIGH(BufferSize) fi

ADD #0FFh STA DCWord3+l fi

LDA DCWord2+0 ADD #0FFh STA DCWord2+0 ifCC LDA DCWord2+l

CMP #HIGH(DCRRBuffer)

- 19δ - APPENDIX 1

; get 1 bit

; 0 = prod, 1 = command

; Prod resets DCBuffer,

; DCCommand and returns

; DCFontParams

; get next 2 bits and

; DCCommand (right

rocessCommand does as named ; and then JMP's back to ; DCFontParams

rod/command encountered

APPENDIX 1

LDA DCABStatus ifPL

ORA #080h els AND #07Fh fi

STA DCABStatus STI #00Oh,DCCommand JMP DCFontParams ELSE

JMP DCProdCommand ENDIF fi

LDA (DCRRPtr) STA DCCurrentChar

JSR DCWriteCharacter IF Repeats CMP DCChar2Prior ifNE JMP DCUpdateFont fi

CMP DCCharlPrior ifNE JMP DCUpdateFont fi

LDA #08Oh ; get 1 bit

DecodeNBits ifEQ JMP DCUpdateFont fi

DCGlobalLong TAY INY

LDA DCCurrentChar DCWriteRepeatsLoop:

JSR DCWriteCharacter APPENDIX 1

DEY

BNE DCWriteRepeatsLoop ENDIF ^"~

DCUpdateFont: FontUpdate DC,FU

IF Failsafe DEC DCFailSafe+0 ifEQ DEC DCFailSafe+1 ifEQ

STI #FailSafeSets,DCFailSafe+1 LDA #010h ; get 4 bits

DecodeNBits ifNE IF EOFControl

FailSafeTrap:

STI #OFFh,ECCommand BBR 6,HostLCR,FailSafeNLB JSR ECReadCharacter STI #OOOh,ECCommand

JMP DCOrECEOF FailSafeNLB:

APPENDIX 1

LDA DCRRPtr+1

ADD #001h

CMP #(HIGH(DCRRBuffer)+HIGH(BufferSize) ) ifEQ ADD #(0-HIGH(Bu ferSize)) fi

STA DCRRPtr+1 fi

DEC DCCharCount ifEQ

JMP DCFontParams fi JMP DCResetFont

printstat Code,size,is,%$-cb

************* I N C L U D E T A B L E S **************

tb equ $ ; include TCtabOll

************** E N C O D E R T A B L E S **************

EncodingTable: ; plantl macro q,r q&r: endm

IF FontSize EQ 8 sepno defl 0 irp y,<e0,el,e2,e3,e4,e5,e6,e7,eδ> db y-FontCode endm ENDIF APPENDIX 1

IF FontSize EQ 16 sepno defl 0 irp y,<e0,el,e2,e3,e4,e5,e6,e7,eδ,e9,elO,ell,el2,el3,el4> db y-FontCode endm

ENDIF

. etbase macro plantl e,%sepno sepno defl sepno+1 endm

.

FontCode: etbase ; 2 db 11000000b,01000000b etbase ; 3 db 11000000b,00100000b,01100000b etbase ; 4 db 11000000b,00100000b,01010000b,01110000b etbase ; 5 db 11000000b,00100000b,01010000b,01101000b,01111000b etbase ; 6 db

10100000b,11100000b,00010000b,00110000b,01010000b db 01110000b etbase ; 7 db 10100000b,11100000b,00010000b,00110000b,01010000b db 01101000b,01111000b etbase ; 8 db 10100000b,11100000b,00010000b,00110000b,01001000b db 01011000b,01101000b,01111000b etbase ; 9 APPENDIX 1

db 10100000b,11100000b,00010000b,00110000b,01001000b db 01011000b,01101000b,01110100b,01111100b etbase ; 10 db

10100000b,11100000b,00010000b,00110000b,01001000b db 01011000b,01100100b,01101100b,01110100b,01111100b IF FontSize EQ 16 etbase ; 11 db 10100000b,11100000b,00010000b,00110000b,01001000b db 01011000b,01100100b,01101100b,01110100b,01111010b db 01111110b etbase ; 12 db 10100000b,11100000b,00010000b,00110000b,01001000b db 01011000b,01100100b,01101100b,01110010b,01110110b db 01111010b,01111110b etbase ; 13 ERP db 10100000b,11100000b,00010000b,00101000b,00111000b db

01001000b,01011000b,01100100b,01101100b,01110010b db 01110110b,01111010b,01111110b ; etbase ; 13 FLB db 10100000b,11100000b,00010000b,00101000b,00111000b db 01001000b,01010100b,01011100b,01100100b,01101100b db 01110100b,01111010b,01111110b etbase ; 14 db

10100000b,11100000b,00010000b,00101000b,00111000b APPENDIX 1

db 01001000b,01010100b,01011100b,01100100b,01101100b db 01110010b,01110110b,01111010b,01111110b etbase ; 15 db

10100000b,11100000b,00010000b,00101000b,00111000b db 01000100b,01001100b,01010100b,01011100b,01100100b db 01101100b,01110010b,01110110b,01111010b,01111110b etbase ; 16 db 10100000b,11100000b,00010000b,00101000b,00111000b db 01000100b,01001100b,01010100b,01011100b,01100100b db 01101010b,01101110b,01110010b,01110110b,01111010b db 01111110b ENDIF ; fontesz equ $-FontCode

************** D E C O D E R T A B L E S ************** ;

DCFontTblIndex:

DB 000,002,006,012 DB 020,030,042,056 DB 072 IF FontSize EQ 16

DB 090,110,132 DB 156,182,210 ENDIF

DCFontNext:

DB 0, 0 ; 2 APPENDIX 1

DB 2 0, 0, 0 ; 3

DB 2 0, 0, 1, 0, 0 ; 4

DB 2 0, 0, 1, 0, 1 ; 5

DB 4 1, o, 0, 2, 3 ; 6

DB 0 0

DB 4 1, 0, 0, 2, 3 ; 7

DB 0 1, 0, 0

DB 4 1, o, 0, 2, 3 ; 8

DB 2 3, 0, 0, 0, 0

DB 4 1, 0, 0, 2, 3 0 ; 9

DB 2 3, 0, 0, 0, 1 0

DB 4 1, o, 0, 2, 3 0 ; ιo

DB 2 3, 0, 0, 2, 3 0

DB 0 0

IF Fontsize EQ 16

DB 2 0 0 2 0 DB 0 0 DB 4 0 2 0 ; 12 DB 2 0 2 0 DB 2 0 0 DB 4 0 2 1 ; 13 ERP DB 0 3 0 3 DB 0 3 0 0 DB 4 0 2 0 ; 13 FLB DB 4 0 0 5 DB 0 0 0 0 DB 4 0 2 3 14 DB 4 0 0 5 DB 0 0 2 0 DB 0 DB 4 0 2 3 ; 15 DB 4 0 4 7 DB 0 0 0 3 DB 0 0 DB 4 o. 2, 3, 0, 3 ; 16 DB DB DB

ENDIF ;

DCFontValue:

DB 0 ; 2

DB 0, 1, 2 ; 3

DB 0, 1, 0, 2, 3 ; 4 DB 0, 1, 0, 2, 0, 3, 4 ; 5

DB 0, 0, 1, 0, 0, 2, 3 ; 6

DB 5

DB 0, 0, 1, 0, 0, 2, 3 ; 7

DB 0, 5, 6 DB 0, 0, 1, 0, 0, 2, 3 ; δ

DB 0, 4, 5, 6, 7

DB 0, 0, 1, 0, 0, 2, 3 ; 9

DB 0, 4, 5, 6, 0, 7, 8

DB 0, 0, 1, 0, 0, 2, 3 ; ιo DB 0, 4, 5, 0, 0, 6, 7

DB 9

IF FontSize EQ 16

DB 1, 0, 0, 2, 3 ; il

DB 5, 0, 0, 6, 7 DB A

DB 1, 0, 0, 2, 3 ; 12

DB 5, 0, 0, 6, 7

DB 9, A, B

DB 1, 0, 0, 2, 0 ; 13 ERP DB 0, 5, 6, 0, 0

DB 0, 9, A, B, C

DB 1, 0, 0, 2, 0 ; 13 FLB

DB 4, 5, 0, 0, 0

DB 9, A, 0, B, C DB 1, 0, 0, 2, 0 ; 14

DB 4, 5, 0, 0, 0 APPENDIX 1

DB 6, 7, 8, 9, 0, 0, A, B

DB C, D

DB 0, 0, 0, 1, 0, 0, 2, 0 ; 15

DB 0, 0, 3, 4, 0, 0, 0, 0 DB 5, 6, 7, 8, 9, A, 0, 0

DB B, C, D, E

DB 0, 0, 0, 1, 0, 0, 2, 0 ; 16

DB 0, 0, 3, 4, 0, 0, 0, 0

DB 5, 6, 7, 8, 9, 0, 0, 0 DB A, B, C, D, E, F

ENDIF

************* S H A R E D T A B L E S **************

Bestl28: DB 20h,30h,45h,65h,0Ah,0Dh,31h,54h,74h,52h,32h,61h,49h,53h,41h, 4Fh

DB 72h,43h,4Eh,6Eh,4Ch,6Fh,69h,73h,09h,2Ch,44h,4Dh,35h,2Dh,33h, 64h

DB 46h,2Eh,6δh,50h,6Ch,3δh,34h,29h,28h,39h,63h,55h,2Fh,3Dh,4δh, 36h DB

75h,66h,6Dh,42h,37h,70h,47h,57h,67h,5δh,56h,62h,59h,77h,22h, 79h

DB 2Ah,2Bh,5Fh,76h,27h,4Bh,25h,3Eh,21h,3Bh,5Ah,3Ch,24h,40h,3Ah, 6Bh

DB 4Ah,7δh,26h,51h,5Bh,5Dh,23h,71h,7Ah,lAh,6Ah,19h,3Fh,5Ch,00h, Olh

DB 02h,03h,04h,05h,06h,07h,Oδh,OBh,OCh,OEh,OFh,lOh,llh,12h,13h, 14h - 20δ - APPENDIX 1

DB 15h,16h,17h,lβh,IBh,ICh,IDh,lEh,IFh,5Eh,60h,7Bh,7Ch,7Dh,7Eh, 7Fh

FontBits: db 1,1 ; 2 db 1,2,2 ; 3 db 1,2,3,3 ; 4 db 1,2,3,4,4 ; 5 db 2,2,3,3,3,3 ; 6 db 2,2,3,3,3,4,4 ; 7 db 2,2,3,3,4,4,4,4 ; δ db 2,2,3,3,4,4,4,5,5 9 db 2,2,3,3,4,4,5,5,5,5 10 IF FontSize EQ 16 db 2,2,3,3,4,4,5,5,5,6,6 11 db 2,2,3,3,4,4,5,5,6,6,6,6 12 db 2,2,3,4,4,4,4,5,5,6,6,6,6 13 ERP db 2,2,3,4,4,4,5,5,5,5,5,6,6 13 FLB db 2,2,3,4,4,4,5,5,5,5,6,6,6,6 r 14 db 2,2,3,4,4,5,5,5,5,5,5,6,6,6,6 15 db 2,2,3,4,4,5,5,5,5,5,6,6,6,6,6,6 r 16 ENDIF

GlobalBits: DB 04,04,04,04,04,04,04,04,05,05,05,05,05,05,05,05

DB 06,06,06,06,06,06,06,06,07,07,07,07,07,07,07,07 DB

10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10

DB 10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10 DB 12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12 DB APPENDIX 1

12.12.12,12,12,12,12,12,12,12,12,12,12,12,12,12

DB

12.12.12.12.12.12.12.12.12.12.12.12.12.12.12.12 DB 12,12,12,12,12,12,12,12,12,12,12,12,12,12,12,12 DB

13.13.13.13.13.13.13.13.13.13.13.13.13.13.13.13 DB

13.13.13,13,13,13,13,13,13,13,13,13,13,13,13,13 DB

13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13

DB 13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13

DB 13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13 DB

13,13,13,13,13,13,13,13,13,13,13,13,13,13,13,13

.

GlobalCodeHigh:

DB 11001000B,11011000B ; 0- 7 DB 11101000B,11111000B

DB 10001000B,10011000B

DB 10101000B,10111000B

DB 01000100B,01001100B ; δ- 15

DB 01010100B,01011100B DB 01100100B,01101100B

DB 01110100B,01111100B

DB 00100010B,00100110B ; 16- 23

DB 00101010B,00101110B

DB 00110010B,00110110B DB 00111010B,00111110B

DB 00010001B,00010011B ; 24- 31 APPENDIX 1

DB 00010101B,00010111B

DB 00011001B,00011011B

DB 00011101B,00011111B -

DB OOOOIOOOB,OOOOIOOOB ; 32- 47 DB OOOOIOOOB,OOOOIOOOB

DB 00001001B,00001001B

DB OOOOIOIOB,OOOOIOIOB

DB OOOOIOIOB,OOOOIOIOB DB OOOOIOIIB,OOOOIOIIB

DB OOOOIOIIB,OOOOIOIIB

DB 00001100B,00001100B ; 4δ- 63

DB 00001100B,00001100B

DB 00001101B,00001101B DB 00001101B,00001101B

DB 00001110B,00001110B

DB 00001111B,00001111B

DB 00001111B,00001111B DB 00000100B,00000100B ; 64- 79

DB 00000100B,00000100B

DB 00000100B,00000100B DB 00000100B,00000100B

DB 00000100B,00000100B

DB 00000101B,00000101B ; 60- 95

DB 00000101B,00000101B DB 00000101B,00000101B

DB 00000101B,00000101B

DB 00000101B,00000101B DB 00000101B,00000101B

DB 00000110B,00000110B ; 96-111 APPENDIX 1

DB OOOOOllOB,OOOOOllOB

DB OOOOOllOB,OOOOOllOB DB OOOOOllOB,OOOOOllOB

DB OOOOOllOB,OOOOOllOB

DB 00000111B,00000111B ; 112-127

DB 00000111B,00000111B DB 00000111B,00000111B

DB 00000111B,00000111B

DB 00000111B,00000111B DB 00000111B,00000111B

DB 00000000B,00000000B ; 126-143

DB 00000000B,00000000B

DB 00000000B,00000000B DB 00000000B,00000000B

DB OOO0OOOOB,OO00OOOOB

DB 00000000B,00000000B

DB 00000000B,00000000B ; 144-159 DB 00000000B,00000000B

DB 00000000B,00000000B

DB 00000000B,00000000B DB 00000000B,00000000B

DB 00000000B,00000000B

DB 00000001B,00000001B ; 160-175

DB 00000001B,00000001B

DB 00000001B,00000001B DB 00000001B,00000001B

DB 00000001B,00000001B APPENDIX 1

DB OOOOOOOIB,OOOOOOOIB

DB OOOOOOOIB,OOOOOOOIB ; 176-191

5 DB OOOOOOOIB,OOOOOOOIB

DB OOOOOOOIB,OOOOOOOIB

10 DB OOOOOOOIB,OOOOOOOIB

DB OOOOOOOIB,OOOOOOOIB

DB O0OOOO10B,OOOOO010B ; 192-207

DB 00000010B,00000010B

15 DB 00000010B,00000010B

DB 00000010B,00000010B

20 DB 00000010B,00000010B ; 206-223

DB 00000010B,00000010B

25 DB 00000010B,00000010B

DB 00000010B,00000010B

DB 00000011B,00000011B ; 224-239

DB 00000011B,00000011B

30 DB 00000011B,00000011B

DB 00000011B,00000011B

35 DB 00000011B,00000011B

DB 00000011B,00000011B ; 240-255 APPENDIX 1

DB OOOOOOllB,OOOOOOllB

DB OOOOOOllB,OOOOOOllB DB OOOOOOllB,OOOOOOllB

DB OOOOOOllB,OOOOOOllB

.

GlobalCodeLow: DB OOOOOOOOB,OOOOOOOOB ; 0- 7

DB OOOOOOOOB,OOOOOOOOB

DB OOOOOOOOB,OOOOOOOOB ; 8- 15 DB OOOOOOOOB,OOOOOOOOB

DB OOOOOOOOB,OOOOOOOOB

DB OOOOOOOOB,OOOOOOOOB ; 16- 23

DB OOOOOOOOB,OOOOOOOOB DB OOOOOOOOB,OOOOOOOOB

DB OOOOOOOOB,OOOOOOOOB

DB OOOOOOOOB,OOOOOOOOB ,* 24- 31

DB OOOOOOOOB,OOOOOOOOB

DB OOOOOOOOB,OOOOOOOOB DB OOOOOOOOB,OOOOOOOOB

DB 00100000B,01100000B ; 32- 47

DB 10100000B,11100000B

DB 00100000B,01100000B

DB 10100000B,11100000B DB 00100000B,01100000B

DB 10100000B,11100000B

DB 00100000B,01100000B

DB 10100000B,11100000B

DB 00100000B,01100000B ; 48- 63 DB 10100000B,11100000B

DB 00100000B,01100000B APPENDIX 1

DB 10100000B,lllOOOOOB

DB OOIOOOOOB,OllOOOOOB

DB lOlOOOOOB,lllOOOOOB

DB OOIOOOOOB,OllOOOOOB DB 10100000B,lllOOOOOB

DB OOOOIOOOB,OOOllOOOB ; 64- 79

DB 00101000B,00111000B

DB 01001000B,01011000B

DB 01101000B,01111000B DB 10001000B,10011000B

DB 10101000B,10111000B

DB 11001000B,11011000B

DB 11101000B,11111000B

DB OOOOIOOOB,OOOllOOOB ; 80- 95 DB 00101000B,00111000B

DB 01001000B,01011000B

DB 01101000B,01111000B

DB 10001000B,10011000B

DB 10101000B,10111000B DB 11001000B,11011000B

DB 11101000B,11111000B

DB OOOOIOOOB,OOOllOOOB ; 96-111

DB 00101000B,00111000B

DB 01001000B,01011000B DB 01101000B,01111000B

DB 10001000B,10011000B

DB 101010O0B,10111000B

DB 11001000B,11011000B

DB 11101000B,11111000B DB OOOOIOOOB,OOOllOOOB ; 112-127

DB 00101000B,00111000B

DB 01001000B,01011000B

DB 01101000B,01111000B

DB 10001000B,10011000B DB 10101000B,10111000B

DB 11001000B,11011000B

APPENDIX 1

DB 00110100B,00111100B

DB 01000100B,01001100B

DB 01010100B,OlOlllOOB

DB OllOOlOOB,OllOllOOB

DB 01110100B,01111100B

DB 10000100B,10001100B ; 208-223

DB 10010100B,10011100B

DB 10100100B,10101100B

DB 10110100B,10111100B

DB 11000100B,11001100B

DB 11010100B,11011100B

DB 11100100B,11101100B

DB 11110100B,11111100B

DB 00000100B,00001100B ; 224-239

DB OOOIOIOOB,OOOlllOOB

DB 00100100B,00101100B

DB 00110100B,00111100B

DB 01000100B,01001100B

DB 01010100B,OlOlllOOB

DB OllOOlOOB,OllOllOOB

DB 01110100B,01111100B

DB 10000100B,10001100B ; 240-255

DB 100101006,100111006

DB 10100100B,10101100B

DB 10110100B,10111100B

DB 11000100B,11001100B

DB 11010100B,11011100B

DB 11100100B,11101100B

DB 11110100B,11111100B

IF 1 EQ 0 LengthABits: DB 02,03,03,03,04,04,04,05,05,06,06,07,07,07,07,08 DB

08,08,08,09,09,09,09,09,09,09,10,10,10,10,10,10 APPENDIX 1

DB 10 , 10 , 10 , 11, 11, 11 , 11 , 11 , 11 , 11 , 11 , 11 , 11, 12 , 12 , 12

DB 12,12,12,12,12,12,12,12,12,12,12,13,13,13,13,06 ;

LengthACodeHigh:

DB lllOOOOOB,10110000B,10010000B,01110000B

DB 01011000B,01001000B,00111000B,00101100B

DB 00100100B,00011110B,00011010B,00010011B DB 00010001B,00001111B,00001101B,OOOOIOIIB

DB OOOOIOIOB,00001001B,OOOOIOOOB,00000111B

DB 00000111B,OOOOOllOB,OOOOOllOB,00000101B

DB 00000101B,00000100B,00000100B,00000100B

DB OOOOOOllB,OOOOOOllB,OOOOOOllB,OOOOOOllB DB 00000010B,00000010B,00000010B,00000010B

DB 00000010B,OOOOOOOIB,OOOOOOOIB,OOOOOOOIB

DB OOOOOOOIB,OOOOOOOIB,OOOOOOOIB,OOOOOOOIB

DB OOOOOOOIB,OOOOOOOOB,OOOOOOOOB,OOOOOOOOB

DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,OOOOOOOOB DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,OOOOOOOOB

DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,OOOOOOOOB

DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,OOOIOIOOB ; no guard bit ; ; on index 63 LengthACodeLow:

DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,OOOOOOOOB

DB OOOOOOOOB,OOOOOOOOB,OOOOOOOOB,10Θ-00000B DB 10000000B,10000000B,10000000B,11000000B •

DB 01000000B,11000000B,01000000B,11000000B

DB 01000000B,11000000B,OllOOOOOB,OOIOOOOOB

DB lllOOOOOB,lOlOOOOOB,OllOOOOOB,OOIOOOOOB

DB lllOOOOOB,lOlOOOOOB,OllOOOOOB,00110000B DB 00010000B,11110000B,11010000B,10110000B

DB 10010000B,01110000B,01010000B,00110000B APPENDIX 1

DB OOOIOOOOB,lllllOOOB,IIIOIOOOB,IIOIIOOOB

DB IIOOIOOOB,10111000B,10101000B,10011000B

DB 10001000B,OllllOOOB,OllOlOOOB,OlOllOOOB

DB OIOOIOOOB,OOlllOOOB,OOIOIOOOB,OOOlllOOB

DB OOOIOIOOB,OOOOllOOB,OOOOOIOOB,OOOOOOOOB

o, o, 0, 6, 1

0, 6, 1, 2, 0

1, 4, 1, 0, 0 0,12, 1, 4, 1 1, 0, 0, 0, 0 1, 4, 1 , 0, 0 1, 0, 0, 2, 0 1, 8, 1, 4, 1 0, 4, 1, 0, 0 0,16, 1, 8, 1 0, 0, 0, 4, 1 0,16, 1, 8, 1 0, 0, 0, 4, 1 0, 8, 1, 4, 1 0, 4, 1, 0, 0

0, 0, 0

LengthAValue:

DB 0, 0, 0, 0, 2, 1, 0, 0

DB 0, 3, 5, 4, 0, 0, 0, 6

DB 8, 7, 0, 0, 0, 0,10, 9

DB 0,63, 12,11, 0, 0, 0, 0 DB 14,13, 0, 0,16,15,18,17

DB 0, 0, 0, 0, 0, 0,20,19

DB 22,21, 0, 0,24,23, 0,25

DB 27,26, 0, 0, 0, 0, 0, 0

DB 29,28, 31,30, 0, 0,33,32 DB 0,34, 36,35, 0, 0, 0, 0

DB 0, 0, 38,37,40,39, 0, 0 APPENDIX 1

DB 42,41,44,43, 0, 0, 0, 0

DB 0, 0,46,45,48,47, 0, 0

DB 50,49,52,51, 0, 0, 0, 0

DB 54,53,56,55, 0, 0,58,57 DB 0, 0,60,59,62,61

ENDIF

.

LengthBBits:

DB 01,03,03,04,05,05,05,06,06,04 ;

LengthBCode: DB llOOOOOOB,OlllOOOOB,OIOIOOOOB,OOlllOOOB,OOOlllOOB DB OOOIOIOOB,OOOOllOOB,OOOOOllOB,OOOOOOIOB,OOIOIOOOB

LengthBNext:

DB 2, 0, 4, 1, 0, 0, 4, 1, 0, 0 DB 4, 1, 0, 0, 2, 0, 0, 0 ;

LengthBValue:

DB 0, 0, 0, 0, 2, 1, 0, 0, 9, 3 DB 0, 0, 5, 4, 0, 6, 8, 7

IF BufferSize EQ 8192

ZoneBits:

DB 02,03,03,04,05,05,05,05,05,05,06,06,06,06,06,06

DB 06,06,06,06,07,07,07,07,07,07,07,07,07,07,07,07

.

ZoneCode:

DB lllOOOOOB,10110000B,10010000B,OllllOOOB

DB OllOllOOB,OllOOlOOB,OlOlllOOB,01010100B DB 01001100B,01000100B,00111110B,00111010B

DB 00110110B,00110010B,00101110B,00101010B APPENDIX 1

DB 00100110B,00100010B,00011110B,00011010B

DB 00010111B,00010101B,00010011B,00010001B

DB OOOOllllB,OOOOIIOIB,OOOOIOIIB,OOOOIOOIB

DB OOOOOlllB,OOOOOIOIB,OOOOOOllB,OOOOOOOIB

ZoneNext:

DB 6, 1, 2, 0, 0, 0,14, 1

DB 6, 1, 2, 0, 0, 0, 4, 1

DB 0, 0, 0, 0,16, 1, 8, 1

DB 4, 1, 0, 0, 0, 0, 4, 1

DB 0, 0, 0, 0,12, 1, 4, 1

DB 0, 0, 4, 1, 0, 0, 0, 0

DB 8, 1, 4, 1, 0, 0, 0, 0

DB 4, 1, 0, 0, 0, 0

ZoneValue:

DB 0, 0, 0, 0, 2, 1, 0, 0

DB 0, 0, 0, 3, 5, 4, 0, 0

DB 7, 6, 9, 8, 0, 0, 0, 0 DB 0, 0,11,10,13,12, 0, 0

DB 15,14,17,16, 0, 0, 0, 0

DB 19,18, 0, 0,21,20,23,22

DB 0, 0, 0, 0,25,24,27,26

DB 0, 0,29,28,31,30

ELSE ZoneBits:

DB

02,02,03,04,04,05,05,05,05,05,06,06,06,06,06,06

.

ZoneCode:

DB lllOOOOOB,10100000B,OlllOOOOB,OlOllOOOB

DB OIOOIOOOB,00111100B,00110100B,00101100B

DB 00100100B,OOOlllOOB,00010110B,00010010B DB 00001110B,OOOOIOIOB,OOOOOllOB,OOOOOOIOB APPENDIX 1

ZoneNext:

DB 4, 1, 0, 0, 6, 1, 2, 0

DB 0, 0, 8, 1, 4, 1, 0, 0

DB 0, 0, 6, 1, 2, 0, 0, 0

DB 4, 1, 0, 0, 0, 0

ZoneValue:

DB 0, 0, 1, 0, 0, 0, 0, 2

DB 4, 3, 0, 0, 0, 0, 6, 5 DB 8, 7, 0, 0, 0, 9,11,10

DB 0, 0,13,12,15,14 ENDIF

CRC_TH:

DB OOOH,011H,023H,032H,046H,057H,065H,074H 000

DB 08CH,09DH,OAFH,OBEH,OCAH,ODBH,0E9H,OF8H 008

DB 010H,001H,033H,022H,056H,047H,075H,064H 010

DB 09CH,08DH,OBFH,OAEH,ODAH,OCBH,0F9H,0E8H 018 DB 021H,030H,002H,013H,067H,076H,044H,055H 020

DB OADH,OBCH,08EH,09FH,OEBH,OFAH,0C8H,0D9H 028

DB 031H,020H,012H,003H,077H,066H,054H,045H 030

DB OBDH,OACH,09EH,08FH,OFBH,OEAH,0D8H,0C9H 038

DB 042H,053H,061H,070H,004H,015H,027H,036H 040 DB 0CEH,0DFH,0EDH,0FCH,088H,099H,0ABH,0BAH 046

DB 052H,043H,071H,060H,014H,005H,037H,026H 050

DB ODEH,OCFH,OFDH,OECH,098H,089H,OBBH,OAAH 058

DB 063H,072H,040H,051H,025H,034H,006H,017H 060

DB OEFH,OFEH,OCCH,ODDH,0A9H,0B8H,08AH,09BH 068 DB 073H,062H,050H,041H,035H,024H,016H,007H 070

DB OFFH,OEEH,ODCH,OCDH,0B9H,0A8H,09AH,08BH 078

DB 084H,095H,0A7H,0B6H,0C2H,0D3H,0E1H,OFOH 080

DB 008H,019H,02BH,03AH,04EH,05FH,06DH,07CH 088

DB 094H,085H,0B7H,0A6H,0D2H,0C3H,0F1H,OEOH 090 DB 018H,009H,03BH,02AH,05EH,04FH,07DH,06CH 098

DB 0A5H,0B4H,086H,097H,0E3H,0F2H,OCOH,0D1H OAO APPENDIX 1

DB 029H,038H,OOAH,01BH,06FH,07EH,04CH,05DH 0A8

DB 0B5H,0A4H,096H,087H,0F3H,0E2H,ODOH,OCIH OBO

DB 039H,028H,01AH,OOBH,07FH,06EH,05CH,04DH 0B8

DB 0C6H,0D7H,0E5H,0F4H,080H,091H,0A3H,0B2H OCO

DB 04AH,05BH,069H,078H,OOCH,01DH,02FH,03EH OCδ

DB 0D6H,0C7H,0F5H,0E4H,090H,OδlH,0B3H,0A2H ODO

DB 05AH,04BH,079H,06δH,01CH,OODH,03FH,02EH 0D8

DB 0E7H,0F6H,0C4H,0D5H,0A1H,OBOH,082H,093H OEO

DB 06BH,07AH,048H,059H,02DH,03CH,OOEH,01FH OEδ

DB 0F7H,0E6H,0D4H,0C5H,0B1H,OAOH,092H,063H OFO

DB 07BH,06AH,058H,049H,03DH,02CH,01EH,OOFH 0F8

CRC TL:

DB OOOH,089H,012H,09BH,024H,OADH,036H,OBFH 000

DB 04δH,OCIH,05AH,0D3H,06CH,0E5H,07EH,0F7H 008

DB 081H,008H,093H,01AH,0A5H,02CH,0B7H,03EH 010

DB 0C9H,040H,ODBH,052H,OEDH,064H,OFFH,076H 018

DB 002H,06BH,010H,099H,026H,OAFH,034H,OBDH 020

DB 04AH,0C3H,058H,0D1H,06EH,0E7H,07CH,0F5H 028

DB 083H,OOAH,091H,OlβH,0A7H,02EH,0B5H,03CH 030

DB OCBH,042H,0D9H,050H,OEFH,066H,OFDH,074H 038

DB 004H,08DH,OlβH,09FH,020H,0A9H,032H,OBBH 040

DB 04CH,0C5H,05EH,0D7H,068H,0E1H,07AH,0F3H 048

DB 085H,OOCH,097H,01EH,0A1H,028H,0B3H,03AH 050

DB OCDH,044H,ODFH,056H,0E9H,060H,OFBH,072H 058

DB 006H,08FH,014H,09DH,022H,OABH,030H,0B9H 060

DB 04EH,0C7H,05CH,0D5H,06AH,0E3H,078H,0F1H 068

DB 087H,OOEH,095H,01CH,0A3H,02AH,0B1H,038H 070

DB OCFH,046H,ODDH,054H,OEBH,062H,0F9H,070H 078

DB 008H,081H,01AH,093H,02CH,0A5H,03EH,0B7H OδO

DB 040H,0C9H,052H,ODBH,064H,OEDH,076H,OFFH Oδδ

DB 0δ9H,OOOH,09BH,012H,OADH,024H,OBFH,036H 090

DB OCIH,048H,0D3H,05AH,0E5H,06CH,0F7H,07EH 098

DB OOAH,083H,016H,091H,02EH,0A7H,03CH,0B5H OAO

DB 042H,OCBH,050H,0D9H,066H,OEFH,074H,OFDH 0A8

DB 08BH,002H,099H,010H,OAFH,026H,OBDH,034H OBO APPENDIX 1

DB 0C3H,04AH,OD1H,058H,0E7H,06EH,0F5H,07CH 0B8

DB OOCH,085H,01EH,097H,028H,0A1H,03AH,0B3H OCO

DB 044H,OCDH,056H,ODFH,060H,0E9H,072H,OFBH 0C8

DB 08DH,004H,09FH,016H,0A9H,020H,OBBH,032H ODO

DB 0C5H,04CH,0D7H,05EH,OEIH,068H,0F3H,07AH 0D8

DB OOEH,087H,01CH,095H,02AH,0A3H,038H,0B1H OEO

DB 046H,OCFH,054H,ODDH,062H,OEBH,070H,0F9H OEδ

DB 08FH,006H,09DH,014H,OABH,022H,0B9H,030H OFO

DB 0C7H,04EH,0D5H,05CH,0E3H,06AH,0F1H,078H OFδ

printstat Data, size, is, %$-tb

******************** D E B U G G E R ********************

DEBUGGER or DUMMY INCLUSION

include TCdbgOOl

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

unplanned_int: brk ;break: ;break_out: nop nop nop rti

*************** V E C T O R T A B L E ****************

VECTOR TABLE

ds 0-progaddr-($-cb)-32,0

Jsb 0 APPENDIX 1

dw unplanned_int

Jsb 1 dw unplanned_int

Jsb 3 dw unplanned_int

Jsb 4 dw unplanned_int ; Jsb 5 dw unplanned_int

Jsb 6 dw unplanned_int ; Jsb 7 dw unplanned_int

.

; Irq6,brea ,PTGA,PTGb,bE dw break ; Irq5,Serin Stat, TimerA dw break_out

; Irq4,PA3,Edge/bF dw unplanned_int ; Irq3,Host/Timerb dw Hostlnt ; Irq2,Pb§ Edge dw unplanned_int ; Irql,Pd7 Edge dw unplanned_int ; NMI dw unplanned_int

; unplanned_int reset: dw dbginit ; start in debugger printstat <C000-FFFFh Block Free =>,%163δ4-($-cb) ; end APPENDIX 2

SOURCE LISTING GUIDE Page 1, lines 1 through 11

Define the assembly environment. Page 1, line 12 The "include ITEC19" statement copies a source file which uses the MACRO facility in the assembler to provide some higher level language type constructs. Page 1, line 13

The "include TCDFM001" statement copies a source file which defines the internal register and I/O structure in the C19. Page 1, lines 14 through 28

Assembly macros used to manage/display the assembly environment and status. Page 1, lines 33 through 40

Setting of symbols which control some assembly time features of the algorithm. These are used to enable/disable various structures and code to evaluate compression effectiveness. Page 1, line 46 through Page 2, line 32

More assembly time controls which affect compression mechanisms and establish sizes of certain memory structures. Page 2, line 38 through Page 3, line 8 Assembly time controls for diagnostics and speed of execution having little or no effect on compression effectivity. Page 3, lines 12 through 42

Definition (mapping) of some structures used by the algorithm.

Page 3, line 47 through Page 6, line 36

Declaration of byte (8 bit) and word (16 bit) variables used by the encoding and decoding processes. Variables beginning with "EC" are used by the encoder (compression process) and beginning with "DC" are the decoder (decompression process) . Other prefixes are APPENDIX 2

general use. Page 6, line 42 through Page 7, line 36

Declaration of encoder/decoder structures which are a multiple of 256 bytes in length. Page 7, line 40 through Page 8, line 27

Declaration of more encoder structures that are a multiple of 256 bytes in length. This block is from absolute address 4000h to OcOOOh (size = 32768 bytes) and is bank switched alternating with the next described block.

Page 8, line 33 through Page 9, line 5

The decoder bankswitched block (size = 32768) . Page 9, line 10 through Page 11, line 48

Interface points to the operating system code for the production implementation of the algorithm. In production, these hooks replace development environment code on pages 22 through 25 inclusive. Page 11, line 11 through Page 14, line 14

Table initialize code. All compression/decompression tables and variables are set to initial conditions. Page 14, line 18 through Page 16, line 46

Program startup code which sets environment and initializes stacks for alternate execution of encoder/decoder. Page 17, lines 1 through 19

Context switch subroutines. Page 17, line 25 through Page 26, line 3

Development environment routines for Memory Dump to PC and character transfer to/from PC bus. Characters transferred are to be compressed or decompressed. The PC interface is an emulation of a standard PC asynchronous communications IC an INS16450. Page 26, line 11 through Page 28, line 24

Macro declarations which facilitate the generation of certain microcode routines for bit stream output as either in-line code or as subroutines. APPENDIX 2

Page 28, line 30 through Page 29, line 11

A macro which embodies the microcode to select the appropriate one of four NCToFrequency tables based on the prior character of the input stream, leaving the base address of the table ECNCChar in ECWordl and the base address of the table ECNCFreq in ECWord2. Page 29, line 17 through Page 41, line 21

The body of the FontUpdate Macro. This generates all of the microcode to perform the processes of CRC Hash generation. Font access. Font creation, etc. In general, all of the processes (steps) 3 through 9 as described with Figure 2B. Page 41, line 27 through Page 45, line 44

The Encode main loop first phase. This is the Refill process which accepts characters from the input stream, stores them in the process buffer (ECChar, ECCharCopy) , invokes the FontUpdate Macro (process) . As required by flush operations and ProcessBuffer full conditions, this process invokes the second phase of execution. Page 46, line 1 through Page 48, line 37

The Mode A string search macro. This embodies the code to locate the longest string in the history buffer matching the string beginning at the position of the ECChar buffer at position A (the value in the C19 accumulator register) .

Page 48, line 41 through Page 52, line 43

The Mode A string find routines. These routines perform two iterations of the above macro, reject strings overlapping the next history buffer stream location, select the longer of the two if two were found. Page 52, line 44 through Page 54, line 44

The Mode A Pair Encoding and bit cost comparison routines. Page 54, line 45 through Page 57, line 19

Mode A String bit cost computation and comparison with APPENDIX 2

Font encoding. Page 57, line 25 through Page 62, line 8

Mode A bit stream format and output routines. - Page 62, line 12 through Page 67, line 34 Mode A repeats output, history buffer and access table update routine, and phase 2 iteration (flush mode or normal) control. Page 67, line 40 through Page 72, line 3 Mode B String search macro routines. Page 72, line 6 through Page 74, line 16

Mode B String find routines. Performs Mode B string search macros and rejects strings which overlap next history buffer store location. Page 74, line 17 through Page 76, line 13 Mode B Pair encoding and bit cost comparison subroutines. Page 76, line 14 through Page 78, line 27

Mode B string bit cost comparison routines. Page 78, line 32 through Page 79, line 40 Mode B antiexpansion summing routines. Page 79, line 46 through Page 80, line 49

Mode B history buffer and access table update. Page 81, line 1 through Page 92, line 18 Mode B bit stream format and output. Page 92, line 17 through Page 96, line 29

Decoder macros for character input/output and bit stream fetch. Page 96, line 33 through Page 105, line 17 Decoder main body. Page 105, line 26 through Page 107 line 9

Encoding Table and FontCode (Huffman Font codes) tables. Used to emit Font encoding bit patterns. Page 107, line 14 through Page 109, line 9 Decoder Huffman Font decoding trees. Page 109, line 13 through Page 109, line 21

New Character to Frequency preload tables. APPENDIX 2

Page 109, line 23 through Page 109, line 41

Font Bits table. Used for computing bit cost of Font encodings. Page 109, line 43 through Page 110, line 10 Global bits Table. Used to compute the bit cost of NewChar and any other encodings which use the GlobalBits tables. Page 110, line 12 through Page 115, line 25

The GlobalBits High and Low Huffman tables, used as a pair to encode any items such as NewChar, Mode A String length, and Repeat count. Page 115, line 27 through Page 117, line 7

The LengthA encoding tables. Not in use by the preferred embodiment. Page 117, line 9 through Page 117, line 22

The LengthBBits LengthBCode, LengthBValue and LengthBNext tables. LengthBBits and LengthB value are used for encoding the Mode B string length. LengthBNext and LengthBValue are used for decoding Mode B string lengths.

Page 117, line 24 through Page 118, line 32

The ZoneBits, ZoneCode, ZoneNext and ZoneValue tables. ZoneBits and ZoneCode are used for encoding the Zone portion of Mode A and Mode B string location offsets. ZoneNext and ZoneValue are used for decoding the Zone portion of Mode A and Mode B string location offsets. Page 118, line 34 through Page 120, line 3

The precalculated CRC table. Used for rapid CRC hash calculations in the FontAccess Routines. Page 120, line 11

Inclusion of the C19 debugger (soft monitor) file. Page 120, line 15 through Page 121, line 17

Vector jump tables for the C19 hardware vectoring system.

Claims

What is claimed is:

1. A system for the dynamic encoding of a character stream, the system comprising: an input for receiving the character stream; an output for providing encoded data; single character encoding means, connected to the input, for providing, for a given character, an encoded signal indicative of the given character, including a) means, hereinafter referred to as "font means," connected to the input, associated with a character pair, hereinafter referred to as "the given character pair", for storing, accessing and updating for each given character of a plurality of characters, a table listing the set o candidates for the character that may follow the given character pair in the stream, such table hereinafter referred to as a "font"; wherein all the candidates in such font are stored in approximate order of their local frequency of occurrence after the given character pair with which the font is associated; b) font identification means, connected to the input, for identifying the font, hereinafter referred to as the "given font", for that character in the stream at the input; and c) position encoding means for providing, for one given character, a signal indicative of the position, occupied by the given character, in the given font; string encoding means, connected to the input, for providing, for a given string of characters, an encoded signal indicative of the given string of characters, including: a) a history buffer; b) history buffer access means for finding a candidate string in the history buffer; and c) longest match search means for searching for longest match by comparing an object string in the character stream with a candidate string in the history buffer; and output selection means for accepting encoded signals from the single character encoding means and encoded signals from the string encoding means and selectively sending these encoded signals to the output; wherein the font identification means further includes hash encoding means for producing hash codes and hash code storage means for storing hash codes and the history buffer access means further includes means for retrieving hash codes from the hash code storage means, such that a common hash code is used by both the font encoding means and the string encoding means.

2. A system according to claim 1, wherein the hash encoding means includes means for applying a CRC algorithm to an ordered character pair to produce a hash code

3. A system according to claim 1, further including: means for maintaining a value for the position of any character that is not otherwise listed in the font, such character hereinafter referred to as "new character" or "NC", in relation to other candidates in a given font, in approximate order of such new character's local frequency of occurrence after the given character pair; such that new character is assigned a "virtual position" in the font, as distinct from a position that is associated with a location in the font capable of storing a specific candidate character; and such that the address of the position of each candidate character below the new character position in the table is incremented by 1.

4. A system according to claim 3, further including: a plurality of NC fonts, each font listing the candidates for the new character which may follow a given set of characters in the character stream wherein all the candidates in such font are stored in approximate order of their local frequency of occurrence after the given set of characters with which the font is associated; and NC font selection means for selecting the NC font to be used to encode a given new character based on predefined bits from the set of characters preceding the given character in the character stream.

5. A system according to claim 4, wherein the number of NC fonts is four and the predefined bits are bits 5 and 6 from the character prior to the given character.

6. A system according to claim 1, further including: means for maintaining a value for the position of a string in a given font.

7. A system according to claim 1, further including: means for maintaining a value for the position of a new character, i.e., any character that is not otherwise listed in the font, in relation to other candidates in a given font, in approximate order of such new character's local frequency of occurrence after the given character pair; and means for maintaining a value for the position of a string; such that the value for the position of a string is one greater than the value for the position of a new character; such that the string is assigned a "virtual position" in the font as distinct from a position associated with a location in the font capable of storing a specific candidate character; and such that the address of the position of each candidate character below the new character position in the font is incremented by 2.

8. A system according to claim 1, further including: repeat character encoding means for encoding repeat character sequences, i.e. characters all alike, found in the character stream; wherein the history buffer stores characters found in the character stream; and wherein repeat character sequences having three or more characters are represented in the history buffer by three characters only.

9. A system according to claim 3, wherein the string encoding means has a plurality of modes of operation, the system further including: means for summing, over a predetermined number of new character occurrences, the bit-count of the code for each new character encoded; means for comparing the sum with a predetermined value; and switch means for switching modes whenever the bit-count exceeds the predetermined value.

10. A system according to claim 9, wherein the predetermined value has a value between seven bits per character and eight bits per character.

11. A system according to claim 9, wherein the predetermined value is 7.5 bits per character.

12. A system for providing, for a given string of characters, an encoded signal indicative of the given string of characters, comprising: a) a history buffer tagged at regular intervals; b) history buffer access means for finding a candidate string in the history buffer; and c) longest match search means for searching for longest match by comparing an object string in the character stream with a candidate string in the history buffer; d) a hash head table, which may be entered by a hash code derived from consecutive characters; e) a hash link/test table, having a number of records equal to the number of tagged entries in the history buffer, each record having a link field and a test field and an address related to the address of the corresponding tagged entry in the history buffer; wherein the hash head table contains pointers, consisting of part of the hash code, each pointing to the first candidate match in a linked list of candidates in the Hash Link field and the Hash Test field contains a match value consisting of another part of the hash code.

13. A system according to claim 12, wherein the longest match search means includes means for testing for a match beginning at a character in the candidate string at least one character ahead of the first character in such string.

14. A system according to claim 13, wherein the longest match search means includes means for testing for a match beginning at character "n" ahead of the first character of the candidate string in the history buffer, where "n" is the length of the longest match found so far, and searching forward to identify the longest match.

15. A system according to claim 14, wherein the longest match search means further includes means for searching back for the longest match.

16. A system according to claim 12, further including means for discarding string matches having less than a predetermined number of characters.

17. A system according to claim 16, wherein the predetermined number is 3.

18. A system according to claim 12, wherein the linked list is terminated by non-match of the contents of the Hash Test field with its corresponding part of the hash code.

19. A system according to claim 1, further including pair encoding means for encoding two characters by presenting the two characters in sequence to a CRC algorithm.

20. A system according to claim 19, wherein pair encoding processes and string encoding processes may be active at the same time.

21. An improved data compression modem of the type having terminal interface control means for controlling an interface with a terminal, data compression means for compressing data from the terminal, line control means for controlling data flow over a data line, line interface means for interfacing with a data line, wherein the improvement comprises: (a) first processor means for controlling both flow of data over the interface with the terminal and for compressing data from the terminal, and

(b) second processor means for controlling flow of data over the data line.

22. An improved data compression modem of the type having terminal interface control means for controlling an interface with a terminal, data compression and decompression means for compressing data received from the terminal and for decompressing data going to the terminal, line control means for controlling data flow over a data line, line interface means for interfacing with a data line, wherein the improvement comprises:

(a) first processor means for controlling both flow of data over the interface with the terminal and for compressing data received from the terminal, and for decompressing data going to the terminal, and (b) second processor means for controlling flow of data over the data line.

23. An improved data compression modem according to claim 21, wherein the first processor and the second processor access a common memory.

24. A method for dynamically encoding a character stream, in an encoder having a history buffer and fonts, comprising the following steps: a) receiving the character stream; b) creating, from a two-character string, having a first character and a second character, a hash code; c) associating each font with a pair of characters; d) maintaining the position of a candidate character in a font in approximate order of the local frequency of occurrence of the candidate character in the character stream after the pair of characters with which the font is associated; e) encoding a given character using the hash code to access the font associated with the pair of characters immediately preceding the given character in the character stream; and f) encoding a given string of characters using the hash code to access a matching string in the history buffer.

25. A method for dynamically encoding a character stream, in an encoder having a history buffer and having fonts that are dynamically created and updated, comprising: a) receiving the character stream; b) creating, from a two-character string, having a first character and a second character, a hash code having desirable statistical properties and a match code; c) associating each font with a pair of characters; d) maintaining the position of a candidate character in a font in approximate order of the local frequency of occurrence of the candidate character in the character stream after the pair of characters with which the font is associated; e) encoding a given character using the hash code and the match code to access the font associated with the pair of characters immediately preceding the given character in the character stream; and f) encoding a given string of characters using the hash code and the match code to access a matching string in the history buffer.

26. A method for creating, from a two-byte string having a first byte and a second byte, a hash code having desirable statistical properties, comprising: encoding the two-byte string by presenting the two bytes in sequence to a CRC algorithm to produce a CRC hash; and designating selected bits from the CRC hash for use as a hash code.

27. A method for creating, from a two-byte string having a first byte and a second byte, a hash code having desirable statistical properties and a match code for resolving ambiguity, comprising: encoding the two-byte string by presenting the two bytes in sequence to a CRC algorithm to produce a CRC hash; designating selected bits from the CRC hash for use as a hash code; and designating the remaining bits from the CRC hash for use as a match code.

2δ. A method according to claim 27, wherein ten bits are selected for use as a hash code.

29. A method for accessing a specific font within a data processing system, the system having a link table and a plurality of fonts, each font being uniquely associated with a specific character pair, comprising: accepting a pair of characters, having a first character and a second character, each character represented by a single byte; encoding the pair of characters using a CRC algorithm to produce a CRC hash; selecting a first part of the CRC hash as a look-up code; linking, in the link table, those fonts that are associated with pairs of characters whose encoding produces the same first part of the CRC hash; entering the hash table with the look-up code to access a linked list of fonts; and identifying, from among the fonts in the linked list, the specific font corresponding to the pair of characters, by matching the remainder of the CRC hash.

30. A method according to claim 29, wherein the method of encoding the pair of characters includes: encoding the two-byte string by presenting the two bytes in sequence to a CRC algorithm to produce a CRC hash; and designating selected bits from the CRC hash for use as a hash code.

31. A method according to claim 29, wherein the first part of the CRC hash consists of ten bits.

32. A method, for accessing a specific pair of characters in a history buffer within a system for the dynamic encoding of a character stream, the system having a history buffer containing characters from the character stream, and a link table, comprising: accepting a pair of characters from the character stream, hereinbelow referred to as "the given pair of characters", each pair having a first character and a second character, each character represented by a single byte; encoding the given pair of characters using a CRC algorithm to produce a CRC hash; selecting a first part of the CRC hash as a look-up code; linking, in the link table, history buffer entry points that have pairs of characters in the history buffer whose encoding produces the same first part of the CRC hash; entering the hash table with the look-up code to access a linked list of history buffer entry points; and identifying, from among the history buffer entry points in the linked list, points corresponding to the given pair of characters, by matching the remainder of the CRC hash.

33. A method according to claim 32, wherein the method of encoding the given pair of characters using the CRC algorithm to produce the CRC hash comprises: encoding the byte representing the first character using the CRC algorithm to produce an intermediate CRC hash; encoding the second character using the CRC algorithm and the intermediate CRC hash to produce the CRC hash.

34. A method according to claim 32, wherein the first part of the CRC hash consists of ten bits.

35. A method, for accessing a specific sequence of four characters in a history buffer within a system for the dynamic encoding of a character stream, the system having a history buffer containing characters from the character stream, and a link table, comprising: accepting four consecutive characters, hereinbelow referred to as "the given four characters", comprising a first pair of consecutive characters and a second pair of consecutive characters, each pair having a first character and a second character, each character represented by a single byte, from the character stream; encoding the given four characters using a CRC algorithm to produce a hash code; selecting a first part of the hash code as a look-up code; linking, in the link table, history buffer entry points that have four sequential characters in the history buffer whose encoding produces the same first part of the hash code; entering the link table with the look-up code to access a linked list of history buffer entry points; and identifying, from among the history buffer entry points in the linked list, points corresponding to the given four characters, by matching the remainder of the hash code.

36. A method according to claim 35, wherein the method of encoding the given four characters using a CRC algorithm to produce a hash code comprises: encoding the first pair of characters by presenting the characters in sequence to a CRC algorithm to produce a first pair CRC hash; encoding the second pair of characters by presenting the characters in sequence to a CRC algorithm to produce a second pair CRC hash; subtracting the second pair CRC hash from zero to produce a negated second pair CRC hash; and performing an Exclusive OR operation on the first pair CRC hash and the negated second pair CRC hash to produce a hash code.

37. A method according to claim 35, wherein the first part of the hash code consists of ten bits.

38. A method for controlling the selection of alternative string encoding modes in a system for the dynamic encoding of a character stream, comprising: maintaining a set of fonts, each font being associated with a pair of characters, wherein all the candidates in such font are stored in approximate order of their local frequency of occurrence after the given character pair with which the font is associated, the fonts further including means for maintaining the position of a symbol for a new character, i.e., any character that is not otherwise listed in the font, in relation to other candidates in a given font in approximate order of such symbol's local frequency of occurrence after the given character pair; maintaining a new character encoding table; encoding new characters from the character stream. according to the position of the new character in the new character encoding table; summing the bit cost of encoding each new character over a predetermined plurality of new character occurrences; comparing the sum with a predetermined value; and switching modes whenever the bit-count exceeds the predetermined value.

39. A method for encoding a character pair, within a system for the dynamic encoding of a character stream, the system having a link table and a plurality of fonts, each font being uniquely associated with a specific character pair, comprising: accepting a pair of characters, having a first character and a second character, each character represented by a single byte; encoding the pair of characters using a CRC algorithm to produce a CRC hash; selecting ten bits of the CRC hash as a look-up code; linking, in the link table, those fonts that are associated with pairs of characters whose encoding produces the same first part of the CRC hash; entering the hash table with the look-up code to access a linked list of fonts; identifying, from among the fonts in the linked list, the specific font corresponding to the pair of characters, by matching the remaining six bits of the CRC hash; and encoding the pair of characters as the relative address of the identified font.

40. In a system for dynamic encoding of a character stream having a link table and a plurality of fonts, each font being associated with a unique, ordered character pair, the system encoding a given character by means of the font associated with the pair of characters immediately preceding a given character in the character stream, a method for maintaining fonts that are most recently used, comprising: accepting a pair of characters, having a first character and a second character, each character represented by a single byte; encoding the pair of characters using a CRC algorithm to produce a CRC hash; selecting a first part of the CRC hash as a look-up code; linking, in the link table, fonts that are associated with pairs of characters whose encoding produces the same first part of the CRC hash; entering the hash table with the look-up code to access a linked list of fonts; identifying, from among the fonts in the linked list, the specific font corresponding to the pair of characters, by matching the remainder of the CRC hash; and discarding a font, when a font must be discarded, whose associated character pair was least recently encountered in the character stream.

41. A method for use in a data processing system for finding an object string within a data string comprising: presenting the object string to a CRC algorithm to produce an object string hash code; presenting each of a plurality of candidate strings within the data string to a CRC algorithm to produce a candidate string hash code for each candidate string; identifying candidate strings whose hash code matches the object string hash code; testing a candidate string, whose hash code matches the object string hash code, for a match with the object string.

42. A method for finding the longest match between an object string in a stream of characters and candidate strings in a buffer, comprising: comparing a character in the object string with a character in a first candidate string; comparing, if the prior comparison yields a match, each next character in the object string with each next character in the first candidate string until the comparison fails to yield a match; storing the number of characters so matched as the length of the longest match; comparing a character in the object string with a character in a second candidate string, starting at a character ahead of the origin of each string by a number of characters substantially equal to the length of the longest match.