US20010025315A1 - Term addressable memory of an accelerator system and method - Google Patents

Term addressable memory of an accelerator system and method Download PDF

Info

Publication number
US20010025315A1
US20010025315A1 US09/756,667 US75666701A US2001025315A1 US 20010025315 A1 US20010025315 A1 US 20010025315A1 US 75666701 A US75666701 A US 75666701A US 2001025315 A1 US2001025315 A1 US 2001025315A1
Authority
US
United States
Prior art keywords
memory
tcp
network
data
packet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US09/756,667
Other versions
US6768992B1 (en
Inventor
Lynne Jolitz
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/147,856 external-priority patent/US6173333B1/en
Application filed by Individual filed Critical Individual
Priority to US09/756,667 priority Critical patent/US6768992B1/en
Publication of US20010025315A1 publication Critical patent/US20010025315A1/en
Application granted granted Critical
Publication of US6768992B1 publication Critical patent/US6768992B1/en
Adjusted expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/161Implementation details of TCP/IP or UDP/IP stack architecture; Specification of modified or new header fields
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99935Query augmenting and refining, e.g. inexact access
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99931Database or file accessing
    • Y10S707/99933Query processing, i.e. searching
    • Y10S707/99936Pattern matching access

Definitions

  • a network accelerator and method for TCP/IP that includes programmable logic for performing network protocol processing at network signaling rates.
  • the programmable logic is configured in a parallel pipelined a architecture controlled by state machines and implements processing for predictable patterns of the majority of transmissions.
  • incoming packets are compared with patterns corresponding to classes of transmissions which are stored in a content addressable memory and are simultaneously stored in a dual port, dual bank application memory.
  • the patterns are used to determine sessions to which an incoming IP datagram belongs, and data packets stored in the application memory are processed by the programmable logic. Processing of packet headers is performed in parallel and during memory transfer without the necessity of conventional store and forward techniques resulting in a substantial reduction in latency. Packets which constitute exceptions or which have checksum or other errors are processed in software.
  • VxCAM VIRTUAL EXTENSIBLE CONTENT ADDRESSABLE MEMORY
  • the present invention relates to Internet communications in general, and to a method and system in particular for substantially increasing the data throughput of TCP/EP protocol based data transmissions by selectively implementing in hardware certain portions of the TCP/IP protocol set (such as a majority of actually called and executed routines), and implementing in software routines the exceptions and remaining portions.
  • ATM cell-based transmission technology incurs a cost because of segmentation and reassembly of large data payload messages into much smaller cells. Devices which attempt to minimize this cost perform this function at the signaling rate. However, this function is specific to cell-based technologies, and is not particularly useful for technologies such as Ethernet and HiPPI.
  • the payload size of such technologies' packets do not require an adaptation layer below that of the network or IP (Internet Protocol) layer. In order to process TCP/IP protocols, traditional store and forward methods must be used.
  • Protocol engines have also been used to optimize traditional methods of protocol handling to reduce certain steps. These include hardware checksum units, hardware buffer management, and RISC processing to improve protocol handling rate. However, this approach still does not scale with signaling rate.
  • the present invention provides a solution to the above-mentioned protocol processing problems using a cross disciplinary combination of hardware elements, techniques and results based, inter alia, on network traffic analysis, high speed programmable logic array technology, and integration with low level operating system software design.
  • the invention solves a problem that has been long unsolved of how to process TCP/IP data packets at a speed equal to that made possible by the latest generation physical layer hardware transmission components.
  • microprocessors increase in speed, the same technology advances also increase the speed at which data can be transmitted over networks. If this data protocol handling must be handled in software, then there are fundamental issues in logic and software design that will always make the ability of a processor to process the packets slower than the physical ability of the network to transmit packets. This speed differential can penalize maximum possible network performance by a factor of almost one hundred at present.
  • the main insights that enable the invention to provide a practical and implementable solution to the above-mentioned protocol processing problems are the recognition that the transmission patterns of the vast majority of packets over current TCP/IP mediated networks are predictable and involve only a very small subset of the entire TCP/IP protocol set. It is possible through logic design to implement this small set of actually used protocols in hardware, such as programmable logic gate arrays, to allow processing of TCP/IP data packets at speeds equal to that of the ability of the fastest physical network layer. The rare packets that cannot be handled in this manner can be defaulted to conventional software processing. An operating system also can be low-level interfaced to this processing system through appropriate memory management in such a way that the packet's data coming off the network data transmission medium can be processed and put into application memory at the speed equivalent to a single gate-mediated operation.
  • the invention allows practical processing of TCP/IP data packets in gate array hardware at a data throughput equal to that of the physical transmission media. It accomplishes this task by recognizing that TCP/IP packets on current networks fall into predictable transmission patterns that actually utilize only a small fraction of the entire protocol for the vast majority of transmissions. By implementing this small subset in gate array hardware and defaulting the exceptions into software, a very large increase in TCP/IP packet throughput can be obtained.
  • TCP/IP transmissions handled by the invention can be made faster than that possible with the best current software implementations and multiprocessor TCP/IP processing engines.
  • Using mask programmable logic affords approaches which are both faster and less expensive to construct than the current RISC CPU assisted TCP/IP processing boards, the invention is intrinsically scaleable upwards in speed with little or no redesign needed as advances in IC processing technology makes the network physical layers faster.
  • a form of software embedded in hardware which can be physically implemented at any point where TCP/IP packet processing is used such as in network interface cards, and within microprocessor CPUs, affording significant potential technological and economic benefits.
  • a difference between the invention and prior approaches is that the invention constructs a path into memory for a specific class of packets that exists for the likely time interval when such a packet will be present.
  • the path into and out of memory is handled entirely in the hardware of the invention with only random logic up to where it interacts with the application, and is triggered entirely by the arrival of the packet itself.
  • this hardware all details are present for handling the packet payload state to where it will be delivered.
  • accelerators on both ends of a network transfer no software overhead need be present for bulk data transfer in burst mode. This differs markedly from prior software and hardware approaches which employed techniques of minimized protocol implementations, buffer management, or by spreading the protocol implementation across a specially designed network fabric.
  • the invention implements continuous flow (streamed) information delivery via a standard protocol such as (TCP/IP) by means of a pattern match via associative memory. It has several benefits in processing standard protocols, as opposed to non-standard protocols. These include absolute minimum latency between application and network medium (fiber), absolute maximum bandwidth between communicating network applications, low complexity design network protocol processing mechanism, and the protocol rate scales linearly with network signaling rate.
  • a standard protocol such as (TCP/IP)
  • FIG. 1 is a diagram of a network accelerator in accordance with the invention.
  • FIGS. 2 a and 2 b are diagrammatic views which contrast, respectively, a traditional link data stream store and forward approach with a network accelerated continuous flow link data stream approach in accordance with the invention
  • FIG. 3 is a more detailed diagram of the network accelerator of FIG. 1;
  • FIG. 4 is a diagram of a control unit of the accelerator of FIG. 3;
  • FIG. 5 is a diagram of a transmit engine of the network acceleraterator
  • FIG. 6 is a diagram of a receive engine of the invention.
  • FIGS. 7 and 8 are diagrams contrasting a traditional memory process with that of the invention.
  • FIG. 9 is a diagram of the term addressable memory of the invention.
  • the invention is particularly adaptable to TCP/IP protocols and will be described in that context. It will be appreciated, however, that the invention has greater utility and is applicable to other streamed protocols.
  • Computer networks use network software protocols to communicate information reliably between computers over multiple successive physical signaling mediums. These protocols are implemented in software on computer processors. While hardware signaling rates have steadily increased, software protocol processing has not kept pace. With the advent of gigabit networking technology, costly processors must be dedicated to providing at most 40-50 percent of the theoretical bandwidth of the network, while software implementations used with earlier signaling technologies were capable of 60-80 percent of theoretical bandwidth. Clearly bandwidth demands will continue to increase, and since the disparity between software protocol processing and signaling rates will also increase, this defines a “bottleneck” in the effectiveness of networking technology.
  • the invention affords a minimum time mechanism for handling TCP “burst” transfers.
  • a burst transfer is a series of bulk data transfers with no options between nodes (usually it is unidirectional). It consists of the sending node passing data payload packets with successive sequence numbers, and the receiver committing them to memory and sending acknowledgments back to the sender to trigger more data to be sent.
  • the invention efficiently handles the burst so as to minimize latency.
  • the software triggers the burst mode by a traditional send operation with a full-payload, and the receiver/transmitter fall into an asynchronous feedback loop of “send-next”/“acknowledgment” packets that continues until either the burst completes or an error occurs.
  • the invention provides a mechanism to process the costly portions of standard protocols in hardware entirely, and to do so at the same clock rate of the signaling. In this way, as the signaling rate rises, so does the protocol processing rate increase in lockstep. This approach is based upon several observations, including traffic pattern analysis of packets, experience with software protocol implementations, and experience with other nonstandard hardware-implemented protocols.
  • the network accelerator of the invention uses a deterministic state machine to implement the transport protocol bulk receive and transmit functions, leaving to the software all other features of the protocol (including error recovery).
  • FIG. 1 is a functional diagram of a network protocol accelerator 10 in accordance with the invention.
  • the network accelerator can perform 100 baseTx full-duplex interface to an Ethernet network, with media access controller (MAC functions), IP (Internet Protocol) processing and decoding, and TCP (Transmission Control Protocol) processing. It may be a PCI interface board, designed to be used in an NT workstation, for example, utilizing a standard PCI bus slot.
  • the accelerator will preferably have a physical link layer processor, an IP processor, a TCP processor, segment buffer memory, multiple FPGAs for logic, and a PCI interface to the host system.
  • the network accelerator includes a network interface 12 that includes a physical (PHY) media framing unit which obtains physical signals from the physical media, decodes the signals and the link layer framing as a byte stream, and supplies the stream to an accelerator engine 14 . Simultaneously, a copy of the signals may be recorded in a receive/transmit (Rx/Tx) FIFO bypass unit 16 .
  • a system bus interface unit 18 may signal a system interface unit 20 to handle the packet which is stored in the Rx FIFO bypass portion of 16 .
  • the system interface can send packets via the bus interface to the Rx/Tx FIFO bypass portion of unit 16 , which hands it on to the physical media framing unit 12 , effectively bypassing the accelerator engine for non-TCP data transfers.
  • the accelerator engine 14 is connected to a variable content addressable memory 22 , and consults the memory as octets of a packet are received to find a match with predetermined patterns. When a match is found, a state machine associated with the pattern is loaded from the content addressable memory into the accelerator engine to operate on the packet. Operations may include packet table delivery from the physical media framing unit to a dual port application transfer buffer 24 ; sending a packet with payload from the application transfer buffer to the physical media forming unit; and sending a packet acknowledgment to the physical media framing unit.
  • the accelerator engine 14 may signal its status to the system interface unit 20 via the bus interface 18 indicating the event.
  • the accelerator engine fails to recognize a packet or encounters an error, such as a CRC (cyclic redundancy check) or checksum error
  • the accelerator engine suspends operation on that segment causing all packet traffic to be handled via the Rx/Tx FIFO bypass buffer until it is re-enabled by the system interface unit 20 via the bus interface 18 to return to normal operation.
  • CRC cyclic redundancy check
  • the network accelerator of the invention is implemented as a set of programmable logic chips intermediate between the network physical layer interface chip set and an interface chip set for the PCI or other bus.
  • programmable logic chips may be SRAM field programmable gate arrays (FPGAs) which can be reprogrammed via the network to allow the hardware protocols to be modified after installation to correct errors, to optimize performance as the network changes, or to implement changes to existing protocol sets.
  • FPGAs field programmable gate arrays
  • Low cost implementations can also use mask programmable chip sets.
  • the speed advantages of mask programmable ASICs may make their use preferable in high speed point to point data transfer applications where the end nodes and routers are well defined by the user.
  • the network accelerator also may be implemented within the silicon of the main microprocessor or within on-board multi-chip-modules analogous to the way MMX incorporates digital signal processor functionality, or by which AGP provides on-chip integration of graphics accelerator functions. This can bring the data directly into the microprocessor, bypassing the external data bus interface which otherwise limits performance.
  • the high speed data transmission capability of the invention is advantageous in providing for direct data storage, display, or data processing device interconnect both between and within individual computers.
  • FIGS. 2 a and 2 b are diagrammatic views which contrast the significant improvement in latency between a network accelerated continuous flow link data stream approach of the invention (FIG. 2 b ) and a traditional link data stream store and forward approach (FIG. 2 a ).
  • FIG. 2 illustrates how a traditional protocol stack accumulates data in a store and forward buffer 30 , and then performs the necessary protocol processing operations. As indicated in the figure, data packets from an Ethernet are delivered to a link data delivery unit which may perform error checking prior to storing in buffer 30 . The time required for this operation is of the order of tens of microseconds. Subsequently, the various segments of the data packet are processed in a protocol processor 32 .
  • This processing is sequential, and results may also be placed in the store and forward buffer as application payload data is delivered to the protocol processor. This typically may take hundreds of microseconds, even with very high speed devices performing the operations.
  • the critical weakness is that data must be in some kind of buffer before it can be processed, and processing must be completed before the data can be forwarded.
  • the network accelerator of the invention uses the protocol's data stream itself as a way of instructing a uniquely constructed data flow processing machine 34 that is clocked by the protocol data and which performs processing operations as the information appears.
  • processing occurs in a series of parallel functional units 35 - 38 having a pipelined architecture so that packets are processed in real time with the processed data flowing between the network's wire link and the processing application's data origin.
  • a protocol's packet would appear as a single, fat instruction that would run on a data flow processor in lock step with the link's data rate. This allows complete processing in times of the order of tens of microseconds in contrast to the traditional store and forward approach illustrated in FIG. 2 a. The manner in which this is accomplished will be described in more detail below.
  • FIG. 3 is a functional diagram which illustrates in more detail a preferred embodiment of the network accelerator of FIG. 1.
  • the physical media framing network interface 12 may comprise a physical device interface (PHY) 40 connected to a CRC/MAC unit 42 .
  • This physical device interface and CRC/MAC unit provide physical and link layer access, respectively, to an Ethernet network.
  • the CRC/MAC unit provides parallel-to-serial conversion, CRC (cyclic redundancy check) generation and checking, MAC address recognition, FIFO buffering, and interface to the remainder of the network accelerator, which includes TCP/IP processors and the dual port application transfer memory 24 , which preferably comprises a dual port/double banked RAM.
  • outgoing Ethernet packets will be read from the buffer memory 24 and transferred to an internal FIFO in preparation for transmission to the network.
  • the CRC will be calculated on the fly and appended to the end of the Ethernet packet.
  • An incoming Ethernet packet is stored in the incoming FIFO while the destination address is checked against the MAC address register. If the MAC address is correct, the Ethernet packet is sent to an Rx engine. The Ethernet packet is also run through the CRC checker, simultaneously. Once the Ethernet packet is completely received and the CRC is good, the CRC good signal will be asserted.
  • the accelerator engine 14 includes a control unit 44 , a Tx engine 46 , and a Rx engine 48 .
  • a first prototype memory 50 connected to the Tx engine 46
  • a second prototype memory 52 connected to the Rx engine 48
  • variable content addressable memory (VxCAM) 22 includes a content addressable memory (CAM) 54 which is also connected to the Rx engine 48 .
  • the variable content addressable memory matches a variety of packet formats and is used to quickly determine to which session an incoming IP datagram belongs.
  • the variable content addressable memory 22 also includes an ADE memory 56 which is connected to control unit 44 .
  • the Rx/Tx FIFO bypass memory 16 may be implemented as a Tx bypass memory 60 and a Rx bypass memory 62 .
  • bypass memory 62 may be connected to the Rx engine 48 and to a bus 64 connecting the control unit 44 and bus interface unit 18 .
  • the Tx bypass memory 60 may be similarly connected to bus 64 and to the CRC/MAC unit 42 .
  • the network accelerator handles the various layers of an Ethernet packet as it is sent or received from a network.
  • the IP address, IP checksums, ID field, flags, IP datagram length, etc. are either pre-calculated and sent to the network via the MTx engine 46 , or used to verify the destination of an incoming Ethernet packet via the Rx engine 48 .
  • TCP ports, TCP checksums, sequence numbers, ACK number, flags, window size, urgent pointer, options, etc. are either pre-calculated and sent to the network via the Tx engine, or used to verify the destination of a incoming IP datagram via the Rx engine.
  • the Tx engine 46 obtains the TCP payload directly from the memory 24 .
  • the Rx engine 48 delivers the TCP payload directly to the memory 24 .
  • the memory 24 will contain the host system view of network memory, and a shadowed copy for the network accelerator to use for TCP segment transmission and reception.
  • the host system software driver will swap application memory (system RAM) for memory 24 . This will allow the host system direct access the network data stored in the dual-port/double banked memory, effectively replacing the role of host system RAM.
  • the system interface controls the relationship between the system and the network accelerator. It contains configuration and status registers, and allows the host system to access the network accelerator.
  • Data for the packets are buffered for transfer using the memory 24 .
  • This memory maintains an up-to-date copy of the network data for the host system/application, and a local copy of the network data for the network accelerator. This allows the application/host system to access memory as it would system RAM before, during and after a TCP segment is sent to the network by the network accelerator. Also, the memory allows access to a stable copy of the network data for transmission or reception to/from the network.
  • the network acceleration control unit maintains the proper relationship between the memory banks, with the banks synchronized in the case of Idle state (the network accelerator is neither transmitting nor receiving TCP segments), or logically separated during network accelerator TCP segment transmission or reception. The double banked nature of the memory allows a “zero-copy” or “zero-latency” method of network data delivery to the network accelerator.
  • status memory used to maintain the relationship between the memory bank and the host system memory bank. This status memory works as a table indicating which bank of memory has the most current byte of network data for each address in the memory.
  • the content addressable memory (CAM) 54 is used to quickly determine to which session an incoming IP datagram belongs. It cooperates with ADE memory 56 , prototype memories 50 , 52 , and is part of the variable content addressable memory (VxCAM) 22 .
  • ADE memory 56 there will be one or more address descriptor entries (ADEs) which describe the segment details such as memory base address, TCP payload length, TCP payload checksum, next TCP sequence number and the next TCP segment's ADE. This information is used by the Tx engine when the segment is constructed, prior to transmission. The Rx engine uses the ADE fields to determine the sequence numbers, payload destination, and out-of order segments.
  • ADEs address descriptor entries
  • session prototype description entries describe the session fields that do not change, as well as the initial values for the session, such as IP address, TCP ports, protocol fields, base sequence number, first ADE, etc.
  • the Tx engine uses this information to generate the static fields within a session for an outgoing TCP segment.
  • the Rx engine uses this information to determine what TCP session an incoming TCP segment is destined for, and to verify the validity of specific fields in the TCP/IP header.
  • the content addressable memory 22 stores the address of the potential TCP session prototype entry that describes the session to which an incoming segment belongs. Certain fields in the TCP/IP header are hashed to obtain a value which is used as an address to “look up” which prototype describes this segment. The memory stores at the “hashed” address another address which points to the prototype data in the prototype memory. If the memory returns a value of zero, the incoming TCP segment does not belong to any accelerated sessions and is routed to the bypass FIFO. In this manner, a “one shot” lookup of the TCP session prototype can be done, rather than searching potentially thousands of TCP session prototypes.
  • FIG. 4 illustrates the network accelerator control unit 44 in more detail.
  • the control unit provides the overall state machines and control registers which control the network accelerator.
  • Logic for controlling the dual port application transfer memory 24 and the Tx and Rx session state machines (to be described) for the Tx engine and the Rx engine, respectively, may be contained in a dual port memory controller 61 .
  • Logic for generating a checksum may be contained in a checksum unit 62 which interfaces with ADE memory 56 via an address bus 57 and a data bus 58 . After initialization of a current checksum, the ADE memory 56 may be created and used for bounds checking on the host address to obtain the checksum for the desired payload of a TCP segment.
  • This checksum may be loaded into the checksum unit 62 .
  • the current value may be stored in memory 24 , and a new calculated value may be added to the checksum.
  • the checksum may then be saved either through a write back to the ADE memory 56 and the dual port application transfer memory 24 , or if multiple locations require modifications, by iteration.
  • the calculated checksum is then ready for use as the data checksum of the TCP segment.
  • a first FIFO buffer 70 may interface the dual port application transfer memory 24 to Tx data from the Tx engine state machine, and a second FIFO buffer 72 may interface memory 24 to Rx data from the Rx engine state machine.
  • Logic for controlling the FIFOs may be contained in the FIFO buffers themselves and used to minimize bus arbitration read/write by an arbitration unit 74 .
  • logic for controlling the Rx engine and the Tx engine, as well as access to their status and control registers may be contained in a registers and configuration unit 76 which is interfaced to memory 24 and memory controller 61 by an application data bus 77 and an application address bus 78 .
  • Arbitration unit 74 also may include logic to control memory access arbitration between the host system and the network accelerator.
  • the network accelerator control unit also maintains global control of the state machines for each session.
  • Tx idle is the state prior to sending a Tx buffer to the network. This is the default state and is set up by the software driver. The software driver will also generate the necessary values for the VxCAM for a given buffer space. The host system fills the memory until the session is ready to be transmitted. At this point, the state machine transitions to the Tx pending state.
  • the dual port memory controller 61 maintains two copies of the data: one for the host system, and one for the Tx engine. Proper data relationship between the host system memory and the network accelerator memory must be maintained to prevent old data from overwriting new host system data, and new host system data from overwriting the data in use by the Tx engine.
  • the network accelerator In the Tx complete state, if the transmission fails the state machine goes to the Tx re-transmit. If the Tx was a success, the network accelerator will set a success bit and go to the Tx idle state. The network accelerator is now waiting to send out the next segment. In either case, the network accelerator control unit must continue to maintain the proper relationship between the two copies of data.
  • the network accelerator may either attempt to re-transmit the segment, or move to the next session queued for transmission and attempt this segment later.
  • the Rx idle state is the initial state. In this state, the two copies of network data are reconciled. Depending on the outcome of the previous received segments, the host system reads data from either the shadow bank of dual port application transfer memory or the application bank of same memory. If a packet was successfully received, the net payload data stored in the shadow bank of the dual port application transfer memory must be presented to the host system. This is performed on a byte by byte level.
  • the Rx engine In the Rx pending state, the Rx engine is receiving one or more segments is the current session. Receive data is placed in the proper bank of the memory by the network accelerator control unit.
  • Rx success In the Rx complete state, there may be two different scenarios: Rx success or Rx time-out. In the case of success, the success bit for the ADE is set, then the Rx idle state is entered. In the case of failure, the state machine goes to idle and no changes occur to the memory.
  • Checksums for the payload of the TCP packet are calculated by the checksum logic 62 as follows. Upon initial setup of the session, the section of memory used by the session is cleared to all zeros. This allows the initial checksum to be initialized to zero for each segment. ADEs are setup for each segment within the session; ADEs contain the starting address, ending address, and checksum for each segment of the session. There may be one or more segments in any session.
  • the host system During host system writes, the host system presents an address to be accessed. Bounds checking is performed on this address to determine which ADE contains the checksum for this address. The checksum is loaded into the checksum logic and the current (old) value in the memory is subtracted from the checksum. Next, the new data value is added to the checksum.
  • the Tx engine uses the checksum stored in the ADE as the checksum for the data portion of the TCP segment.
  • the Rx engine and the Tx engine use FIFOs 70 , 72 for interfacing the engines to the dual port application transfer memory 24 .
  • the FIFOs minimize the bus arbitration necessary to read and write data into the dual port application transfer memory 24 from the engines.
  • the control of the FIFOs involves filling and draining the FIFOs in a cycle-steal mode between host system accesses to the memory.
  • the control unit 44 has an address and data bus connection to the Tx and Rx engines 46 , 48 . This bus allows the control unit to set and read configuration and status registers within the two engines.
  • the control unit controls access to the Tx and Rx engines and all memory 24 , 50 , 52 , 56 , and CAM memory 54 through arbitration, using arbitration unit 74 . Host system accesses and accesses compete for access through the control unit. Any known arbitration method may be used to control these accesses.
  • Tx engine 46 may be controlled by a state machine 100 , which is used to generate signals which are used to control all the events in the send process. It may be based on a send counter (not shown). This counter is started at initial transmission time, and generates signals which are used to control all the events in the send process.
  • a multiplexer 102 combines Tx data and the outputs of several registers, and provides these to an output register 104 .
  • the registers muxed to the output register 104 may be Tx prototype register 106 , a Tx application data output register 108 , the outputs of checksum registers 110 and 112 , an ACK register, and all the individually calculated fields in overlay registers 114 and 116 .
  • the send counter and the Tx engine control state machine 100 govern the timeslots for outputting the various fields to the output register.
  • the state machine determines the proper time to calculate the various IP and TCP fields and when to send next segment.
  • Sliding window calculation logic provides the information via register 120 to the Tx engine state machine for next segment transmission.
  • the Tx engine is responsible for sending Ethernet packets containing IP datagrams of TCP segments to the network.
  • TCP segments There are two primary types of TCP segments. These are user data (ADE) segments, and automatically generated acknowledgment segments for received data.
  • the network accelerator creates packets from scratch, generates the Ethernet header, the IP header, the TCP header, and the TCP data payload.
  • the Tx engine state machine 100 which may be contained in the dual port memory controller 61 , asserts the Tx pending state through the Tx engine control state machine 100 , making available data contained in the Tx FIFO 70 (FIG. 4).
  • the Tx engine loads a prototype register 106 with static portions of the TCP/IP headers from the proto memory 50 of the Tx engine (FIG. 3).
  • the logic for the calculation of the dynamic portion of TCP header is contained in a TCP header Tx overlay register 116 .
  • the logic for the final checksum calculation for the dynamic portion of the IP header may be contained in an IP header checksum register 10 of FIG.
  • TCP segment post-checksum register 112 The logic which provides sequential accesses to the register contents to the output register 104 may be contained in a transfer register 105 .
  • the logic for calculating sequence numbers from the TCP header Tx overlay register by adding the length of the packet data contents to a current sequence number may be contained in an arithmetic logic unit (ALU) 1 18 .
  • ALU arithmetic logic unit
  • the logic to obtain values of a prior received datagram's sequence number and length to generate an acknowledgment number using the ALU may be obtained from the Rx engine TCP header Rx register 103 .
  • the final results may be output via the output register 104 to the CRC/MAC unit 42 of FIG. 3.
  • the logic for determining whether sending of a datagram was successful and acknowledged may be contained in the engine control state machine 100 .
  • the logic for determining if one can send additional datagrams is determined by the engine control state machine 100 and Tx window register 120 .
  • the data used to generate the dynamic calculated portion of the TCP/IP headers reside in the ADE memory, the proto memory, and the memory 24 , and the data to generate the static precalculated portion of the TCP/IP headers resides in the Tx engine proto memory 50 .
  • the data used to generate the TCP/IP payload resides in the dual port application memory 24 .
  • the base address of the segment prototype is loaded into the Tx engine proto memory address register, and the base ADE address for the segment is loaded into the Tx engine ADE memory address register.
  • the Tx engine reads the ADE and prototype data out of the ADE memory and the Tx engine proto memory, respectively, then calculates the various fields and inserts the fields into the outgoing network stream. Certain fields of the stream, such as sequence numbers, ACK numbers, ID fields, etc., may be calculated as the stream progresses. Once the headers have been calculated, the TCP payload is output from the dual port application transfer memory.
  • the CRC/MAC unit 42 appends a CRC 32 value to the Ethernet packet, and completes delivery of the packet to the PHY device 40 .
  • the network accelerator generates a complete Ethernet packet comprising an IP datagram containing a TCP segment.
  • the Rx engine 48 is controlled by an Rx engine control state machine 140 , and receives Ethernet packets from the network interface comprising the PHY device 40 and the CRC/MAC unit 42 via the input register 142 .
  • the state machine Upon receipt, the state machine sequences data to other elements of the Rx engine.
  • the receive packet is sent to the Rx bypass memory 62 , which serves as a buffer and used for any packet that is not a bulk data transfer TCP segment.
  • the Rx engine processes the IP and TCP headers and determines the type of TCP segment.
  • the Vx CAM memory 22 is used by the Rx engine to determine to which session an incoming IP datagram belongs.
  • the Rx engine under the control of state machine 140 , compares a number of fields of the IP header with expected values stored in a plurality of registers. Certain fields in the TCP/IP header are static and can be compared against static values. Other fields are variable, and define, for example, the length, or checksum or other session-related details. The variable fields are compared against values stored in registers and pre-determined values stored in the ADE memory.
  • the header Upon receiving an incoming packet, the header is decoded by a decoder 144 to determine the location of the source and destination addresses and ports contained in the TCP/IP header.
  • the logic for locating the associated prototype packet header and address descripted entry is contained a Vx CAM proto-ADE locator 146 .
  • the Rx engine block address decoded entry may be held in an ADE register 156 .
  • the Rx engine block prototype entry may be held in proto memory 52 , which loads the entry into a prototype register 148 .
  • a TCP/IP header matcher 150 which contains logic for comparing of session fields of the packet obtained from the prototype register and variable fields held in a TCP header Rx register 152 and IP header Rx register 154 .
  • Logic for validating the checksum for the IP portion of the TCP/IP header matcher 150 may be contained in an IP header checksum unit 162 , and the logic which validates the checksum for the TCP segment portion of the TCP/IP header matcher and the data stream from the input register 142 may be contained in a TCP segment header checksum unit 160 .
  • Data from valid packets may be passed to the receive data FIFO 72 (FIG. 4).
  • Logic for updating TCP header Rx register 152 for transmitted data acknowledgments or buffer window size adjustment may be contained in the arithmetic logic unit (ALU) 164 .
  • ALU arithmetic logic unit
  • the Rx engine control state machine 140 reduces the Rx window register 170 as data is received, and increases it as buffer space becomes available in the dual port application transfer memory 24 (FIG. 3) by the application. Under the control of the state machine, the contents of the Rx window register 170 may also be passed to the ALU 164 to synthesize a window update, which may be passed to the Tx engine via the Rx engine transfer unit 172 .
  • the segment When processing a packet header, if any of the fields of the header do not match expected values, the segment may be routed to the Rx bypass memory 62 , and the Rx engine may go into an idle state.
  • the IP source and destination addresses, plus the TCP source and destination ports, may be hashed together to form a value which is used as an address to look up in the content addressable memory the address for the Rx prototype. If the memory returns a non-zero value, it is used as an address to fetch the Rx prototype. If the value is zero, the packet is routed to the bypass buffer.
  • the value returned by the content addressable memory is used as the base address for the Rx prototype for the segment.
  • the prototype is read and the IP address and the TCP ports are compared against prototype values. If they match, the segment is accepted for further processing, and the ADE base address is read from the prototype memory array.
  • the ADE contains the base sequence number of the memory region. If the sequence number and the segment falls within those in the ADE, it is accepted and the base TCP payload address is read from the ADE.
  • Data from the segment is read into the dual transport application memory 24 until the segment is completely received, which can be determined by a length counter. Once a segment is received, a CRC 32 signal may be asserted, indicating the packet has been verified and to notify the host system of receipt of data.
  • the Rx engine 48 remains in a pending state until a finished bit is received for the segment. At that time, the system is interrupted and the network accelerator control unit goes into the Rx complete state.
  • the network accelerator of the invention affords significant advantages, and may be used in diverse applications. It is also applicable to continuous flow, streamed protocols other than TCP/IP. Some of these applications include high speed links for network backbones, protocol processing for gigabit physical Ethernet layers, data transport between computers within a system, high speed transport for real time high resolution video, increasing the speed of Internet data burst communications, permitting telephony packets to be transmitted over the Internet, and affording enhanced transaction processing and robotics control feedback.
  • the implementation of the network accelerator is not limited to FPGAs. It may be implemented in other forms of hardware, and even integrated with microprocessors.
  • the invention may be installed in various components, such as disk drives, graphics cards, video transmission devices, wireless links, TCP/IP hubs, and the like. The substantial increase in speed and corresponding reduction in latency afforded by the invention is a significant advantage.
  • Internet communications require heavy use of packet traffic that is directed between endpoints by large (32 to 97 bit) identifiers within the packet.
  • packet traffic is directed by partially decoding the full 97 bit identifier.
  • the smallest identifier (32-bit) direct raffic to a computer
  • medium sized identifiers direct traffic to a specific computer's application program instantiation (65-bit)
  • the largest identifier (97-bit) identifies the communication session between two application programs.
  • Level-3 switches use he smallest identifier, but for level-4 switches and processor adapters, medium and arge identifiers are required hundreds to thousands of these identifiers may be used in a fraction of a second. If more identifiers can be matched more quickly, the same ardware can handle more network bandwidth for improvement in performance.
  • Content-addressable memory is a hardware concept commonly used in switches to direct packets.
  • the VxCAM, or Virtual context dependant content addressable memory is an improvement of the CAM concept that takes into account characteristics of the usage on the Internet to perform more effectively.
  • the VxCAM outperforms a CAM in a Level-4 device by requiring (a) fewer memory elements to switch the same amount of traffic, (b) less wide data paths, and (c) shorter connection establishment—all the result of the fewer terms to check and setup.
  • This invention reduces the complexity of the process of associating packets with specific information sessions or groups as explained above such that higher-level protocol functions can be performed
  • TCP the primary protocol used to communication over the Internet
  • TCP over 95% of all communications is TCP.
  • a single web page on the average has 10 connection sessions that each send on the average 10K bytes of payload for a total of 100-300 packets.
  • the 97 bit endpoint to endpoint Internet identifier can be broken into separate components as follows:
  • IP addresses and ports two kinds of identifiers are present—IP addresses and ports.
  • Small “term” CAM's of IP addresses and ports match terms regardless of use as source or destination, or of another session. Another benefit is to “compress” an address or port into fewer bits, since the index of each of the term CAM's is smaller than the term width.
  • the combination of source/destination address/port matches is matched against yet another small CAM of sessions to in turn locate the index of the session descriptor. As a result, these three small CAMs reduce a 97 bit by 1024 session CAM of 99,328 bits to a fully allocated VxCAM of 61,440 bits.
  • CAM content addressable memory
  • Internet communications devices to partially decode large (32 to 97 bit) identifiers to direct the packet to its destination.
  • the CAM required to compare and match patterns increases linearly (see FIG. 7). Where off-chip memory is used to process these patterns, inefficiency results.
  • VxCAM Virtual Extensible Content Addressable Memory
  • the Session Accumulator ( 206 ) is cleared.
  • the destination address is gated onto the IP address bus ( 201 ) to an Address Term CAM ( 203 ) which locates the destination address term. If not found, the packet is signaled as not recognized and the VxCAM ignores all further action. However, if found, the resultant index of the IP address term is passed through the Adder/Mux ( 205 ) to the Session Accumulator register ( 206 ).
  • the source address is gated onto the IP address bus to an Address Term CAM ( 203 ) which locates the source address term.
  • the packet is signaled as not recognized and the VXCAM ignores all further action. If found, the resultant index of the IP address term is accumulated using the Adder/Mux ( 205 ) to the Session Accumulator register ( 206 ).
  • the destination port address is gated onto the TCP/UDP port bus ( 202 ) to a Address Term CAM ( 204 ) which locates the destination port address term. If not found, the packet is signaled as not recognized and the VxCAM ignores all further action. If found, the resultant index of the port destination address term is accumulated using the Adder/Mux ( 205 ) to the Session Accumulator register ( 206 ).
  • the source port address if present, is gated onto the TCP/UDP port bus ( 202 ) to an Address Term CAM ( 204 ) which locales the source port address term. If found, the resultant index of the port address term is accumulated using the Adder/Mux ( 205 ) to the Session Accumulator register ( 206 ).
  • the contents of the Session Accumulator consisting of the term index of the IP destination address, the term index of the IP source address, the term index of the TCP/UDP destination port address, and the term index of the TCP/UDP source port address if present, is passed to the Session CAM ( 207 ) which locates the index of the session descriptor.
  • TCP/IP communications involve highly redundant header fields that can be matched more efficiently by factoring out the redundant entries.

Abstract

An improved term addressable memory of an accelerator system and method includes a mechanism for performing predetermined plurality of pattern matches of packets to classify them for use with stateful protocol processing units that can resolve session data spread across multiple data packets and process them for the ultimate destination. The invention replaces a conventional content addressable memory with a term addressable memory, whereby redundant terms are recorded with a single memory entry. Two classes of terms are used to match packet addresses and application ports, as well as a much smaller session CAM that matches the aggregate match of all terms to a specific session.

Description

    SCOPE OF THE INVENTION
  • In the above-identified application, there is described and claimed a network accelerator and method for TCP/IP that includes programmable logic for performing network protocol processing at network signaling rates. The programmable logic is configured in a parallel pipelined a architecture controlled by state machines and implements processing for predictable patterns of the majority of transmissions. In more detail, incoming packets are compared with patterns corresponding to classes of transmissions which are stored in a content addressable memory and are simultaneously stored in a dual port, dual bank application memory. The patterns are used to determine sessions to which an incoming IP datagram belongs, and data packets stored in the application memory are processed by the programmable logic. Processing of packet headers is performed in parallel and during memory transfer without the necessity of conventional store and forward techniques resulting in a substantial reduction in latency. Packets which constitute exceptions or which have checksum or other errors are processed in software. [0001]
  • It has now been discovered that the above-described and claimed accelerator and method has surprising improvement using an improved content or term adressable memory called “VxCAM or VIRTUAL EXTENSIBLE CONTENT ADDRESSABLE MEMORY”. In accordance with the invention, VxCAM matches the minimum number of predetermined plurality of patterns resulting in fewer memory elements so that the invention can be easily implemented on-chip, narrows path width and reduces connection establishment overhead. [0002]
  • The present invention relates to Internet communications in general, and to a method and system in particular for substantially increasing the data throughput of TCP/EP protocol based data transmissions by selectively implementing in hardware certain portions of the TCP/IP protocol set (such as a majority of actually called and executed routines), and implementing in software routines the exceptions and remaining portions. [0003]
  • Since the implementation of FDDI fiber network links, the transmission speed of the physical layer to transmit data, has exceeded the ability of the end node computers to process the data packets. If the processing of the data packets is done by Von Neuman architectured end node computers, capacity is always exceeded since the switching speed of the fastest computer's gates will be approximately equal to that of the physical layer comprising the internal components of Application Specific Integrated Circuit (ASIC) chips. The computer CPU (which must process the data packets with multiple operations and copies to memory) intrinsically requires orders of magnitude more device operations than that of the analog/state machine mediated physical layer of the ASIC chips normalized to a common amount of data. While the problem of scaling current computer networks to gigabit speeds has been recognized, the complexity of the TCP/IP protocols has presented both practical and conceptual barriers to attempts to implement them in any manner other than various forms of software executed processes. However, even the fastest of CPUs for any given technological generation, cannot match the physical bandwidth of their internal components. [0004]
  • There have been a number of attempts to accelerate TCP/IP protocol handling, but none has effectively solved the latency problems. One approach to accelerate TCP/IP protocol handling was to process the headers of the protocols independently of the data payload. While the implementation of the protocols themselves was virtually identical to existing methods (TCP/IP software stack), the data was indirectly manipulated by separate buffering to avoid multiple copies of the payload data through the use of hardware buffer management using a multi-port memory. This approach demonstrated that hardware buffer management could improve handling of large payload packets, but it did not reduce packet latency to memory, did not improve the control bandwidth of the protocol or the ability to send small packets efficiently, and did not decouple protocol processing speed from transmission speed. The approach also was not applicable to local clusters, or to small record applications like web-serving or transaction processing. Moreover, the approach did not eliminate the store/forward processing of protocols, but merely attempted to optimize the methods by which the store and forward were mediated. [0005]
  • ATM cell-based transmission technology incurs a cost because of segmentation and reassembly of large data payload messages into much smaller cells. Devices which attempt to minimize this cost perform this function at the signaling rate. However, this function is specific to cell-based technologies, and is not particularly useful for technologies such as Ethernet and HiPPI. The payload size of such technologies' packets do not require an adaptation layer below that of the network or IP (Internet Protocol) layer. In order to process TCP/IP protocols, traditional store and forward methods must be used. [0006]
  • Protocol engines have also been used to optimize traditional methods of protocol handling to reduce certain steps. These include hardware checksum units, hardware buffer management, and RISC processing to improve protocol handling rate. However, this approach still does not scale with signaling rate. [0007]
  • Other approaches have implemented in hardware proprietary non-TCP/IP protocols having a continuous flow and routing that is specific to the particular network fabric. Variable context matching is not performed, and cells propagate in strict format and order to a priori known memory addresses instead of to a transport protocol's abstract port destination. Therefore, such approaches are not readily adaptable to wide area networks which must handle a variable and relatively unstructured traffic flow, and which must be scaleable, expandable and readily adaptable to network changes. [0008]
  • It is desirable to provide a network accelerator system and method for handling standard TCP/IP protocol which solves the latency and other problems of known systems and methods, and it is to these ends that the present invention is directed. [0009]
  • SUMMARY OF THE INVENTION
  • The present invention provides a solution to the above-mentioned protocol processing problems using a cross disciplinary combination of hardware elements, techniques and results based, inter alia, on network traffic analysis, high speed programmable logic array technology, and integration with low level operating system software design. [0010]
  • The invention solves a problem that has been long unsolved of how to process TCP/IP data packets at a speed equal to that made possible by the latest generation physical layer hardware transmission components. As microprocessors increase in speed, the same technology advances also increase the speed at which data can be transmitted over networks. If this data protocol handling must be handled in software, then there are fundamental issues in logic and software design that will always make the ability of a processor to process the packets slower than the physical ability of the network to transmit packets. This speed differential can penalize maximum possible network performance by a factor of almost one hundred at present. [0011]
  • The main insights that enable the invention to provide a practical and implementable solution to the above-mentioned protocol processing problems are the recognition that the transmission patterns of the vast majority of packets over current TCP/IP mediated networks are predictable and involve only a very small subset of the entire TCP/IP protocol set. It is possible through logic design to implement this small set of actually used protocols in hardware, such as programmable logic gate arrays, to allow processing of TCP/IP data packets at speeds equal to that of the ability of the fastest physical network layer. The rare packets that cannot be handled in this manner can be defaulted to conventional software processing. An operating system also can be low-level interfaced to this processing system through appropriate memory management in such a way that the packet's data coming off the network data transmission medium can be processed and put into application memory at the speed equivalent to a single gate-mediated operation. [0012]
  • The invention allows practical processing of TCP/IP data packets in gate array hardware at a data throughput equal to that of the physical transmission media. It accomplishes this task by recognizing that TCP/IP packets on current networks fall into predictable transmission patterns that actually utilize only a small fraction of the entire protocol for the vast majority of transmissions. By implementing this small subset in gate array hardware and defaulting the exceptions into software, a very large increase in TCP/IP packet throughput can be obtained. [0013]
  • TCP/IP transmissions handled by the invention can be made faster than that possible with the best current software implementations and multiprocessor TCP/IP processing engines. Using mask programmable logic affords approaches which are both faster and less expensive to construct than the current RISC CPU assisted TCP/IP processing boards, the invention is intrinsically scaleable upwards in speed with little or no redesign needed as advances in IC processing technology makes the network physical layers faster. A form of software embedded in hardware which can be physically implemented at any point where TCP/IP packet processing is used such as in network interface cards, and within microprocessor CPUs, affording significant potential technological and economic benefits. [0014]
  • A difference between the invention and prior approaches is that the invention constructs a path into memory for a specific class of packets that exists for the likely time interval when such a packet will be present. The path into and out of memory is handled entirely in the hardware of the invention with only random logic up to where it interacts with the application, and is triggered entirely by the arrival of the packet itself. In this hardware, all details are present for handling the packet payload state to where it will be delivered. With accelerators on both ends of a network transfer, no software overhead need be present for bulk data transfer in burst mode. This differs markedly from prior software and hardware approaches which employed techniques of minimized protocol implementations, buffer management, or by spreading the protocol implementation across a specially designed network fabric. [0015]
  • The invention implements continuous flow (streamed) information delivery via a standard protocol such as (TCP/IP) by means of a pattern match via associative memory. It has several benefits in processing standard protocols, as opposed to non-standard protocols. These include absolute minimum latency between application and network medium (fiber), absolute maximum bandwidth between communicating network applications, low complexity design network protocol processing mechanism, and the protocol rate scales linearly with network signaling rate. [0016]
  • These and other benefits are obtained, in one aspect, by avoiding software and hardware processing steps via an isochronous “stimulus/response” architecture using a variable content addressable memory that has preprogrammed state logic that effects protocol processing as a minimum time series of operations. A substantial, e.g., ten-fold, improvement in interapplication bandwidth with same complexity hardware results which makes practical low-cost gigabit network transport communications. While standard protocol processing is not unique as a process, this inventive method of processing is unique in that the software of a protocol implementation processes protocol information indirectly via hardware which has been a priori instructed on how to handle a predicted flow of packets autonomously. This methodology is superior to prior attempts in that the transmission speed of the network transport layer is scaled with the network physical layer.[0017]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a diagram of a network accelerator in accordance with the invention; [0018]
  • FIGS. 2[0019] a and 2 b are diagrammatic views which contrast, respectively, a traditional link data stream store and forward approach with a network accelerated continuous flow link data stream approach in accordance with the invention;
  • FIG. 3 is a more detailed diagram of the network accelerator of FIG. 1; [0020]
  • FIG. 4 is a diagram of a control unit of the accelerator of FIG. 3; [0021]
  • FIG. 5 is a diagram of a transmit engine of the network acelerator; [0022]
  • FIG. 6 is a diagram of a receive engine of the invention; [0023]
  • FIGS. 7 and 8 are diagrams contrasting a traditional memory process with that of the invention and [0024]
  • FIG. 9 is a diagram of the term addressable memory of the invention. [0025]
  • DESCRIPTION OF PREFERRED EMBODIMENTS
  • The invention is particularly adaptable to TCP/IP protocols and will be described in that context. It will be appreciated, however, that the invention has greater utility and is applicable to other streamed protocols. [0026]
  • Computer networks use network software protocols to communicate information reliably between computers over multiple successive physical signaling mediums. These protocols are implemented in software on computer processors. While hardware signaling rates have steadily increased, software protocol processing has not kept pace. With the advent of gigabit networking technology, costly processors must be dedicated to providing at most 40-50 percent of the theoretical bandwidth of the network, while software implementations used with earlier signaling technologies were capable of 60-80 percent of theoretical bandwidth. Clearly bandwidth demands will continue to increase, and since the disparity between software protocol processing and signaling rates will also increase, this defines a “bottleneck” in the effectiveness of networking technology. [0027]
  • The invention affords a minimum time mechanism for handling TCP “burst” transfers. A burst transfer is a series of bulk data transfers with no options between nodes (usually it is unidirectional). It consists of the sending node passing data payload packets with successive sequence numbers, and the receiver committing them to memory and sending acknowledgments back to the sender to trigger more data to be sent. The invention efficiently handles the burst so as to minimize latency. The software triggers the burst mode by a traditional send operation with a full-payload, and the receiver/transmitter fall into an asynchronous feedback loop of “send-next”/“acknowledgment” packets that continues until either the burst completes or an error occurs. [0028]
  • The invention provides a mechanism to process the costly portions of standard protocols in hardware entirely, and to do so at the same clock rate of the signaling. In this way, as the signaling rate rises, so does the protocol processing rate increase in lockstep. This approach is based upon several observations, including traffic pattern analysis of packets, experience with software protocol implementations, and experience with other nonstandard hardware-implemented protocols. [0029]
  • Traffic observation of TCP/IP packets shows that the majority of the packets simply pass bulk data without event, while the minority packets require more elaborate handling. Even more significant is that the delays on the Internet are for these very bulk data packets, so that it is critical to have timely delivery or low-latency of these packets for the performance to be maximized. Loss of this low-latency also impacts the reliability of a network, since it becomes impossible to tell if a failure has occurred, or if an assemblage of worst-case delays has masked an otherwise successful transfer. The ability to handle protocol packets with deterministic response time (as well as with a broad range of arrival distributions) is a requirement to maintain the “real time” characteristics that telecommunications services like telephone systems use to provide high-valued services en masse globally. [0030]
  • Experiences with software protocol implementations have shown that the necessary operations for a TCP/IP “burst” mode are constrained enough to be performed by hardware as clocked by the data stream. Unfortunately the difficulty in synchronizing the software with the data stream renders this observation useless. However, significant performance advantage can be gained by relying on hardware logic gate delays instead of program instructions for substantially reducing the latency between the network and application, thus allowing protocol handling at sustained rates without the need for additional buffering. This allows for continuous protocol processing at the data rate of the signaling technology. As will be described, the network accelerator of the invention uses a deterministic state machine to implement the transport protocol bulk receive and transmit functions, leaving to the software all other features of the protocol (including error recovery). [0031]
  • FIG. 1 is a functional diagram of a [0032] network protocol accelerator 10 in accordance with the invention. As will be described, the network accelerator can perform 100 baseTx full-duplex interface to an Ethernet network, with media access controller (MAC functions), IP (Internet Protocol) processing and decoding, and TCP (Transmission Control Protocol) processing. It may be a PCI interface board, designed to be used in an NT workstation, for example, utilizing a standard PCI bus slot. The accelerator will preferably have a physical link layer processor, an IP processor, a TCP processor, segment buffer memory, multiple FPGAs for logic, and a PCI interface to the host system.
  • As shown in the figure, the network accelerator includes a [0033] network interface 12 that includes a physical (PHY) media framing unit which obtains physical signals from the physical media, decodes the signals and the link layer framing as a byte stream, and supplies the stream to an accelerator engine 14. Simultaneously, a copy of the signals may be recorded in a receive/transmit (Rx/Tx) FIFO bypass unit 16. In the event of a failure of the accelerator engine 14 to accept the packet, a system bus interface unit 18 may signal a system interface unit 20 to handle the packet which is stored in the Rx FIFO bypass portion of 16. Similarly, the system interface can send packets via the bus interface to the Rx/Tx FIFO bypass portion of unit 16, which hands it on to the physical media framing unit 12, effectively bypassing the accelerator engine for non-TCP data transfers. The accelerator engine 14 is connected to a variable content addressable memory 22, and consults the memory as octets of a packet are received to find a match with predetermined patterns. When a match is found, a state machine associated with the pattern is loaded from the content addressable memory into the accelerator engine to operate on the packet. Operations may include packet table delivery from the physical media framing unit to a dual port application transfer buffer 24; sending a packet with payload from the application transfer buffer to the physical media forming unit; and sending a packet acknowledgment to the physical media framing unit.
  • Upon completion of an operation, the [0034] accelerator engine 14 may signal its status to the system interface unit 20 via the bus interface 18 indicating the event. In the event the accelerator engine fails to recognize a packet or encounters an error, such as a CRC (cyclic redundancy check) or checksum error, the accelerator engine suspends operation on that segment causing all packet traffic to be handled via the Rx/Tx FIFO bypass buffer until it is re-enabled by the system interface unit 20 via the bus interface 18 to return to normal operation.
  • Preferably, the network accelerator of the invention is implemented as a set of programmable logic chips intermediate between the network physical layer interface chip set and an interface chip set for the PCI or other bus. These programmable logic chips may be SRAM field programmable gate arrays (FPGAs) which can be reprogrammed via the network to allow the hardware protocols to be modified after installation to correct errors, to optimize performance as the network changes, or to implement changes to existing protocol sets. Low cost implementations can also use mask programmable chip sets. The speed advantages of mask programmable ASICs may make their use preferable in high speed point to point data transfer applications where the end nodes and routers are well defined by the user. [0035]
  • The network accelerator also may be implemented within the silicon of the main microprocessor or within on-board multi-chip-modules analogous to the way MMX incorporates digital signal processor functionality, or by which AGP provides on-chip integration of graphics accelerator functions. This can bring the data directly into the microprocessor, bypassing the external data bus interface which otherwise limits performance. [0036]
  • The high speed data transmission capability of the invention is advantageous in providing for direct data storage, display, or data processing device interconnect both between and within individual computers. [0037]
  • FIGS. 2[0038] a and 2 b are diagrammatic views which contrast the significant improvement in latency between a network accelerated continuous flow link data stream approach of the invention (FIG. 2b) and a traditional link data stream store and forward approach (FIG. 2a). FIG. 2 illustrates how a traditional protocol stack accumulates data in a store and forward buffer 30, and then performs the necessary protocol processing operations. As indicated in the figure, data packets from an Ethernet are delivered to a link data delivery unit which may perform error checking prior to storing in buffer 30. The time required for this operation is of the order of tens of microseconds. Subsequently, the various segments of the data packet are processed in a protocol processor 32. This processing is sequential, and results may also be placed in the store and forward buffer as application payload data is delivered to the protocol processor. This typically may take hundreds of microseconds, even with very high speed devices performing the operations. The critical weakness is that data must be in some kind of buffer before it can be processed, and processing must be completed before the data can be forwarded.
  • In contrast, as shown in FIG. 2[0039] b, the network accelerator of the invention uses the protocol's data stream itself as a way of instructing a uniquely constructed data flow processing machine 34 that is clocked by the protocol data and which performs processing operations as the information appears. As indicated in the figure, processing occurs in a series of parallel functional units 35-38 having a pipelined architecture so that packets are processed in real time with the processed data flowing between the network's wire link and the processing application's data origin. In effect, a protocol's packet would appear as a single, fat instruction that would run on a data flow processor in lock step with the link's data rate. This allows complete processing in times of the order of tens of microseconds in contrast to the traditional store and forward approach illustrated in FIG. 2a. The manner in which this is accomplished will be described in more detail below.
  • FIG. 3 is a functional diagram which illustrates in more detail a preferred embodiment of the network accelerator of FIG. 1. As shown, the physical media [0040] framing network interface 12 may comprise a physical device interface (PHY) 40 connected to a CRC/MAC unit 42. This physical device interface and CRC/MAC unit provide physical and link layer access, respectively, to an Ethernet network. The CRC/MAC unit provides parallel-to-serial conversion, CRC (cyclic redundancy check) generation and checking, MAC address recognition, FIFO buffering, and interface to the remainder of the network accelerator, which includes TCP/IP processors and the dual port application transfer memory 24, which preferably comprises a dual port/double banked RAM.
  • As will be described more fully, outgoing Ethernet packets will be read from the [0041] buffer memory 24 and transferred to an internal FIFO in preparation for transmission to the network. As the Ethernet packet is constructed and output to the network from the FIFO by the Tx engine, the CRC will be calculated on the fly and appended to the end of the Ethernet packet. An incoming Ethernet packet is stored in the incoming FIFO while the destination address is checked against the MAC address register. If the MAC address is correct, the Ethernet packet is sent to an Rx engine. The Ethernet packet is also run through the CRC checker, simultaneously. Once the Ethernet packet is completely received and the CRC is good, the CRC good signal will be asserted.
  • As shown in FIG. 3, the [0042] accelerator engine 14 includes a control unit 44, a Tx engine 46, and a Rx engine 48. Included within the variable content addressable memory 22 is a first prototype memory 50 connected to the Tx engine 46, and a second prototype memory 52 connected to the Rx engine 48. In addition, variable content addressable memory (VxCAM) 22 includes a content addressable memory (CAM) 54 which is also connected to the Rx engine 48. The variable content addressable memory matches a variety of packet formats and is used to quickly determine to which session an incoming IP datagram belongs. The variable content addressable memory 22 also includes an ADE memory 56 which is connected to control unit 44. The Rx/Tx FIFO bypass memory 16 may be implemented as a Tx bypass memory 60 and a Rx bypass memory 62. As shown, bypass memory 62 may be connected to the Rx engine 48 and to a bus 64 connecting the control unit 44 and bus interface unit 18. The Tx bypass memory 60 may be similarly connected to bus 64 and to the CRC/MAC unit 42.
  • The network accelerator handles the various layers of an Ethernet packet as it is sent or received from a network. When processing the IP layer, the IP address, IP checksums, ID field, flags, IP datagram length, etc. are either pre-calculated and sent to the network via the [0043] MTx engine 46, or used to verify the destination of an incoming Ethernet packet via the Rx engine 48.
  • When processing the TCP layer, TCP ports, TCP checksums, sequence numbers, ACK number, flags, window size, urgent pointer, options, etc. are either pre-calculated and sent to the network via the Tx engine, or used to verify the destination of a incoming IP datagram via the Rx engine. The [0044] Tx engine 46 obtains the TCP payload directly from the memory 24. The Rx engine 48 delivers the TCP payload directly to the memory 24.
  • The [0045] memory 24 will contain the host system view of network memory, and a shadowed copy for the network accelerator to use for TCP segment transmission and reception. The host system software driver will swap application memory (system RAM) for memory 24. This will allow the host system direct access the network data stored in the dual-port/double banked memory, effectively replacing the role of host system RAM. Finally, the system interface controls the relationship between the system and the network accelerator. It contains configuration and status registers, and allows the host system to access the network accelerator.
  • Data for the packets are buffered for transfer using the [0046] memory 24. This memory maintains an up-to-date copy of the network data for the host system/application, and a local copy of the network data for the network accelerator. This allows the application/host system to access memory as it would system RAM before, during and after a TCP segment is sent to the network by the network accelerator. Also, the memory allows access to a stable copy of the network data for transmission or reception to/from the network. The network acceleration control unit maintains the proper relationship between the memory banks, with the banks synchronized in the case of Idle state (the network accelerator is neither transmitting nor receiving TCP segments), or logically separated during network accelerator TCP segment transmission or reception. The double banked nature of the memory allows a “zero-copy” or “zero-latency” method of network data delivery to the network accelerator.
  • Along with the bulk memory, there is status memory used to maintain the relationship between the memory bank and the host system memory bank. This status memory works as a table indicating which bank of memory has the most current byte of network data for each address in the memory. [0047]
  • The content addressable memory (CAM) [0048] 54 is used to quickly determine to which session an incoming IP datagram belongs. It cooperates with ADE memory 56, prototype memories 50, 52, and is part of the variable content addressable memory (VxCAM) 22.
  • Within the [0049] ADE memory 56, there will be one or more address descriptor entries (ADEs) which describe the segment details such as memory base address, TCP payload length, TCP payload checksum, next TCP sequence number and the next TCP segment's ADE. This information is used by the Tx engine when the segment is constructed, prior to transmission. The Rx engine uses the ADE fields to determine the sequence numbers, payload destination, and out-of order segments.
  • Within the [0050] prototype memories 50, 52, there will be one or more session prototype description entries. These entries describe the session fields that do not change, as well as the initial values for the session, such as IP address, TCP ports, protocol fields, base sequence number, first ADE, etc. The Tx engine uses this information to generate the static fields within a session for an outgoing TCP segment. The Rx engine uses this information to determine what TCP session an incoming TCP segment is destined for, and to verify the validity of specific fields in the TCP/IP header.
  • The content [0051] addressable memory 22 stores the address of the potential TCP session prototype entry that describes the session to which an incoming segment belongs. Certain fields in the TCP/IP header are hashed to obtain a value which is used as an address to “look up” which prototype describes this segment. The memory stores at the “hashed” address another address which points to the prototype data in the prototype memory. If the memory returns a value of zero, the incoming TCP segment does not belong to any accelerated sessions and is routed to the bypass FIFO. In this manner, a “one shot” lookup of the TCP session prototype can be done, rather than searching potentially thousands of TCP session prototypes.
  • FIG. 4 illustrates the network [0052] accelerator control unit 44 in more detail. As indicated above, the control unit provides the overall state machines and control registers which control the network accelerator. Logic for controlling the dual port application transfer memory 24 and the Tx and Rx session state machines (to be described) for the Tx engine and the Rx engine, respectively, may be contained in a dual port memory controller 61. Logic for generating a checksum may be contained in a checksum unit 62 which interfaces with ADE memory 56 via an address bus 57 and a data bus 58. After initialization of a current checksum, the ADE memory 56 may be created and used for bounds checking on the host address to obtain the checksum for the desired payload of a TCP segment. This checksum may be loaded into the checksum unit 62. The current value may be stored in memory 24, and a new calculated value may be added to the checksum. The checksum may then be saved either through a write back to the ADE memory 56 and the dual port application transfer memory 24, or if multiple locations require modifications, by iteration. The calculated checksum is then ready for use as the data checksum of the TCP segment.
  • As shown in FIG. 4, a [0053] first FIFO buffer 70 may interface the dual port application transfer memory 24 to Tx data from the Tx engine state machine, and a second FIFO buffer 72 may interface memory 24 to Rx data from the Rx engine state machine. Logic for controlling the FIFOs may be contained in the FIFO buffers themselves and used to minimize bus arbitration read/write by an arbitration unit 74. In addition, logic for controlling the Rx engine and the Tx engine, as well as access to their status and control registers, may be contained in a registers and configuration unit 76 which is interfaced to memory 24 and memory controller 61 by an application data bus 77 and an application address bus 78. Arbitration unit 74 also may include logic to control memory access arbitration between the host system and the network accelerator. The network accelerator control unit also maintains global control of the state machines for each session.
  • The Tx state machine and the Rx state machine may have the following states. Tx idle is the state prior to sending a Tx buffer to the network. This is the default state and is set up by the software driver. The software driver will also generate the necessary values for the VxCAM for a given buffer space. The host system fills the memory until the session is ready to be transmitted. At this point, the state machine transitions to the Tx pending state. [0054]
  • During the Tx pending state, the dual [0055] port memory controller 61 maintains two copies of the data: one for the host system, and one for the Tx engine. Proper data relationship between the host system memory and the network accelerator memory must be maintained to prevent old data from overwriting new host system data, and new host system data from overwriting the data in use by the Tx engine.
  • In the Tx complete state, if the transmission fails the state machine goes to the Tx re-transmit. If the Tx was a success, the network accelerator will set a success bit and go to the Tx idle state. The network accelerator is now waiting to send out the next segment. In either case, the network accelerator control unit must continue to maintain the proper relationship between the two copies of data. [0056]
  • If the Tx transaction fails, in the Tx re-transmit state, the network accelerator may either attempt to re-transmit the segment, or move to the next session queued for transmission and attempt this segment later. [0057]
  • The Rx idle state is the initial state. In this state, the two copies of network data are reconciled. Depending on the outcome of the previous received segments, the host system reads data from either the shadow bank of dual port application transfer memory or the application bank of same memory. If a packet was successfully received, the net payload data stored in the shadow bank of the dual port application transfer memory must be presented to the host system. This is performed on a byte by byte level. [0058]
  • In the Rx pending state, the Rx engine is receiving one or more segments is the current session. Receive data is placed in the proper bank of the memory by the network accelerator control unit. [0059]
  • In the Rx complete state, there may be two different scenarios: Rx success or Rx time-out. In the case of success, the success bit for the ADE is set, then the Rx idle state is entered. In the case of failure, the state machine goes to idle and no changes occur to the memory. [0060]
  • Checksums for the payload of the TCP packet are calculated by the [0061] checksum logic 62 as follows. Upon initial setup of the session, the section of memory used by the session is cleared to all zeros. This allows the initial checksum to be initialized to zero for each segment. ADEs are setup for each segment within the session; ADEs contain the starting address, ending address, and checksum for each segment of the session. There may be one or more segments in any session.
  • During host system writes, the host system presents an address to be accessed. Bounds checking is performed on this address to determine which ADE contains the checksum for this address. The checksum is loaded into the checksum logic and the current (old) value in the memory is subtracted from the checksum. Next, the new data value is added to the checksum. [0062]
  • Upon saving a new checksum, if it is a single location write, the new checksum is written back into the ADE and the new data value is written into the memory. If multiple locations are to be modified, the checksum stays in the checksum generator and each new data value is added to the checksum while each old value is subtracted. The new data is written into the dual port [0063] application transfer memory 24 during this operation.
  • Once a segment is ready to be sent to the network, the Tx engine uses the checksum stored in the ADE as the checksum for the data portion of the TCP segment. [0064]
  • The Rx engine and the Tx [0065] engine use FIFOs 70, 72 for interfacing the engines to the dual port application transfer memory 24. The FIFOs minimize the bus arbitration necessary to read and write data into the dual port application transfer memory 24 from the engines. The control of the FIFOs involves filling and draining the FIFOs in a cycle-steal mode between host system accesses to the memory.
  • The [0066] control unit 44 has an address and data bus connection to the Tx and Rx engines 46, 48. This bus allows the control unit to set and read configuration and status registers within the two engines.
  • The control unit controls access to the Tx and Rx engines and all [0067] memory 24, 50, 52, 56, and CAM memory 54 through arbitration, using arbitration unit 74. Host system accesses and accesses compete for access through the control unit. Any known arbitration method may be used to control these accesses.
  • FIGS. 5 and 6, respectively, illustrate in more detail preferred embodiments of the Tx (transmit) [0068] engine 46 and Rx (receive) engine 48. Referring to FIG. 5, Tx engine 46 may be controlled by a state machine 100, which is used to generate signals which are used to control all the events in the send process. It may be based on a send counter (not shown). This counter is started at initial transmission time, and generates signals which are used to control all the events in the send process. A multiplexer 102 combines Tx data and the outputs of several registers, and provides these to an output register 104.
  • The registers muxed to the [0069] output register 104 may be Tx prototype register 106, a Tx application data output register 108, the outputs of checksum registers 110 and 112, an ACK register, and all the individually calculated fields in overlay registers 114 and 116.
  • The send counter and the Tx engine control state machine [0070] 100 govern the timeslots for outputting the various fields to the output register. The state machine determines the proper time to calculate the various IP and TCP fields and when to send next segment. Sliding window calculation logic provides the information via register 120 to the Tx engine state machine for next segment transmission.
  • The Tx engine is responsible for sending Ethernet packets containing IP datagrams of TCP segments to the network. There are two primary types of TCP segments. These are user data (ADE) segments, and automatically generated acknowledgment segments for received data. The network accelerator creates packets from scratch, generates the Ethernet header, the IP header, the TCP header, and the TCP data payload. [0071]
  • The Tx engine state machine [0072] 100, which may be contained in the dual port memory controller 61, asserts the Tx pending state through the Tx engine control state machine 100, making available data contained in the Tx FIFO 70 (FIG. 4). The Tx engine loads a prototype register 106 with static portions of the TCP/IP headers from the proto memory 50 of the Tx engine (FIG. 3). The logic for the calculation of the dynamic portion of TCP header is contained in a TCP header Tx overlay register 116. The logic for the final checksum calculation for the dynamic portion of the IP header may be contained in an IP header checksum register 10 of FIG. 5, and the logic for the post-checksum calculation for the TCP segment may be contained in a TCP segment post-checksum register 112. The logic which provides sequential accesses to the register contents to the output register 104 may be contained in a transfer register 105. The logic for calculating sequence numbers from the TCP header Tx overlay register by adding the length of the packet data contents to a current sequence number may be contained in an arithmetic logic unit (ALU) 1 18. The logic to obtain values of a prior received datagram's sequence number and length to generate an acknowledgment number using the ALU may be obtained from the Rx engine TCP header Rx register 103. The final results may be output via the output register 104 to the CRC/MAC unit 42 of FIG. 3. The logic for determining whether sending of a datagram was successful and acknowledged may be contained in the engine control state machine 100. The logic for determining if one can send additional datagrams is determined by the engine control state machine 100 and Tx window register 120.
  • The data used to generate the dynamic calculated portion of the TCP/IP headers reside in the ADE memory, the proto memory, and the [0073] memory 24, and the data to generate the static precalculated portion of the TCP/IP headers resides in the Tx engine proto memory 50. The data used to generate the TCP/IP payload resides in the dual port application memory 24.
  • When the host system asserts a signal indicating that a segment should be sent to the network, the base address of the segment prototype is loaded into the Tx engine proto memory address register, and the base ADE address for the segment is loaded into the Tx engine ADE memory address register. The Tx engine reads the ADE and prototype data out of the ADE memory and the Tx engine proto memory, respectively, then calculates the various fields and inserts the fields into the outgoing network stream. Certain fields of the stream, such as sequence numbers, ACK numbers, ID fields, etc., may be calculated as the stream progresses. Once the headers have been calculated, the TCP payload is output from the dual port application transfer memory. Finally, the CRC/[0074] MAC unit 42 appends a CRC 32 value to the Ethernet packet, and completes delivery of the packet to the PHY device 40. In this manner, the network accelerator generates a complete Ethernet packet comprising an IP datagram containing a TCP segment.
  • Referring to FIG. 6, the [0075] Rx engine 48 is controlled by an Rx engine control state machine 140, and receives Ethernet packets from the network interface comprising the PHY device 40 and the CRC/MAC unit 42 via the input register 142. Upon receipt, the state machine sequences data to other elements of the Rx engine. The receive packet is sent to the Rx bypass memory 62, which serves as a buffer and used for any packet that is not a bulk data transfer TCP segment. The Rx engine processes the IP and TCP headers and determines the type of TCP segment. The Vx CAM memory 22 is used by the Rx engine to determine to which session an incoming IP datagram belongs.
  • The Rx engine, under the control of [0076] state machine 140, compares a number of fields of the IP header with expected values stored in a plurality of registers. Certain fields in the TCP/IP header are static and can be compared against static values. Other fields are variable, and define, for example, the length, or checksum or other session-related details. The variable fields are compared against values stored in registers and pre-determined values stored in the ADE memory.
  • Upon receiving an incoming packet, the header is decoded by a [0077] decoder 144 to determine the location of the source and destination addresses and ports contained in the TCP/IP header. The logic for locating the associated prototype packet header and address descripted entry is contained a Vx CAM proto-ADE locator 146. The Rx engine block address decoded entry may be held in an ADE register 156. The Rx engine block prototype entry may be held in proto memory 52, which loads the entry into a prototype register 148. A TCP/IP header matcher 150 which contains logic for comparing of session fields of the packet obtained from the prototype register and variable fields held in a TCP header Rx register 152 and IP header Rx register 154. Logic for validating the checksum for the IP portion of the TCP/IP header matcher 150 may be contained in an IP header checksum unit 162, and the logic which validates the checksum for the TCP segment portion of the TCP/IP header matcher and the data stream from the input register 142 may be contained in a TCP segment header checksum unit 160. Data from valid packets may be passed to the receive data FIFO 72 (FIG. 4). Logic for updating TCP header Rx register 152 for transmitted data acknowledgments or buffer window size adjustment may be contained in the arithmetic logic unit (ALU) 164.
  • The Rx engine [0078] control state machine 140 reduces the Rx window register 170 as data is received, and increases it as buffer space becomes available in the dual port application transfer memory 24 (FIG. 3) by the application. Under the control of the state machine, the contents of the Rx window register 170 may also be passed to the ALU 164 to synthesize a window update, which may be passed to the Tx engine via the Rx engine transfer unit 172.
  • When processing a packet header, if any of the fields of the header do not match expected values, the segment may be routed to the [0079] Rx bypass memory 62, and the Rx engine may go into an idle state. The IP source and destination addresses, plus the TCP source and destination ports, may be hashed together to form a value which is used as an address to look up in the content addressable memory the address for the Rx prototype. If the memory returns a non-zero value, it is used as an address to fetch the Rx prototype. If the value is zero, the packet is routed to the bypass buffer.
  • The value returned by the content addressable memory is used as the base address for the Rx prototype for the segment. The prototype is read and the IP address and the TCP ports are compared against prototype values. If they match, the segment is accepted for further processing, and the ADE base address is read from the prototype memory array. The ADE contains the base sequence number of the memory region. If the sequence number and the segment falls within those in the ADE, it is accepted and the base TCP payload address is read from the ADE. [0080]
  • Data from the segment is read into the dual [0081] transport application memory 24 until the segment is completely received, which can be determined by a length counter. Once a segment is received, a CRC 32 signal may be asserted, indicating the packet has been verified and to notify the host system of receipt of data. The Rx engine 48 remains in a pending state until a finished bit is received for the segment. At that time, the system is interrupted and the network accelerator control unit goes into the Rx complete state.
  • From the foregoing, it may be seen that the network accelerator of the invention affords significant advantages, and may be used in diverse applications. It is also applicable to continuous flow, streamed protocols other than TCP/IP. Some of these applications include high speed links for network backbones, protocol processing for gigabit physical Ethernet layers, data transport between computers within a system, high speed transport for real time high resolution video, increasing the speed of Internet data burst communications, permitting telephony packets to be transmitted over the Internet, and affording enhanced transaction processing and robotics control feedback. [0082]
  • As will also be appreciated from the foregoing, the implementation of the network accelerator is not limited to FPGAs. It may be implemented in other forms of hardware, and even integrated with microprocessors. The invention may be installed in various components, such as disk drives, graphics cards, video transmission devices, wireless links, TCP/IP hubs, and the like. The substantial increase in speed and corresponding reduction in latency afforded by the invention is a significant advantage. [0083]
  • Overview
  • Internet communications require heavy use of packet traffic that is directed between endpoints by large (32 to 97 bit) identifiers within the packet. packet traffic is directed by partially decoding the full 97 bit identifier. The smallest identifier (32-bit) direct raffic to a computer, medium sized identifiers direct traffic to a specific computer's application program instantiation (65-bit), while the largest identifier (97-bit) identifies the communication session between two application programs. Level-3 switches use he smallest identifier, but for level-4 switches and processor adapters, medium and arge identifiers are required hundreds to thousands of these identifiers may be used in a fraction of a second. If more identifiers can be matched more quickly, the same ardware can handle more network bandwidth for improvement in performance. [0084]
  • Content-addressable memory (CAM) is a hardware concept commonly used in switches to direct packets. The VxCAM, or Virtual context dependant content addressable memory is an improvement of the CAM concept that takes into account characteristics of the usage on the Internet to perform more effectively. The VxCAM outperforms a CAM in a Level-4 device by requiring (a) fewer memory elements to switch the same amount of traffic, (b) less wide data paths, and (c) shorter connection establishment—all the result of the fewer terms to check and setup. [0085]
  • This invention reduces the complexity of the process of associating packets with specific information sessions or groups as explained above such that higher-level protocol functions can be performed In accordance with the invention, since the primary protocol used to communication over the Internet is TCP, over 95% of all communications is TCP. A single web page on the average has 10 connection sessions that each send on the average 10K bytes of payload for a total of 100-300 packets. The 97 bit endpoint to endpoint Internet identifier can be broken into separate components as follows: [0086]
  • (1.) 1 bit for UDP/TCP protocol selection, (2) two 32 bit IP source and destination t terms that determine selection of the communicating computers, and (3) two 16 bit source and destination UDP/TCP port terms for application/session determination within each computer. [0087]
  • I have discovered that that 90% of the terms in all of the packet identifiers in all of these packets are the same. Based on that discovery, the technique of term sharing has surprising application wherein one memory cell for the redundant terms that would in a conventional CAM consume memory cells, has surprising efficiency. [0088]
  • In the invention, two kinds of identifiers are present—IP addresses and ports. Small “term” CAM's of IP addresses and ports match terms regardless of use as source or destination, or of another session. Another benefit is to “compress” an address or port into fewer bits, since the index of each of the term CAM's is smaller than the term width. Furthermore in the invention, the combination of source/destination address/port matches is matched against yet another small CAM of sessions to in turn locate the index of the session descriptor. As a result, these three small CAMs reduce a 97 bit by 1024 session CAM of 99,328 bits to a fully allocated VxCAM of 61,440 bits. [0089]
  • Limits of the process in accordance with the invention: at least one port per session is required as is at least one address per sets of shared sessions. Hence, in average use the size of the VxCAM of 1024 sessions is greatly reduced—from 61,440 to 23,552 bits. As a result, small memories can be implemented on chip instead of requiring larger off chip memories with their attendant drawbacks, viz., delays, costs, etc. [0090]
  • DESCRIPTION OF EMBODIMENTS
  • As previously mentioned, content addressable memory (CAM) is a hardware method in which incoming data is compared with a set of predetermined patterns to identify a matching pattern, and has been heretofore used by Internet communications devices to partially decode large (32 to 97 bit) identifiers to direct the packet to its destination. As As the number of sessions increases, the CAM required to compare and match patterns increases linearly (see FIG. 7). Where off-chip memory is used to process these patterns, inefficiency results. [0091]
  • VxCAM (Virtual Extensible Content Addressable Memory) of the invention, matches a minimum number of predetermined plurality of patterns, resulting in fewer memory elements (FIG. 8) required so that the invention is easily implemented on-chip, narrows memory path width, and reduces connection establishment overhead. [0092]
  • The method and system of the invention is shown in detail in FIG. 9. [0093]
  • As shown, as the destination address is latched in the TCP/IP header register ([0094] 200), the Session Accumulator (206) is cleared. The destination address is gated onto the IP address bus (201) to an Address Term CAM (203) which locates the destination address term. If not found, the packet is signaled as not recognized and the VxCAM ignores all further action. However, if found, the resultant index of the IP address term is passed through the Adder/Mux (205) to the Session Accumulator register (206). Similarly, the source address is gated onto the IP address bus to an Address Term CAM (203) which locates the source address term. If not found, the packet is signaled as not recognized and the VXCAM ignores all further action. If found, the resultant index of the IP address term is accumulated using the Adder/Mux (205) to the Session Accumulator register (206). The destination port address is gated onto the TCP/UDP port bus (202) to a Address Term CAM (204) which locates the destination port address term. If not found, the packet is signaled as not recognized and the VxCAM ignores all further action. If found, the resultant index of the port destination address term is accumulated using the Adder/Mux (205) to the Session Accumulator register (206). The source port address, if present, is gated onto the TCP/UDP port bus (202) to an Address Term CAM (204) which locales the source port address term. If found, the resultant index of the port address term is accumulated using the Adder/Mux (205) to the Session Accumulator register (206). The contents of the Session Accumulator consisting of the term index of the IP destination address, the term index of the IP source address, the term index of the TCP/UDP destination port address, and the term index of the TCP/UDP source port address if present, is passed to the Session CAM (207) which locates the index of the session descriptor.
  • I have discovered that TCP/IP communications involve highly redundant header fields that can be matched more efficiently by factoring out the redundant entries. [0095]
  • While the foregoing description has been with reference to particular embodiments, it will be appreciated by those skilled in the art that changes in these embodiments made be made without departing from the spirit of the invention, the scipe of which is defined in the appended claims.[0096]

Claims (5)

What is claimed is:
1. A method of matching a predetermined plurality of patterns of a stream-oriented protocol involving a network communications device, for data packets having a formatted header containing information about the packet, the method comprising analyzing packet traffic on the network to identify classes of predictable protocols which characterize a majority of such packets and implementing programmable hardware logic to process such classes of protocols whereby individual datapackets are meaningfully associated with stateful stream protocols so that significantly less memory is required reducing power consumption, and increasing connection establishment efficiency.
2. The method of
claim 1
wherein said identifying comprises storing in a memory a plurality of predetermined patterns which correspond to said plurality of classes; analyzing the header of a packet to identify a match with a stored pattern; simultaneously with said analyzing, processing the header to determine whether the packet is valid; controlling the programmable logic to process the packet in accordance with the class corresponding to the matched pattern; and processing in said software non-matching and invalid packets.
3. The method of claim I wherein said network protocol comprises TCP/IP.
4. In a network communications device, for data packets having a formatted header containing information about the packet, said network communications device comprising a header decoder, term addressable memory, and stateful protocol processing units, the improvement comprising means for matching a predetermined rurality of patterns of a stream-oriented network protocols with fewer memory elements than streams, thereby being more efficient with memory, which results in the ability to support more streams with the same memory resource.
5. The improvement of
claim 4
wherein said stream-oriented protocol comprises TCP/IP, and wherein said processing units compromise programmable logic controlled by a state machine, whereby TCP segments can be identified statefully for processing into application memory, thereby allowing higher level stream processing of session data spread across multiple data packets.
US09/756,667 1999-05-17 2001-01-10 Term addressable memory of an accelerator system and method Expired - Lifetime US6768992B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US09/756,667 US6768992B1 (en) 1999-05-17 2001-01-10 Term addressable memory of an accelerator system and method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/147,856 US6173333B1 (en) 1997-07-18 1998-07-17 TCP/IP network accelerator system and method which identifies classes of packet traffic for predictable protocols
US09/756,667 US6768992B1 (en) 1999-05-17 2001-01-10 Term addressable memory of an accelerator system and method

Related Parent Applications (2)

Application Number Title Priority Date Filing Date
US09/147,856 Continuation-In-Part US6173333B1 (en) 1997-07-18 1998-07-17 TCP/IP network accelerator system and method which identifies classes of packet traffic for predictable protocols
US09147856 Continuation-In-Part 1998-07-17

Publications (2)

Publication Number Publication Date
US20010025315A1 true US20010025315A1 (en) 2001-09-27
US6768992B1 US6768992B1 (en) 2004-07-27

Family

ID=22523194

Family Applications (1)

Application Number Title Priority Date Filing Date
US09/756,667 Expired - Lifetime US6768992B1 (en) 1999-05-17 2001-01-10 Term addressable memory of an accelerator system and method

Country Status (1)

Country Link
US (1) US6768992B1 (en)

Cited By (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010037406A1 (en) * 1997-10-14 2001-11-01 Philbrick Clive M. Intelligent network storage interface system
US20020059438A1 (en) * 2000-01-21 2002-05-16 Drew Sarkisian Wireless communications invisible proxy and hooking systems and methods
US20020091844A1 (en) * 1997-10-14 2002-07-11 Alacritech, Inc. Network interface device that fast-path processes solicited session layer read commands
US20020095519A1 (en) * 1997-10-14 2002-07-18 Alacritech, Inc. TCP/IP offload device with fast-path TCP ACK generating and transmitting mechanism
US20020120888A1 (en) * 2001-02-14 2002-08-29 Jorg Franke Network co-processor for vehicles
US20020161919A1 (en) * 1997-10-14 2002-10-31 Boucher Laurence B. Fast-path processing for receiving data on TCP connection offload devices
US20030121835A1 (en) * 2001-12-31 2003-07-03 Peter Quartararo Apparatus for and method of sieving biocompatible adsorbent beaded polymers
US20030129405A1 (en) * 2000-10-26 2003-07-10 Yide Zhang Insulator coated magnetic nanoparticulate composites with reduced core loss and method of manufacture thereof
US6658480B2 (en) 1997-10-14 2003-12-02 Alacritech, Inc. Intelligent network interface system and method for accelerated protocol processing
US6687758B2 (en) 2001-03-07 2004-02-03 Alacritech, Inc. Port aggregation for network connections that are offloaded to network interface devices
US6697868B2 (en) 2000-02-28 2004-02-24 Alacritech, Inc. Protocol processing stack for use with intelligent network interface device
US20040054813A1 (en) * 1997-10-14 2004-03-18 Alacritech, Inc. TCP offload network interface device
US20040073703A1 (en) * 1997-10-14 2004-04-15 Alacritech, Inc. Fast-path apparatus for receiving data corresponding a TCP connection
US6751665B2 (en) 2002-10-18 2004-06-15 Alacritech, Inc. Providing window updates from a computer to a network interface device
US6757746B2 (en) 1997-10-14 2004-06-29 Alacritech, Inc. Obtaining a destination address so that a network interface device can write network data without headers directly into host memory
US6807581B1 (en) 2000-09-29 2004-10-19 Alacritech, Inc. Intelligent network storage interface system
US20050097242A1 (en) * 2003-10-30 2005-05-05 International Business Machines Corporation Method and system for internet transport acceleration without protocol offload
US20050122986A1 (en) * 2003-12-05 2005-06-09 Alacritech, Inc. TCP/IP offload device with reduced sequential processing
US20050138655A1 (en) * 2003-12-22 2005-06-23 Randy Zimler Methods, systems and storage medium for managing digital rights of segmented content
US20050177618A1 (en) * 2003-12-22 2005-08-11 Randy Zimler Methods, systems and storage medium for managing bandwidth of segmented content
US20050226242A1 (en) * 2004-03-30 2005-10-13 Parker David K Pipelined packet processor
US7042898B2 (en) 1997-10-14 2006-05-09 Alacritech, Inc. Reducing delays associated with inserting a checksum into a network message
US20070153808A1 (en) * 2005-12-30 2007-07-05 Parker David K Method of providing virtual router functionality
US20080008099A1 (en) * 2004-03-30 2008-01-10 Parker David K Packet processing system architecture and method
US7502374B1 (en) 2004-03-30 2009-03-10 Extreme Networks, Inc. System for deriving hash values for packets in a packet processing system
US7664868B2 (en) 1998-04-27 2010-02-16 Alacritech, Inc. TCP/IP offload network interface device
US7664883B2 (en) 1998-08-28 2010-02-16 Alacritech, Inc. Network interface device that fast-path processes solicited session layer read commands
US7675915B2 (en) 2004-03-30 2010-03-09 Extreme Networks, Inc. Packet processing system architecture and method
US7738500B1 (en) 2005-12-14 2010-06-15 Alacritech, Inc. TCP timestamp synchronization for network connections that are offloaded to network interface devices
US20100211859A1 (en) * 2002-05-31 2010-08-19 Jds Uniphase Corporation Systems and methods for data alignment
US7817633B1 (en) 2005-12-30 2010-10-19 Extreme Networks, Inc. Method of providing virtual router functionality through abstracted virtual identifiers
US7822033B1 (en) 2005-12-30 2010-10-26 Extreme Networks, Inc. MAC address detection device for virtual routers
US7853723B2 (en) 1997-10-14 2010-12-14 Alacritech, Inc. TCP/IP offload network interface device
US7889750B1 (en) 2004-04-28 2011-02-15 Extreme Networks, Inc. Method of extending default fixed number of processing cycles in pipelined packet processor architecture
US20110208871A1 (en) * 2002-01-15 2011-08-25 Intel Corporation Queuing based on packet classification
US8019901B2 (en) 2000-09-29 2011-09-13 Alacritech, Inc. Intelligent network storage interface system
US8161270B1 (en) 2004-03-30 2012-04-17 Extreme Networks, Inc. Packet data modification processor
US8248939B1 (en) 2004-10-08 2012-08-21 Alacritech, Inc. Transferring control of TCP connections between hierarchy of processing mechanisms
US8341286B1 (en) 2008-07-31 2012-12-25 Alacritech, Inc. TCP offload send optimization
US8539513B1 (en) 2008-04-01 2013-09-17 Alacritech, Inc. Accelerating data transfer in a virtual computer system with tightly coupled TCP connections
US8539112B2 (en) 1997-10-14 2013-09-17 Alacritech, Inc. TCP/IP offload device
US8605732B2 (en) 2011-02-15 2013-12-10 Extreme Networks, Inc. Method of providing virtual router functionality
US8621101B1 (en) 2000-09-29 2013-12-31 Alacritech, Inc. Intelligent network storage interface device
US20150110114A1 (en) * 2013-10-17 2015-04-23 Marvell Israel (M.I.S.L) Ltd. Processing Concurrency in a Network Device
US9047417B2 (en) 2012-10-29 2015-06-02 Intel Corporation NUMA aware network interface
US9055104B2 (en) 2002-04-22 2015-06-09 Alacritech, Inc. Freeing transmit memory on a network interface device prior to receiving an acknowledgment that transmit data has been received by a remote device
US20150234841A1 (en) * 2014-02-20 2015-08-20 Futurewei Technologies, Inc. System and Method for an Efficient Database Storage Model Based on Sparse Files
US9306793B1 (en) 2008-10-22 2016-04-05 Alacritech, Inc. TCP offload device that batches session layer headers to reduce interrupts as well as CPU copies
US9455907B1 (en) 2012-11-29 2016-09-27 Marvell Israel (M.I.S.L) Ltd. Multithreaded parallel packet processing in network devices
US10684973B2 (en) 2013-08-30 2020-06-16 Intel Corporation NUMA node peripheral switch
US10929152B2 (en) * 2003-05-23 2021-02-23 Ip Reservoir, Llc Intelligent data storage and processing using FPGA devices
US10957423B2 (en) 2005-03-03 2021-03-23 Washington University Method and apparatus for performing similarity searching
US10963962B2 (en) 2012-03-27 2021-03-30 Ip Reservoir, Llc Offload processing of data packets containing financial market data
CN113645256A (en) * 2021-10-13 2021-11-12 成都数默科技有限公司 Aggregation method without reducing TCP session data value density
US11397985B2 (en) 2010-12-09 2022-07-26 Exegy Incorporated Method and apparatus for managing orders in financial markets
US11436672B2 (en) 2012-03-27 2022-09-06 Exegy Incorporated Intelligent switch for processing financial market data
US11449538B2 (en) 2006-11-13 2022-09-20 Ip Reservoir, Llc Method and system for high performance integration, processing and searching of structured and unstructured data

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9444785B2 (en) 2000-06-23 2016-09-13 Cloudshield Technologies, Inc. Transparent provisioning of network access to an application
US7003555B1 (en) 2000-06-23 2006-02-21 Cloudshield Technologies, Inc. Apparatus and method for domain name resolution
US7032031B2 (en) * 2000-06-23 2006-04-18 Cloudshield Technologies, Inc. Edge adapter apparatus and method
US8204082B2 (en) 2000-06-23 2012-06-19 Cloudshield Technologies, Inc. Transparent provisioning of services over a network
US7069375B2 (en) * 2001-05-17 2006-06-27 Decru, Inc. Stream-oriented interconnect for networked computer storage
US7225188B1 (en) * 2002-02-13 2007-05-29 Cisco Technology, Inc. System and method for performing regular expression matching with high parallelism
US20050232303A1 (en) * 2002-04-26 2005-10-20 Koen Deforche Efficient packet processing pipeline device and method
JP2003324464A (en) * 2002-04-30 2003-11-14 Fujitsu Ltd Data search apparatus and data search method
US8015303B2 (en) * 2002-08-02 2011-09-06 Astute Networks Inc. High data rate stateful protocol processing
US7313667B1 (en) * 2002-08-05 2007-12-25 Cisco Technology, Inc. Methods and apparatus for mapping fields of entries into new values and combining these mapped values into mapped entries for use in lookup operations such as for packet processing
US8151278B1 (en) 2002-10-17 2012-04-03 Astute Networks, Inc. System and method for timer management in a stateful protocol processing system
US7814218B1 (en) 2002-10-17 2010-10-12 Astute Networks, Inc. Multi-protocol and multi-format stateful processing
GB0408870D0 (en) * 2004-04-21 2004-05-26 Level 5 Networks Ltd Processsing packet headers
GB0408868D0 (en) 2004-04-21 2004-05-26 Level 5 Networks Ltd Checking data integrity
CN100486211C (en) * 2005-01-31 2009-05-06 国际商业机器公司 Group classifying method based on regular collection division for use in internet
GB0506403D0 (en) 2005-03-30 2005-05-04 Level 5 Networks Ltd Routing tables
EP1861778B1 (en) 2005-03-10 2017-06-21 Solarflare Communications Inc Data processing system
GB0505300D0 (en) 2005-03-15 2005-04-20 Level 5 Networks Ltd Transmitting data
GB0600417D0 (en) 2006-01-10 2006-02-15 Level 5 Networks Inc Virtualisation support
US7953895B1 (en) * 2007-03-07 2011-05-31 Juniper Networks, Inc. Application identification
US20090028150A1 (en) * 2007-07-26 2009-01-29 Telefonaktiebolaget L M Ericsson (Publ) Protocol-Independent Packet Header Analysis
CN101739298B (en) * 2008-11-27 2013-07-31 国际商业机器公司 Shared cache management method and system
US20100182970A1 (en) * 2009-01-21 2010-07-22 Qualcomm Incorporated Multiple Subscriptions Using a Single Air-Interface Resource
US9038073B2 (en) * 2009-08-13 2015-05-19 Qualcomm Incorporated Data mover moving data to accelerator for processing and returning result data based on instruction received from a processor utilizing software and hardware interrupts
US8762532B2 (en) * 2009-08-13 2014-06-24 Qualcomm Incorporated Apparatus and method for efficient memory allocation
US20110041128A1 (en) * 2009-08-13 2011-02-17 Mathias Kohlenz Apparatus and Method for Distributed Data Processing
US8788782B2 (en) 2009-08-13 2014-07-22 Qualcomm Incorporated Apparatus and method for memory management and efficient data processing
US8887284B2 (en) * 2011-02-10 2014-11-11 Circumventive, LLC Exfiltration testing and extrusion assessment
TWI459763B (en) * 2011-03-23 2014-11-01 Mediatek Inc Method for packet segmentation offload and the apparatus using the same
US8977704B2 (en) * 2011-12-29 2015-03-10 Nokia Corporation Method and apparatus for flexible caching of delivered media
US9401968B2 (en) 2012-01-20 2016-07-26 Nokia Techologies Oy Method and apparatus for enabling pre-fetching of media
US11170294B2 (en) 2016-01-07 2021-11-09 Intel Corporation Hardware accelerated machine learning
US10817802B2 (en) 2016-05-07 2020-10-27 Intel Corporation Apparatus for hardware accelerated machine learning
US11120329B2 (en) 2016-05-07 2021-09-14 Intel Corporation Multicast network and memory transfer optimizations for neural network hardware acceleration

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5022046A (en) * 1989-04-14 1991-06-04 The United States Of America As Represented By The Secretary Of The Air Force Narrowband/wideband packet data communication system
US5421030A (en) * 1991-09-17 1995-05-30 Com21, Inc. Communications system and method for bi-directional communications between an upstream control facility and downstream user terminals
US5787115A (en) * 1995-12-28 1998-07-28 Northern Telecom Limited Key telephone system without common control
US5916305A (en) * 1996-11-05 1999-06-29 Shomiti Systems, Inc. Pattern recognition in data communications using predictive parsers
US6088356A (en) * 1997-06-30 2000-07-11 Sun Microsystems, Inc. System and method for a multi-layer network element
AU8490898A (en) * 1997-07-18 1999-02-10 Interprophet Corporation Tcp/ip network accelerator system and method
US6304903B1 (en) * 1997-08-01 2001-10-16 Agilent Technologies, Inc. State machine for collecting information on use of a packet network
US6223172B1 (en) * 1997-10-31 2001-04-24 Nortel Networks Limited Address routing using address-sensitive mask decimation scheme
US6549519B1 (en) * 1998-01-23 2003-04-15 Alcatel Internetworking (Pe), Inc. Network switching device with pipelined search engines
US6430527B1 (en) * 1998-05-06 2002-08-06 Avici Systems Prefix search circuitry and method
US6526066B1 (en) * 1998-07-16 2003-02-25 Nortel Networks Limited Apparatus for classifying a packet within a data stream in a computer network
US6321269B1 (en) * 1998-12-29 2001-11-20 Apple Computer, Inc. Optimized performance for transaction-oriented communications using stream-based network protocols
US6510509B1 (en) * 1999-03-29 2003-01-21 Pmc-Sierra Us, Inc. Method and apparatus for high-speed network rule processing
CA2291310C (en) * 1999-11-30 2007-04-10 Mosaid Technologies Inc. Generating searchable data entries and applications therefore
US6278289B1 (en) * 2000-05-01 2001-08-21 Xilinx, Inc. Content-addressable memory implemented using programmable logic
US6564214B1 (en) * 2000-06-28 2003-05-13 Visual Networks Technologies, Inc. Method of searching a data record for a valid identifier

Cited By (101)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8805948B2 (en) 1997-10-14 2014-08-12 A-Tech Llc Intelligent network interface system and method for protocol processing
US8856379B2 (en) 1997-10-14 2014-10-07 A-Tech Llc Intelligent network interface system and method for protocol processing
US20020091844A1 (en) * 1997-10-14 2002-07-11 Alacritech, Inc. Network interface device that fast-path processes solicited session layer read commands
US20020095519A1 (en) * 1997-10-14 2002-07-18 Alacritech, Inc. TCP/IP offload device with fast-path TCP ACK generating and transmitting mechanism
US7844743B2 (en) 1997-10-14 2010-11-30 Alacritech, Inc. Protocol stack that offloads a TCP connection from a host computer to a network interface device
US20020161919A1 (en) * 1997-10-14 2002-10-31 Boucher Laurence B. Fast-path processing for receiving data on TCP connection offload devices
US9009223B2 (en) 1997-10-14 2015-04-14 Alacritech, Inc. Method and apparatus for processing received network packets on a network interface for a computer
US20010037406A1 (en) * 1997-10-14 2001-11-01 Philbrick Clive M. Intelligent network storage interface system
US6658480B2 (en) 1997-10-14 2003-12-02 Alacritech, Inc. Intelligent network interface system and method for accelerated protocol processing
US8782199B2 (en) 1997-10-14 2014-07-15 A-Tech Llc Parsing a packet header
US8631140B2 (en) 1997-10-14 2014-01-14 Alacritech, Inc. Intelligent network interface system and method for accelerated protocol processing
US8539112B2 (en) 1997-10-14 2013-09-17 Alacritech, Inc. TCP/IP offload device
US20040073703A1 (en) * 1997-10-14 2004-04-15 Alacritech, Inc. Fast-path apparatus for receiving data corresponding a TCP connection
US8447803B2 (en) 1997-10-14 2013-05-21 Alacritech, Inc. Method and apparatus for distributing network traffic processing on a multiprocessor computer
US20040117509A1 (en) * 1997-10-14 2004-06-17 Alacritech, Inc. Protocol processing stack for use with intelligent network interface device
US6757746B2 (en) 1997-10-14 2004-06-29 Alacritech, Inc. Obtaining a destination address so that a network interface device can write network data without headers directly into host memory
US8131880B2 (en) 1997-10-14 2012-03-06 Alacritech, Inc. Intelligent network interface device and system for accelerated communication
US7945699B2 (en) 1997-10-14 2011-05-17 Alacritech, Inc. Obtaining a destination address so that a network interface device can write network data without headers directly into host memory
US7853723B2 (en) 1997-10-14 2010-12-14 Alacritech, Inc. TCP/IP offload network interface device
US20040054813A1 (en) * 1997-10-14 2004-03-18 Alacritech, Inc. TCP offload network interface device
US7809847B2 (en) 1997-10-14 2010-10-05 Alacritech, Inc. Network interface device that can transfer control of a TCP connection to a host CPU
US7694024B2 (en) 1997-10-14 2010-04-06 Alacritech, Inc. TCP/IP offload device with fast-path TCP ACK generating and transmitting mechanism
US7673072B2 (en) 1997-10-14 2010-03-02 Alacritech, Inc. Fast-path apparatus for transmitting data corresponding to a TCP connection
US7284070B2 (en) 1997-10-14 2007-10-16 Alacritech, Inc. TCP offload network interface device
US7237036B2 (en) 1997-10-14 2007-06-26 Alacritech, Inc. Fast-path apparatus for receiving data corresponding a TCP connection
US7042898B2 (en) 1997-10-14 2006-05-09 Alacritech, Inc. Reducing delays associated with inserting a checksum into a network message
US7076568B2 (en) 1997-10-14 2006-07-11 Alacritech, Inc. Data communication apparatus for computer intelligent network interface card which transfers data between a network and a storage device according designated uniform datagram protocol socket
US7664868B2 (en) 1998-04-27 2010-02-16 Alacritech, Inc. TCP/IP offload network interface device
US7664883B2 (en) 1998-08-28 2010-02-16 Alacritech, Inc. Network interface device that fast-path processes solicited session layer read commands
US20020059438A1 (en) * 2000-01-21 2002-05-16 Drew Sarkisian Wireless communications invisible proxy and hooking systems and methods
US7512694B2 (en) * 2000-01-21 2009-03-31 Bytemobile, Inc. Wireless communications invisible proxy and hooking systems and methods
US6697868B2 (en) 2000-02-28 2004-02-24 Alacritech, Inc. Protocol processing stack for use with intelligent network interface device
US8019901B2 (en) 2000-09-29 2011-09-13 Alacritech, Inc. Intelligent network storage interface system
US8621101B1 (en) 2000-09-29 2013-12-31 Alacritech, Inc. Intelligent network storage interface device
US6807581B1 (en) 2000-09-29 2004-10-19 Alacritech, Inc. Intelligent network storage interface system
US20030129405A1 (en) * 2000-10-26 2003-07-10 Yide Zhang Insulator coated magnetic nanoparticulate composites with reduced core loss and method of manufacture thereof
US20020120888A1 (en) * 2001-02-14 2002-08-29 Jorg Franke Network co-processor for vehicles
US7260668B2 (en) * 2001-02-14 2007-08-21 Micronas Gmbh Network co-processor for vehicles
US6687758B2 (en) 2001-03-07 2004-02-03 Alacritech, Inc. Port aggregation for network connections that are offloaded to network interface devices
US6938092B2 (en) 2001-03-07 2005-08-30 Alacritech, Inc. TCP offload device that load balances and fails-over between aggregated ports having different MAC addresses
US20030121835A1 (en) * 2001-12-31 2003-07-03 Peter Quartararo Apparatus for and method of sieving biocompatible adsorbent beaded polymers
US8493852B2 (en) * 2002-01-15 2013-07-23 Intel Corporation Packet aggregation
US20110208874A1 (en) * 2002-01-15 2011-08-25 Intel Corporation Packet aggregation
US20110208871A1 (en) * 2002-01-15 2011-08-25 Intel Corporation Queuing based on packet classification
US8730984B2 (en) 2002-01-15 2014-05-20 Intel Corporation Queuing based on packet classification
US9055104B2 (en) 2002-04-22 2015-06-09 Alacritech, Inc. Freeing transmit memory on a network interface device prior to receiving an acknowledgment that transmit data has been received by a remote device
US8326988B2 (en) 2002-05-31 2012-12-04 Jds Uniphase Corporation Systems and methods for data alignment
US20100211859A1 (en) * 2002-05-31 2010-08-19 Jds Uniphase Corporation Systems and methods for data alignment
US6751665B2 (en) 2002-10-18 2004-06-15 Alacritech, Inc. Providing window updates from a computer to a network interface device
US10929152B2 (en) * 2003-05-23 2021-02-23 Ip Reservoir, Llc Intelligent data storage and processing using FPGA devices
US11275594B2 (en) 2003-05-23 2022-03-15 Ip Reservoir, Llc Intelligent data storage and processing using FPGA devices
US20080165784A1 (en) * 2003-10-30 2008-07-10 International Business Machines Corporation Method And System For Internet Transport Acceleration Without Protocol Offload
US20050097242A1 (en) * 2003-10-30 2005-05-05 International Business Machines Corporation Method and system for internet transport acceleration without protocol offload
US20050122986A1 (en) * 2003-12-05 2005-06-09 Alacritech, Inc. TCP/IP offload device with reduced sequential processing
US6996070B2 (en) * 2003-12-05 2006-02-07 Alacritech, Inc. TCP/IP offload device with reduced sequential processing
WO2005057945A3 (en) * 2003-12-05 2005-09-09 Alacritech Inc Tcp/ip offload device withg reduced sequential processing
US20050138655A1 (en) * 2003-12-22 2005-06-23 Randy Zimler Methods, systems and storage medium for managing digital rights of segmented content
US20050177618A1 (en) * 2003-12-22 2005-08-11 Randy Zimler Methods, systems and storage medium for managing bandwidth of segmented content
US7675915B2 (en) 2004-03-30 2010-03-09 Extreme Networks, Inc. Packet processing system architecture and method
US8161270B1 (en) 2004-03-30 2012-04-17 Extreme Networks, Inc. Packet data modification processor
US7554978B1 (en) * 2004-03-30 2009-06-30 Extreme Networks, Inc. System for accessing content-addressable memory in packet processor
US7580350B1 (en) 2004-03-30 2009-08-25 Extreme Networks, Inc. System for deriving packet quality of service indicator
US7606263B1 (en) 2004-03-30 2009-10-20 Extreme Networks, Inc. Packet parser
US7522516B1 (en) 2004-03-30 2009-04-21 Extreme Networks, Inc. Exception handling system for packet processing system
US20050226242A1 (en) * 2004-03-30 2005-10-13 Parker David K Pipelined packet processor
US20080008099A1 (en) * 2004-03-30 2008-01-10 Parker David K Packet processing system architecture and method
US7646770B1 (en) 2004-03-30 2010-01-12 Extreme Networks, Inc. Systems for supporting packet processing operations
US7822038B2 (en) 2004-03-30 2010-10-26 Extreme Networks, Inc. Packet processing system architecture and method
US7502374B1 (en) 2004-03-30 2009-03-10 Extreme Networks, Inc. System for deriving hash values for packets in a packet processing system
US7936687B1 (en) 2004-03-30 2011-05-03 Extreme Networks, Inc. Systems for statistics gathering and sampling in a packet processing system
US8924694B2 (en) 2004-03-30 2014-12-30 Extreme Networks, Inc. Packet data modification processor
US7649879B2 (en) 2004-03-30 2010-01-19 Extreme Networks, Inc. Pipelined packet processor
US7889750B1 (en) 2004-04-28 2011-02-15 Extreme Networks, Inc. Method of extending default fixed number of processing cycles in pipelined packet processor architecture
US8248939B1 (en) 2004-10-08 2012-08-21 Alacritech, Inc. Transferring control of TCP connections between hierarchy of processing mechanisms
US10957423B2 (en) 2005-03-03 2021-03-23 Washington University Method and apparatus for performing similarity searching
US7738500B1 (en) 2005-12-14 2010-06-15 Alacritech, Inc. TCP timestamp synchronization for network connections that are offloaded to network interface devices
US7822033B1 (en) 2005-12-30 2010-10-26 Extreme Networks, Inc. MAC address detection device for virtual routers
US7894451B2 (en) 2005-12-30 2011-02-22 Extreme Networks, Inc. Method of providing virtual router functionality
US20070153808A1 (en) * 2005-12-30 2007-07-05 Parker David K Method of providing virtual router functionality
US7817633B1 (en) 2005-12-30 2010-10-19 Extreme Networks, Inc. Method of providing virtual router functionality through abstracted virtual identifiers
US11449538B2 (en) 2006-11-13 2022-09-20 Ip Reservoir, Llc Method and system for high performance integration, processing and searching of structured and unstructured data
US8893159B1 (en) 2008-04-01 2014-11-18 Alacritech, Inc. Accelerating data transfer in a virtual computer system with tightly coupled TCP connections
US8539513B1 (en) 2008-04-01 2013-09-17 Alacritech, Inc. Accelerating data transfer in a virtual computer system with tightly coupled TCP connections
US9667729B1 (en) 2008-07-31 2017-05-30 Alacritech, Inc. TCP offload send optimization
US8341286B1 (en) 2008-07-31 2012-12-25 Alacritech, Inc. TCP offload send optimization
US9413788B1 (en) 2008-07-31 2016-08-09 Alacritech, Inc. TCP offload send optimization
US9306793B1 (en) 2008-10-22 2016-04-05 Alacritech, Inc. TCP offload device that batches session layer headers to reduce interrupts as well as CPU copies
US11803912B2 (en) 2010-12-09 2023-10-31 Exegy Incorporated Method and apparatus for managing orders in financial markets
US11397985B2 (en) 2010-12-09 2022-07-26 Exegy Incorporated Method and apparatus for managing orders in financial markets
US8605732B2 (en) 2011-02-15 2013-12-10 Extreme Networks, Inc. Method of providing virtual router functionality
US11436672B2 (en) 2012-03-27 2022-09-06 Exegy Incorporated Intelligent switch for processing financial market data
US10963962B2 (en) 2012-03-27 2021-03-30 Ip Reservoir, Llc Offload processing of data packets containing financial market data
US9047417B2 (en) 2012-10-29 2015-06-02 Intel Corporation NUMA aware network interface
US9455907B1 (en) 2012-11-29 2016-09-27 Marvell Israel (M.I.S.L) Ltd. Multithreaded parallel packet processing in network devices
US10684973B2 (en) 2013-08-30 2020-06-16 Intel Corporation NUMA node peripheral switch
US11593292B2 (en) 2013-08-30 2023-02-28 Intel Corporation Many-to-many PCIe switch
US9467399B2 (en) * 2013-10-17 2016-10-11 Marvell World Trade Ltd. Processing concurrency in a network device
US20150110114A1 (en) * 2013-10-17 2015-04-23 Marvell Israel (M.I.S.L) Ltd. Processing Concurrency in a Network Device
US9461939B2 (en) 2013-10-17 2016-10-04 Marvell World Trade Ltd. Processing concurrency in a network device
US20150234841A1 (en) * 2014-02-20 2015-08-20 Futurewei Technologies, Inc. System and Method for an Efficient Database Storage Model Based on Sparse Files
CN113645256A (en) * 2021-10-13 2021-11-12 成都数默科技有限公司 Aggregation method without reducing TCP session data value density

Also Published As

Publication number Publication date
US6768992B1 (en) 2004-07-27

Similar Documents

Publication Publication Date Title
US6768992B1 (en) Term addressable memory of an accelerator system and method
US6173333B1 (en) TCP/IP network accelerator system and method which identifies classes of packet traffic for predictable protocols
US6952409B2 (en) Accelerator system and method
US7773599B1 (en) Packet fragment handling
US7916632B1 (en) Systems and methods for handling packet fragmentation
JP3832816B2 (en) Network processor, memory configuration and method
JP3872342B2 (en) Device for network and scalable network processor
US6430184B1 (en) System and process for GHIH-speed pattern matching for application-level switching of data packets
US5838904A (en) Random number generating apparatus for an interface unit of a carrier sense with multiple access and collision detect (CSMA/CD) ethernet data network
US5963543A (en) Error detection and correction apparatus for an asynchronous transfer mode (ATM) network device
JP4066382B2 (en) Network switch and component and method of operation
US6226267B1 (en) System and process for application-level flow connection of data processing networks
US6629125B2 (en) Storing a frame header
US6715002B2 (en) Watermark for additional data burst into buffer memory
JP3807980B2 (en) Network processor processing complex and method
Davie The architecture and implementation of a high-speed host interface
US7936758B2 (en) Logical separation and accessing of descriptor memories
US6988235B2 (en) Checksum engine and a method of operation thereof
US7680116B1 (en) Optimized buffer loading for packet header processing
US20040125751A1 (en) Network protocol off-load engines
JPH10505977A (en) Asynchronous transfer mode adapter for desktop
WO2000001121A1 (en) Two-dimensional queuing/de-queuing methods and systems for implementing the same
WO1999053406A2 (en) High-speed data bus for network switching
US20040025105A1 (en) CRC calculation system for a packet arriving on an n-byte wide bus and a method of calculation thereof
US7239630B1 (en) Dedicated processing resources for packet header generation

Legal Events

Date Code Title Description
STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: SMALL ENTITY

REMI Maintenance fee reminder mailed
FPAY Fee payment

Year of fee payment: 8

SULP Surcharge for late payment

Year of fee payment: 7

FPAY Fee payment

Year of fee payment: 12