US20040062245A1 - TCP/IP offload device - Google Patents

TCP/IP offload device Download PDF

Info

Publication number
US20040062245A1
US20040062245A1 US10/420,364 US42036403A US2004062245A1 US 20040062245 A1 US20040062245 A1 US 20040062245A1 US 42036403 A US42036403 A US 42036403A US 2004062245 A1 US2004062245 A1 US 2004062245A1
Authority
US
United States
Prior art keywords
tcp
processor
sram
packet
hash
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US10/420,364
Other versions
US7496689B2 (en
Inventor
Colin Sharp
Clive Philbrick
Daryl Starr
Stephen Blightman
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alacritech Inc
Original Assignee
Alacritech Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alacritech Inc filed Critical Alacritech Inc
Priority to US10/420,364 priority Critical patent/US7496689B2/en
Assigned to ALACRITECH, INC. reassignment ALACRITECH, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: BLIGHTMAN, STEPHEN E.J., PHILBRICK, CLIVE M., SHARP, COLIN C., STARR, DARYL D.
Publication of US20040062245A1 publication Critical patent/US20040062245A1/en
Application granted granted Critical
Publication of US7496689B2 publication Critical patent/US7496689B2/en
Assigned to A-TECH LLC reassignment A-TECH LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ALACRITECH INC.
Assigned to ALACRITECH, INC. reassignment ALACRITECH, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: A-TECH LLC
Expired - Fee Related legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/10Streamlined, light-weight or high-speed protocols, e.g. express transfer protocol [XTP] or byte stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/16Implementation or adaptation of Internet protocol [IP], of transmission control protocol [TCP] or of user datagram protocol [UDP]
    • H04L69/163In-band adaptation of TCP data exchange; In-band control procedures

Definitions

  • Compact Disc Appendix which is a part of the present disclosure, includes a recordable Compact Disc (CD-R) containing information that is part of the disclosure of the present patent document.
  • CD-R recordable Compact Disc
  • a portion of the disclosure of this patent document contains material that is subject to copyright protection. All the material on the Compact Disc is hereby expressly incorporated by reference into the present application. The copyright owner of that material has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights.
  • FIG. 1 is a diagram of a system 1 in accordance with one embodiment of the present invention.
  • FIG. 2 is a simplified diagram of various structures and steps involved in the processing of an incoming packet in accordance with an embodiment of the present invention.
  • FIG. 3 is a flowchart of a method in accordance with an embodiment of the present invention.
  • FIGS. 4, 5, 6 , 7 , 8 and 9 are diagrams that illustrate various system configurations involving a network interface device in accordance with the present invention.
  • FIG. 1 is a simplified diagram of a system 1 in accordance with a first embodiment.
  • System 1 is coupled to a packet-switched network 2 .
  • Network 2 can, for example, be a local area network (LAN) and/or a collection of networks.
  • Network 2 can, for example, be the Internet.
  • Network 2 can, for example, be an IP-based SAN that runs iSCSI.
  • Network 2 may, for example, be coupled to system 1 via media that communicates electrical signals, via fiber optic cables, and/or via a wireless communication channel.
  • System 1 includes a network interface device (NID) 3 as well as a central processing unit (CPU) 4 .
  • CPU 4 executes software stored in storage 5 .
  • NID 3 is coupled to CPU 4 and storage 5 via host bus 6 , a bridge 7 , and local bus 8 .
  • Host bus 6 may, for example, be a PCI bus or another computer expansion bus.
  • NID 3 includes an application specific integrated circuit (ASIC) 9 , an amount of dynamic random access memory (DRAM) 10 , and Physical Layer Interface (PHY) circuitry 11 .
  • ASIC application specific integrated circuit
  • DRAM dynamic random access memory
  • PHY Physical Layer Interface
  • NID 3 includes specialized protocol accelerating hardware for implementing “fast-path” processing whereby certain types of network communications are accelerated in comparison to “slow-path” processing whereby the remaining types of network communications are handled at least in part by a software protocol processing stack.
  • the certain types of network communications accelerated are TCP/IP communications.
  • the embodiment of NID 3 illustrated in FIG. 1 is therefore sometimes called a TCP/IP Offload Engine (TOE).
  • TOE TCP/IP Offload Engine
  • System 1 of FIG. 1 employs techniques set forth in these documents for transferring control of TCP/IP connections between a protocol processing stack and a network interface device.
  • NID 3 includes Media Access Control circuitry 12 , three processors 13 - 15 , a pair of Content Addressable Memories (CAMs) 16 and 17 , an amount of Static Random Access Memory (SRAM) 18 , queue manager circuitry 19 , a receive processor 20 , and a transmit sequencer 21 .
  • Receive processor 20 executes code stored its own control store 22 .
  • NID 3 includes a processor 23 .
  • Processor 23 may, for example, be a general purpose microprocessor.
  • Processor 23 performs slow-path processing such as TCP error condition handling and exception condition handling.
  • processor 23 also performs higher layer protocol processing such as, for example, iSCSI layer protocol processing such that NID 3 offloads CPU 4 of all iSCSI protocol processing tasks.
  • iSCSI layer protocol processing such that NID 3 offloads CPU 4 of all iSCSI protocol processing tasks.
  • CPU 4 executes code that implements a file system
  • processor 23 executes code that implements a protocol processing stack that includes an iSCSI protocol processing layer.
  • DRAM 10 is initially partitioned to include a plurality of buffers.
  • Receive processor 20 uses the buffers in DRAM 10 to store incoming network packet data as well as status information for the packet. For each buffer, a 32-bit buffer descriptor is created. Each 32-bit buffer descriptor indicates the size of the associated buffer and the location in DRAM of the associated buffer. The location is indicated by a 19-bit pointer.
  • the buffer descriptors for the fee buffers are pushed onto on a “free-buffer queue” 24 . This is accomplished by writing the buffer descriptors to queue manager 19 . Queue manager 19 maintains multiple queues including the “free-buffer queue” 24 . In this implementation, the heads and tails of the various queues are located in SRAM 18 , whereas the middle portion of the queues are located in DRAM 10 .
  • the TCP/IP packet is received from the network 2 via Physical Layer Interface (PHY) circuitry 11 and MAC circuitry 12 .
  • PHY Physical Layer Interface
  • the MAC circuitry 12 verifies checksums in the packet and generates “status” information.
  • the MAC circuitry 12 After all the packet data has been received, the MAC circuitry 12 generates “final packet status” (MAC packet status).
  • the status information also called “protocol analyzer status”
  • the MAC packet status information is then transferred to a free one of the DRAM buffers obtained from the free-buffer queue 24 .
  • the status information and MAC packet status information is stored prepended to the associated data in the buffer.
  • receive processor 20 pushes a “receive packet descriptor” (also called a “summary”) onto a “receive packet descriptor” queue 25 .
  • the “receive packet descriptor” includes a 14-bit hash value, the buffer descriptor, a buffer load-count, the MAC ID, and a status bit (also called an “attention bit”).
  • the 14-bit hash value was previously generated by the receive processor 20 (from the TCP and IP source and destination addresses) as the packet was received.
  • the attention bit of the receive packet descriptor is a one, then the packet is not a “fast-path candidate”; whereas if the attention bit is a zero, then the packet is a “fast-path candidate”.
  • the attention bit being a zero indicates that the packet employs both the TCP protocol and the IP protocol.
  • the processor determines that the packet is not a “fast-path candidate” and the packet is handled in “slow-path”.
  • the packet is a TCP/IP packet
  • the attention bit indicates the packet is not a “fast-path candidate”
  • NID 3 performs full offload TCP/IP functions
  • general purpose processor 23 performs further protocol processing on the packet (headers and data).
  • the entire packet (headers and data) are transferred from the DRAM buffer and across host bus 6 such that CPU 4 performs further protocol processing on the packet.
  • the processor determines that the packet is a “fast-path candidate”. If the processor determines that the packet is a “fast-path candidate”, then the processor uses the buffer descriptor from the “receive packet descriptor” to initiate a DMA transfer the first approximately 96 bytes of information from the pointed to buffer in DRAM 10 into a portion of SRAM 18 so that the processor can examine it.
  • This first approximately 96 bytes contains the IP source address of the IP header, the IP destination address of the IP header, the TCP source address (“TCP source port”) of the TCP header, and the TCP destination address (“TCP destination port”) of the TCP header.
  • TCP source port the TCP source address
  • TCP destination address TCP destination address
  • the IP source address of the IP header, the IP destination address of the IP header, the TCP source address of the TCP header, and the TCP destination address of the TCP header together uniquely define a single “connection context” with which the packet is associated.
  • the processor uses the 14-bit hash from the “receive packet descriptor” to identify the connection context of the packet and to determine whether the connection context is one of a plurality of connection contexts that are under the control of NID 3 .
  • the hash points to one hash bucket in a hash table 104 in SRAM 18 .
  • each row of the hash table 104 is a hash bucket.
  • Each hash bucket contains one or more hash table entries.
  • the processor attempts to match the IP source address, IP destination address, TCP source address (port), and TCP destination address (port) retrieved from DRAM with the same fields, i.e., the IP source address, IP destination address, TCP source port, and TCP destination port of each hash table entry.
  • the hash table entries in the hash bucket are searched one by one in this manner until the processor finds a match.
  • a number stored in the hash table entry (called a “transmit control block number” or “TCB number”) identifies a block of information (called a TCB) related to the connection context of the packet.
  • TCB number a number stored in the hash table entry
  • connection context is determined not to be one of the contexts under the control of NID 3 , then the “fast-path candidate” packet is determined not to be an actual “fast-path packet.”
  • NID 3 includes general purpose processor 23 and where NID 3 performs full TCP/IP offload functions, processor 23 performs further TCP/IP protocol processing on the packet.
  • NID 3 performs partial TCP/IP offload functions, the entire packet (headers and data) is transferred across host bus 6 for further TCP/IP protocol processing by the sequential protocol processing stack of CPU 4 .
  • connection context is one of the connection contexts under control of NID 3
  • software executed by the processor 13 or 14 ) checks for one of numerous exception conditions and determines whether the packet is a “fast-path packet” or is not a “fast-path packet”.
  • the processor determines that the “fast-path candidate” is not a “fast-path packet.” In such a case, the connection context for the packet is “flushed” (control of the connection context is passed back to the stack) so that the connection context is no longer present in the list of connection contexts under control of NID 3 . If NID 3 is a full TCP/IP offload device including general purpose processor 23 , then general purpose processor 23 performs further TCP/IP processing on the packet. In other embodiments where NID 3 performs partial TCP/IP offload functions and NID 3 includes no general purpose processor 23 , the entire packet (headers and data) is transferred across host bus 6 to CPU 4 for further “slow-path” protocol processing.
  • the processor finds no such exception condition, then the “fast-path candidate” packet is determined to be an actual “fast-path packet”.
  • the processor executes a software state machine such that the packet is processed in accordance with the IP and TCP protocols.
  • the data portion of the packet is then DMA transferred to a destination identified by another device or processor.
  • the destination is located in storage 5 and the destination is identified by a file system controlled by CPU 4 .
  • CPU 4 does no or very little analysis of the TCP and IP headers on this “fast-path packet”. All or substantially all analysis of the TCP and IP headers of the “fast-path packet” is done on NID 3 . Description Of A TCB Lookup Method:
  • TBC Transmit Control Block
  • An incoming packet is analyzed to determine whether it is associated with a connection context that is under the control of NID 3 . If the packet is associated with a connection context under the control of NID 3 , then a TCB lookup method is employed to find the TCB for the connection context. This lookup method is described in further detail in connection with FIGS. 2 and 3.
  • NID 3 is a multi-receive processor network interface device. In NID 3 , up to sixteen different incoming packets can be in process at the same time by two processors 13 and 14 .
  • processor 15 is a utility processor, but each of processors 13 and 14 can perform receive processing or transmit processing.
  • a processor executes a software state machine to process the packet. As the packet is processed, the state machine transitions from state to state. One of the processors, for example processor 13 , can work on one of the packets being received until it reaches a stopping point. Processor 13 then stops work and stores the state of the software state machine. This stored state is called a “processor context”.
  • processor 14 retrieves the prior state of the state machine from the previous “processor context”, loads this state information into its software state machine, and then continues processing the packet through the state machine from that point. In this way, up to sixteen different flows can be processed by the two processors 13 and 14 working in concert.
  • the TCB lookup method starts after the TCP packet has been received, after the 14-bit hash and the attention bit has been generated, and after the hash and attention bit have been pushed in the form of a “receive packet descriptor” onto the “receive packet descriptor queue”.
  • a first step one of processors 13 or 14 obtains an available “processor context”.
  • the processor pops (step 201 ) the “receive packet descriptor” queue 25 to obtain the “receive packet descriptor”.
  • the “receive packet descriptor” contains the previously-described 14-bit hash value 101 (see FIG. 2) and the previously-described attention bit.
  • the processor checks the attention bit.
  • processing proceeds to slow-path processing.
  • NID 3 is a TCP/IP full-offload device and if the packet is a TCP/IP packet, then further TCP/IP processing is performed by general purpose processor 23 .
  • NID 3 is a TCP/IP partial offload device, then the packet is sent across host bus 6 for further protocol processing by CPU 4 .
  • the processor initiates a DMA transfer of the beginning part of the packet (including the header) from the identified buffer in DRAM 10 to SRAM 18 .
  • 14-bit hash value 101 (see FIG. 2) actually comprises a 12-bit hash value 102 and another two bits 103 .
  • the 12-bit hash value (bits [13:2]) identifies an associated one of 4096 possible 64-byte hash buckets. In this embodiment, up to 48 of these hash buckets can be cached in SRAM in a hash table 104 , whereas any additional used hash buckets 105 are stored in DRAM 10 .
  • the hash bucket identified by the 12-bit hash value is in DRAM 10 , then the hash bucket is copied (or moved) from DRAM 10 to an available row in hash table 104 .
  • SRAM_hashbt hash byte
  • a six-bit pointer field in the hash byte indicates whether the associated hash bucket is located in SRAM or not. If the pointer field contains a number between 1 and 48, then the pointer indicates the row of hash table 104 where the hash bucket is found. If the pointer field contains the number zero, then the hash bucket is not in hash table 104 but rather is in DRAM.
  • the processor uses the 12-bit hash value 102 to check the associated hash byte to see if the pointed to hash bucket is in the SRAM hash table 104 (step 204 ).
  • step 205 processing is suspended until the DMA transfer of the header from DRAM to SRAM is complete.
  • a queue (Q_FREEHASHSLOTS) identifying free rows in hash table 104 is accessed (the queue is maintained by queue manager 19 ) and a free hash bucket row (sometimes called a “slot’) is obtained.
  • the processor then causes the hash bucket to be copied or moved from DRAM and into the free hash bucket row.
  • the processor updates the pointer field in the associated hash byte to indicate that the hash bucket is now in SRAM and is located at the row now containing the hash bucket.
  • the up to four possible hash bucket entries in the hash bucket are searched one by one (step 207 ) to identify if the TCP and IP fields of an entry match the TCP and IP fields of the packet header 106 (the TCP and IP fields from the packet header were obtained from the receive descriptor).
  • the pointed to hash bucket contains two hash entries.
  • the hash entries are checked one by one.
  • the two bits 103 Bits [1:0] of the 14-bit hash are used to determine which of the four possible hash table entry rows (i.e., slots) to check first.
  • the second hash entry 107 (shown in exploded view) is representative of the other hash table entries. It includes a 16-bit TCB# 108, a 32-bit IP destination address, a 32-bit IP source address, a 16-bit TCP destination port, and a 16-bit TCP source port.
  • step 208 If all of the entries in the hash bucket are searched and a match is not found (step 208 ), then processing proceeds by the slow-path. If, on the other hand, a match is found (step 209 ), then the TCB# portion 108 of the matching entry identifies the TCB of the connection context.
  • NID 3 supports both fast-path receive processing as well as fast-path transmit processing.
  • a TCP/IP connection can involve bidirectional communications in that packets might be transmitted out of NID 3 on the same TCP/IP connection that other packets flow into NID 3 .
  • a mechanism is provided so that the context for a connection can be “locked” by one processor (for example, a processor receiving a packet on the TCP/IP connection) so that the another processor (for example, a processor transmitting a packet on the same TCP/IP connection) will not interfere with the connection context.
  • This mechanism includes two bits for each of the up to 8192 connections that can be controlled by NID 3 : 1) a “TCB lock bit” (SRAM_tcblock), and 2) a “TCB in-use bit” (SRAM_tcbinuse).
  • the “TCB lock bits” 109 and the “TCB in-use bits” 110 are maintained in SRAM 18 .
  • the processor attempts to lock the designated TCB (step 210 ) by attempting to set the TCB's lock bit. If the lock bit indicates that the TCB is already locked, then the processor context number (a 4-bit number) is pushed onto a linked list of waiting processor contexts for that TCB. Because there are sixteen possible processor contexts, a lock table 112 is maintained in SRAM 18 . There is one row in lock table 112 for each of the sixteen possible processor contexts. Each row has sixteen four-bit fields. Each field can contain the 4-bit processor context number for a waiting processor context. Each row of the lock table 112 is sixteen entries wide because all sixteen processor contexts may be working on or waiting for the same TCB.
  • lock bit indicates that the TCB is already locked (step 211 )
  • the processor context number (a four-bit number because there can be up to sixteen processor contexts) is pushed onto the row of the lock table 112 associated with the TCB.
  • a lock table content addressable memory (CAM) 111 is used to translate the TCB number (from TCB field 108 ) into the row number in lock table 112 where the linked list for that TCB number is found. Accordingly, lock table CAM 111 receives a sixteen-bit TCB number and outputs a four-bit row number.
  • the processor context that has the TCB locked When the processor context that has the TCB locked is ready to suspend itself, it consults the lock table CAM 111 and the associated lock table 112 to determine if there is another processor context waiting for the TCB. If there is another processor context waiting (there is an entry in the associated row of lock table 112 ), then it restarts the first (oldest) of the waiting processor contexts in the linked list. The restarted processor context is then free to lock the TCB and continue processing.
  • the processor context locks the TCB by setting the associated TCB lock bit 109 .
  • the processor context then supplies the TCB number (sixteen bits) to an IN SRAM CAM 113 (step 212 ) to determine if the TCB is in one of thirty-two TCB slots 114 in SRAM 18 . (Up to thirty-two TCBs are cached in SRAM, whereas a copy of all “in-use” TCBs is kept in DRAM).
  • the IN SRAM CAM 113 outputs a sixteen-bit value, five bits of which point to one of the thirty-two possible TCB slots 114 in SRAM 18 . One of the bits is a “found” bit.
  • the “found” bit indicates that the TCB is “found”, then the five bits are a number from one to thirty-two that points to a TCB slot in SRAM 18 where the TCB is cached. The TCB has therefore been identified in SRAM 18 , and fast-path receive processing continues (step 213 ).
  • the TCB is not cached in SRAM 18 . All TCBs 115 under control of NID 3 are, however, maintained in DRAM 10 . The information in the appropriate TCB slot in DRAM 10 is then written over one of the thirty-two TCB slots 114 in SRAM 18 . In the event that one of the SRAM TCB slots is empty, then the TCB information from DRAM 10 is DMA transferred into that free SRAM slot. If there is no free SRAM TCB slot, then the least-recently-used TCB slot in SRAM 18 is overwritten.
  • the IN SRAM CAM 113 is updated to indicate that the TCB is now located in SRAM at a particular slot. The slot number is therefore written into the IN SRAM CAM 113 .
  • Fast-path receive processing then continues (step 216 ).
  • the processor context releasing control of a TCB does not update the DRAM version of the TCB, but rather the processor context assuming control of the TCB has that potential responsibility.
  • a “dirty bit” 116 is provided in each TCB. If the releasing processor context changed the contents of the TCB (i.e., the TCB is dirty), then the releasing processor context sets this “dirty bit” 116 .
  • next processor context needs to put another TCB into the SRAM TCB slot held by the dirty TCB
  • the next processor first writes the dirty TCB information (i.e., updated TCB information) to overwrite the corresponding TCB information in DRAM (i.e., to update the DRAM version of the TCB). If, on the other hand, the next processor does not need to move a TCB into an SRAM slot held by a dirty TCB, then the next processor does not need to write the dirty TCB information to DRAM.
  • next processor can either just update a TCB whose dirty bit is not set, or the next processor can simply overwrite the TCB whose dirty bit is not set (for example, to move another TCB into the slot occupied by the TCB whose dirty bit is not set).
  • CAM A is a thirty-two entry CAM when CAMs A and B are used together as a single CAM. If CAM A is used separately, then CAM A is an sixteen-entry CAM. 0b011000001 CamContentsA Read/Write.
  • Cam Valid[CamAddrA] ⁇ AluOut[16].
  • CamContents [CamAddrA] AluOut[15:0]. Accordingly, writing bit sixteen “invalidates” the CAM entry. The tilde symbol here indicates the logical NOT.
  • Bit 16 ⁇ CamValid[CamAddrA].
  • Bits 15-0 Cam Contents[CamAddrA]. 0b011000010
  • 0b011000110 CamMatchB Read/Write These registers (CamAddrB, Cam ContentsB and CamMatchB) are identical in use to those for CAM A (see above), except that they are for the second half of the first CAM (CAM B).
  • 0b011001000 CamAddrC Write Only This register for CAM C is identical in function to the corresponding register for CAM A.
  • This register for CAM C is identical in function to the corresponding register for CAM A.
  • 0b011001010 CamMatchC Read/Write This register for CAM C is identical in function to the corresponding register for CAM A.
  • CAM C can be split into two sixteen-entry CAMs: CAM C and CAM D. 0b011001100 CamAddrD Write Only.
  • This register for CAM D is identical in function to the corresponding register for CAM D. 0b011001101 CamContentsD Read/Write.
  • This register for CAM D is identical in function to the corresponding register for CAM D. 0b011001110 CamMatchD Read/Write.
  • This register for CAM D is identical in function to the corresponding register for CAM D.
  • processors 13 - 15 One embodiment of the code executed by processors 13 - 15 is written using functions. These functions are in turn made up of instructions including those instructions set forth in Table 1 above. The functions are set forth in the file SUBR.MAL of the CD Appendix (the files on the CD Appendix are incorporated by reference into the present patent document). These functions include:
  • the INSRAM_CAM_INSERT function Executing this function causes the TCB number present in a register (register cr 11 ) to be written into the IN SRAM CAM (CAM A of the processor). The particular CAM slot written to is identified by the lower sixteen bits of the value present in another register (register TbuffL 18 ).
  • the INSRAM_CAM_REMOVE function Executing this function causes the CAM entry in the IN SRAM CAM slot identified by a register (register cr 11 ) to be invalidated (i.e., removed). The entry is invalidated by setting bit 16 of a register (register CAM_CONTENTS_A).
  • the INSRAM_CAM SEARCH function Executing this function causes a search of the IN SRAM CAM for the TCB number identified by the TCB number present in a register (register cr 11 ). The result of the search is a five-bit slot number that is returned in five bits of another register (register TbuffL 18 ). The value returned in a sixth bit of the register TbuffL 18 indicates whether or not the TCB number was found in the INSRAM_CAM.
  • the LOCKBL_CAM_REMOVE function Executing this function causes the CAM entry in the LOCK TABLE CAM slot identified by a register (register cr 10 ) to be invalidated (i.e., removed). The entry is invalidated by setting bit of another register (register CAM_CONTENTS_C).
  • the LOCK_TABLE_SEARCH function Executing this function causes a search of the LOCK TABLE CAM for the TCB number identified by the TCB number present in a register (register cr 11 ). The result of the search is a four-bit number of a row in the lock table. The four-bit number is four bits of another register (register cr 10 ). The value returned in a fifth bit of the register cr 10 indicates whether or not the TCB number was found in the LOCK TABLE CAM.
  • the Compact Disc Appendix includes a folder “CD Appendix A”, a folder “CD Appendix B”, a folder “CD Appendix C”, and a file “title page.txt”.
  • CD Appendix A includes a description of an integrated circuit (the same as ASIC 9 of FIG. 1 except that the integrated circuit of CD Appendix A does not include processor 23 ) of one embodiment of a TCP/IP offload network interface device (NID).
  • CD Appendix B includes software that executes on a host computer CPU, where the host computer is coupled to a NID incorporating the integrated circuit set forth in CD Appendix A and wherein the host computer includes a CPU that executes a protocol stack.
  • CD Appendix C includes a listing of the program executed by the receive processor of the integrated circuit set forth in Appendix A as well as a description of the instruction set executed by the receive processor.
  • the CD Appendix A includes the following: 1) a folder “Mojave verilog code” that contains a hardware description of an embodiment of the integrated circuit, and 2) a folder “Mojave microcode” that contains code that executes on the processors (for example, processors 13 and 14 of FIG. 1) of the integrated circuit.
  • the file “MAINLOOP.MAL” is commented to indicate instructions corresponding to various steps of the method of FIG. 3.
  • the file “SEQ.H” is a definition file for the “MAINLOOP.MAL” code.
  • Page 9 sets forth steps in accordance with a twenty-step method in accordance with some embodiments of the present invention.
  • Page 10 sets forth the structure of a TCB in accordance with some embodiments.
  • Page 17 sets forth the structure of a hash byte (called a “TCB Hash Bucket Status Byte”).
  • the CD Appendix B includes the following: 1) a folder entitled “simba (device driver software for Mojave)” that contains device driver software executable on the host computer; 2) a folder entitled “atcp (free BSD stack and code added to it)” that contains a TCP/IP stack [the folder “atcp” contains: a) a TCP/IP stack derived from the “free BSD” TCP/IP stack (available from the University of California, Berkeley) so as to make it run on a Windows operating system, and b) code added to the free BSD stack between the session layer above and the device driver below that enables the BSD stack to carry out “fast-path” processing in conjunction with the NID]; and 3) a folder entitled “include (set of files shared by ATCP and device driver)” that contains a set of files that are used by the ATCP stack and are used by the device driver.
  • the CD Appendix C includes the following: 1) a file called “mojave_rcv_seq (instruction set description).mdl” that contains a description of the instruction set of the receive processor, and 2) a file called “mojave_rcv_seq (program executed by receive processor).mal” that contains a program executed by the receive processor.
  • FIGS. 4 - 9 illustrate various system configurations involving a network interface device in accordance with the present invention. These configurations are but some system configurations. The present invention is not limited to these configurations, but rather these configurations are illustrated here only as examples of some of the many configurations that are taught in this patent document.
  • FIG. 4 shows a computer 300 wherein a network interface device (NID) 301 is coupled via a connector 302 and a host bus 303 to a CPU 304 and storage 305 .
  • NID network interface device
  • CPU 304 and storage 305 are together referred to as a “host” 306 .
  • network interface device (NID) 301 can be considered part of a host as shown in FIG. 5.
  • a host computer 400 includes NID 301 as well as CPU 304 and storage 305 .
  • the CPU executes instructions that implement a sequential protocol processing stack.
  • the network interface device 301 performs fast-path hardware accelerated protocol processing on some types of packets such that CPU 304 performs no or substantially no protocol processing on these types of packets. Control of a connection can be passed from the NID to the stack and from the stack to the NID.
  • FIG. 6 shows a computer 500 wherein NID 301 is coupled to CPU 304 and storage 305 by abridge 501 .
  • FIG. 7 shows a computer 500 wherein a network interface device (NID) 501 is integrated into a bridge integrated circuit 502 .
  • Bridge 502 couples computer 500 to a network 503 .
  • Bridge 502 is coupled to CPU 504 and storage 505 by local bus 506 .
  • CPU 504 executes instructions that implement a software sequential protocol processing stack.
  • Bridge 502 is coupled to multiple expansion cards 507 , 508 and 509 via a host bus 510 .
  • Network interface device 501 performs TCP and IP protocol processing on certain types of packets, thereby offloading CPU and its sequential protocol processing stack of these tasks. Control of a connection can be passed from the NID to the stack and from the stack to the NID.
  • NID 501 is a full TCP/IP offload device.
  • NID is a partial TCP/IP offload device.
  • the terms “partial TCP/IP” are used here to indicate that all or substantially all TCP and IP protocol processing on certain types of packets is performed by the offload device, whereas substantial TCP and IP protocol processing for other types of packets is performed by the stack.
  • FIG. 8 shows a computer 700 wherein a network interface device (NID) 701 couples CPU 702 and storage 703 to network 704 .
  • NID 701 includes a processor that implements a sequential protocol processing stack 705 , a plurality of sequencers 706 (such as, for example, a receive sequencer and a transmit sequencer), and a plurality of processors 707 .
  • This embodiment maybe a full-offload embodiment in that processor 705 fully offloads CPU 702 and its stack of all or substantially all TCP and IP protocol processing duties.
  • FIG. 9 shows a computer 800 wherein a network interface device (NID) 801 couples CPU 802 and storage 803 to network 804 .
  • NID 801 includes a plurality of sequencers 806 (for example, a receive sequencer and a transmit sequencer), and a plurality of processors 807 .
  • CPU 802 implements a software sequential protocol processing stack
  • NID 801 does not include a general purpose processor that implements a sequential software protocol processing stack.
  • This embodiment may be a partial-offload embodiment in that NID 801 performs all or substantially all TCP and IP protocol processing tasks on some types of packets, whereas CPU 802 and its stack perform TCP and IP protocol processing on other types of packets.
  • NID 3 can be part of a memory controller integrated circuit or an input/output (I/O) integrated circuit or a bridge integrated circuit of a microprocessor chip-set. In some embodiments, NID 3 is part of an I/O integrated circuit chip such as, for example, the Intel 82801 integrated circuit of the Intel 820 chip set. NID 3 may be integrated into the Broadcom ServerWorks Grand Champion HE chipset, the Intel 82815 Graphics and Memory Controller Hub, the Intel 440BX chipset, or the Apollo VT8501 MVP4 North Bridge chip.
  • the instructions executed by receive processor 20 and/or processors 13 - 15 are, in some embodiments, downloaded upon power-up of NID 3 into a memory on NID 3 , thereby facilitating the periodic updating of NID functionality.
  • High and low priority transmit queues may be implemented using queue manager 19 .
  • Hardcoded transmit sequencer 21 in some embodiments, is replaced with a transmit processor that executes instructions.
  • Processors 13 , 14 and 15 can be identical processors, each of which can perform receive processing and/or transmit processing and/or utility functions. Accordingly, various modifications, adaptations, and combinations of various features of the described embodiments can be practiced without departing from the scope of the invention as set forth in the following claims that follow the “Mojave Hardware Specification” section below.
  • PCI Peripheral Component Interconnect
  • GMII Gigabit Media Independent Interface
  • TBI Ten Bit Interface
  • Writable control store allows field updates and feature enhancements.
  • Mojave (See FIG. 10) is a 32-bit, full-duplex, single channel, 10/100/1000-Megabit per second (Mbps), Session Layer Interface Controller (SLIC), designed to provide high-speed protocol processing for server and desktop applications. It combines the functions of a standard network interface controller and a protocol processor within a single chip.
  • SLIC Session Layer Interface Controller
  • Mojave When combined with the 802.3/GMII compliant Phy and Synchronous Dram (SDRAM), Mojave comprises one complete ethernet node. It contains one 802.3/ethernet compliant Mac, a PCI Bus Interface Unit (BIU), a memory controller, transmit fifo, receive fifo and a custom TCP/IP protocol processor. Mojave supports 10 Base-T, 100 Base-TX and 1000 Base-TX via the GMII interface attachment of appropriate Phys. Mojave also supports 100 Base-FX, and 1000 Base-FX via the TBI interface attachment of external Serdes.
  • BIU PCI Bus Interface Unit
  • Mojave supports 10 Base-T, 100 Base-TX and 1000 Base-TX via the GMII interface attachment of appropriate Phys. Mojave also supports 100 Base-FX, and 1000 Base-FX via the TBI interface attachment of external Serdes.
  • the Mojave Mac provides statistical information that may be used for SNMP.
  • the Mac can operate in promiscuous mode allowing Mojave to function as a network monitor, receive broadcast and multicast packets and implement multiple Mac addresses for each node.
  • Any 802.3/GMII/TBI compliant PHY/SERDES can be utilized, allowing Mojave to support 10 BASE-T, 10 BASE-T2, 100 BASE-TX, 100 Base-FX, 100 BASE-T4, 1000 BASE-TX or 1000 BASE-FX as well as future interface standards.
  • PHY identification and initialization is accomplished through host driver initialization routines.
  • PHY status registers can be polled continuously by Mojave to detect PHY status changes which are then reported to the host driver.
  • the Mac can be configured to support a maximum frame size of 1518 bytes or 9018 bytes.
  • the 64-bit, multiplexed BIU provides a direct interface to the PCI bus for both slave and master functions.
  • Mojave is capable of operating in either a 64-bit or 32-bit PCI environment, while supporting 64-bit addressing in either configuration.
  • PCI bus frequencies up to 33 MHz are supported yielding instantaneous bus transfer rates of 266 MB/s.
  • Both 5.0V and 3.3V signaling environments can be utilized by Mojave.
  • Configurable cache-line size up to 256B will accommodate future architectures, and Expansion ROM/Flash support will allow for diskless system booting.
  • Non-PC applications are supported via programmable big and little endian modes. Host based communication has been utilized to provide the best system performance possible.
  • Mojave supports Plug-N-Play auto-configuration through the PCI configuration space.
  • Support of an external eeprom allows for local storage of configuration information such as Mac addresses.
  • External SDRAM provides frame buffering, which is configurable as 1 MB, 2 MB, 4 MB or 8 MB using the appropriate technology and width selections. Use of—10 speed grades yields an external buffer bandwidth of 88 MB/s.
  • the buffer provides temporary storage of both incoming and outgoing frames.
  • the protocol processor accesses the frames within the buffer in order to implement TCP/IP and NETBIOS. Incoming frames are processed, assembled then transferred to host memory under the control of the protocol processor. For transmit, data is moved from host memory to buffers where various headers are created before being transmitted out via the Mac.
  • the processor (See FIG. 14) is a convenient means to provide a programmable state-machine capable of processing incoming frames and host commands, directing network traffic and directing PCI bus traffic.
  • Three processors are implemented using shared hardware in a three-level pipelined architecture which launches and completes a single instruction for every clock cycle. The instructions are executed in three distinct phases corresponding to each of the pipeline stages where each phase is responsible for a different function.
  • the first instruction phase writes the instruction results of the last instruction to the destination operand, modifies the program counter (Pc), selects the address source for the instruction to fetch, then fetches the instruction from the control store.
  • Pc program counter
  • the fetched instruction is then stored in the instruction register at the end of the clock cycle.
  • the processor instructions reside in the on-chip control-store, which is implemented as a mixture of ROM and Sram.
  • the ROM contains 4K instructions starting at address 0 ⁇ 0000 and aliases every 0 ⁇ 1000 locations throughout the first 0 ⁇ 8000 locations of instruction space.
  • the Sram (WCS) will hold up to 0 ⁇ 1000 instructions starting at address 0 ⁇ 8000 and aliasing each 0 ⁇ 1000 locations throughout the last 0 ⁇ 8000 of instruction space.
  • the ROM and Sram are both 49-bits wide accounting for bits [48:0] of the instruction microword.
  • a separate mapping ram provides bits [55:49] of the microword (MapAddr) to allow replacement of faulty ROM based instructions.
  • the mapping ram has a configuration of 512 ⁇ 7 which is insufficient to allow a separate map address for each of the 4K ROM locations.
  • the map ram address lines are connected to the address bits Fetch [9:3]. The result is that the ROM is re-mapped in blocks of 8 contiguous locations.
  • the second instruction phase decodes the instruction which was stored in the instruction register. It is at this point that the map address is checked for a non-zero value which will cause the decoder to force a Jmp instruction to the map address. If a non-zero value is detected then the decoder selects the source operands for the Alu operation based on the values of the OpdASel, OpdBSel and AluOp fields. These operands are then stored in the decode register at the end of the clock cycle. Operands may originate from File, Sram, or flip-flop based registers. The second instruction phase is also where the results of the previous instruction are written to the Sram.
  • the third instruction phase is when the actual Alu operation is performed, the test condition is selected and the Stack push and pop are implemented. Results of the Alu operation are stored in the results register at the end of the clock cycle.
  • FIG. 14 is a block diagram of the CPU.
  • FIG. 14 shows the hardware functions associated with each of the instruction phases. Note that various functions have been distributed across the three phases of the instruction execution in order to minimize the combinatorial delays within any given phase.
  • micro-instructions are divided into nine types according to the program control directive.
  • the micro-instruction is further divided into sub-fields for which the definitions are dependant upon the instruction type.
  • the six instruction types are listed IN FIG. 15.
  • All instructions include the Alu operation (AluOp), operand “A” select (OpdASel), operand “B” select (OpdBSel) and Literal fields. Other field usage depends upon the instruction type.
  • conditional jump (Jct/Jcf) instruction causes the program counter to be altered if the condition selected by the “test select” (TstSel) field is true/false.
  • the new program counter (Pc) value is loaded from either the Literal field or the AluOut as described in the following section and the Literal field may be used as a source for the Alu or the ram address if the new Pc value is sourced by the Alu.
  • the “jump” (Jmp) instruction causes the program counter to be altered unconditionally.
  • the new program counter (Pc) value is loaded from either the Literal field or the AluOut as described in the following section.
  • the format allows instruction bits 22:16 to be used to perform a flag operation and the Literal field may be used as a source for the Alu or the ram address if the new Pc value is sourced by the Alu.
  • the “jump subroutine” (Jsr) instruction causes the program counter to be altered unconditionally.
  • the new program counter (Pc) value is loaded from either the Literal field or the AluOut as described in the following section.
  • the old program counter value is stored on the top location of the Pc-Stack which is implemented as a LIFO memory.
  • the format allows instruction bits 22:16 to be used to perform a flag operation and the Literal field may be used as a source for the Alu or the ram address if the new Pc value is sourced by the Alu.
  • the “Cont” (Cont) instruction causes the program counter to increment.
  • the format allows instruction bits 22:16 to be used to perform a flag operation and the Literal field may be used as a source for the Alu or the ram address.
  • Rts return from subroutine
  • Rtt/Rtf conditional Rts
  • MapEn maps to the “map enable”
  • MapAddr maps to the “map address”
  • the instruction decoder forces a jump instruction with the Alu operation and destination fields set to pass the MapAddr field to the program control block.
  • FIGS. 16 - 20 show sequencer behavior, ALU operations, ALU operands, selected tests, and flag operations.
  • Hardware will detect certain program errors. Any sequencer generating a program error will be forced to continue executing from location 0004. The program errors detected are:
  • Sram is the nexus for data movement within Mojave.
  • the data flow block diagram of FIG. 21 shows all of the master and slave sequencers of the Mojave product.
  • Request information such as r/w, address, size, endian and alignment are represented by each request line.
  • Acknowledge information to master sequencers include only the size of the transfer being acknowledged.
  • FIG. 22 The block diagram of FIG. 22 illustrates how data movement is accomplished for a Pci slave write to Dram.
  • Psi sends a write request to the SramCtrl module.
  • Psi requests Dwr to move data from Sram to dram.
  • Dwr subsequently sends a read request to the SramCtrl module then writes the data to the dram via the Mctrl module.
  • Mctrl module As each piece of data is moved from the Sram to Dwr, Dwr sends an acknowledge to the Psi module.
  • the Sram control sequencer (See FIG. 23) services requests to store to, or retrieve data from an Sram organized as 2048 locations by 128 bits (32 KB).
  • the sequencer operates at a frequency of 200 Mhz, allowing both a Cpu access and a dma access to occur during a standard 100 Mhz Cpu cycle.
  • One 200 Mhz cycle is reserved for Cpu accesses during each 100 Mhz cycle while the remaining 200 Mhz cycle is reserved for dma accesses on a prioritized basis.
  • FIG. 23 shows the major functions of the Sram control sequencer.
  • a slave sequencer begins by asserting a request along with r/w, ram address, data path alignment and request size.
  • SramCtrl prioritizes the requests.
  • the request parameters are then selected by a multiplexer which feeds the parameters to the Sram via alignment logic.
  • the requestor provides the Sram address which when combined with the other parameters controls the input and output alignment.
  • Sram outputs are fed to the output aligner. Requests are acknowledged in parallel with the returned data.
  • Memctrl (See FIG. 24) implements the memory controller function and registers for access to SDRAM, Flash memory, and external configuration jumpers. It also implements the register interface for the serial EEPROM and GPIO access. Memctrl functional module summaries:
  • memregs The memregs module provides the configuration and control registers for all the functions of memctrl. memregs also implements the GPIO interface registers for reading, writing and directional control, the FLASH control registers for configuring and accessing FLASH, and registers associated with configuring the SDPAM controller. memregs is accessed through the CPU data path with all of its registers mapped to a CPU register address.
  • dramcfg_seq The dramcfg_seq module contains the refresh logic, timers, and sequencer for the various configuration accesses that are performed. This also includes operations which take place during initialization.
  • flash_seq The flash_seq module performs the various FLASH memory access sequences. This module also implements the programmable nature of the access time delays between the control signals and data accesses.
  • dramif The dramif module arbitrates between the memctrl modules requesting access to the memory interface. This includes the dramcfg_seq, flash_seq, memregs, dramwrt and dramrd modules. The dramif module also muxes the row and column address for the SDRAM accesses, muxs the read and write control signals between dramrd, dramwrt, etc., and also controls the direction of the data bus interface. dramif attempts to ping-pong between reads and writes to maximize the overlap between read and write buffers and for fairness.
  • This fairness can be overriden if a requester asserts it's urgent request signal for high priority conditions like impending buffer overflow or underflow.
  • the checkbits become address and control signals and the FSH_CS_L signal is asserted.
  • dramwrt The dramwrt module implements the data and control path for all masters requesting write access to SDRAM.
  • the dramwrt submodule dramwrt_mux arbitrates across all six dma requesters giving the following priorities from highest to lowest: RcvA, Q2d, Psi, S2d, P2d and D2d.
  • dramwrt_mux will then mux the selected requester's data and address.
  • the dramwrt_ldctrl will buffer the granted requester's data and ack the appropriate requester while the dramwrt_seq will proceed to initiate an SDRAM write operation.
  • the buffered data will be selected from dramwrt_data data buffers and written to memory. If ECC is enabled, the dramwrt_data block will also compute the checkbits as the data passes through. This block can also force ECC errors at any bit in any location. Also, as the data is being written, the dramwrt_cksum block will checksum the data and indicate to the DMA requester when the checksum is complete. P2d and D2d are the only two requesters which have checksums calculated for their transactions.
  • dramrd The dramrd module implements the data and control path for all masters requesting read access from SDRAM.
  • the dramrd submodule dramrd_mux arbitrates across all six dma requesters, giving the following priorities from highest to lowest: XmtA, Pso, D2s, D2q, D2p and D2d.
  • dramrd_mux also implements a state machine to overlap multiple read operations. So when a requester's read operation is being satisfied from SDRAM, another operation can be in progress with respect to bank activation and addressing.
  • the dramrd_seq intiates the request for the interface via dramif and starts the actual read sequence.
  • the dramrd_data block will check it for ECC errors, if ECC correction and detection is enabled. The data is then stored in a 64 byte read buffer. Once there is enough data to write to the sram, the dramrd_unld sequencer will select data from the read buffer and request access to sram. The acks coming back from these sram writes are directed by the dramrd_mux to the original DMA requestor. Once all the requested data is delivered to the requestor, this operation is then complete.
  • the dramrd controller (See FIG. 24) acts only as a slave sequencer to the rest of the Mojave chip. Servicing requests issued by master sequencers, the dramrd controller moves data from external SDRAM or flash to the Sram, via the dramif module, in blocks of 64 bytes or less.
  • the nature of the SDRAM requires fixed burst sizes for each of it's internal banks with ras precharge intervals between each access.
  • the Memory Controller Block Diagram depicts the major functional blocks of the dramrd controller.
  • the first step is servicing a request to move data from SDRAM to Sram in the prioritization of the master sequencer requests. This is done by dramrd mux.
  • the dramrd_mux selects the DMA requester's dram read address and sram write address and applies configuration information to determine the correct bank, row and column address to apply.
  • the dramrd_seq will control the operations of applying the row and column addresses and sequencing the appropriate control signals.
  • the dramrd_data block While reading the data from the SDRAM interface the dramrd_data block will perform error detection and/or correction on the data if this feature is enabled.
  • the dramrd_unld sequencer issues a write request to the SramCtrl sequencer which in turn sends an acknowledge to the dramrd sequencer.
  • the dramrd sequencer passes this acknowledge along to the level two master with a size code indicating how much data was written during the Sram cycle allowing the update of pointers and counters.
  • the dram read and Sram write cycles repeat until the original burst request has been completed at which point the dramrd sequencer prioritizes any remaining requests in preparation for the next burst cycle.
  • Contiguous dram burst cycles are not guaranteed to the dramrd controller as an algorithm is implemented in the dramif which ensures highest priority to refresh cycles followed by ping-pong access between dram writes and dram reads and then confiuration and flash cycles.
  • FIG. 25 is a timing diagram illustrating how data is read from SDRAM.
  • the dram has been configured for a burst of 4 with a latency of 2 clock cycles.
  • Bank A is first selected/activated followed by a read command 2 clock cycles later.
  • the bank select/activate for bank B is next issued as read data begins returning 2 clocks after the read command was issued to bank A.
  • Two clock cycles before we need to receive data from bank B we issue the read command. Once all 4 words have been received from bank A we begin receiving data from bank B.
  • the dramwrt controller (See FIG. 24) is a slave sequencer to the rest of Mojave. Servicing requests issued by master DMA sequencers, the dramwrt controller moves data from Sram to the external SDRAM or flash, via the dramif module, in blocks of 64 bytes or less while accumulating a checksum of the data moved.
  • the nature of the SDRAM requires fixed burst sizes for each of it's internal banks with ras precharge intervals between each access.
  • the memctrl block diagram (See FIG. 24) contains the major functional blocks of the dramwrt controller.
  • the first step in servicing a request to move data from Sram to SDRAM is the prioritization of the level two master requests. This is done by the dramwrt_mux.
  • the dramwrt_mux takes a Snapshot of the dram write address and applies configuration information to determine the correct dram, bank, row and column address to apply.
  • the dramwrt ldctrl sequencer immediately issues a read command to the Sram to which the Sram responds with both data and an acknowledge.
  • the read data is stored within the dramwrt_data buffers by the dramwrt_ldctrl sequencer.
  • the dramwrt_ldctrl sequencer passes the acknowledge to the level two master along with a size code indicating how much data was read during the Sram cycle allowing the update of pointers and counters.
  • the dramwrt_seq has initiated an a bank activate command at this point. Once sufficient data has been read from Sram, the dramwrt_seq sequencer issues a write command to the dram starting the burst cycle and computing a checksum as the data passes by.
  • ECC checkbits are also computed by the dramwrt_data block as the data moves out to the SDRAM interface. It is also possible to force ECC errors to any bit position within the data byte or checkbits.
  • the Sram read cycle repeats until the original burst request has been completed at which point the dramwrt_mux prioritizes any remaining requests in preparation for the next burst cycle.
  • the ECC is a 8 bit ECC for a 64 bit word
  • writes not aligned to a 64 bit boundary will necessitate a read/modify/write cycle.
  • the dramwrt_ldctrl sequencer detects that a non-aligned write is required, it will generate a request for the read to the dramrd controller.
  • the dramrd controller then returns the read data which is loaded into the write buffers.
  • the dramwrt_ldctrl sequencer can then request the new data from the Sram, proceeding from this point in the same way as for an aligned operation.
  • Contiguous dram burst cycles are not guaranteed to the dramwrt controller as an algorithm is implemented in the dramif which ensures highest priority to refresh cycles followed by ping-pong access between dram writes and dram reads and then configuration and flash cycles.
  • FIG. 26 is a timing diagram illustrating how data is written to SDRAM.
  • the dram has been configured for a burst of four with a latency of two clock cycles.
  • Bank A is first selected/activated followed by a write command two clock cycles later.
  • the bank select/activate for bank B is next issued in preparation for issuing the second write command.
  • Banks C and D follow if necessary.
  • the Pmo sequencer (See FIG. 27) acts only as a slave sequencer. Servicing requests issued by master sequencers, the Pmo sequencer moves data from an Sram based fifo to a Pci target, via the PciMstrIO module, in bursts of up to 256 bytes.
  • the nature of the PCI bus dictates the use of the write line command to ensure optimal system performance.
  • the write line command requires that the Pmo sequencer be capable of transferring a whole multiple (1 ⁇ , 2 ⁇ , 3 ⁇ , . . . ) of cache lines of which the size is set through the Pci configuration registers.
  • Pmo will automatically perform partial bursts until it has aligned the transfers on a cache line boundary at which time it will begin usage of the write line command.
  • the Sram fifo depth of 256 bytes, has been chosen in order to allow Pmo to accommodate cache line sizes up to 128 bytes. Provided the cache line size is less than 128 bytes, Pmo will perform multiple, contiguous cache line bursts until it has exhausted the supply of data.
  • Pmo receives requests from two separate sources; the dram to Pci (D2p) module and the Sram to Pci (S2p) module.
  • An operation (See FIG. 27) first begins with prioritization of the requests where the S2p module is given highest priority.
  • the Pmo module takes a Snapshot of the Sram fifo address and uses this to generate read requests for the SramCtrl sequencer.
  • the Pmo module then proceeds to arbitrate for ownership of the Pci bus via the PciMstrIO module. Once the Pmo holding registers have sufficient data and Pci bus mastership has been granted, the Pmo module begins transferring data to the Pci target.
  • Pmo For each successful transfer, Pmo sends an acknowledge and encoded size to the master sequencer, allow it to update it's internal pointers, counters and status. Once the Pci burst transaction has terminated, Pmo parks on the Pci bus unless another initiator has requested ownership. Pmo again prioritizes the incoming requests and repeats the process.
  • the Pmi sequencer (See FIG. 28) acts only as a slave sequencer. Servicing requests issued by master sequencers, the Pmi sequencer moves data from a Pci target to an Sram based fifo, via the PciMstrIO module, in bursts of up to 256 bytes.
  • the nature of the PCI bus dictates the use of the read multiple command to ensure optimal system performance.
  • the read multiple command requires that the Pmi sequencer be capable of transferring a cache line or more of data. To accomplish this end, Pmi will automatically perform partial cache line bursts until it has aligned the transfers on a cache line boundary at which time it will begin usage of the read multiple command.
  • the Sram fifo depth of 256 bytes, has been chosen in order to allow Pmi to accommodate cache line sizes up to 128 bytes. Provided the cache line size is less than 128 bytes, Pmi will perform multiple, contiguous cache line bursts until it has filled the fifo.
  • Pmi receive requests from two separate sources; the Pci to dram (P2d) module and the Pci to Sram (P2s) module.
  • An operation (See FIG. 28) first begins with prioritization of the requests where the P2s module is given highest priority.
  • the Pmi module then proceeds to arbitrate for ownership of the Pci bus via the PciMstrIO module. Once the Pci bus mastership has been granted and the Pmi holding registers have sufficient data, the Pmi module begins transferring data to the Sram fifo. For each successful transfer, Pmi sends an acknowledge and encoded size to the master sequencer, allowing it to update it's internal pointers, counters and status. Once the Pci burst transaction has terminated, Pmi parks on the Pci bus unless another initiator has requested ownership. Pmi again prioritizes the incoming requests and repeats the process
  • the D2p sequencer acts is a master sequencer. Servicing channel requests issued by the Cpu, the D2p sequencer manages movement of data from dram to the Pci bus by issuing requests to both the Drd sequencer and the Pmo sequencer. Data transfer is accomplished using an Sram based fifo through which data is staged.
  • D2p can receive requests from any of the processor's thirty-two dma channels. Once a command request has been detected, D2p fetches a dma descriptor from an Sram location dedicated to the requesting channel which includes the dram address, Pci address, Pci endian and request size. D2p then issues a request to the D2s sequencer causing the Sram based fifo to fill with dram data. Once the fifo contains sufficient data for a Pci transaction, D2s issues a request to Pmo which in turn moves data from the fifo to a Pci target.
  • FIG. 29 is an illustration showing the major blocks involved in the movement of data from dram to Pci target.
  • the P2d sequencer acts as both a slave sequencer and a master sequencer. Servicing channel requests issued by the Cpu, the P2d sequencer manages movement of data from Pci bus to dram by issuing requests to both the Dwr sequencer and the Pmi sequencer. Data transfer is accomplished using an Sram based fifo through which data is staged.
  • P2d can receive requests from any of the processor's thirty-two dma channels. Once a command request has been detected, P2d, operating as a slave sequencer, fetches a dma descriptor from an Sram location dedicated to the requesting channel which includes the dram address, Pci address, Pci endian and request size. P2d then issues a request to Pmo which in turn moves data from the Pci target to the Sram fifo. Next, P2d issues a request to the Dwr sequencer causing the Sram based fifo contents to be written to the dram.
  • FIG. 30 is an illustration showing the major blocks involved in the movement of data from a Pci target to dram.
  • the S2p sequencer (See FIG. 31) acts as both a slave sequencer and a master sequencer. Servicing channel requests issued by the Cpu, the S2p sequencer manages movement of data from Sram to the Pci bus by issuing requests to the Pmo sequencer
  • S2p can receive requests from any of the processor's thirty-two dma channels. Once a command request has been detected, S2p, operating as a slave sequencer, fetches a dma descriptor from an Sram location dedicated to the requesting channel which includes the Sram address, Pci address, Pci endian and request size. S2p then issues a request to Pmo which in turn moves data from the Sram to a Pci target. The process repeats until the entire request has been satisfied at which time S2p writes ending status in to the Sram dma descriptor area and sets the channel done bit associated with that channel. S2p then monitors the dma channels for additional requests.
  • FIG. 31 is an illustration showing the major blocks involved in the movement of data from Sram to Pci target.
  • the P2s sequencer acts as both a slave sequencer and a master sequencer. Servicing channel requests issued by the Cpu, the P2s sequencer manages movement of data from Pci bus to Sram by issuing requests to the. Pmi sequencer.
  • P2s can receive requests from any of the processor's thirty-two dma channels. Once a command request has been detected, P2s, operating as a slave sequencer, fetches a dma descriptor from an Sram location dedicated to the requesting channel which includes the Sram address, Pci address, Pci endian and request size. P2s then issues a request to Pmo which in turn moves data from the Pci target to the Sram. The process repeats until the entire request has been satisfied at which time P2s writes ending status in to the dma descriptor area of Sram and sets the channel done bit associated with that channel. P2s then monitors the dma channels for additional requests.
  • FIG. 32 is an illustration showing the major blocks involved in the movement of data from a Pci target to dram.
  • the D2s sequencer (See FIG. 33) acts as both a slave sequencer and a master sequencer. Servicing channel requests issued by the Cpu, the D2s sequencer manages movement of data from dram to Sram by issuing requests to the Drd sequencer.
  • D2s can receive requests from any of the processor's thirty-two dma channels. Once a command request has been detected, D2s, operating as a slave sequencer, fetches a dma descriptor from an Sram location dedicated to the requesting channel which includes the dram address, Sram address and request size. D2s then issues a request to the Drd sequencer causing the transfer of data to the Sram. The process repeats until the entire request has been satisfied at which time D2s writes ending status in to the Sram dma descriptor area and sets the channel done bit associated with that channel. D2s then monitors the dma channels for additional requests.
  • FIG. 33 is an illustration showing the major blocks involved in the movement of data from dram to Sram.
  • the S2d sequencer (See FIG. 34) acts as both a slave sequencer and a master sequencer. Servicing channel requests issued by the Cpu, the S2d sequencer manages movement of data from Sram to dram by issuing requests to the Dwr sequencer.
  • S2d can receive requests from any of the processor's thirty-two dma channels. Once a command request has been detected, S2d, operating as a slave sequencer, fetches a dma descriptor from an Sram location dedicated to the requesting channel which includes the dram address, Sram address, checksum reset and request size. S2d then issues a request to the Dwr sequencer causing the transfer of data to the dram. The process repeats until the entire request has been satisfied at which time S2d writes ending status in to the Sram dma descriptor area and sets the channel done bit associated with that channel. S2d then monitors the dma channels for additional requests.
  • FIG. 34 is an illustration showing the major blocks involved in the movement of data from Sram to dram.
  • the Psi sequencer acts as both a slave sequencer and a master sequencer. Servicing requests issued by a Pci master, the Psi sequencer manages movement of data from Pci bus to Sram and Pci bus to dram via Sram by issuing requests to the SramCtrl and Dwr sequencers.
  • Psi manages write requests to configuration space, expansion rom, dram, Sram and memory mapped registers. Psi separates these Pci bus operations in to two categories with different action taken for each. Dram accesses result in Psi generating write request to an Sram buffer followed with a write request to the Dwr sequencer. Subsequent write or read dram operations are retry terminated until the buffer has been emptied. An event notification is set for the processor allowing message passing to occur through dram space.
  • All other Pci write transactions result in Psi posting the write information including Pci address, Pci byte marks and Pci data to a reserved location in Sram, then setting an event flag which the event processor monitors. Subsequent writes or reads of configuration, expansion rom, Sram or registers are terminated with retry until the processor clears the event flag. This allows Mojave to keep pipelining levels to a minimum for the posted write and give the processor ample time to modify data for subsequent Pci read operations.
  • FIG. 35 depicts the sequence of events when Psi is the target of a Pci write operation. Note that events 4 through 7 occur only when the write operation targets the dram.
  • the Pso sequencer acts as both a slave sequencer and a master sequencer. Servicing requests issued by a Pci master, the Pso sequencer manages movement of data to Pci bus form Sram and to Pci bus from dram via Sram by issuing requests to the SramCtrl and Drd sequencers.
  • Pso manages read requests to configuration space, expansion rom, dram, Sram and memory mapped registers. Pso separates these Pci bus operations in to two categories with different action taken for each. Dram accesses result in Pso generating read request to the Drd sequencer followed with a read request to Sram buffer. Subsequent write or read dram operations are retry terminated until the buffer has been emptied.
  • All other Pci read transactions result in Pso posting the read request information including Pci address and Pci byte marks to a reserved location in Sram, then setting an event flag which the event processor monitors. Subsequent writes or reads of configuration, expansion rom, Sram or registers are terminated with retry until the processor clears the event flag. This allows Mojave to use a microcoded response mechanism to return data for the request. The processor decodes the request information, formulates or fetches the requested data and stores it in Sram then clears the event flag allowing Pso to fetch the data and return it on the Pci bus.
  • FIG. 36 depicts the sequence of events when Pso is the target of a Pci read operation.
  • the receive sequencer (RcvSeq)(See FIG. 37) analyzes and manages incoming packets, stores the result in dram buffers or sram buffers, then notifies the processor through the receive queue (RcvQ) mechanism.
  • the process begins when a buffer descriptor is available at the output of the FreeQ ( 1 ).
  • RcvSeq issues a request to the Qmg ( 2 ) which responds by supplying the buffer descriptor to RcvSeq ( 3 ).
  • RcvSeq then waits for a receive packet ( 4 ).
  • the Mac, network, transport and session information is analyzed as each byte is received ( 4 ) and stored in the assembly register (AssyReg).
  • RcvSeq When sixteen bytes of information is available, RcvSeq requests a write of the data to the Sram ( 5 ). In normal mode, when sufficient data has been stored in the Sram based receive fifo, a Dram write request is issued to Dwr ( 8 ). The process continues until the entire packet has been received at which point RcvSeq stores the results of the packet analysis in the beginning of the receive buffer. Once the buffer and status have both been stored, RcvSeq issues a write-queue request to Qmg ( 12 ) using a QId based on the priority level of the incoming packet detected by RcvSeq.
  • Qmg responds by storing a buffer descriptor ( 15 ) and, in normal mode, a status vector provided by RcvSeq ( 13 ).
  • RcvSeq When QHashEn is set, RcvSeq will merge the CtxHash with the receive descriptor. The process then repeats. If RcvSeq detects the arrival of a packet before a free buffer is available, it ignores the packet and sets the PktMissed status bit for the next received packet.
  • FIG. 37 depicts the sequence of events for successful reception of a packet.
  • FIG. 39 is a definition of the receive buffer.
  • FIG. 40 is a definition of the receive buffer descriptor as stored on the RcvQ.
  • FIG. 41 is a diagram that illustrates a receive vector.
  • the receive sequencer (See FIG. 37) analyzes the vlan priorities of the incoming packets, and stores the receive descriptor in one of it's receive queues according to the value written to the PriLevels bits of the RcvCfg register as represented in FIG. 38.
  • Rev. A of Mojave has a bug which limits receive queues to 0 and 1.
  • the transmit sequencer (XmtSeq)(See FIG. 42) manages outgoing packets, using buffer descriptors retrieved from, in order of priority, the urgent descriptor register (XmtUrgDscr) followed by the transmit queues (XmtQ) priority 3 down to priority 0 , then storing the descriptor for the freed buffer in the free buffer queue (FreeQ).
  • the process begins when a buffer descriptor is available at, for example, the output of XmtQ2 ( 1 ).
  • XmtSeq issues a request to the Qmg ( 2 ) which responds by supplying the buffer descriptor to XmtSeq ( 4 ).
  • XmtSeq then issues a read request to the Drd ( 5 ) sequencer.
  • XmtSeq issues a read request to SramCtrl ( 6 ) then instructs the Mac ( 10 ) to begin frame transmission.
  • XmtSeq stores the buffer descriptor on the FreeQ ( 12 ) thereby recycling the buffer. If XmtSeq detects a data-late condition or a collision, the packet is retransmitted automatically.
  • FIG. 42 depicts the sequence of events for successful transmission of a packet.
  • FIG. 43 is a diagram of the transmit descriptor.
  • FIG. 44 is a diagram of the merge descriptor.
  • FIG. 45 is a diagram of the transmit beffer format.
  • FIG. 46 is a diagram of the transmit vector.
  • Mojave includes special hardware assist for the implementation of message and pointer queues.
  • the hardware assist is called the queue manager (Qmg) (See FIG. 47) and manages the movement of queue entries between Cpu and Sram, between Xcv sequencers and Sram as well as between Sram and Dram.
  • Queues comprise three distinct entities; the queue head (QHd), the queue tail (QTl) and the queue body (QBdy).
  • QHd resides in 64 bytes of scratch ram and provides the area to which entries will be written (pushed).
  • QTl resides in 64 bytes of scratch ram and contains queue locations from which entries will be read (popped).
  • QBdy resides in dram and contains locations for expansion of the queue in order to minimize the Sram space requirements.
  • the QBdy size depends upon the queue being accessed and the initialization parameters presented during queue initialization.
  • Qmg (See FIG. 47) accepts operation requests from both Cpu, XcvSeqs and DmaSeqs. Executing these operations at a frequency of 100 Mhz.
  • Valid Cpu operations include initialize queue (InitQ), write queue (WrQ) and read queue (RdQ).
  • Valid dma requests include read queue (RdQ), read body (RdBdy) and write body (WrBdy).
  • Qmg working in unison with Q2d and D2q generate requests to the Dwr and Drd sequencers to control the movement of data between the QHd, QTl and QBdy.
  • the first 8 are dedicated to a specific function as shown in FIG. 48.
  • FIG. 47 shows the major functions of Qmg.
  • the arbiter selects the next operation to be performed.
  • the dual-ported Sram holds the queue variables HdWrAddr, HdRdAddr, TlWrAddr, TlRdAddr, BdyWrAddr, BdyRdAddr and QSz.
  • Qmg accepts an operation request, fetches the queue variables from the queue ram (Qram), modifies the variables based on the current state and the requested operation then updates the variables and issues a read or write request to the Sram controller.
  • the Sram controller services the requests by writing the tail or reading the head and returning an acknowledge.
  • DMA operations are accomplished by seven dma sequencers (DmaSeq). Commands are sent to these sequencers via hardware queues.
  • the queue Ids are fixed in hardware and are as shown in FIG. 49.
  • Microcode will initiate a DMA by writing a command to the appropriate queue.
  • the DMA sequencer will read a command from the queue, and fetch the descriptor block from Sram. It will then do the DMA.
  • the DMA sequencer will terminate the DMA.
  • the DMA Context byte (bits 31:24 of the command) will be written to the termination queue indicated by bits 20:16 of the command.
  • Each entry in the termination queue is 32 bits, but only the least significant 8 bits (7:0) are used and wriiten with the DMA Context.
  • the termination queue will not be written. Instead a bit in the DMA Error Register will be set. This is a 32 bit register and the least significant 5 bits of the DMA context will be used to decide which bit should be written in the following manner:
  • the DMA descriptor block is updated, but no other termination information is written. If the DMA chain bit is set and the DMA completes with an error, the DMA descriptor block is updated, and the error is propogated to subsequent DMA commands until the sequencer finds one that does not have the chain bit, when the DMA Error Register will be written as above, without writing to the termination queue.
  • FIG. 54 shows the major blocks of PCI logic and their relationships. The blocks of FIG. 54 are as follows:
  • Slave Dram Interface This block controls the interface to Dram when Dram is being accessed directly by the host or by another PCI master.
  • Slave Sram Interface This block controls the access to Sram for PCI slave accesses to read Sram, or to read or write Dram.
  • Pci Configuration Registers This block contains the configuration registers that control the PCI space.
  • DMA Master In This block does PCI master transfers on behalf of the P2D and P2S DMA sequencers. There is synchronization logic to synchronize between the PCI bus and the SRAM which are being clcoked by different clocks. It has 256 bytes of buffering to minimize latencies caused by this synchronization.
  • DMA Master Out This block does PCI master transfers on behalf of the D2P and S2P DMA sequencers. There is synchronization logic to synchronize between the PCI bus and the SRAM which are being clcoked by different clocks. It has 256 bytes of buffering to minimize latencies caused by this synchronization.
  • PCI Slave Interface This block has the state machine for PCI slave accesses to Mojave, from the host or from another PCI master.
  • PCI Parity This block generates and checks parity on the PCI bus.
  • PCI Master Interface This block has the state machine for PCI master accesses to host memory or to another PCI slave, done on behalf of the DMA sequencers.

Abstract

A TCP/IP offload network interface device (NID) is integrated with a processing device that executes a stack. The TCP/IP offload NID can either be a full TCP/IP offload device or a partial TCP/IP offload device. Common types of packets are processed by the NID in a fast-path such that the stack is offloaded of TCP and IP protocol processing tasks. A hash is made from the packet header and is pushed onto a queue. The hash is later popped off the queue and is used to identify an associated TCB number from a hash table. A mechanism caches hash buckets in SRAM and stores other hash buckets in DRAM. An “IN SRAM CAM” is used to determine whether the TCB associated with the identified TCB number is cached in SRAM or whether it must be moved from DRAM into the SRAM cache. A lock table and a “lock table CAM” mechansim is disclosed that facilitates multiple processors working on the protocol processing of a single packet.

Description

    CROSS-REFERENCE TO RELATED APPLICATIONS
  • This application claims the benefit under 35 U.S.C. § 119(e) of Provisional Application Serial No. 60/374,788, filed Apr. 22, 2002. The complete disclosure of Provisional Application Serial No. 60/374,788 is incorporated herein by reference.[0001]
  • CROSS-REFERENCE TO COMPACT DISC APPENDIX
  • Compact Disc Appendix, which is a part of the present disclosure, includes a recordable Compact Disc (CD-R) containing information that is part of the disclosure of the present patent document. A portion of the disclosure of this patent document contains material that is subject to copyright protection. All the material on the Compact Disc is hereby expressly incorporated by reference into the present application. The copyright owner of that material has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights.[0002]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings, in which: [0003]
  • FIG. 1 is a diagram of a [0004] system 1 in accordance with one embodiment of the present invention.
  • FIG. 2 is a simplified diagram of various structures and steps involved in the processing of an incoming packet in accordance with an embodiment of the present invention. [0005]
  • FIG. 3 is a flowchart of a method in accordance with an embodiment of the present invention. [0006]
  • FIGS. 4, 5, [0007] 6, 7, 8 and 9 are diagrams that illustrate various system configurations involving a network interface device in accordance with the present invention.
  • DETAILED DESCRIPTION
  • FIG. 1 is a simplified diagram of a [0008] system 1 in accordance with a first embodiment. System 1 is coupled to a packet-switched network 2. Network 2 can, for example, be a local area network (LAN) and/or a collection of networks. Network 2 can, for example, be the Internet. Network 2 can, for example, be an IP-based SAN that runs iSCSI. Network 2 may, for example, be coupled to system 1 via media that communicates electrical signals, via fiber optic cables, and/or via a wireless communication channel. System 1 includes a network interface device (NID) 3 as well as a central processing unit (CPU) 4. CPU 4 executes software stored in storage 5. NID 3 is coupled to CPU 4 and storage 5 via host bus 6, a bridge 7, and local bus 8. Host bus 6 may, for example, be a PCI bus or another computer expansion bus.
  • In the illustrated particular embodiment, [0009] NID 3 includes an application specific integrated circuit (ASIC) 9, an amount of dynamic random access memory (DRAM) 10, and Physical Layer Interface (PHY) circuitry 11. NID 3 includes specialized protocol accelerating hardware for implementing “fast-path” processing whereby certain types of network communications are accelerated in comparison to “slow-path” processing whereby the remaining types of network communications are handled at least in part by a software protocol processing stack. In one embodiment, the certain types of network communications accelerated are TCP/IP communications. The embodiment of NID 3 illustrated in FIG. 1 is therefore sometimes called a TCP/IP Offload Engine (TOE).
  • For additional information on examples of a network interface device (sometimes called an Intelligent Network Interface Card or “INIC”), see: U.S. Pat. No. 6,247,060; U.S. Pat. No. 6,226,680; Published U.S. Patent Application No. 20010021949; Published U.S. Patent Application No. 20010027496; and Published U.S. Patent Application No. 20010047433 (the contents of each of the above-identified patents and published patent applications is incorporated herein by reference). [0010] System 1 of FIG. 1 employs techniques set forth in these documents for transferring control of TCP/IP connections between a protocol processing stack and a network interface device.
  • NID [0011] 3 includes Media Access Control circuitry 12, three processors 13-15, a pair of Content Addressable Memories (CAMs) 16 and 17, an amount of Static Random Access Memory (SRAM) 18, queue manager circuitry 19, a receive processor 20, and a transmit sequencer 21. Receive processor 20 executes code stored its own control store 22.
  • In some embodiments where [0012] NID 3 fully offloads or substantially fully offloads CPU 4 of the task of performing TCP/IP protocol processing, NID 3 includes a processor 23. Processor 23 may, for example, be a general purpose microprocessor. Processor 23 performs slow-path processing such as TCP error condition handling and exception condition handling. In some embodiments, processor 23 also performs higher layer protocol processing such as, for example, iSCSI layer protocol processing such that NID 3 offloads CPU 4 of all iSCSI protocol processing tasks. In the example of FIG. 1, CPU 4 executes code that implements a file system, and processor 23 executes code that implements a protocol processing stack that includes an iSCSI protocol processing layer.
  • Overview of One Embodiment Of A Fast-Path Receive Path: [0013]
  • Operation of NID [0014] 3 is now described in connection with the receipt onto NID 3 of a TCP/IP packet from network 2. DRAM 10 is initially partitioned to include a plurality of buffers. Receive processor 20 uses the buffers in DRAM 10 to store incoming network packet data as well as status information for the packet. For each buffer, a 32-bit buffer descriptor is created. Each 32-bit buffer descriptor indicates the size of the associated buffer and the location in DRAM of the associated buffer. The location is indicated by a 19-bit pointer.
  • At start time, the buffer descriptors for the fee buffers are pushed onto on a “free-buffer queue” [0015] 24. This is accomplished by writing the buffer descriptors to queue manager 19. Queue manager 19 maintains multiple queues including the “free-buffer queue” 24. In this implementation, the heads and tails of the various queues are located in SRAM 18, whereas the middle portion of the queues are located in DRAM 10.
  • The TCP/IP packet is received from the [0016] network 2 via Physical Layer Interface (PHY) circuitry 11 and MAC circuitry 12. As the MAC circuitry 12 processes the packet, the MAC circuitry 12 verifies checksums in the packet and generates “status” information. After all the packet data has been received, the MAC circuitry 12 generates “final packet status” (MAC packet status). The status information (also called “protocol analyzer status”) and the MAC packet status information is then transferred to a free one of the DRAM buffers obtained from the free-buffer queue 24. The status information and MAC packet status information is stored prepended to the associated data in the buffer.
  • After all packet data has been transferred to the free DRAM buffer, receive [0017] processor 20 pushes a “receive packet descriptor” (also called a “summary”) onto a “receive packet descriptor” queue 25. The “receive packet descriptor” includes a 14-bit hash value, the buffer descriptor, a buffer load-count, the MAC ID, and a status bit (also called an “attention bit”). The 14-bit hash value was previously generated by the receive processor 20 (from the TCP and IP source and destination addresses) as the packet was received. If the “attention bit” of the receive packet descriptor is a one, then the packet is not a “fast-path candidate”; whereas if the attention bit is a zero, then the packet is a “fast-path candidate”. In the present example of a TCP/IP offload engine, the attention bit being a zero indicates that the packet employs both the TCP protocol and the IP protocol.
  • Once the “receive packet descriptor” (including the buffer descriptor that points to the DRAM buffer where the data is stored) has been placed in the “receive packet descriptor” [0018] queue 25 and the packet data has been placed in the associated DRAM buffer, one of the processors 13 and 14 can retrieve the “receive packet descriptor” from the “receive packet descriptor” queue 25 and examine the “attention bit”.
  • If the attention bit is a digital one, then the processor determines that the packet is not a “fast-path candidate” and the packet is handled in “slow-path”. In one embodiment where the packet is a TCP/IP packet, wherein the attention bit indicates the packet is not a “fast-path candidate”, and where [0019] NID 3 performs full offload TCP/IP functions, general purpose processor 23 performs further protocol processing on the packet (headers and data). In another embodiment where there is no general purpose processor 23 and where NID 3 performs partial TCP/IP functions, the entire packet (headers and data) are transferred from the DRAM buffer and across host bus 6 such that CPU 4 performs further protocol processing on the packet.
  • If, on the other hand, the attention bit is a zero, then the processor determines that the packet is a “fast-path candidate”. If the processor determines that the packet is a “fast-path candidate”, then the processor uses the buffer descriptor from the “receive packet descriptor” to initiate a DMA transfer the first approximately 96 bytes of information from the pointed to buffer in [0020] DRAM 10 into a portion of SRAM 18 so that the processor can examine it. This first approximately 96 bytes contains the IP source address of the IP header, the IP destination address of the IP header, the TCP source address (“TCP source port”) of the TCP header, and the TCP destination address (“TCP destination port”) of the TCP header. The IP source address of the IP header, the IP destination address of the IP header, the TCP source address of the TCP header, and the TCP destination address of the TCP header together uniquely define a single “connection context” with which the packet is associated.
  • While this DMA transfer from DRAM to SRAM is occurring, the processor uses the 14-bit hash from the “receive packet descriptor” to identify the connection context of the packet and to determine whether the connection context is one of a plurality of connection contexts that are under the control of [0021] NID 3. The hash points to one hash bucket in a hash table 104 in SRAM 18. In the diagram of FIG. 1, each row of the hash table 104 is a hash bucket. Each hash bucket contains one or more hash table entries. If the hash identifies a hash bucket having more than-one hash table entry (as set forth below in further detail), then the processor attempts to match the IP source address, IP destination address, TCP source address (port), and TCP destination address (port) retrieved from DRAM with the same fields, i.e., the IP source address, IP destination address, TCP source port, and TCP destination port of each hash table entry. The hash table entries in the hash bucket are searched one by one in this manner until the processor finds a match. When the processor finds a matching hash table entry, a number stored in the hash table entry (called a “transmit control block number” or “TCB number”) identifies a block of information (called a TCB) related to the connection context of the packet. There is one TCB maintained on NID 3 for each connection context under the control of NID 3.
  • If the connection context is determined not to be one of the contexts under the control of [0022] NID 3, then the “fast-path candidate” packet is determined not to be an actual “fast-path packet.” In one embodiment where NID 3 includes general purpose processor 23 and where NID 3 performs full TCP/IP offload functions, processor 23 performs further TCP/IP protocol processing on the packet. In another embodiment where NID 3 performs partial TCP/IP offload functions, the entire packet (headers and data) is transferred across host bus 6 for further TCP/IP protocol processing by the sequential protocol processing stack of CPU 4.
  • If, on the other hand, the connection context is one of the connection contexts under control of [0023] NID 3, then software executed by the processor (13 or 14) checks for one of numerous exception conditions and determines whether the packet is a “fast-path packet” or is not a “fast-path packet”. These exception conditions include: 1) IP fragmentation is detected; 2) an IP option is detected; 3) an unexpected TCP flag (urgent bit set, reset bit set, SYN bit set or FIN bit set) is detected; 4) the ACK field in the TCP header shrinks the TCP window; 5) the ACK field in the TCP header is a duplicate ACK and the ACK field exceeds the duplicate ACK count (the duplicate ACK count is a user settable value); and 6) the sequence number of the TCP header is out of order (packet is received out of sequence).
  • If the software executed by the processor ([0024] 13 or 14) detects an exception condition, then the processor determines that the “fast-path candidate” is not a “fast-path packet.” In such a case, the connection context for the packet is “flushed” (control of the connection context is passed back to the stack) so that the connection context is no longer present in the list of connection contexts under control of NID 3. If NID 3 is a full TCP/IP offload device including general purpose processor 23, then general purpose processor 23 performs further TCP/IP processing on the packet. In other embodiments where NID 3 performs partial TCP/IP offload functions and NID 3 includes no general purpose processor 23, the entire packet (headers and data) is transferred across host bus 6 to CPU 4 for further “slow-path” protocol processing.
  • If, on the other hand, the processor ([0025] 13 or 14) finds no such exception condition, then the “fast-path candidate” packet is determined to be an actual “fast-path packet”. The processor executes a software state machine such that the packet is processed in accordance with the IP and TCP protocols. The data portion of the packet is then DMA transferred to a destination identified by another device or processor. In the present example, the destination is located in storage 5 and the destination is identified by a file system controlled by CPU 4. CPU 4 does no or very little analysis of the TCP and IP headers on this “fast-path packet”. All or substantially all analysis of the TCP and IP headers of the “fast-path packet” is done on NID 3. Description Of A TCB Lookup Method:
  • As set forth above, information for each connection context under the control of [0026] NID 3 is stored in a block called a “Transmit Control Block” (TCB). An incoming packet is analyzed to determine whether it is associated with a connection context that is under the control of NID 3. If the packet is associated with a connection context under the control of NID 3, then a TCB lookup method is employed to find the TCB for the connection context. This lookup method is described in further detail in connection with FIGS. 2 and 3.
  • [0027] NID 3 is a multi-receive processor network interface device. In NID 3, up to sixteen different incoming packets can be in process at the same time by two processors 13 and 14. (Processor 15 is a utility processor, but each of processors 13 and 14 can perform receive processing or transmit processing.) A processor executes a software state machine to process the packet. As the packet is processed, the state machine transitions from state to state. One of the processors, for example processor 13, can work on one of the packets being received until it reaches a stopping point. Processor 13 then stops work and stores the state of the software state machine. This stored state is called a “processor context”. Then, at some later time, either the same processor 13 or the other processor 14 may resume processing on the packet. In the case where the other processor 14 resumes processing, processor 14 retrieves the prior state of the state machine from the previous “processor context”, loads this state information into its software state machine, and then continues processing the packet through the state machine from that point. In this way, up to sixteen different flows can be processed by the two processors 13 and 14 working in concert.
  • In this example, the TCB lookup method starts after the TCP packet has been received, after the 14-bit hash and the attention bit has been generated, and after the hash and attention bit have been pushed in the form of a “receive packet descriptor” onto the “receive packet descriptor queue”. [0028]
  • In a first step (step [0029] 200), one of processors 13 or 14 obtains an available “processor context”. The processor pops (step 201) the “receive packet descriptor” queue 25 to obtain the “receive packet descriptor”. The “receive packet descriptor” contains the previously-described 14-bit hash value 101 (see FIG. 2) and the previously-described attention bit. The processor checks the attention bit.
  • If the attention bit is set (step [0030] 202), then processing proceeds to slow-path processing. As set forth above, if NID 3 is a TCP/IP full-offload device and if the packet is a TCP/IP packet, then further TCP/IP processing is performed by general purpose processor 23. As set forth above, if NID 3 is a TCP/IP partial offload device, then the packet is sent across host bus 6 for further protocol processing by CPU 4.
  • If, on the other hand, the attention bit is not set (step [0031] 203), then the processor initiates a DMA transfer of the beginning part of the packet (including the header) from the identified buffer in DRAM 10 to SRAM 18. 14-bit hash value 101 (see FIG. 2) actually comprises a 12-bit hash value 102 and another two bits 103. The 12-bit hash value (bits [13:2]) identifies an associated one of 4096 possible 64-byte hash buckets. In this embodiment, up to 48 of these hash buckets can be cached in SRAM in a hash table 104, whereas any additional used hash buckets 105 are stored in DRAM 10. Accordingly, if the hash bucket identified by the 12-bit hash value is in DRAM 10, then the hash bucket is copied (or moved) from DRAM 10 to an available row in hash table 104. To facilitate this, there is a hash byte (SRAM_hashbt) provided in SRAM for each of the possible 4096 hash buckets. A six-bit pointer field in the hash byte indicates whether the associated hash bucket is located in SRAM or not. If the pointer field contains a number between 1 and 48, then the pointer indicates the row of hash table 104 where the hash bucket is found. If the pointer field contains the number zero, then the hash bucket is not in hash table 104 but rather is in DRAM. The processor uses the 12-bit hash value 102 to check the associated hash byte to see if the pointed to hash bucket is in the SRAM hash table 104 (step 204).
  • If the hash bucket is in the SRAM hash table [0032] 104 (step 205), then processing is suspended until the DMA transfer of the header from DRAM to SRAM is complete.
  • If, on the other hand, the hash bucket is not in the SRAM hash table [0033] 104 (step 206), then a queue (Q_FREEHASHSLOTS) identifying free rows in hash table 104 is accessed (the queue is maintained by queue manager 19) and a free hash bucket row (sometimes called a “slot’) is obtained. The processor then causes the hash bucket to be copied or moved from DRAM and into the free hash bucket row. Once the hash bucket is present in SRAM hash table 104, the processor updates the pointer field in the associated hash byte to indicate that the hash bucket is now in SRAM and is located at the row now containing the hash bucket.
  • Once the pointed to hash bucket is in SRAM hash table [0034] 104, the up to four possible hash bucket entries in the hash bucket are searched one by one (step 207) to identify if the TCP and IP fields of an entry match the TCP and IP fields of the packet header 106 (the TCP and IP fields from the packet header were obtained from the receive descriptor).
  • In the example of FIG. 2, the pointed to hash bucket contains two hash entries. The hash entries are checked one by one. The two [0035] bits 103 Bits [1:0] of the 14-bit hash are used to determine which of the four possible hash table entry rows (i.e., slots) to check first. In FIG. 2, the second hash entry 107 (shown in exploded view) is representative of the other hash table entries. It includes a 16-bit TCB# 108, a 32-bit IP destination address, a 32-bit IP source address, a 16-bit TCP destination port, and a 16-bit TCP source port.
  • If all of the entries in the hash bucket are searched and a match is not found (step [0036] 208), then processing proceeds by the slow-path. If, on the other hand, a match is found (step 209), then the TCB# portion 108 of the matching entry identifies the TCB of the connection context.
  • [0037] NID 3 supports both fast-path receive processing as well as fast-path transmit processing. A TCP/IP connection can involve bidirectional communications in that packets might be transmitted out of NID 3 on the same TCP/IP connection that other packets flow into NID 3. A mechanism is provided so that the context for a connection can be “locked” by one processor (for example, a processor receiving a packet on the TCP/IP connection) so that the another processor (for example, a processor transmitting a packet on the same TCP/IP connection) will not interfere with the connection context. This mechanism includes two bits for each of the up to 8192 connections that can be controlled by NID 3: 1) a “TCB lock bit” (SRAM_tcblock), and 2) a “TCB in-use bit” (SRAM_tcbinuse). The “TCB lock bits” 109 and the “TCB in-use bits” 110 are maintained in SRAM 18.
  • The processor attempts to lock the designated TCB (step [0038] 210) by attempting to set the TCB's lock bit. If the lock bit indicates that the TCB is already locked, then the processor context number (a 4-bit number) is pushed onto a linked list of waiting processor contexts for that TCB. Because there are sixteen possible processor contexts, a lock table 112 is maintained in SRAM 18. There is one row in lock table 112 for each of the sixteen possible processor contexts. Each row has sixteen four-bit fields. Each field can contain the 4-bit processor context number for a waiting processor context. Each row of the lock table 112 is sixteen entries wide because all sixteen processor contexts may be working on or waiting for the same TCB.
  • If the lock bit indicates that the TCB is already locked (step [0039] 211), then the processor context number (a four-bit number because there can be up to sixteen processor contexts) is pushed onto the row of the lock table 112 associated with the TCB. A lock table content addressable memory (CAM) 111 is used to translate the TCB number (from TCB field 108) into the row number in lock table 112 where the linked list for that TCB number is found. Accordingly, lock table CAM 111 receives a sixteen-bit TCB number and outputs a four-bit row number. When the processor context that has the TCB locked is ready to suspend itself, it consults the lock table CAM 111 and the associated lock table 112 to determine if there is another processor context waiting for the TCB. If there is another processor context waiting (there is an entry in the associated row of lock table 112), then it restarts the first (oldest) of the waiting processor contexts in the linked list. The restarted processor context is then free to lock the TCB and continue processing.
  • If, on the other hand, the TCB is not already locked, then the processor context locks the TCB by setting the associated [0040] TCB lock bit 109. The processor context then supplies the TCB number (sixteen bits) to an IN SRAM CAM 113 (step 212) to determine if the TCB is in one of thirty-two TCB slots 114 in SRAM 18. (Up to thirty-two TCBs are cached in SRAM, whereas a copy of all “in-use” TCBs is kept in DRAM). The IN SRAM CAM 113 outputs a sixteen-bit value, five bits of which point to one of the thirty-two possible TCB slots 114 in SRAM 18. One of the bits is a “found” bit.
  • If the “found” bit indicates that the TCB is “found”, then the five bits are a number from one to thirty-two that points to a TCB slot in [0041] SRAM 18 where the TCB is cached. The TCB has therefore been identified in SRAM 18, and fast-path receive processing continues (step 213).
  • If, on the other hand, the “found” bit indicates that the TCB is not found, then the TCB is not cached in [0042] SRAM 18. All TCBs 115 under control of NID 3 are, however, maintained in DRAM 10. The information in the appropriate TCB slot in DRAM 10 is then written over one of the thirty-two TCB slots 114 in SRAM 18. In the event that one of the SRAM TCB slots is empty, then the TCB information from DRAM 10 is DMA transferred into that free SRAM slot. If there is no free SRAM TCB slot, then the least-recently-used TCB slot in SRAM 18 is overwritten.
  • Once the TCB is located in [0043] SRAM cache 114, the IN SRAM CAM 113 is updated to indicate that the TCB is now located in SRAM at a particular slot. The slot number is therefore written into the IN SRAM CAM 113. Fast-path receive processing then continues (step 216).
  • When a processor context releases control of a TCB, it is not always necessary for the TCB information in [0044] SRAM 18 to be written to DRAM to update the version of the TCB in DRAM. If, for example, the TCB is a commonly used TCB and the TCB will be used again in the near future by the next processor context, then the next processor context can use the updated TCB in SRAM without the updated TCB having to have been written to DRAM and then having to be transferred back from DRAM to SRAM. Avoiding this unnecessary transferring of the TCB is advantageous. In accordance with one embodiment of the present invention, the processor context releasing control of a TCB does not update the DRAM version of the TCB, but rather the processor context assuming control of the TCB has that potential responsibility. A “dirty bit” 116 is provided in each TCB. If the releasing processor context changed the contents of the TCB (i.e., the TCB is dirty), then the releasing processor context sets this “dirty bit” 116. If the next processor context needs to put another TCB into the SRAM TCB slot held by the dirty TCB, then the next processor first writes the dirty TCB information (i.e., updated TCB information) to overwrite the corresponding TCB information in DRAM (i.e., to update the DRAM version of the TCB). If, on the other hand, the next processor does not need to move a TCB into an SRAM slot held by a dirty TCB, then the next processor does not need to write the dirty TCB information to DRAM. If need be, the next processor can either just update a TCB whose dirty bit is not set, or the next processor can simply overwrite the TCB whose dirty bit is not set (for example, to move another TCB into the slot occupied by the TCB whose dirty bit is not set).
  • In one specific embodiment, the instruction set of processors [0045] 13-15 includes the instructions in Table 1 below.
    TABLE 1
    OpdSel Name Description
    0b011000000 CamAddrA Write Only. CamAddr=AluOut[4:0].
    This register is written to define which
    one of the entries of the multi-entry
    CAM A will be read from or written to.
    The entry is either read from
    CamContentsA register on a read, or the
    entry is written into the CamContentsA
    register on a write. CAM A is a
    thirty-two entry CAM when CAMs A
    and B are used together as a single
    CAM. If CAM A is used separately,
    then CAM A is an sixteen-entry CAM.
    0b011000001 CamContentsA Read/Write. When writing:
    Cam Valid[CamAddrA]=˜AluOut[16].
    CamContents [CamAddrA]=
    AluOut[15:0]. Accordingly, writing
    bit sixteen “invalidates” the CAM
    entry. The tilde symbol here indicates
    the logical NOT.
    When reading:
    Bit 16=˜CamValid[CamAddrA].
    Bits 15-0=Cam Contents[CamAddrA].
    0b011000010 CamMatchA Read/Write. Writing a sixteen-bit value
    into this register causes CAM A to
    search its entries for a match with
    valid CAM A entries. A subsequent
    read of this register returns the
    result of the search as follows:
    Bit 5=contents not found.
    Bits 4-0=If the contents were found
    and the matched entry is valid, then bits
    4-0 are the number of the CAM entry
    which matched.
    0b011000011 CamConfigAB Write Only. CamSplit=AluOut[0]. If
    CamSplit is set, then CAM A is split
    into two sixteen-entry CAMs: CAM A
    and CAM B. The following addresses
    (CamAddrB, CanConentsB and
    CamMatchB) are then available to use
    the second part of the CAM (CAM B).
    0b011000100 CamAddrB Write Only. See the description of
    CamAddrA above.
    0b011000101 CamContentsB Read/Write. See the description of
    CamContentsB above.
    0b011000110 CamMatchB Read/Write. These registers
    (CamAddrB, Cam ContentsB and
    CamMatchB) are identical in use to
    those for CAM A (see above), except
    that they are for the second half of the
    first CAM (CAM B).
    0b011001000 CamAddrC Write Only. This register for CAM C
    is identical in function to the
    corresponding register for CAM A.
    0b011001001 CamContentsC Read/Write. This register for CAM C
    is identical in function to the
    corresponding register for CAM A.
    0b011001010 CamMatchC Read/Write. This register for CAM C
    is identical in function to the
    corresponding register for CAM A.
    0b011001011 CamConfigCD Write Only. As in the case of CAM A
    above, CAM C can be split into two
    sixteen-entry CAMs: CAM C and
    CAM D.
    0b011001100 CamAddrD Write Only. This register for CAM D
    is identical in function to the
    corresponding register for CAM D.
    0b011001101 CamContentsD Read/Write. This register for CAM D
    is identical in function to the
    corresponding register for CAM D.
    0b011001110 CamMatchD Read/Write. This register for CAM D
    is identical in function to the
    corresponding register for CAM D.
  • One embodiment of the code executed by processors [0046] 13-15 is written using functions. These functions are in turn made up of instructions including those instructions set forth in Table 1 above. The functions are set forth in the file SUBR.MAL of the CD Appendix (the files on the CD Appendix are incorporated by reference into the present patent document). These functions include:
  • 1) The INSRAM_CAM_INSERT function: Executing this function causes the TCB number present in a register (register cr[0047] 11) to be written into the IN SRAM CAM (CAM A of the processor). The particular CAM slot written to is identified by the lower sixteen bits of the value present in another register (register TbuffL 18).
  • 2) The INSRAM_CAM_REMOVE function: Executing this function causes the CAM entry in the IN SRAM CAM slot identified by a register (register cr[0048] 11) to be invalidated (i.e., removed). The entry is invalidated by setting bit 16 of a register (register CAM_CONTENTS_A).
  • 3) The INSRAM_CAM SEARCH function: Executing this function causes a search of the IN SRAM CAM for the TCB number identified by the TCB number present in a register (register cr[0049] 11). The result of the search is a five-bit slot number that is returned in five bits of another register (register TbuffL 18). The value returned in a sixth bit of the register TbuffL 18 indicates whether or not the TCB number was found in the INSRAM_CAM.
  • 4) The LOCKBL_CAM_INSERT function: Executing this function causes the sixteen-bit TCB number present in a register (register cr[0050] 11) to be written into the LOCK TABLE CAM (CAM C of the processor). The particular CAM slot written to is identified by the value present in a register (register cr10).
  • 5) The LOCKBL_CAM_REMOVE function: Executing this function causes the CAM entry in the LOCK TABLE CAM slot identified by a register (register cr[0051] 10) to be invalidated (i.e., removed). The entry is invalidated by setting bit of another register (register CAM_CONTENTS_C).
  • 6) The LOCK_TABLE_SEARCH function: Executing this function causes a search of the LOCK TABLE CAM for the TCB number identified by the TCB number present in a register (register cr[0052] 11). The result of the search is a four-bit number of a row in the lock table. The four-bit number is four bits of another register (register cr10). The value returned in a fifth bit of the register cr10 indicates whether or not the TCB number was found in the LOCK TABLE CAM.
  • Compact Disc Appendix: [0053]
  • The Compact Disc Appendix includes a folder “CD Appendix A”, a folder “CD Appendix B”, a folder “CD Appendix C”, and a file “title page.txt”. CD Appendix A includes a description of an integrated circuit (the same as [0054] ASIC 9 of FIG. 1 except that the integrated circuit of CD Appendix A does not include processor 23) of one embodiment of a TCP/IP offload network interface device (NID). CD Appendix B includes software that executes on a host computer CPU, where the host computer is coupled to a NID incorporating the integrated circuit set forth in CD Appendix A and wherein the host computer includes a CPU that executes a protocol stack. CD Appendix C includes a listing of the program executed by the receive processor of the integrated circuit set forth in Appendix A as well as a description of the instruction set executed by the receive processor.
  • The CD Appendix A includes the following: 1) a folder “Mojave verilog code” that contains a hardware description of an embodiment of the integrated circuit, and 2) a folder “Mojave microcode” that contains code that executes on the processors (for example, [0055] processors 13 and 14 of FIG. 1) of the integrated circuit. In the folder “Mojave microcode”, the file “MAINLOOP.MAL” is commented to indicate instructions corresponding to various steps of the method of FIG. 3. In the folder “Mojave microcode”, the file “SEQ.H” is a definition file for the “MAINLOOP.MAL” code. Page 9 sets forth steps in accordance with a twenty-step method in accordance with some embodiments of the present invention. Page 10 sets forth the structure of a TCB in accordance with some embodiments. Page 17 sets forth the structure of a hash byte (called a “TCB Hash Bucket Status Byte”).
  • A description of the instruction set executed by processors [0056] 13-15 of FIG. 1 is set forth in the section of this document entitled “Mojave Hardware Specification.”
  • The CD Appendix B includes the following: 1) a folder entitled “simba (device driver software for Mojave)” that contains device driver software executable on the host computer; 2) a folder entitled “atcp (free BSD stack and code added to it)” that contains a TCP/IP stack [the folder “atcp” contains: a) a TCP/IP stack derived from the “free BSD” TCP/IP stack (available from the University of California, Berkeley) so as to make it run on a Windows operating system, and b) code added to the free BSD stack between the session layer above and the device driver below that enables the BSD stack to carry out “fast-path” processing in conjunction with the NID]; and 3) a folder entitled “include (set of files shared by ATCP and device driver)” that contains a set of files that are used by the ATCP stack and are used by the device driver. [0057]
  • The CD Appendix C includes the following: 1) a file called “mojave_rcv_seq (instruction set description).mdl” that contains a description of the instruction set of the receive processor, and 2) a file called “mojave_rcv_seq (program executed by receive processor).mal” that contains a program executed by the receive processor. [0058]
  • System Configurations: [0059]
  • [0060]
  • FIGS. [0061] 4-9 illustrate various system configurations involving a network interface device in accordance with the present invention. These configurations are but some system configurations. The present invention is not limited to these configurations, but rather these configurations are illustrated here only as examples of some of the many configurations that are taught in this patent document.
  • FIG. 4 shows a [0062] computer 300 wherein a network interface device (NID) 301 is coupled via a connector 302 and a host bus 303 to a CPU 304 and storage 305. CPU 304 and storage 305 are together referred to as a “host” 306.
  • Rather than being considered coupled to a host, network interface device (NID) [0063] 301 can be considered part of a host as shown in FIG. 5. In FIG. 5, what is called a host computer 400 includes NID 301 as well as CPU 304 and storage 305. In both the examples of FIGS. 4 and 5, the CPU executes instructions that implement a sequential protocol processing stack. The network interface device 301 performs fast-path hardware accelerated protocol processing on some types of packets such that CPU 304 performs no or substantially no protocol processing on these types of packets. Control of a connection can be passed from the NID to the stack and from the stack to the NID.
  • FIG. 6 shows a [0064] computer 500 wherein NID 301 is coupled to CPU 304 and storage 305 by abridge 501.
  • FIG. 7 shows a [0065] computer 500 wherein a network interface device (NID) 501 is integrated into a bridge integrated circuit 502. Bridge 502 couples computer 500 to a network 503. Bridge 502 is coupled to CPU 504 and storage 505 by local bus 506. CPU 504 executes instructions that implement a software sequential protocol processing stack. Bridge 502 is coupled to multiple expansion cards 507, 508 and 509 via a host bus 510. Network interface device 501 performs TCP and IP protocol processing on certain types of packets, thereby offloading CPU and its sequential protocol processing stack of these tasks. Control of a connection can be passed from the NID to the stack and from the stack to the NID.
  • In one version, [0066] NID 501 is a full TCP/IP offload device. In another version, NID is a partial TCP/IP offload device. The terms “partial TCP/IP” are used here to indicate that all or substantially all TCP and IP protocol processing on certain types of packets is performed by the offload device, whereas substantial TCP and IP protocol processing for other types of packets is performed by the stack.
  • FIG. 8 shows a [0067] computer 700 wherein a network interface device (NID) 701 couples CPU 702 and storage 703 to network 704. NID 701 includes a processor that implements a sequential protocol processing stack 705, a plurality of sequencers 706 (such as, for example, a receive sequencer and a transmit sequencer), and a plurality of processors 707. This embodiment maybe a full-offload embodiment in that processor 705 fully offloads CPU 702 and its stack of all or substantially all TCP and IP protocol processing duties.
  • FIG. 9 shows a [0068] computer 800 wherein a network interface device (NID) 801 couples CPU 802 and storage 803 to network 804. NID 801 includes a plurality of sequencers 806 (for example, a receive sequencer and a transmit sequencer), and a plurality of processors 807. In this example, CPU 802 implements a software sequential protocol processing stack, and NID 801 does not include a general purpose processor that implements a sequential software protocol processing stack. This embodiment may be a partial-offload embodiment in that NID 801 performs all or substantially all TCP and IP protocol processing tasks on some types of packets, whereas CPU 802 and its stack perform TCP and IP protocol processing on other types of packets.
  • In the realization of different embodiments, the techniques, methods, and structures set forth in the documents listed below are applied to the system, and/or to the network interface device (NID), and/or to the application specific integrated circuit (ASIC) set forth in present patent document: U.S. Pat. No. 6,389,479; U.S. Pat. No. 6,470,415; U.S. Pat. No. 6,434,620; U.S. Pat. No. 6,247,060; U.S. Pat. No. 6,226,680; Published U.S. Patent Application 20020095519; Published U.S. Patent Application No. 20020091844; Published U.S. Patent Application No. 20010021949; Published U.S. Patent Application No. 20010047433; and U.S. patent application Ser. No. 09/801,488, entitled “Port Aggregation For Network Connections That Are Offloaded To Network Interface Devices”, filed Mar. 7, 2001. The content of each of the above-identified patents, published patent applications, and patent application is incorporated herein by reference. [0069]
  • Although certain specific exemplary embodiments are described above in order to illustrate the invention, the invention is not limited to the specific embodiments. [0070] NID 3 can be part of a memory controller integrated circuit or an input/output (I/O) integrated circuit or a bridge integrated circuit of a microprocessor chip-set. In some embodiments, NID 3 is part of an I/O integrated circuit chip such as, for example, the Intel 82801 integrated circuit of the Intel 820 chip set. NID 3 may be integrated into the Broadcom ServerWorks Grand Champion HE chipset, the Intel 82815 Graphics and Memory Controller Hub, the Intel 440BX chipset, or the Apollo VT8501 MVP4 North Bridge chip. The instructions executed by receive processor 20 and/or processors 13-15 are, in some embodiments, downloaded upon power-up of NID 3 into a memory on NID 3, thereby facilitating the periodic updating of NID functionality. High and low priority transmit queues may be implemented using queue manager 19. Hardcoded transmit sequencer 21, in some embodiments, is replaced with a transmit processor that executes instructions. Processors 13, 14 and 15 can be identical processors, each of which can perform receive processing and/or transmit processing and/or utility functions. Accordingly, various modifications, adaptations, and combinations of various features of the described embodiments can be practiced without departing from the scope of the invention as set forth in the following claims that follow the “Mojave Hardware Specification” section below.
  • Mojave Hardware Specification
  • Features [0071]
  • 1) Peripheral Component Interconnect (PCI) Interface. [0072]
  • a) Universal PCI interface supports both 5.0V and 3.3V signaling environments; [0073]
  • b) Supports both 32-bit and 64 bit PCI interface; [0074]
  • c) Supports PCI clock frequencies from 0 MHz to 66 MHz; [0075]
  • d) High performance bus mastering architecture; [0076]
  • e) Host memory based communications reduce register accesses; [0077]
  • f) Host memory based interrupt status word reduces register reads; [0078]
  • g) Plug and Play compatible; [0079]
  • h) PCI specification revision 2.2 compliant; [0080]
  • i) PCI bursts up to 4 K bytes; [0081]
  • j) Supports cache line operations up to 128 bytes; [0082]
  • k) Both big-endian and little-endian byte alignments supported; and [0083]
  • l) Supports Expansion ROM. [0084]
  • 2) Network Interface. [0085]
  • a) One internal 802.3 and ethernet compliant Mac; [0086]
  • b) Gigabit Media Independent Interface (GMII) supports external PHYs; [0087]
  • c) Ten Bit Interface (TBI) supports external SERDES; [0088]
  • d) 10 BASE-T, 100 BASE-TX/FX and 1000 BASE-TX/FX supported; [0089]
  • e) Full and half-duplex modes supported at 10/100 speeds; [0090]
  • f) Automatic PHY status polling notifies system of status change; [0091]
  • g) Provides SNMP statistics counters; [0092]
  • h) Supports broadcast and multicast packets; [0093]
  • i) Provides promiscuous mode for network monitoring or multiple unicast address detection; [0094]
  • j) Supports “Huge Packets” up to 9018B; and [0095]
  • k) Supports auto-negotiating Phys. [0096]
  • 3) Memory Interface. [0097]
  • a) External Dram buffering of transmit and receive packets; [0098]
  • b) ECC error correction and detection; [0099]
  • c) 64-bit data interface supports maximum throughput of 600 MB/s at 100 Mhz; [0100]
  • d) Supports external FLASH ROM up to 1 MB, for diskless boot applications; and [0101]
  • e) Supports external serial EEPROM for custom configuration and Mac addresses. [0102]
  • 4) Protocol Processor. [0103]
  • a) High speed, custom, 32-bit processor executes 100 million instructions per second; [0104]
  • b) Processes IP, TCP, IPX and UDP protocols; [0105]
  • c) Supports up to 32K resident TCP/IP contexts; and [0106]
  • d) Writable control store (WCS) allows field updates and feature enhancements. [0107]
  • 5) Power. [0108]
  • a) 1.8/3.3V chip operation; and [0109]
  • b) PCI controlled 5.0V/3.3V I/O cell operation. [0110]
  • 6) Packaging. [0111]
  • a) 272-pin plastic ball grid array. [0112]
  • General Description [0113]
  • Mojave (See FIG. 10) is a 32-bit, full-duplex, single channel, 10/100/1000-Megabit per second (Mbps), Session Layer Interface Controller (SLIC), designed to provide high-speed protocol processing for server and desktop applications. It combines the functions of a standard network interface controller and a protocol processor within a single chip. [0114]
  • When combined with the 802.3/GMII compliant Phy and Synchronous Dram (SDRAM), Mojave comprises one complete ethernet node. It contains one 802.3/ethernet compliant Mac, a PCI Bus Interface Unit (BIU), a memory controller, transmit fifo, receive fifo and a custom TCP/IP protocol processor. Mojave supports 10 Base-T, 100 Base-TX and 1000 Base-TX via the GMII interface attachment of appropriate Phys. Mojave also supports 100 Base-FX, and 1000 Base-FX via the TBI interface attachment of external Serdes. [0115]
  • The Mojave Mac provides statistical information that may be used for SNMP. The Mac can operate in promiscuous mode allowing Mojave to function as a network monitor, receive broadcast and multicast packets and implement multiple Mac addresses for each node. [0116]
  • Any 802.3/GMII/TBI compliant PHY/SERDES can be utilized, allowing Mojave to support 10 BASE-T, 10 BASE-T2, 100 BASE-TX, 100 Base-FX, 100 BASE-T4, 1000 BASE-TX or 1000 BASE-FX as well as future interface standards. PHY identification and initialization is accomplished through host driver initialization routines. PHY status registers can be polled continuously by Mojave to detect PHY status changes which are then reported to the host driver. The Mac can be configured to support a maximum frame size of 1518 bytes or 9018 bytes. [0117]
  • The 64-bit, multiplexed BIU provides a direct interface to the PCI bus for both slave and master functions. Mojave is capable of operating in either a 64-bit or 32-bit PCI environment, while supporting 64-bit addressing in either configuration. PCI bus frequencies up to 33 MHz are supported yielding instantaneous bus transfer rates of 266 MB/s. Both 5.0V and 3.3V signaling environments can be utilized by Mojave. Configurable cache-line size up to 256B will accommodate future architectures, and Expansion ROM/Flash support will allow for diskless system booting. Non-PC applications are supported via programmable big and little endian modes. Host based communication has been utilized to provide the best system performance possible. [0118]
  • Mojave supports Plug-N-Play auto-configuration through the PCI configuration space. Support of an external eeprom allows for local storage of configuration information such as Mac addresses. [0119]
  • External SDRAM provides frame buffering, which is configurable as 1 MB, 2 MB, 4 MB or 8 MB using the appropriate technology and width selections. Use of—10 speed grades yields an external buffer bandwidth of 88 MB/s. The buffer provides temporary storage of both incoming and outgoing frames. The protocol processor accesses the frames within the buffer in order to implement TCP/IP and NETBIOS. Incoming frames are processed, assembled then transferred to host memory under the control of the protocol processor. For transmit, data is moved from host memory to buffers where various headers are created before being transmitted out via the Mac. [0120]
  • 1) Datapath Bandwidth (See FIG. 11). [0121]
  • 2) Cpu Bandwidth (See FIG. 12). [0122]
  • 3) Performance Features. [0123]
  • a) 896 registers improve performance through reduced scratch ram accesses and reduced instructions; [0124]
  • b) Register windowing eliminates context-switching overhead; [0125]
  • c) Separate instruction and data paths eliminate memory contention; [0126]
  • d) Totally resident control store eliminates stalling during instruction fetch; [0127]
  • e) Multiple logical processors reduce context switching and improve real-time response; [0128]
  • f) Pipelined architecture increases operating frequency; [0129]
  • g) Shared registers and scratch ram improve inter-processor communication; [0130]
  • h) Fly-by receive sequencer assists address compare and checksum calculation; [0131]
  • i) TCP/IP-context caching reduces latency; [0132]
  • j) Hardware-implemented queues reduce Cpu overhead and latency; [0133]
  • k) Horizontal microcode greatly improves instruction efficiency; [0134]
  • l) Automatic frame DMA and status between Mac and dram buffer; and [0135]
  • m) Deterministic architecture coupled with context switching eliminates processor stalls. [0136]
  • 4) Pin Assignments (See FIG. 13). [0137]
  • Processor. [0138]
  • The processor (See FIG. 14) is a convenient means to provide a programmable state-machine capable of processing incoming frames and host commands, directing network traffic and directing PCI bus traffic. Three processors are implemented using shared hardware in a three-level pipelined architecture which launches and completes a single instruction for every clock cycle. The instructions are executed in three distinct phases corresponding to each of the pipeline stages where each phase is responsible for a different function. [0139]
  • The first instruction phase writes the instruction results of the last instruction to the destination operand, modifies the program counter (Pc), selects the address source for the instruction to fetch, then fetches the instruction from the control store. The fetched instruction is then stored in the instruction register at the end of the clock cycle. [0140]
  • The processor instructions reside in the on-chip control-store, which is implemented as a mixture of ROM and Sram. The ROM contains 4K instructions starting at [0141] address 0×0000 and aliases every 0×1000 locations throughout the first 0×8000 locations of instruction space. The Sram (WCS) will hold up to 0×1000 instructions starting at address 0×8000 and aliasing each 0×1000 locations throughout the last 0×8000 of instruction space. The ROM and Sram are both 49-bits wide accounting for bits [48:0] of the instruction microword. A separate mapping ram provides bits [55:49] of the microword (MapAddr) to allow replacement of faulty ROM based instructions. The mapping ram has a configuration of 512×7 which is insufficient to allow a separate map address for each of the 4K ROM locations. To allow re-mapping of the entire 4K ROM space, the map ram address lines are connected to the address bits Fetch [9:3]. The result is that the ROM is re-mapped in blocks of 8 contiguous locations.
  • The second instruction phase decodes the instruction which was stored in the instruction register. It is at this point that the map address is checked for a non-zero value which will cause the decoder to force a Jmp instruction to the map address. If a non-zero value is detected then the decoder selects the source operands for the Alu operation based on the values of the OpdASel, OpdBSel and AluOp fields. These operands are then stored in the decode register at the end of the clock cycle. Operands may originate from File, Sram, or flip-flop based registers. The second instruction phase is also where the results of the previous instruction are written to the Sram. [0142]
  • The third instruction phase is when the actual Alu operation is performed, the test condition is selected and the Stack push and pop are implemented. Results of the Alu operation are stored in the results register at the end of the clock cycle. [0143]
  • FIG. 14 is a block diagram of the CPU. FIG. 14 shows the hardware functions associated with each of the instruction phases. Note that various functions have been distributed across the three phases of the instruction execution in order to minimize the combinatorial delays within any given phase. [0144]
  • Instruction Set. [0145]
  • The micro-instructions are divided into nine types according to the program control directive. The micro-instruction is further divided into sub-fields for which the definitions are dependant upon the instruction type. The six instruction types are listed IN FIG. 15. [0146]
  • All instructions (See FIG. 15) include the Alu operation (AluOp), operand “A” select (OpdASel), operand “B” select (OpdBSel) and Literal fields. Other field usage depends upon the instruction type. [0147]
  • The conditional jump (Jct/Jcf) instruction causes the program counter to be altered if the condition selected by the “test select” (TstSel) field is true/false. The new program counter (Pc) value is loaded from either the Literal field or the AluOut as described in the following section and the Literal field may be used as a source for the Alu or the ram address if the new Pc value is sourced by the Alu. [0148]
  • The “jump” (Jmp) instruction causes the program counter to be altered unconditionally. The new program counter (Pc) value is loaded from either the Literal field or the AluOut as described in the following section. The format allows instruction bits 22:16 to be used to perform a flag operation and the Literal field may be used as a source for the Alu or the ram address if the new Pc value is sourced by the Alu. [0149]
  • The “jump subroutine” (Jsr) instruction causes the program counter to be altered unconditionally. The new program counter (Pc) value is loaded from either the Literal field or the AluOut as described in the following section. The old program counter value is stored on the top location of the Pc-Stack which is implemented as a LIFO memory. The format allows instruction bits 22:16 to be used to perform a flag operation and the Literal field may be used as a source for the Alu or the ram address if the new Pc value is sourced by the Alu. [0150]
  • The “Cont” (Cont) instruction causes the program counter to increment. The format allows instruction bits 22:16 to be used to perform a flag operation and the Literal field may be used as a source for the Alu or the ram address. [0151]
  • The “return from subroutine” (Rts) instruction, or the conditional Rts (Rtt/Rtf) if the selected condition is true/false, causes the current Pc value to be replaced with the last value stored in the stack. The Literal field may be used as a source for the Alu or the ram address. The unconditional return (Rts) allows instruction bits 22:16 to be used to perform a flag operation. [0152]
  • The Map instruction is provided to allow replacement of instructions which have been stored in ROM and is implemented any time the “map enable” (MapEn) bit has been set and the content of the “map address” (MapAddr) field is non-zero. The instruction decoder forces a jump instruction with the Alu operation and destination fields set to pass the MapAddr field to the program control block. [0153]
  • FIGS. [0154] 16-20 show sequencer behavior, ALU operations, ALU operands, selected tests, and flag operations.
  • Program Errors: [0155]
  • Hardware will detect certain program errors. Any sequencer generating a program error will be forced to continue executing from location 0004. The program errors detected are: [0156]
  • 1. Stack Overflow: A JSR is attempted and the stack registers are full. [0157]
  • 2. Stack Underflow: An RTS is attempted and the stack registers are empty. [0158]
  • 3. Incompatible Sram Size & Sram Alignment: An Sram Operation is attempted where the size and the Sram address would cause the operation to extend beyond the size of the word, e.g. Size=4 Address=401 or Size=2 Address=563. [0159]
  • 4. An Sram read is attempted immediately following an Sram write. Because an Sram write is actually done in the clock cycle of the following instruction, the sram interface will be busy during that phase, and an Sram read is illegal at this time. [0160]
  • 5. An attempt was made to access a non-existent register. [0161]
  • Sram Control Sequencer (SramCtrl). [0162]
  • Sram is the nexus for data movement within Mojave. A hierarchy of sequencers, working in concert, accomplish the movement of data between dram, Sram, Cpu, ethernet and the Pci bus. Slave sequencers, provided with stimulus from master sequencers, request data movement operations by way of the Sram, Pci bus and Dram. The slave sequencers prioritize, service and acknowledge the requests [0163]
  • The data flow block diagram of FIG. 21 shows all of the master and slave sequencers of the Mojave product. Request information such as r/w, address, size, endian and alignment are represented by each request line. Acknowledge information to master sequencers include only the size of the transfer being acknowledged. [0164]
  • The block diagram of FIG. 22 illustrates how data movement is accomplished for a Pci slave write to Dram. Psi sends a write request to the SramCtrl module. Psi requests Dwr to move data from Sram to dram. Dwr subsequently sends a read request to the SramCtrl module then writes the data to the dram via the Mctrl module. As each piece of data is moved from the Sram to Dwr, Dwr sends an acknowledge to the Psi module. [0165]
  • Sram Control Sequencer (SramCtrl). [0166]
  • The Sram control sequencer (See FIG. 23) services requests to store to, or retrieve data from an Sram organized as 2048 locations by 128 bits (32 KB). The sequencer operates at a frequency of 200 Mhz, allowing both a Cpu access and a dma access to occur during a standard 100 Mhz Cpu cycle. One 200 Mhz cycle is reserved for Cpu accesses during each 100 Mhz cycle while the remaining 200 Mhz cycle is reserved for dma accesses on a prioritized basis. [0167]
  • The block diagram of FIG. 23 shows the major functions of the Sram control sequencer. A slave sequencer begins by asserting a request along with r/w, ram address, data path alignment and request size. SramCtrl prioritizes the requests. The request parameters are then selected by a multiplexer which feeds the parameters to the Sram via alignment logic. The requestor provides the Sram address which when combined with the other parameters controls the input and output alignment. Sram outputs are fed to the output aligner. Requests are acknowledged in parallel with the returned data. [0168]
  • External Memory Control (memctrl). [0169]
  • Memctrl (See FIG. 24) implements the memory controller function and registers for access to SDRAM, Flash memory, and external configuration jumpers. It also implements the register interface for the serial EEPROM and GPIO access. Memctrl functional module summaries: [0170]
  • memregs: The memregs module provides the configuration and control registers for all the functions of memctrl. memregs also implements the GPIO interface registers for reading, writing and directional control, the FLASH control registers for configuring and accessing FLASH, and registers associated with configuring the SDPAM controller. memregs is accessed through the CPU data path with all of its registers mapped to a CPU register address. [0171]
  • dramcfg_seq: The dramcfg_seq module contains the refresh logic, timers, and sequencer for the various configuration accesses that are performed. This also includes operations which take place during initialization. [0172]
  • flash_seq: The flash_seq module performs the various FLASH memory access sequences. This module also implements the programmable nature of the access time delays between the control signals and data accesses. [0173]
  • dramif: The dramif module arbitrates between the memctrl modules requesting access to the memory interface. This includes the dramcfg_seq, flash_seq, memregs, dramwrt and dramrd modules. The dramif module also muxes the row and column address for the SDRAM accesses, muxs the read and write control signals between dramrd, dramwrt, etc., and also controls the direction of the data bus interface. dramif attempts to ping-pong between reads and writes to maximize the overlap between read and write buffers and for fairness. This fairness can be overriden if a requester asserts it's urgent request signal for high priority conditions like impending buffer overflow or underflow. When the flash_seq has access to the interface the checkbits become address and control signals and the FSH_CS_L signal is asserted. [0174]
  • dramwrt: The dramwrt module implements the data and control path for all masters requesting write access to SDRAM. The dramwrt submodule dramwrt_mux arbitrates across all six dma requesters giving the following priorities from highest to lowest: RcvA, Q2d, Psi, S2d, P2d and D2d. dramwrt_mux will then mux the selected requester's data and address. The dramwrt_ldctrl will buffer the granted requester's data and ack the appropriate requester while the dramwrt_seq will proceed to initiate an SDRAM write operation. After dramwrt seq gains control of the SDRAM interface via dramif, the buffered data will be selected from dramwrt_data data buffers and written to memory. If ECC is enabled, the dramwrt_data block will also compute the checkbits as the data passes through. This block can also force ECC errors at any bit in any location. Also, as the data is being written, the dramwrt_cksum block will checksum the data and indicate to the DMA requester when the checksum is complete. P2d and D2d are the only two requesters which have checksums calculated for their transactions. [0175]
  • dramrd: The dramrd module implements the data and control path for all masters requesting read access from SDRAM. The dramrd submodule dramrd_mux arbitrates across all six dma requesters, giving the following priorities from highest to lowest: XmtA, Pso, D2s, D2q, D2p and D2d. dramrd_mux also implements a state machine to overlap multiple read operations. So when a requester's read operation is being satisfied from SDRAM, another operation can be in progress with respect to bank activation and addressing. Once the dramrd_mux starts a transaction the dramrd_seq intiates the request for the interface via dramif and starts the actual read sequence. Once data starts to come back from the SDRAM the dramrd_data block will check it for ECC errors, if ECC correction and detection is enabled. The data is then stored in a 64 byte read buffer. Once there is enough data to write to the sram, the dramrd_unld sequencer will select data from the read buffer and request access to sram. The acks coming back from these sram writes are directed by the dramrd_mux to the original DMA requestor. Once all the requested data is delivered to the requestor, this operation is then complete. [0176]
  • External Memory Read Operations (dramrd). [0177]
  • The dramrd controller (See FIG. 24) acts only as a slave sequencer to the rest of the Mojave chip. Servicing requests issued by master sequencers, the dramrd controller moves data from external SDRAM or flash to the Sram, via the dramif module, in blocks of 64 bytes or less. The nature of the SDRAM requires fixed burst sizes for each of it's internal banks with ras precharge intervals between each access. By selecting a burst size of 4 words for SDRAM bank reads and interleaving bank accesses on a 4 word boundary, we can ensure that the ras precharge interval for the first bank is satisfied before burst completion for the second bank, allowing us to re-instruct the first bank and continue with uninterrupted dram access. SDRAMs require a consistent burst size be utilized each and every time the SDRAM is accessed. For this reason, if an SDRAM access does not begin or end on a 16 word boundary, SDRAM bandwidth will be reduced due to less than 64 bytes of data being transferred to sram during the burst cycle. [0178]
  • The Memory Controller Block Diagram (See FIG. 24) depicts the major functional blocks of the dramrd controller. The first step is servicing a request to move data from SDRAM to Sram in the prioritization of the master sequencer requests. This is done by dramrd mux. Next the dramrd_mux selects the DMA requester's dram read address and sram write address and applies configuration information to determine the correct bank, row and column address to apply. The dramrd_seq will control the operations of applying the row and column addresses and sequencing the appropriate control signals. While reading the data from the SDRAM interface the dramrd_data block will perform error detection and/or correction on the data if this feature is enabled. Once sufficient data has been read, the dramrd_unld sequencer issues a write request to the SramCtrl sequencer which in turn sends an acknowledge to the dramrd sequencer. The dramrd sequencer passes this acknowledge along to the level two master with a size code indicating how much data was written during the Sram cycle allowing the update of pointers and counters. The dram read and Sram write cycles repeat until the original burst request has been completed at which point the dramrd sequencer prioritizes any remaining requests in preparation for the next burst cycle. [0179]
  • Contiguous dram burst cycles are not guaranteed to the dramrd controller as an algorithm is implemented in the dramif which ensures highest priority to refresh cycles followed by ping-pong access between dram writes and dram reads and then confiuration and flash cycles. [0180]
  • FIG. 25 is a timing diagram illustrating how data is read from SDRAM. The dram has been configured for a burst of 4 with a latency of 2 clock cycles. Bank A is first selected/activated followed by a [0181] read command 2 clock cycles later. The bank select/activate for bank B is next issued as read data begins returning 2 clocks after the read command was issued to bank A. Two clock cycles before we need to receive data from bank B we issue the read command. Once all 4 words have been received from bank A we begin receiving data from bank B.
  • External Memory Write Operations (dramwrt). [0182]
  • The dramwrt controller (See FIG. 24) is a slave sequencer to the rest of Mojave. Servicing requests issued by master DMA sequencers, the dramwrt controller moves data from Sram to the external SDRAM or flash, via the dramif module, in blocks of 64 bytes or less while accumulating a checksum of the data moved. The nature of the SDRAM requires fixed burst sizes for each of it's internal banks with ras precharge intervals between each access. By selecting a burst size of 4 words for SDRAM writes and interleaving bank accesses on a 4 word boundary, we can ensure that the ras precharge interval for the first bank is satisfied before burst completion for the second bank, allowing us to re-instruct the first bank and continue with uninterrupted dram access. SDRAMs require a consistent burst size be utilized each and every time the SDRAM is accessed. For this reason, if an SDRAM access does not begin or end on a 64 byte boundary, SDRAM bandwidth will be reduced due to less than 16 words of data being transferred during the burst cycle. [0183]
  • The memctrl block diagram (See FIG. 24) contains the major functional blocks of the dramwrt controller. The first step in servicing a request to move data from Sram to SDRAM is the prioritization of the level two master requests. This is done by the dramwrt_mux. Next the dramwrt_mux takes a Snapshot of the dram write address and applies configuration information to determine the correct dram, bank, row and column address to apply. The dramwrt ldctrl sequencer immediately issues a read command to the Sram to which the Sram responds with both data and an acknowledge. The read data is stored within the dramwrt_data buffers by the dramwrt_ldctrl sequencer. The dramwrt_ldctrl sequencer passes the acknowledge to the level two master along with a size code indicating how much data was read during the Sram cycle allowing the update of pointers and counters. The dramwrt_seq has initiated an a bank activate command at this point. Once sufficient data has been read from Sram, the dramwrt_seq sequencer issues a write command to the dram starting the burst cycle and computing a checksum as the data passes by. In the typical case, all the required data will be read from Sram before the data is ready to be written to Sram, thus ensuring infrequent wait states on the SDRAM interface. ECC checkbits are also computed by the dramwrt_data block as the data moves out to the SDRAM interface. It is also possible to force ECC errors to any bit position within the data byte or checkbits. The Sram read cycle repeats until the original burst request has been completed at which point the dramwrt_mux prioritizes any remaining requests in preparation for the next burst cycle. [0184]
  • Since the ECC is a 8 bit ECC for a 64 bit word, writes not aligned to a 64 bit boundary will necessitate a read/modify/write cycle. When the dramwrt_ldctrl sequencer detects that a non-aligned write is required, it will generate a request for the read to the dramrd controller. The dramrd controller then returns the read data which is loaded into the write buffers. The dramwrt_ldctrl sequencer can then request the new data from the Sram, proceeding from this point in the same way as for an aligned operation. [0185]
  • Contiguous dram burst cycles are not guaranteed to the dramwrt controller as an algorithm is implemented in the dramif which ensures highest priority to refresh cycles followed by ping-pong access between dram writes and dram reads and then configuration and flash cycles. [0186]
  • FIG. 26 is a timing diagram illustrating how data is written to SDRAM. The dram has been configured for a burst of four with a latency of two clock cycles. Bank A is first selected/activated followed by a write command two clock cycles later. The bank select/activate for bank B is next issued in preparation for issuing the second write command. As soon as the first [0187] 4 word burst to bank A completes we issue the write command for bank B and begin supplying data. Banks C and D follow if necessary.
  • Pci Master-Out Sequencer (Pmo). [0188]
  • The Pmo sequencer (See FIG. 27) acts only as a slave sequencer. Servicing requests issued by master sequencers, the Pmo sequencer moves data from an Sram based fifo to a Pci target, via the PciMstrIO module, in bursts of up to 256 bytes. The nature of the PCI bus dictates the use of the write line command to ensure optimal system performance. The write line command requires that the Pmo sequencer be capable of transferring a whole multiple (1×, 2×, 3×, . . . ) of cache lines of which the size is set through the Pci configuration registers. To accomplish this end, Pmo will automatically perform partial bursts until it has aligned the transfers on a cache line boundary at which time it will begin usage of the write line command. The Sram fifo depth, of 256 bytes, has been chosen in order to allow Pmo to accommodate cache line sizes up to 128 bytes. Provided the cache line size is less than 128 bytes, Pmo will perform multiple, contiguous cache line bursts until it has exhausted the supply of data. [0189]
  • Pmo receives requests from two separate sources; the dram to Pci (D2p) module and the Sram to Pci (S2p) module. An operation (See FIG. 27) first begins with prioritization of the requests where the S2p module is given highest priority. Next, the Pmo module takes a Snapshot of the Sram fifo address and uses this to generate read requests for the SramCtrl sequencer. The Pmo module then proceeds to arbitrate for ownership of the Pci bus via the PciMstrIO module. Once the Pmo holding registers have sufficient data and Pci bus mastership has been granted, the Pmo module begins transferring data to the Pci target. For each successful transfer, Pmo sends an acknowledge and encoded size to the master sequencer, allow it to update it's internal pointers, counters and status. Once the Pci burst transaction has terminated, Pmo parks on the Pci bus unless another initiator has requested ownership. Pmo again prioritizes the incoming requests and repeats the process. [0190]
  • Pci Master-In Sequencer (Pmi). [0191]
  • The Pmi sequencer (See FIG. 28) acts only as a slave sequencer. Servicing requests issued by master sequencers, the Pmi sequencer moves data from a Pci target to an Sram based fifo, via the PciMstrIO module, in bursts of up to 256 bytes. The nature of the PCI bus dictates the use of the read multiple command to ensure optimal system performance. The read multiple command requires that the Pmi sequencer be capable of transferring a cache line or more of data. To accomplish this end, Pmi will automatically perform partial cache line bursts until it has aligned the transfers on a cache line boundary at which time it will begin usage of the read multiple command. The Sram fifo depth, of 256 bytes, has been chosen in order to allow Pmi to accommodate cache line sizes up to 128 bytes. Provided the cache line size is less than 128 bytes, Pmi will perform multiple, contiguous cache line bursts until it has filled the fifo. [0192]
  • Pmi receive requests from two separate sources; the Pci to dram (P2d) module and the Pci to Sram (P2s) module. An operation (See FIG. 28) first begins with prioritization of the requests where the P2s module is given highest priority. The Pmi module then proceeds to arbitrate for ownership of the Pci bus via the PciMstrIO module. Once the Pci bus mastership has been granted and the Pmi holding registers have sufficient data, the Pmi module begins transferring data to the Sram fifo. For each successful transfer, Pmi sends an acknowledge and encoded size to the master sequencer, allowing it to update it's internal pointers, counters and status. Once the Pci burst transaction has terminated, Pmi parks on the Pci bus unless another initiator has requested ownership. Pmi again prioritizes the incoming requests and repeats the process [0193]
  • Dram To Pci Sequencer (D2p). [0194]
  • The D2p sequencer (See FIG. 29) acts is a master sequencer. Servicing channel requests issued by the Cpu, the D2p sequencer manages movement of data from dram to the Pci bus by issuing requests to both the Drd sequencer and the Pmo sequencer. Data transfer is accomplished using an Sram based fifo through which data is staged. [0195]
  • D2p can receive requests from any of the processor's thirty-two dma channels. Once a command request has been detected, D2p fetches a dma descriptor from an Sram location dedicated to the requesting channel which includes the dram address, Pci address, Pci endian and request size. D2p then issues a request to the D2s sequencer causing the Sram based fifo to fill with dram data. Once the fifo contains sufficient data for a Pci transaction, D2s issues a request to Pmo which in turn moves data from the fifo to a Pci target. The process repeats until the entire request has been satisfied at which time D2p writes ending status in to the Sram dma descriptor area and sets the channel done bit associated with that channel. D2p then monitors the dma channels for additional requests. FIG. 29 is an illustration showing the major blocks involved in the movement of data from dram to Pci target. [0196]
  • Pci To Dram Sequencer (P2d). [0197]
  • The P2d sequencer (See FIG. 30) acts as both a slave sequencer and a master sequencer. Servicing channel requests issued by the Cpu, the P2d sequencer manages movement of data from Pci bus to dram by issuing requests to both the Dwr sequencer and the Pmi sequencer. Data transfer is accomplished using an Sram based fifo through which data is staged. [0198]
  • P2d can receive requests from any of the processor's thirty-two dma channels. Once a command request has been detected, P2d, operating as a slave sequencer, fetches a dma descriptor from an Sram location dedicated to the requesting channel which includes the dram address, Pci address, Pci endian and request size. P2d then issues a request to Pmo which in turn moves data from the Pci target to the Sram fifo. Next, P2d issues a request to the Dwr sequencer causing the Sram based fifo contents to be written to the dram. The process repeats until the entire request has been satisfied at which time P2d writes ending status in to the Sram dma descriptor area and sets the channel done bit associated with that channel. P2d then monitors the dma channels for additional requests. FIG. 30 is an illustration showing the major blocks involved in the movement of data from a Pci target to dram. [0199]
  • Sram to Pci Sequencer (S2p). [0200]
  • The S2p sequencer (See FIG. 31) acts as both a slave sequencer and a master sequencer. Servicing channel requests issued by the Cpu, the S2p sequencer manages movement of data from Sram to the Pci bus by issuing requests to the Pmo sequencer [0201]
  • S2p can receive requests from any of the processor's thirty-two dma channels. Once a command request has been detected, S2p, operating as a slave sequencer, fetches a dma descriptor from an Sram location dedicated to the requesting channel which includes the Sram address, Pci address, Pci endian and request size. S2p then issues a request to Pmo which in turn moves data from the Sram to a Pci target. The process repeats until the entire request has been satisfied at which time S2p writes ending status in to the Sram dma descriptor area and sets the channel done bit associated with that channel. S2p then monitors the dma channels for additional requests. FIG. 31 is an illustration showing the major blocks involved in the movement of data from Sram to Pci target. [0202]
  • Pci To Sram Sequencer (P2s). [0203]
  • The P2s sequencer (See FIG. 32) acts as both a slave sequencer and a master sequencer. Servicing channel requests issued by the Cpu, the P2s sequencer manages movement of data from Pci bus to Sram by issuing requests to the. Pmi sequencer. [0204]
  • P2s can receive requests from any of the processor's thirty-two dma channels. Once a command request has been detected, P2s, operating as a slave sequencer, fetches a dma descriptor from an Sram location dedicated to the requesting channel which includes the Sram address, Pci address, Pci endian and request size. P2s then issues a request to Pmo which in turn moves data from the Pci target to the Sram. The process repeats until the entire request has been satisfied at which time P2s writes ending status in to the dma descriptor area of Sram and sets the channel done bit associated with that channel. P2s then monitors the dma channels for additional requests. FIG. 32 is an illustration showing the major blocks involved in the movement of data from a Pci target to dram. [0205]
  • Dram To Sram Sequencer (D2s). [0206]
  • The D2s sequencer (See FIG. 33) acts as both a slave sequencer and a master sequencer. Servicing channel requests issued by the Cpu, the D2s sequencer manages movement of data from dram to Sram by issuing requests to the Drd sequencer. [0207]
  • D2s can receive requests from any of the processor's thirty-two dma channels. Once a command request has been detected, D2s, operating as a slave sequencer, fetches a dma descriptor from an Sram location dedicated to the requesting channel which includes the dram address, Sram address and request size. D2s then issues a request to the Drd sequencer causing the transfer of data to the Sram. The process repeats until the entire request has been satisfied at which time D2s writes ending status in to the Sram dma descriptor area and sets the channel done bit associated with that channel. D2s then monitors the dma channels for additional requests. FIG. 33 is an illustration showing the major blocks involved in the movement of data from dram to Sram. [0208]
  • Sram to Dram Sequencer (S2d). [0209]
  • The S2d sequencer (See FIG. 34) acts as both a slave sequencer and a master sequencer. Servicing channel requests issued by the Cpu, the S2d sequencer manages movement of data from Sram to dram by issuing requests to the Dwr sequencer. [0210]
  • S2d can receive requests from any of the processor's thirty-two dma channels. Once a command request has been detected, S2d, operating as a slave sequencer, fetches a dma descriptor from an Sram location dedicated to the requesting channel which includes the dram address, Sram address, checksum reset and request size. S2d then issues a request to the Dwr sequencer causing the transfer of data to the dram. The process repeats until the entire request has been satisfied at which time S2d writes ending status in to the Sram dma descriptor area and sets the channel done bit associated with that channel. S2d then monitors the dma channels for additional requests. FIG. 34 is an illustration showing the major blocks involved in the movement of data from Sram to dram. [0211]
  • Pci Slave Input Sequencer (Psi). [0212]
  • The Psi sequencer (See FIG. 35) acts as both a slave sequencer and a master sequencer. Servicing requests issued by a Pci master, the Psi sequencer manages movement of data from Pci bus to Sram and Pci bus to dram via Sram by issuing requests to the SramCtrl and Dwr sequencers. [0213]
  • Psi manages write requests to configuration space, expansion rom, dram, Sram and memory mapped registers. Psi separates these Pci bus operations in to two categories with different action taken for each. Dram accesses result in Psi generating write request to an Sram buffer followed with a write request to the Dwr sequencer. Subsequent write or read dram operations are retry terminated until the buffer has been emptied. An event notification is set for the processor allowing message passing to occur through dram space. [0214]
  • All other Pci write transactions result in Psi posting the write information including Pci address, Pci byte marks and Pci data to a reserved location in Sram, then setting an event flag which the event processor monitors. Subsequent writes or reads of configuration, expansion rom, Sram or registers are terminated with retry until the processor clears the event flag. This allows Mojave to keep pipelining levels to a minimum for the posted write and give the processor ample time to modify data for subsequent Pci read operations. [0215]
  • FIG. 35 depicts the sequence of events when Psi is the target of a Pci write operation. Note that [0216] events 4 through 7 occur only when the write operation targets the dram.
  • Pci Slave Output Sequencer (Pso). [0217]
  • The Pso sequencer (See FIG. 36) acts as both a slave sequencer and a master sequencer. Servicing requests issued by a Pci master, the Pso sequencer manages movement of data to Pci bus form Sram and to Pci bus from dram via Sram by issuing requests to the SramCtrl and Drd sequencers. [0218]
  • Pso manages read requests to configuration space, expansion rom, dram, Sram and memory mapped registers. Pso separates these Pci bus operations in to two categories with different action taken for each. Dram accesses result in Pso generating read request to the Drd sequencer followed with a read request to Sram buffer. Subsequent write or read dram operations are retry terminated until the buffer has been emptied. [0219]
  • All other Pci read transactions result in Pso posting the read request information including Pci address and Pci byte marks to a reserved location in Sram, then setting an event flag which the event processor monitors. Subsequent writes or reads of configuration, expansion rom, Sram or registers are terminated with retry until the processor clears the event flag. This allows Mojave to use a microcoded response mechanism to return data for the request. The processor decodes the request information, formulates or fetches the requested data and stores it in Sram then clears the event flag allowing Pso to fetch the data and return it on the Pci bus. [0220]
  • FIG. 36 depicts the sequence of events when Pso is the target of a Pci read operation. [0221]
  • Frame Receive Sequencer (RcvX).
  • The receive sequencer (RcvSeq)(See FIG. 37) analyzes and manages incoming packets, stores the result in dram buffers or sram buffers, then notifies the processor through the receive queue (RcvQ) mechanism. The process begins when a buffer descriptor is available at the output of the FreeQ ([0222] 1). RcvSeq issues a request to the Qmg (2) which responds by supplying the buffer descriptor to RcvSeq (3). RcvSeq then waits for a receive packet (4). The Mac, network, transport and session information is analyzed as each byte is received (4) and stored in the assembly register (AssyReg). When sixteen bytes of information is available, RcvSeq requests a write of the data to the Sram (5). In normal mode, when sufficient data has been stored in the Sram based receive fifo, a Dram write request is issued to Dwr (8). The process continues until the entire packet has been received at which point RcvSeq stores the results of the packet analysis in the beginning of the receive buffer. Once the buffer and status have both been stored, RcvSeq issues a write-queue request to Qmg (12) using a QId based on the priority level of the incoming packet detected by RcvSeq. Qmg responds by storing a buffer descriptor (15) and, in normal mode, a status vector provided by RcvSeq (13). When QHashEn is set, RcvSeq will merge the CtxHash with the receive descriptor. The process then repeats. If RcvSeq detects the arrival of a packet before a free buffer is available, it ignores the packet and sets the PktMissed status bit for the next received packet.
  • FIG. 37 depicts the sequence of events for successful reception of a packet. FIG. 39 is a definition of the receive buffer. FIG. 40 is a definition of the receive buffer descriptor as stored on the RcvQ. FIG. 41 is a diagram that illustrates a receive vector. [0223]
  • Receive Priorities. [0224]
  • The receive sequencer (See FIG. 37) analyzes the vlan priorities of the incoming packets, and stores the receive descriptor in one of it's receive queues according to the value written to the PriLevels bits of the RcvCfg register as represented in FIG. 38. Rev. A of Mojave has a bug which limits receive queues to 0 and 1. [0225]
  • Frame Transmit Sequencer (XmtX). [0226]
  • The transmit sequencer (XmtSeq)(See FIG. 42) manages outgoing packets, using buffer descriptors retrieved from, in order of priority, the urgent descriptor register (XmtUrgDscr) followed by the transmit queues (XmtQ) [0227] priority 3 down to priority 0, then storing the descriptor for the freed buffer in the free buffer queue (FreeQ). The process begins when a buffer descriptor is available at, for example, the output of XmtQ2 (1). XmtSeq issues a request to the Qmg (2) which responds by supplying the buffer descriptor to XmtSeq (4). XmtSeq then issues a read request to the Drd (5) sequencer. Next, XmtSeq issues a read request to SramCtrl (6) then instructs the Mac (10) to begin frame transmission. Once the frame transmission has completed, XmtSeq stores the buffer descriptor on the FreeQ (12) thereby recycling the buffer. If XmtSeq detects a data-late condition or a collision, the packet is retransmitted automatically.
  • FIG. 42 depicts the sequence of events for successful transmission of a packet. FIG. 43 is a diagram of the transmit descriptor. FIG. 44 is a diagram of the merge descriptor. FIG. 45 is a diagram of the transmit beffer format. FIG. 46 is a diagram of the transmit vector. [0228]
  • Queue Manager (Qmg). [0229]
  • Mojave includes special hardware assist for the implementation of message and pointer queues. The hardware assist is called the queue manager (Qmg) (See FIG. 47) and manages the movement of queue entries between Cpu and Sram, between Xcv sequencers and Sram as well as between Sram and Dram. Queues comprise three distinct entities; the queue head (QHd), the queue tail (QTl) and the queue body (QBdy). QHd resides in 64 bytes of scratch ram and provides the area to which entries will be written (pushed). QTl resides in 64 bytes of scratch ram and contains queue locations from which entries will be read (popped). QBdy resides in dram and contains locations for expansion of the queue in order to minimize the Sram space requirements. The QBdy size depends upon the queue being accessed and the initialization parameters presented during queue initialization. [0230]
  • Qmg (See FIG. 47) accepts operation requests from both Cpu, XcvSeqs and DmaSeqs. Executing these operations at a frequency of 100 Mhz. Valid Cpu operations include initialize queue (InitQ), write queue (WrQ) and read queue (RdQ). Valid dma requests include read queue (RdQ), read body (RdBdy) and write body (WrBdy). Qmg working in unison with Q2d and D2q generate requests to the Dwr and Drd sequencers to control the movement of data between the QHd, QTl and QBdy. [0231]
  • There are a total of 32 queues. The first [0232] 8 are dedicated to a specific function as shown in FIG. 48.
  • FIG. 47 shows the major functions of Qmg. The arbiter selects the next operation to be performed. The dual-ported Sram holds the queue variables HdWrAddr, HdRdAddr, TlWrAddr, TlRdAddr, BdyWrAddr, BdyRdAddr and QSz. Qmg accepts an operation request, fetches the queue variables from the queue ram (Qram), modifies the variables based on the current state and the requested operation then updates the variables and issues a read or write request to the Sram controller. The Sram controller services the requests by writing the tail or reading the head and returning an acknowledge. [0233]
  • Dma Operations. [0234]
  • DMA operations are accomplished by seven dma sequencers (DmaSeq). Commands are sent to these sequencers via hardware queues. The queue Ids are fixed in hardware and are as shown in FIG. 49. [0235]
  • Microcode will initiate a DMA by writing a command to the appropriate queue. The DMA sequencer will read a command from the queue, and fetch the descriptor block from Sram. It will then do the DMA. At the end of the DMA, if the DMA chain bit is not set, the DMA sequencer will terminate the DMA. [0236]
  • For DMAs that complete without error, the DMA Context byte (bits 31:24 of the command) will be written to the termination queue indicated by bits 20:16 of the command. Each entry in the termination queue is 32 bits, but only the least significant 8 bits (7:0) are used and wriiten with the DMA Context. [0237]
  • For DMAs that complete with error, the termination queue will not be written. Instead a bit in the DMA Error Register will be set. This is a 32 bit register and the least significant 5 bits of the DMA context will be used to decide which bit should be written in the following manner: [0238]
  • DMA Error Register [1<<DMA command [28:24]]=1; [0239]
  • If the Dummy DMA bit is set, no DMA is performed but the DMA context is written directly to the DMA termination queue. [0240]
  • If the DMA chain bit is set and the DMA completes without error, the DMA descriptor block is updated, but no other termination information is written. If the DMA chain bit is set and the DMA completes with an error, the DMA descriptor block is updated, and the error is propogated to subsequent DMA commands until the sequencer finds one that does not have the chain bit, when the DMA Error Register will be written as above, without writing to the termination queue. [0241]
  • The format of the P2d or P2s descriptor is shown in FIG. 50. [0242]
  • The format of the S2p or D2p descriptor is shown in FIG. 51. [0243]
  • The format of the S2d, D2d or D2s descriptor is shown in FIG. 52. [0244]
  • The format of the ending status of any dma is as shown in FIG. 53. [0245]
  • FIG. 54 shows the major blocks of PCI logic and their relationships. The blocks of FIG. 54 are as follows: [0246]
  • Slave Dram Interface: This block controls the interface to Dram when Dram is being accessed directly by the host or by another PCI master. [0247]
  • Slave Sram Interface: This block controls the the access to Sram for PCI slave accesses to read Sram, or to read or write Dram. [0248]
  • Pci Configuration Registers: This block contains the configuration registers that control the PCI space. [0249]
  • DMA Master In: This block does PCI master transfers on behalf of the P2D and P2S DMA sequencers. There is synchronization logic to synchronize between the PCI bus and the SRAM which are being clcoked by different clocks. It has 256 bytes of buffering to minimize latencies caused by this synchronization. [0250]
  • DMA Master Out: This block does PCI master transfers on behalf of the D2P and S2P DMA sequencers. There is synchronization logic to synchronize between the PCI bus and the SRAM which are being clcoked by different clocks. It has 256 bytes of buffering to minimize latencies caused by this synchronization. [0251]
  • PCI Slave Interface: This block has the state machine for PCI slave accesses to Mojave, from the host or from another PCI master. [0252]
  • PCI Parity: This block generates and checks parity on the PCI bus. [0253]
  • PCI Master Interface: This block has the state machine for PCI master accesses to host memory or to another PCI slave, done on behalf of the DMA sequencers. [0254]

Claims (22)

What is claimed is:
1. A system, comprising:
(a) fast-path receive circuitry that is in control of a first plurality of TCP/IP connections, a first TCP/IP packet associated with one of the first plurality of TCP/IP connections being received onto the fast-path circuitry from a network, the fast-path receive circuitry comprising:
an SRAM that stores a control block (CB) for each TCP/IP connection of a first set of the first plurality of TCP/IP connections;
a DRAM that stores a CB for each TCP/IP connection of a second set of the first plurality of TCP/IP connections, the DRAM storing a CB associated with the first TCP/IP packet received onto the fast-path receive circuitry;
a content addressable memory (CAM); and
a first processor that executes a receive state machine, the first processor obtaining from the CAM information indicative of whether the CB associated with the first TCP/IP packet is stored in the SRAM or is stored in the DRAM, the first processor using the information obtained from the CAM to access the CB; and
(b) a processor that executes a protocol processing stack, the protocol processing stack being in control of a second plurality of TCP/IP connections, wherein TCP/IP packets associated with the second plurality of TCP/IP connections are received onto the fast-path circuitry from the network, the protocol processing stack performing TCP protocol processing on the TCP/IP packets associated with the second plurality of TCP/IP connections, and wherein other TCP/IP packets associated with the first plurality of TCP/IP connections are received onto the fast-path circuitry from the network, the protocol stack performing substantially no TCP protocol processing on the other TCP/IP packets associated with the second plurality of TCP/IP connections.
2. The system of claim 1, wherein the first processor accesses the CB associated with the first TCP/IP packet by causing the CB to be copied from the DRAM into the SRAM.
3. The system of claim 1, wherein the fast-path receive circuitry generates a hash for the first TCP/IP packet, and wherein the fast-path receive circuitry pushes the hash onto a queue, the first processor popping the queue and thereby obtaining the hash, the first processor then using the hash to identify the control block (CB) associated with the first TCP/IP packet.
4. The system of claim 1, wherein the SRAM includes a plurality of control block (CB) slots, and wherein the CAM contains a CAM entry for each of the CB slots in the SRAM.
5. The system of claim 1, wherein the control block (CB) associated with the first TCP/IP packet contains TCP state information.
6. The system of claim 1, wherein the control block (CB) associated with the first TCP/IP packet is a communication control block (CCB).
7. The system of claim 1, wherein the control block (CB) associated with the first TCP/IP packet is a transmit control block (TCB), the TCB comprising: TCP state information, a TCP source port address, a TCP destination port address, an IP source address, and an IP destination address.
8. The system of claim 1, wherein the fast-path receive circuitry further comprises:
a second processor that executes the receive state machine, the first processor and the second processor together performing TCP protocol processing and IP protocol processing on the first TCP/IP packet.
9. The system of claim 8, wherein one of the first and second processors performs initial processing on the first TCP/IP packet using the receive state machine and then stops processing the first TCP/IP packet and stores state information relating to a state of the receive state machine, and wherein the other the first and second processors retrieves the state information and uses the retrieved state information to perform subsequent processing on the first TCP/IP packet using the receive state machine.
10. The system of claim 1, wherein the fast-path receive circuitry uses a plurality of hash buckets to identify control blocks (CBs) associated with incoming TCP/IP packets, some of the plurality of hash buckets being cached in the SRAM, others of the hash buckets being stored in DRAM.
11. The system of claim 10, wherein the first TCP/IP packet has an associated hash bucket, wherein if the associated hash bucket is stored in DRAM, then the associated hash bucket is copied into the SRAM, the associated hash bucket having a hash bucket entry, the first processor checking the hash bucket entry to determine whether TCP and IP fields of the hash bucket entry match TCP and IP fields of the first TCP/IP packet.
12. The system of claim 1, wherein the fast-path receive circuitry further comprises:
a plurality of lock bits, there being one lock bit for each of the first plurality of TCP/IP connections controlled by the fast-path receive circuitry, a lock bit indicating whether a control block (CB) associated with the lock bit has been locked by a processor context;
a lock table CAM; and
a lock table, wherein the lock table and the lock table CAM are used to identify a processor context that is waiting to gain control of the control block (CB).
13. The system of claim 12, wherein the lock table contains a plurality of entries, each entry identifying one of a plurality of processor contexts.
14. The system of claim 1, wherein each CB of the second set of the first plurality of TCP/IP connections is also stored in the SRAM.
15. The system of claim 1, further comprising a host CPU, the fast-path receive circuitry (a) and the processor (b) being part of a network interface device, the network interface being coupled to the host CPU.
16. The system of claim 1, wherein the fast-path receive circuitry (a) is part of a network interface device, and wherein the processor (b) is a host CPU, the network interface device being coupled to the host CPU.
17. A system, comprising:
a first processor that executes a protocol processing stack; and
fast-path receive circuitry that receives an incoming TCP/IP packet and performs substantially all TCP and IP protocol processing on the TCP/IP packet, the TCP/IP packet containing a header portion and a data portion, the data portion being transferred into a destination identified by the first processor, the data portion being transferred without the header portion being transferred into the destination and without the protocol processing stack doing any TCP protocol processing on the TCP/IP packet, the fast-path receive circuitry comprising:
an SRAM that stores a first plurality of control blocks (CB);
a DRAM that stores a second plurality of control blocks (CB);
a content addressable memory (CAM); and
a second processor that executes a receive state machine, the second processor using the CAM to determine whether a control block (CB) associated with the incoming TCP/IP packet is stored in the SRAM, wherein if the control block is not stored in the SRAM but rather is stored in the DRAM, then the second processor causes the control block (CB) associated with the incoming TCP/IP packet to be copied into the SRAM.
18. The system of claim 17, wherein the incoming TCP/IP packet is associated with a TCP/IP connection, wherein control of the TCP/IP connection is passed from the first processor to the fast-path receive circuitry.
19. The system of claim 18, wherein control of the TCP/IP connection is passed to the fast-path receive circuitry by passing control of an associated control block (CB) to the fast-path receive circuitry.
20. A method, comprising:
receiving a TCP/IP packet onto a network interface device;
generating a hash from the TCP/IP packet and pushing the hash onto a queue, the queue being located on the network interface device;
popping the queue to retrieve the hash and using the hash to identify a hash bucket;
determining that the hash bucket identified by the hash is stored in a DRAM and copying the hash bucket from the DRAM and into an SRAM, the DRAM and the SRAM both being part of the network interface device;
searching a plurality of hash entries in the identified hash bucket and determining from one of the hash entries a control block number;
using a content addressable memory (CAM) to determine that a control block (CB) associated with the control block number is located in the DRAM, the CAM being part of the network interface device;
copying the control block (CB) from the DRAM and into the SRAM; and
using the control block (CB) to fast-path process the TCP/IP packet on the network interface device, the network interface device transferring a data portion of the TCP/IP packet into a destination, the destination having been identified by a processor, the processor executing a protocol processing stack, the network interface device transferring the data portion into the destination identified by the processor without the protocol processing stack of the processor performing any TCP protocol processing on the TCP/IP packet.
21. The method of claim 20, wherein the processor is a CPU of a host computer, the network interface device being coupled to the host computer, the destination being located in a memory of the host computer.
22. The method of claim 20, wherein the processor is a part of the network interface device.
US10/420,364 2002-04-22 2003-04-22 TCP/IP offload device Expired - Fee Related US7496689B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/420,364 US7496689B2 (en) 2002-04-22 2003-04-22 TCP/IP offload device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US37478802P 2002-04-22 2002-04-22
US10/420,364 US7496689B2 (en) 2002-04-22 2003-04-22 TCP/IP offload device

Publications (2)

Publication Number Publication Date
US20040062245A1 true US20040062245A1 (en) 2004-04-01
US7496689B2 US7496689B2 (en) 2009-02-24

Family

ID=32033362

Family Applications (1)

Application Number Title Priority Date Filing Date
US10/420,364 Expired - Fee Related US7496689B2 (en) 2002-04-22 2003-04-22 TCP/IP offload device

Country Status (1)

Country Link
US (1) US7496689B2 (en)

Cited By (127)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030165160A1 (en) * 2001-04-24 2003-09-04 Minami John Shigeto Gigabit Ethernet adapter
US20040042464A1 (en) * 2002-08-30 2004-03-04 Uri Elzur System and method for TCP/IP offload independent of bandwidth delay product
US20040042412A1 (en) * 2002-09-04 2004-03-04 Fan Kan Frankie System and method for fault tolerant TCP offload
US20040049580A1 (en) * 2002-09-05 2004-03-11 International Business Machines Corporation Receive queue device with efficient queue flow control, segment placement and virtualization mechanisms
US20040062267A1 (en) * 2002-03-06 2004-04-01 Minami John Shigeto Gigabit Ethernet adapter supporting the iSCSI and IPSEC protocols
US20040081202A1 (en) * 2002-01-25 2004-04-29 Minami John S Communications processor
US20040100907A1 (en) * 2002-11-25 2004-05-27 Illikkal Rameshkumar G. Managing a protocol control block cache in a network device
US20040117496A1 (en) * 2002-12-12 2004-06-17 Nexsil Communications, Inc. Networked application request servicing offloaded from host
WO2004021150A3 (en) * 2002-08-30 2004-08-12 Broadcom Corp System and method for tpc/ip offload independent of bandwidth delay product
US20040172485A1 (en) * 2001-04-11 2004-09-02 Kianoosh Naghshineh Multi-purpose switching network interface controller
US20040177307A1 (en) * 2002-06-28 2004-09-09 Interdigital Technology Corporation System and method for transmitting a sequence of data blocks
US20040213290A1 (en) * 1998-06-11 2004-10-28 Johnson Michael Ward TCP/IP/PPP modem
US20040221050A1 (en) * 2003-05-02 2004-11-04 Graham Smith Direct TCP/IP communication method and system for coupling to a CPU/Memory complex
US20040246974A1 (en) * 2003-06-05 2004-12-09 Gyugyi Paul J. Storing and accessing TCP connection information
US20050025502A1 (en) * 2003-06-12 2005-02-03 Finisar Modular optical device that interfaces with an external controller
US20050050250A1 (en) * 2003-08-29 2005-03-03 Finisar Computer system with modular optical devices
US20050138180A1 (en) * 2003-12-19 2005-06-23 Iredy Corporation Connection management system and method for a transport offload engine
US20050149632A1 (en) * 2003-12-19 2005-07-07 Iready Corporation Retransmission system and method for a transport offload engine
US20050195851A1 (en) * 2004-02-12 2005-09-08 International Business Machines Corporation System, apparatus and method of aggregating TCP-offloaded adapters
US20050246450A1 (en) * 2004-04-28 2005-11-03 Yutaka Enko Network protocol processing device
US20050271042A1 (en) * 2000-11-10 2005-12-08 Michael Johnson Internet modem streaming socket method
US20050281262A1 (en) * 2004-06-17 2005-12-22 Zur Uri E Method and system for supporting read operations for iSCSI and iSCSI chimney
US20060010238A1 (en) * 2001-03-07 2006-01-12 Alacritech, Inc. Port aggregation for network connections that are offloaded to network interface devices
US20060031818A1 (en) * 1997-05-08 2006-02-09 Poff Thomas C Hardware accelerator for an object-oriented programming language
US20060059273A1 (en) * 2004-09-16 2006-03-16 Carnevale Michael J Envelope packet architecture for broadband engine
US20060104303A1 (en) * 2004-11-16 2006-05-18 Srihari Makineni Packet coalescing
US20060114939A1 (en) * 2000-07-25 2006-06-01 Juniper Networks, Inc. Network architecture and methods for transparent on-line cross-sessional encoding and transport of network communications data
US20060259291A1 (en) * 2005-05-12 2006-11-16 International Business Machines Corporation Internet SCSI communication via UNDI services
US20070140297A1 (en) * 2005-12-16 2007-06-21 Shen-Ming Chung Extensible protocol processing system
US20070153818A1 (en) * 2005-12-29 2007-07-05 Sridhar Lakshmanamurthy On-device packet descriptor cache
US20070283041A1 (en) * 2006-06-01 2007-12-06 Shen-Ming Chung System and method for recognizing offloaded packets
US7324547B1 (en) 2002-12-13 2008-01-29 Nvidia Corporation Internet protocol (IP) router residing in a processor chipset
US20080040519A1 (en) * 2006-05-02 2008-02-14 Alacritech, Inc. Network interface device with 10 Gb/s full-duplex transfer rate
US20080056124A1 (en) * 2003-06-05 2008-03-06 Sameer Nanda Using TCP/IP offload to accelerate packet filtering
US7362772B1 (en) 2002-12-13 2008-04-22 Nvidia Corporation Network processing pipeline chipset for routing and host packet processing
US20080109562A1 (en) * 2006-11-08 2008-05-08 Hariramanathan Ramakrishnan Network Traffic Controller (NTC)
US20080117911A1 (en) * 2006-11-21 2008-05-22 Yasantha Rajakarunanayake System and method for a software-based TCP/IP offload engine for digital media renderers
US7386619B1 (en) * 2003-01-06 2008-06-10 Slt Logic, Llc System and method for allocating communications to processors in a multiprocessor system
US20080198781A1 (en) * 2007-02-20 2008-08-21 Yasantha Rajakarunanayake System and method for a software-based TCP/IP offload engine for implementing efficient digital media streaming over Internet protocol networks
US20080205407A1 (en) * 2000-11-17 2008-08-28 Andrew Chang Network switch cross point
US20080313687A1 (en) * 2007-06-18 2008-12-18 Yasantha Nirmal Rajakarunanayake System and method for just in time streaming of digital programs for network recording and relaying over internet protocol network
US20090022413A1 (en) * 2000-07-25 2009-01-22 Juniper Networks, Inc. System and method for incremental and continuous data compression
US7594002B1 (en) * 2003-02-14 2009-09-22 Istor Networks, Inc. Hardware-accelerated high availability integrated networked storage system
US7616563B1 (en) 2005-08-31 2009-11-10 Chelsio Communications, Inc. Method to implement an L4-L7 switch using split connections and an offloading NIC
US20090279549A1 (en) * 2005-12-28 2009-11-12 Foundry Networks, Inc. Hitless software upgrades
US20090288013A1 (en) * 2008-05-16 2009-11-19 Honeywell International Inc. Scalable User Interface System
US20090323691A1 (en) * 2008-06-30 2009-12-31 Sun Microsystems, Inc. Method and apparatus to provide virtual toe interface with fail-over
US7647436B1 (en) * 2005-04-29 2010-01-12 Sun Microsystems, Inc. Method and apparatus to interface an offload engine network interface with a host machine
US7660264B1 (en) 2005-12-19 2010-02-09 Chelsio Communications, Inc. Method for traffic schedulign in intelligent network interface circuitry
US7660306B1 (en) 2006-01-12 2010-02-09 Chelsio Communications, Inc. Virtualizing the operation of intelligent network interface circuitry
US7673074B1 (en) * 2002-10-24 2010-03-02 Emulex Design & Manufacturing Corporation Avoiding port collisions in hardware-accelerated network protocol
US20100061393A1 (en) * 2003-05-15 2010-03-11 Foundry Networks, Inc. System and Method for High Speed Packet Transmission
US7689702B1 (en) * 2003-10-31 2010-03-30 Sun Microsystems, Inc. Methods and apparatus for coordinating processing of network connections between two network protocol stacks
US7698413B1 (en) 2004-04-12 2010-04-13 Nvidia Corporation Method and apparatus for accessing and maintaining socket control information for high speed network connections
US7715436B1 (en) 2005-11-18 2010-05-11 Chelsio Communications, Inc. Method for UDP transmit protocol offload processing with traffic management
US7724658B1 (en) 2005-08-31 2010-05-25 Chelsio Communications, Inc. Protocol offload transmit traffic management
US20100135313A1 (en) * 2002-05-06 2010-06-03 Foundry Networks, Inc. Network routing system for enhanced efficiency and monitoring capability
US20100161894A1 (en) * 2004-10-29 2010-06-24 Foundry Networks, Inc. Double density content addressable memory (cam) lookup scheme
US7760733B1 (en) 2005-10-13 2010-07-20 Chelsio Communications, Inc. Filtering ingress packets in network interface circuitry
US7826350B1 (en) 2007-05-11 2010-11-02 Chelsio Communications, Inc. Intelligent network adaptor with adaptive direct data placement scheme
US7831745B1 (en) 2004-05-25 2010-11-09 Chelsio Communications, Inc. Scalable direct memory access using validation of host and scatter gather engine (SGE) generation indications
US7831720B1 (en) 2007-05-17 2010-11-09 Chelsio Communications, Inc. Full offload of stateful connections, with partial connection offload
US7849208B2 (en) 2002-08-30 2010-12-07 Broadcom Corporation System and method for TCP offload
US7912064B2 (en) 2002-08-30 2011-03-22 Broadcom Corporation System and method for handling out-of-order frames
US7934021B2 (en) 2002-08-29 2011-04-26 Broadcom Corporation System and method for network interfacing
US7978614B2 (en) 2007-01-11 2011-07-12 Foundry Network, LLC Techniques for detecting non-receipt of fault detection protocol packets
US20110208874A1 (en) * 2002-01-15 2011-08-25 Intel Corporation Packet aggregation
US8037399B2 (en) 2007-07-18 2011-10-11 Foundry Networks, Llc Techniques for segmented CRC design in high speed networks
US20110270976A1 (en) * 2008-09-19 2011-11-03 Masama Yasuda Network protocol processing system and network protocol processing method
US8060644B1 (en) 2007-05-11 2011-11-15 Chelsio Communications, Inc. Intelligent network adaptor with end-to-end flow control
US8065439B1 (en) * 2003-12-19 2011-11-22 Nvidia Corporation System and method for using metadata in the context of a transport offload engine
US8090901B2 (en) 2009-05-14 2012-01-03 Brocade Communications Systems, Inc. TCAM management approach that minimize movements
US8116203B2 (en) 2001-07-23 2012-02-14 Broadcom Corporation Multiple virtual channels for use in network devices
US8135016B2 (en) 2002-03-08 2012-03-13 Broadcom Corporation System and method for identifying upper layer protocol message boundaries
US8149839B1 (en) 2007-09-26 2012-04-03 Foundry Networks, Llc Selection of trunk ports and paths using rotation
US8170044B2 (en) 2002-05-06 2012-05-01 Foundry Networks, Llc Pipeline method and system for switching packets
US8180928B2 (en) 2002-08-30 2012-05-15 Broadcom Corporation Method and system for supporting read operations with CRC for iSCSI and iSCSI chimney
US8194666B2 (en) 2002-05-06 2012-06-05 Foundry Networks, Llc Flexible method for processing data packets in a network routing system for enhanced efficiency and monitoring capability
US8238255B2 (en) 2006-11-22 2012-08-07 Foundry Networks, Llc Recovering from failures without impact on data traffic in a shared bus architecture
US8271859B2 (en) 2007-07-18 2012-09-18 Foundry Networks Llc Segmented CRC design in high speed networks
WO2012135442A1 (en) * 2011-03-30 2012-10-04 Amazon Technologies, Inc. Frameworks and interfaces for offload device-based packet processing
US20130185378A1 (en) * 2012-01-18 2013-07-18 LineRate Systems, Inc. Cached hash table for networking
US8493988B2 (en) 2004-03-26 2013-07-23 Foundry Networks, Llc Method and apparatus for aggregating input data streams
US8549345B1 (en) * 2003-10-31 2013-10-01 Oracle America, Inc. Methods and apparatus for recovering from a failed network interface card
US8589587B1 (en) 2007-05-11 2013-11-19 Chelsio Communications, Inc. Protocol offload in intelligent network adaptor, including application level signalling
US8599850B2 (en) 2009-09-21 2013-12-03 Brocade Communications Systems, Inc. Provisioning single or multistage networks using ethernet service instances (ESIs)
US20140056315A1 (en) * 2005-12-02 2014-02-27 Broadcom Corporation Method and system for speed negotiation for twisted pair links in fibre channel systems
US8671219B2 (en) 2002-05-06 2014-03-11 Foundry Networks, Llc Method and apparatus for efficiently processing data packets in a computer network
US8730961B1 (en) 2004-04-26 2014-05-20 Foundry Networks, Llc System and method for optimizing router lookup
US8750320B2 (en) 1997-01-23 2014-06-10 Broadcom Corporation Fibre channel arbitrated loop bufferless switch circuitry to increase bandwidth without significant increase in cost
US20140180903A1 (en) * 2012-03-27 2014-06-26 Ip Reservoir, Llc Offload Processing of Data Packets
US8774213B2 (en) 2011-03-30 2014-07-08 Amazon Technologies, Inc. Frameworks and interfaces for offload device-based packet processing
US8798091B2 (en) 1998-11-19 2014-08-05 Broadcom Corporation Fibre channel arbitrated loop bufferless switch circuitry to increase bandwidth without significant increase in cost
US8891970B2 (en) 2003-08-29 2014-11-18 Finisar Corporation Modular optical device with mixed signal interface
US8935406B1 (en) 2007-04-16 2015-01-13 Chelsio Communications, Inc. Network adaptor configured for connection establishment offload
US8964754B2 (en) 2000-11-17 2015-02-24 Foundry Networks, Llc Backplane interface adapter with error control and redundant fabric
US9042403B1 (en) 2011-03-30 2015-05-26 Amazon Technologies, Inc. Offload device for stateless packet processing
US9047417B2 (en) 2012-10-29 2015-06-02 Intel Corporation NUMA aware network interface
US9065571B2 (en) 2003-08-29 2015-06-23 Finisar Corporation Modular controller that interfaces with modular optical device
US9154453B2 (en) 2009-01-16 2015-10-06 F5 Networks, Inc. Methods and systems for providing direct DMA
US9152483B2 (en) 2009-01-16 2015-10-06 F5 Networks, Inc. Network devices with multiple fully isolated and independently resettable direct memory access channels and methods thereof
US9270602B1 (en) * 2012-12-31 2016-02-23 F5 Networks, Inc. Transmit rate pacing of large network traffic bursts to reduce jitter, buffer overrun, wasted bandwidth, and retransmissions
US9313047B2 (en) 2009-11-06 2016-04-12 F5 Networks, Inc. Handling high throughput and low latency network data packets in a traffic management device
US9547680B2 (en) 2005-03-03 2017-01-17 Washington University Method and apparatus for performing similarity searching
US9582831B2 (en) 2006-06-19 2017-02-28 Ip Reservoir, Llc High speed processing of financial information using FPGA devices
GB2542373A (en) * 2015-09-16 2017-03-22 Nanospeed Tech Ltd TCP/IP offload system
US9864606B2 (en) 2013-09-05 2018-01-09 F5 Networks, Inc. Methods for configurable hardware logic device reloading and devices thereof
US9990393B2 (en) 2012-03-27 2018-06-05 Ip Reservoir, Llc Intelligent feed switch
US10037568B2 (en) 2010-12-09 2018-07-31 Ip Reservoir, Llc Method and apparatus for managing orders in financial markets
US10062115B2 (en) 2008-12-15 2018-08-28 Ip Reservoir, Llc Method and apparatus for high-speed processing of financial market depth data
US10121196B2 (en) 2012-03-27 2018-11-06 Ip Reservoir, Llc Offload processing of data packets containing financial market data
US10229453B2 (en) 2008-01-11 2019-03-12 Ip Reservoir, Llc Method and system for low latency basket calculation
US10375155B1 (en) 2013-02-19 2019-08-06 F5 Networks, Inc. System and method for achieving hardware acceleration for asymmetric flow connections
US10684973B2 (en) 2013-08-30 2020-06-16 Intel Corporation NUMA node peripheral switch
US11436672B2 (en) 2012-03-27 2022-09-06 Exegy Incorporated Intelligent switch for processing financial market data
US20220350543A1 (en) * 2021-04-29 2022-11-03 EMC IP Holding Company LLC Methods and systems for storing data in a distributed system using offload components
US11537716B1 (en) 2018-11-13 2022-12-27 F5, Inc. Methods for detecting changes to a firmware and devices thereof
US11567704B2 (en) 2021-04-29 2023-01-31 EMC IP Holding Company LLC Method and systems for storing data in a storage pool using memory semantics with applications interacting with emulated block devices
US11579976B2 (en) 2021-04-29 2023-02-14 EMC IP Holding Company LLC Methods and systems parallel raid rebuild in a distributed storage system
US11669259B2 (en) 2021-04-29 2023-06-06 EMC IP Holding Company LLC Methods and systems for methods and systems for in-line deduplication in a distributed storage system
US11677633B2 (en) 2021-10-27 2023-06-13 EMC IP Holding Company LLC Methods and systems for distributing topology information to client nodes
US11741056B2 (en) 2019-11-01 2023-08-29 EMC IP Holding Company LLC Methods and systems for allocating free space in a sparse file system
US11740822B2 (en) 2021-04-29 2023-08-29 EMC IP Holding Company LLC Methods and systems for error detection and correction in a distributed storage system
US11762682B2 (en) 2021-10-27 2023-09-19 EMC IP Holding Company LLC Methods and systems for storing data in a distributed system using offload components with advanced data services
US11855898B1 (en) 2018-03-14 2023-12-26 F5, Inc. Methods for traffic dependent direct memory access optimization and devices thereof
US11892983B2 (en) 2021-04-29 2024-02-06 EMC IP Holding Company LLC Methods and systems for seamless tiering in a distributed storage system
US11922071B2 (en) 2021-10-27 2024-03-05 EMC IP Holding Company LLC Methods and systems for storing data in a distributed system using offload components and a GPU module

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8621101B1 (en) 2000-09-29 2013-12-31 Alacritech, Inc. Intelligent network storage interface device
US6434620B1 (en) 1998-08-27 2002-08-13 Alacritech, Inc. TCP/IP offload network interface device
US7167927B2 (en) * 1997-10-14 2007-01-23 Alacritech, Inc. TCP/IP offload device with fast-path TCP ACK generating and transmitting mechanism
US6757746B2 (en) * 1997-10-14 2004-06-29 Alacritech, Inc. Obtaining a destination address so that a network interface device can write network data without headers directly into host memory
US6226680B1 (en) 1997-10-14 2001-05-01 Alacritech, Inc. Intelligent network interface system method for protocol processing
US8782199B2 (en) * 1997-10-14 2014-07-15 A-Tech Llc Parsing a packet header
US8539112B2 (en) 1997-10-14 2013-09-17 Alacritech, Inc. TCP/IP offload device
US6697868B2 (en) * 2000-02-28 2004-02-24 Alacritech, Inc. Protocol processing stack for use with intelligent network interface device
US8019901B2 (en) 2000-09-29 2011-09-13 Alacritech, Inc. Intelligent network storage interface system
US7543087B2 (en) * 2002-04-22 2009-06-02 Alacritech, Inc. Freeing transmit memory on a network interface device prior to receiving an acknowledgement that transmit data has been received by a remote device
US7403542B1 (en) * 2002-07-19 2008-07-22 Qlogic, Corporation Method and system for processing network data packets
US6996070B2 (en) * 2003-12-05 2006-02-07 Alacritech, Inc. TCP/IP offload device with reduced sequential processing
US7636372B2 (en) * 2003-12-19 2009-12-22 Broadcom Corporation Method and system for providing smart offload and upload
US8248939B1 (en) 2004-10-08 2012-08-21 Alacritech, Inc. Transferring control of TCP connections between hierarchy of processing mechanisms
US7797460B2 (en) * 2005-03-17 2010-09-14 Microsoft Corporation Enhanced network system through the combination of network objects
US20080263171A1 (en) * 2007-04-19 2008-10-23 Alacritech, Inc. Peripheral device that DMAS the same data to different locations in a computer
US8539513B1 (en) 2008-04-01 2013-09-17 Alacritech, Inc. Accelerating data transfer in a virtual computer system with tightly coupled TCP connections
US9106592B1 (en) * 2008-05-18 2015-08-11 Western Digital Technologies, Inc. Controller and method for controlling a buffered data transfer device
US8341286B1 (en) 2008-07-31 2012-12-25 Alacritech, Inc. TCP offload send optimization
US9306793B1 (en) 2008-10-22 2016-04-05 Alacritech, Inc. TCP offload device that batches session layer headers to reduce interrupts as well as CPU copies
US8880696B1 (en) 2009-01-16 2014-11-04 F5 Networks, Inc. Methods for sharing bandwidth across a packetized bus and systems thereof
US8402183B2 (en) * 2010-10-06 2013-03-19 Lsi Corporation System and method for coordinating control settings for hardware-automated I/O processors
US10135831B2 (en) 2011-01-28 2018-11-20 F5 Networks, Inc. System and method for combining an access control system with a traffic management system
WO2015095000A1 (en) 2013-12-16 2015-06-25 F5 Networks, Inc. Methods for facilitating improved user authentication using persistent data and devices thereof
US10015143B1 (en) 2014-06-05 2018-07-03 F5 Networks, Inc. Methods for securing one or more license entitlement grants and devices thereof
US10972453B1 (en) 2017-05-03 2021-04-06 F5 Networks, Inc. Methods for token refreshment based on single sign-on (SSO) for federated identity environments and devices thereof

Citations (96)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4589063A (en) * 1983-08-04 1986-05-13 Fortune Systems Corporation Data processing system having automatic configuration
US4991133A (en) * 1988-10-07 1991-02-05 International Business Machines Corp. Specialized communications processor for layered protocols
US5021446A (en) * 1988-05-06 1991-06-04 Sumitomo Chemical Company Pyrazole compounds, insecticidal and acaricidal compositions and use
US5097442A (en) * 1985-06-20 1992-03-17 Texas Instruments Incorporated Programmable depth first-in, first-out memory
US5212778A (en) * 1988-05-27 1993-05-18 Massachusetts Institute Of Technology Message-driven processor in a concurrent computer
US5223242A (en) * 1985-11-05 1993-06-29 The General Hospital Corporation Negatively charged specific affinity reagents
US5280477A (en) * 1992-08-17 1994-01-18 E-Systems, Inc. Network synchronous data distribution system
US5289580A (en) * 1991-05-10 1994-02-22 Unisys Corporation Programmable multiple I/O interface controller
US5289023A (en) * 1991-02-19 1994-02-22 Synaptics, Incorporated High-density photosensor and contactless imaging array having wide dynamic range
US5303344A (en) * 1989-03-13 1994-04-12 Hitachi, Ltd. Protocol processing apparatus for use in interfacing network connected computer systems utilizing separate paths for control information and data transfer
US5412782A (en) * 1992-07-02 1995-05-02 3Com Corporation Programmed I/O ethernet adapter with early interrupts for accelerating data transfer
US5418912A (en) * 1990-08-20 1995-05-23 International Business Machines Corporation System and method for controlling buffer transmission of data packets by limiting buffered data packets in a communication session
US5485579A (en) * 1989-09-08 1996-01-16 Auspex Systems, Inc. Multiple facility operating system architecture
US5485455A (en) * 1994-01-28 1996-01-16 Cabletron Systems, Inc. Network having secure fast packet switching and guaranteed quality of service
US5485460A (en) * 1994-08-19 1996-01-16 Microsoft Corporation System and method for running multiple incompatible network protocol stacks
US5506966A (en) * 1991-12-17 1996-04-09 Nec Corporation System for message traffic control utilizing prioritized message chaining for queueing control ensuring transmission/reception of high priority messages
US5511169A (en) * 1992-03-02 1996-04-23 Mitsubishi Denki Kabushiki Kaisha Data transmission apparatus and a communication path management method therefor
US5517668A (en) * 1994-01-10 1996-05-14 Amdahl Corporation Distributed protocol framework
US5524250A (en) * 1991-08-23 1996-06-04 Silicon Graphics, Inc. Central processing unit for processing a plurality of threads using dedicated general purpose registers and masque register for providing access to the registers
US5592622A (en) * 1995-05-10 1997-01-07 3Com Corporation Network intermediate system with message passing architecture
US5598410A (en) * 1994-12-29 1997-01-28 Storage Technology Corporation Method and apparatus for accelerated packet processing
US5619650A (en) * 1992-12-31 1997-04-08 International Business Machines Corporation Network processor for transforming a message transported from an I/O channel to a network by adding a message identifier and then converting the message
US5629933A (en) * 1995-06-07 1997-05-13 International Business Machines Corporation Method and system for enhanced communication in a multisession packet based communication system
US5634127A (en) * 1994-11-30 1997-05-27 International Business Machines Corporation Methods and apparatus for implementing a message driven processor in a client-server environment
US5634099A (en) * 1994-12-09 1997-05-27 International Business Machines Corporation Direct memory access unit for transferring data between processor memories in multiprocessing systems
US5633780A (en) * 1994-12-21 1997-05-27 Polaroid Corporation Electrostatic discharge protection device
US5642482A (en) * 1992-12-22 1997-06-24 Bull, S.A. System for network transmission using a communication co-processor comprising a microprocessor to implement protocol layer and a microprocessor to manage DMA
US5727142A (en) * 1996-05-03 1998-03-10 International Business Machines Corporation Method for a non-disruptive host connection switch after detection of an error condition or during a host outage or failure
US5742765A (en) * 1996-06-19 1998-04-21 Pmc-Sierra, Inc. Combination local ATM segmentation and reassembly and physical layer device
US5749095A (en) * 1996-07-01 1998-05-05 Sun Microsystems, Inc. Multiprocessing system configured to perform efficient write operations
US5752078A (en) * 1995-07-10 1998-05-12 International Business Machines Corporation System for minimizing latency data reception and handling data packet error if detected while transferring data packet from adapter memory to host memory
US5754715A (en) * 1996-11-12 1998-05-19 Melling; Peter J. Mid-infrared fiber-optic spectroscopic probe
US5758089A (en) * 1995-11-02 1998-05-26 Sun Microsystems, Inc. Method and apparatus for burst transferring ATM packet header and data to a host computer system
US5758186A (en) * 1995-10-06 1998-05-26 Sun Microsystems, Inc. Method and apparatus for generically handling diverse protocol method calls in a client/server computer system
US5758194A (en) * 1993-11-30 1998-05-26 Intel Corporation Communication apparatus for handling networks with different transmission protocols by stripping or adding data to the data stream in the application layer
US5758084A (en) * 1995-02-27 1998-05-26 Hewlett-Packard Company Apparatus for parallel client/server communication having data structures which stored values indicative of connection state and advancing the connection state of established connections
US5768618A (en) * 1995-12-21 1998-06-16 Ncr Corporation Method for performing sequence of actions in device connected to computer in response to specified values being written into snooped sub portions of address space
US5771349A (en) * 1992-05-12 1998-06-23 Compaq Computer Corp. Network packet switch using shared memory for repeating and bridging packets at media rate
US5774660A (en) * 1996-08-05 1998-06-30 Resonate, Inc. World-wide-web server with delayed resource-binding for resource-based load balancing on a distributed resource multi-node network
US5872919A (en) * 1997-05-07 1999-02-16 Advanced Micro Devices, Inc. Computer communication network having a packet processor with an execution unit which is variably configured from a programmable state machine and logic
US5878225A (en) * 1996-06-03 1999-03-02 International Business Machines Corporation Dual communication services interface for distributed transaction processing
US5892903A (en) * 1996-09-12 1999-04-06 Internet Security Systems, Inc. Method and apparatus for detecting and identifying security vulnerabilities in an open network computer communication system
US5898713A (en) * 1997-08-29 1999-04-27 Cisco Technology, Inc. IP checksum offload
US5913028A (en) * 1995-10-06 1999-06-15 Xpoint Technologies, Inc. Client/server data traffic delivery system and method
US6016513A (en) * 1998-02-19 2000-01-18 3Com Corporation Method of preventing packet loss during transfers of data packets between a network interface card and an operating system of a computer
US6026452A (en) * 1997-02-26 2000-02-15 Pitts; William Michael Network distributed site cache RAM claimed as up/down stream request/reply channel for storing anticipated data and meta data
US6034963A (en) * 1996-10-31 2000-03-07 Iready Corporation Multiple network protocol encoder/decoder and data processor
US6038562A (en) * 1996-09-05 2000-03-14 International Business Machines Corporation Interface to support state-dependent web applications accessing a relational database
US6041381A (en) * 1998-02-05 2000-03-21 Crossroads Systems, Inc. Fibre channel to SCSI addressing method and system
US6041058A (en) * 1997-09-11 2000-03-21 3Com Corporation Hardware filtering method and apparatus
US6044438A (en) * 1997-07-10 2000-03-28 International Business Machiness Corporation Memory controller for controlling memory accesses across networks in distributed shared memory processing systems
US6047323A (en) * 1995-10-19 2000-04-04 Hewlett-Packard Company Creation and migration of distributed streams in clusters of networked computers
US6047356A (en) * 1994-04-18 2000-04-04 Sonic Solutions Method of dynamically allocating network node memory's partitions for caching distributed files
US6049528A (en) * 1997-06-30 2000-04-11 Sun Microsystems, Inc. Trunking ethernet-compatible networks
US6057863A (en) * 1997-10-31 2000-05-02 Compaq Computer Corporation Dual purpose apparatus, method and system for accelerated graphics port and fibre channel arbitrated loop interfaces
US6061368A (en) * 1997-11-05 2000-05-09 Xylan Corporation Custom circuitry for adaptive hardware routing engine
US6065096A (en) * 1997-09-30 2000-05-16 Lsi Logic Corporation Integrated single chip dual mode raid controller
US6067569A (en) * 1997-07-10 2000-05-23 Microsoft Corporation Fast-forwarding and filtering of network packets in a computer system
US6070200A (en) * 1998-06-02 2000-05-30 Adaptec, Inc. Host adapter having paged data buffers for continuously transferring data between a system bus and a peripheral bus
US6078733A (en) * 1996-03-08 2000-06-20 Mitsubishi Electric Information Technolgy Center America, Inc. (Ita) Network interface having support for message processing and an interface to a message coprocessor
US6078564A (en) * 1996-08-30 2000-06-20 Lucent Technologies, Inc. System for improving data throughput of a TCP/IP network connection with slow return channel
US6172980B1 (en) * 1997-09-11 2001-01-09 3Com Corporation Multiple protocol support
US6173333B1 (en) * 1997-07-18 2001-01-09 Interprophet Corporation TCP/IP network accelerator system and method which identifies classes of packet traffic for predictable protocols
US6181705B1 (en) * 1993-12-21 2001-01-30 International Business Machines Corporation System and method for management a communications buffer
US6202105B1 (en) * 1998-06-02 2001-03-13 Adaptec, Inc. Host adapter capable of simultaneously transmitting and receiving data of multiple contexts between a computer bus and peripheral bus
US6226680B1 (en) * 1997-10-14 2001-05-01 Alacritech, Inc. Intelligent network interface system method for protocol processing
US6246683B1 (en) * 1998-05-01 2001-06-12 3Com Corporation Receive processing with network protocol bypass
US20010004354A1 (en) * 1999-05-17 2001-06-21 Jolitz Lynne G. Accelerator system and method
US6343360B1 (en) * 1999-05-13 2002-01-29 Microsoft Corporation Automated configuration of computing system using zip code data
US6345302B1 (en) * 1997-10-30 2002-02-05 Tsi Telsys, Inc. System for transmitting and receiving data within a reliable communications protocol by concurrently processing portions of the protocol suite
US6345301B1 (en) * 1999-03-30 2002-02-05 Unisys Corporation Split data path distributed network protocol
US6356951B1 (en) * 1999-03-01 2002-03-12 Sun Microsystems, Inc. System for parsing a packet for conformity with a predetermined protocol using mask and comparison values included in a parsing instruction
US6370599B1 (en) * 1998-06-12 2002-04-09 Microsoft Corporation System for ascertaining task off-load capabilities of a device and enabling selected capabilities and when needed selectively and dynamically requesting the device to perform the task
US6385647B1 (en) * 1997-08-18 2002-05-07 Mci Communications Corporations System for selectively routing data via either a network that supports Internet protocol or via satellite transmission network based on size of the data
US6389468B1 (en) * 1999-03-01 2002-05-14 Sun Microsystems, Inc. Method and apparatus for distributing network traffic processing on a multiprocessor computer
US6389479B1 (en) * 1997-10-14 2002-05-14 Alacritech, Inc. Intelligent network interface device and system for accelerated communication
US20020073223A1 (en) * 1998-09-28 2002-06-13 Raytheon Company, A Delaware Corporation Method and system for scheduling network communication
US6523119B2 (en) * 1996-12-04 2003-02-18 Rainbow Technologies, Inc. Software protection device and method
US6526446B1 (en) * 1999-04-27 2003-02-25 3Com Corporation Hardware only transmission control protocol segmentation for a high performance network interface card
US20030066011A1 (en) * 2001-04-12 2003-04-03 Siliquent Technologies Ltd. Out-of-order calculation of error detection codes
US6570884B1 (en) * 1999-11-05 2003-05-27 3Com Corporation Receive filtering for communication interface
US20030110344A1 (en) * 1996-09-18 2003-06-12 Andre Szczepanek Communications systems, apparatus and methods
US6678283B1 (en) * 1999-03-10 2004-01-13 Lucent Technologies Inc. System and method for distributing packet processing in an internetworking device
US6681364B1 (en) * 1999-09-24 2004-01-20 International Business Machines Corporation Cyclic redundancy check for partitioned frames
US6687758B2 (en) * 2001-03-07 2004-02-03 Alacritech, Inc. Port aggregation for network connections that are offloaded to network interface devices
US6697868B2 (en) * 2000-02-28 2004-02-24 Alacritech, Inc. Protocol processing stack for use with intelligent network interface device
US20040054814A1 (en) * 2002-09-17 2004-03-18 Mcdaniel Scott S. System and method for handling frames in multiple stack environments
US20040059926A1 (en) * 2002-09-20 2004-03-25 Compaq Information Technology Group, L.P. Network interface controller with firmware enabled licensing features
US6842896B1 (en) * 1999-09-03 2005-01-11 Rainbow Technologies, Inc. System and method for selecting a server in a multiple server license management system
US6996070B2 (en) * 2003-12-05 2006-02-07 Alacritech, Inc. TCP/IP offload device with reduced sequential processing
US7042898B2 (en) * 1997-10-14 2006-05-09 Alacritech, Inc. Reducing delays associated with inserting a checksum into a network message
US7167926B1 (en) * 1998-08-27 2007-01-23 Alacritech, Inc. TCP/IP offload network interface device
US7167927B2 (en) * 1997-10-14 2007-01-23 Alacritech, Inc. TCP/IP offload device with fast-path TCP ACK generating and transmitting mechanism
US7174393B2 (en) * 2000-12-26 2007-02-06 Alacritech, Inc. TCP/IP offload network interface device
US7191318B2 (en) * 2002-12-12 2007-03-13 Alacritech, Inc. Native copy instruction for file-access processor with copy-rule-based validation
US7191241B2 (en) * 2002-09-27 2007-03-13 Alacritech, Inc. Fast-path apparatus for receiving data corresponding to a TCP connection

Family Cites Families (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4366538A (en) 1980-10-31 1982-12-28 Honeywell Information Systems Inc. Memory controller with queue control apparatus
US4700185A (en) 1984-12-26 1987-10-13 Motorola Inc. Request with response mechanism and method for a local area network controller
JP2986802B2 (en) 1989-03-13 1999-12-06 株式会社日立製作所 Protocol high-speed processing method
US5058110A (en) 1989-05-03 1991-10-15 Ultra Network Technologies Protocol processor
US5163131A (en) 1989-09-08 1992-11-10 Auspex Systems, Inc. Parallel i/o network file server architecture
JP2791236B2 (en) 1991-07-25 1998-08-27 三菱電機株式会社 Protocol parallel processing unit
US5574919A (en) 1991-08-29 1996-11-12 Lucent Technologies Inc. Method for thinning a protocol
JPH05260045A (en) 1992-01-14 1993-10-08 Ricoh Co Ltd Communication method for data terminal equipment
DE69324508T2 (en) 1992-01-22 1999-12-23 Enhanced Memory Systems Inc DRAM with integrated registers
JPH07504527A (en) 1992-03-09 1995-05-18 オースペックス システムズ インコーポレイテッド High performance non-volatile RAM protected write cache accelerator system
JPH0619771A (en) 1992-04-20 1994-01-28 Internatl Business Mach Corp <Ibm> File management system of shared file by different kinds of clients
US5671355A (en) 1992-06-26 1997-09-23 Predacomm, Inc. Reconfigurable network interface apparatus and method
GB9300942D0 (en) 1993-01-19 1993-03-10 Int Computers Ltd Parallel computer system
EP0609595B1 (en) 1993-02-05 1998-08-12 Hewlett-Packard Company Method and apparatus for verifying CRC codes by combination of partial CRC codes
US5815646A (en) 1993-04-13 1998-09-29 C-Cube Microsystems Decompression processor for video applications
JP3358254B2 (en) 1993-10-28 2002-12-16 株式会社日立製作所 Communication control device and communication control circuit device
US5448566A (en) 1993-11-15 1995-09-05 International Business Machines Corporation Method and apparatus for facilitating communication in a multilayer communication architecture via a dynamic communication channel
US5809527A (en) 1993-12-23 1998-09-15 Unisys Corporation Outboard file cache system
JPH08180001A (en) 1994-04-12 1996-07-12 Mitsubishi Electric Corp Communication system, communication method and network interface
WO1996007139A1 (en) 1994-09-01 1996-03-07 Mcalpine Gary L A multi-port memory system including read and write buffer interfaces
US5548730A (en) 1994-09-20 1996-08-20 Intel Corporation Intelligent bus bridge for input/output subsystems in a computer system
US5566170A (en) 1994-12-29 1996-10-15 Storage Technology Corporation Method and apparatus for accelerated packet forwarding
US5701434A (en) 1995-03-16 1997-12-23 Hitachi, Ltd. Interleave memory controller with a common access queue
US5802278A (en) 1995-05-10 1998-09-01 3Com Corporation Bridge/router architecture for high performance scalable networking
US5664114A (en) 1995-05-16 1997-09-02 Hewlett-Packard Company Asynchronous FIFO queuing system operating with minimal queue status
JPH096706A (en) 1995-06-22 1997-01-10 Hitachi Ltd Loosely coupled computer system
US5812775A (en) 1995-07-12 1998-09-22 3Com Corporation Method and apparatus for internetworking buffer management
US5742840A (en) 1995-08-16 1998-04-21 Microunity Systems Engineering, Inc. General purpose, multiple precision parallel operation, programmable media processor
US5682534A (en) 1995-09-12 1997-10-28 International Business Machines Corporation Transparent local RPC optimization
US5699350A (en) 1995-10-06 1997-12-16 Canon Kabushiki Kaisha Reconfiguration of protocol stacks and/or frame type assignments in a network interface device
US5848293A (en) 1995-11-03 1998-12-08 Sun Microsystems, Inc. Method and apparatus for transmission and processing of virtual commands
US5809328A (en) 1995-12-21 1998-09-15 Unisys Corp. Apparatus for fibre channel transmission having interface logic, buffer memory, multiplexor/control device, fibre channel controller, gigabit link module, microprocessor, and bus control device
US5802258A (en) 1996-05-03 1998-09-01 International Business Machines Corporation Loosely coupled system environment designed to handle a non-disruptive host connection switch after detection of an error condition or during a host outage or failure
US5751715A (en) 1996-08-08 1998-05-12 Gadzoox Microsystems, Inc. Accelerator fiber channel hub and protocol
US6009504A (en) * 1996-09-27 1999-12-28 Intel Corporation Apparatus and method for storing data associated with multiple addresses in a storage element using a base address and a mask
US5987022A (en) 1996-12-27 1999-11-16 Motorola, Inc. Method for transmitting multiple-protocol packetized data
US5930830A (en) 1997-01-13 1999-07-27 International Business Machines Corporation System and method for concatenating discontiguous memory pages
US5996013A (en) 1997-04-30 1999-11-30 International Business Machines Corporation Method and apparatus for resource allocation with guarantees
US5920566A (en) 1997-06-30 1999-07-06 Sun Microsystems, Inc. Routing in a multi-layer distributed network element
US6021446A (en) 1997-07-11 2000-02-01 Sun Microsystems, Inc. Network device driver performing initial packet processing within high priority hardware interrupt service routine and then finishing processing within low priority software interrupt service routine
US5991299A (en) 1997-09-11 1999-11-23 3Com Corporation High speed header translation processing
US6005849A (en) 1997-09-24 1999-12-21 Emulex Corporation Full-duplex communication processor which can be used for fibre channel frames
US5941969A (en) 1997-10-22 1999-08-24 Auspex Systems, Inc. Bridge for direct data storage device access
US5937169A (en) 1997-10-29 1999-08-10 3Com Corporation Offload of TCP segmentation to a smart adapter
US6009478A (en) 1997-11-04 1999-12-28 Adaptec, Inc. File array communications interface for communicating between a host computer and an adapter
US5941972A (en) 1997-12-31 1999-08-24 Crossroads Systems, Inc. Storage router and method for providing virtual local storage
US5950203A (en) 1997-12-31 1999-09-07 Mercury Computer Systems, Inc. Method and apparatus for high-speed access to and sharing of storage devices on a networked digital data processing system
US5996024A (en) 1998-01-14 1999-11-30 Emc Corporation Method and apparatus for a SCSI applications server which extracts SCSI commands and data from message and encapsulates SCSI responses to provide transparent operation

Patent Citations (99)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4589063A (en) * 1983-08-04 1986-05-13 Fortune Systems Corporation Data processing system having automatic configuration
US5097442A (en) * 1985-06-20 1992-03-17 Texas Instruments Incorporated Programmable depth first-in, first-out memory
US5223242A (en) * 1985-11-05 1993-06-29 The General Hospital Corporation Negatively charged specific affinity reagents
US5021446A (en) * 1988-05-06 1991-06-04 Sumitomo Chemical Company Pyrazole compounds, insecticidal and acaricidal compositions and use
US5212778A (en) * 1988-05-27 1993-05-18 Massachusetts Institute Of Technology Message-driven processor in a concurrent computer
US4991133A (en) * 1988-10-07 1991-02-05 International Business Machines Corp. Specialized communications processor for layered protocols
US5303344A (en) * 1989-03-13 1994-04-12 Hitachi, Ltd. Protocol processing apparatus for use in interfacing network connected computer systems utilizing separate paths for control information and data transfer
US5485579A (en) * 1989-09-08 1996-01-16 Auspex Systems, Inc. Multiple facility operating system architecture
US5418912A (en) * 1990-08-20 1995-05-23 International Business Machines Corporation System and method for controlling buffer transmission of data packets by limiting buffered data packets in a communication session
US5289023A (en) * 1991-02-19 1994-02-22 Synaptics, Incorporated High-density photosensor and contactless imaging array having wide dynamic range
US5289580A (en) * 1991-05-10 1994-02-22 Unisys Corporation Programmable multiple I/O interface controller
US5524250A (en) * 1991-08-23 1996-06-04 Silicon Graphics, Inc. Central processing unit for processing a plurality of threads using dedicated general purpose registers and masque register for providing access to the registers
US5506966A (en) * 1991-12-17 1996-04-09 Nec Corporation System for message traffic control utilizing prioritized message chaining for queueing control ensuring transmission/reception of high priority messages
US5511169A (en) * 1992-03-02 1996-04-23 Mitsubishi Denki Kabushiki Kaisha Data transmission apparatus and a communication path management method therefor
US5771349A (en) * 1992-05-12 1998-06-23 Compaq Computer Corp. Network packet switch using shared memory for repeating and bridging packets at media rate
US5412782A (en) * 1992-07-02 1995-05-02 3Com Corporation Programmed I/O ethernet adapter with early interrupts for accelerating data transfer
US5280477A (en) * 1992-08-17 1994-01-18 E-Systems, Inc. Network synchronous data distribution system
US5642482A (en) * 1992-12-22 1997-06-24 Bull, S.A. System for network transmission using a communication co-processor comprising a microprocessor to implement protocol layer and a microprocessor to manage DMA
US5619650A (en) * 1992-12-31 1997-04-08 International Business Machines Corporation Network processor for transforming a message transported from an I/O channel to a network by adding a message identifier and then converting the message
US5758194A (en) * 1993-11-30 1998-05-26 Intel Corporation Communication apparatus for handling networks with different transmission protocols by stripping or adding data to the data stream in the application layer
US6181705B1 (en) * 1993-12-21 2001-01-30 International Business Machines Corporation System and method for management a communications buffer
US5517668A (en) * 1994-01-10 1996-05-14 Amdahl Corporation Distributed protocol framework
US5485455A (en) * 1994-01-28 1996-01-16 Cabletron Systems, Inc. Network having secure fast packet switching and guaranteed quality of service
US6047356A (en) * 1994-04-18 2000-04-04 Sonic Solutions Method of dynamically allocating network node memory's partitions for caching distributed files
US5485460A (en) * 1994-08-19 1996-01-16 Microsoft Corporation System and method for running multiple incompatible network protocol stacks
US5634127A (en) * 1994-11-30 1997-05-27 International Business Machines Corporation Methods and apparatus for implementing a message driven processor in a client-server environment
US5634099A (en) * 1994-12-09 1997-05-27 International Business Machines Corporation Direct memory access unit for transferring data between processor memories in multiprocessing systems
US5633780A (en) * 1994-12-21 1997-05-27 Polaroid Corporation Electrostatic discharge protection device
US5598410A (en) * 1994-12-29 1997-01-28 Storage Technology Corporation Method and apparatus for accelerated packet processing
US5758084A (en) * 1995-02-27 1998-05-26 Hewlett-Packard Company Apparatus for parallel client/server communication having data structures which stored values indicative of connection state and advancing the connection state of established connections
US5592622A (en) * 1995-05-10 1997-01-07 3Com Corporation Network intermediate system with message passing architecture
US5629933A (en) * 1995-06-07 1997-05-13 International Business Machines Corporation Method and system for enhanced communication in a multisession packet based communication system
US5752078A (en) * 1995-07-10 1998-05-12 International Business Machines Corporation System for minimizing latency data reception and handling data packet error if detected while transferring data packet from adapter memory to host memory
US5913028A (en) * 1995-10-06 1999-06-15 Xpoint Technologies, Inc. Client/server data traffic delivery system and method
US5758186A (en) * 1995-10-06 1998-05-26 Sun Microsystems, Inc. Method and apparatus for generically handling diverse protocol method calls in a client/server computer system
US6047323A (en) * 1995-10-19 2000-04-04 Hewlett-Packard Company Creation and migration of distributed streams in clusters of networked computers
US5758089A (en) * 1995-11-02 1998-05-26 Sun Microsystems, Inc. Method and apparatus for burst transferring ATM packet header and data to a host computer system
US5768618A (en) * 1995-12-21 1998-06-16 Ncr Corporation Method for performing sequence of actions in device connected to computer in response to specified values being written into snooped sub portions of address space
US6078733A (en) * 1996-03-08 2000-06-20 Mitsubishi Electric Information Technolgy Center America, Inc. (Ita) Network interface having support for message processing and an interface to a message coprocessor
US5727142A (en) * 1996-05-03 1998-03-10 International Business Machines Corporation Method for a non-disruptive host connection switch after detection of an error condition or during a host outage or failure
US6021507A (en) * 1996-05-03 2000-02-01 International Business Machines Corporation Method for a non-disruptive host connection switch after detection of an error condition or during a host outage or failure
US5878225A (en) * 1996-06-03 1999-03-02 International Business Machines Corporation Dual communication services interface for distributed transaction processing
US5742765A (en) * 1996-06-19 1998-04-21 Pmc-Sierra, Inc. Combination local ATM segmentation and reassembly and physical layer device
US5749095A (en) * 1996-07-01 1998-05-05 Sun Microsystems, Inc. Multiprocessing system configured to perform efficient write operations
US5774660A (en) * 1996-08-05 1998-06-30 Resonate, Inc. World-wide-web server with delayed resource-binding for resource-based load balancing on a distributed resource multi-node network
US6078564A (en) * 1996-08-30 2000-06-20 Lucent Technologies, Inc. System for improving data throughput of a TCP/IP network connection with slow return channel
US6038562A (en) * 1996-09-05 2000-03-14 International Business Machines Corporation Interface to support state-dependent web applications accessing a relational database
US5892903A (en) * 1996-09-12 1999-04-06 Internet Security Systems, Inc. Method and apparatus for detecting and identifying security vulnerabilities in an open network computer communication system
US20030110344A1 (en) * 1996-09-18 2003-06-12 Andre Szczepanek Communications systems, apparatus and methods
US6034963A (en) * 1996-10-31 2000-03-07 Iready Corporation Multiple network protocol encoder/decoder and data processor
US5754715A (en) * 1996-11-12 1998-05-19 Melling; Peter J. Mid-infrared fiber-optic spectroscopic probe
US6523119B2 (en) * 1996-12-04 2003-02-18 Rainbow Technologies, Inc. Software protection device and method
US6026452A (en) * 1997-02-26 2000-02-15 Pitts; William Michael Network distributed site cache RAM claimed as up/down stream request/reply channel for storing anticipated data and meta data
US5872919A (en) * 1997-05-07 1999-02-16 Advanced Micro Devices, Inc. Computer communication network having a packet processor with an execution unit which is variably configured from a programmable state machine and logic
US6049528A (en) * 1997-06-30 2000-04-11 Sun Microsystems, Inc. Trunking ethernet-compatible networks
US6044438A (en) * 1997-07-10 2000-03-28 International Business Machiness Corporation Memory controller for controlling memory accesses across networks in distributed shared memory processing systems
US6067569A (en) * 1997-07-10 2000-05-23 Microsoft Corporation Fast-forwarding and filtering of network packets in a computer system
US6173333B1 (en) * 1997-07-18 2001-01-09 Interprophet Corporation TCP/IP network accelerator system and method which identifies classes of packet traffic for predictable protocols
US6385647B1 (en) * 1997-08-18 2002-05-07 Mci Communications Corporations System for selectively routing data via either a network that supports Internet protocol or via satellite transmission network based on size of the data
US5898713A (en) * 1997-08-29 1999-04-27 Cisco Technology, Inc. IP checksum offload
US6041058A (en) * 1997-09-11 2000-03-21 3Com Corporation Hardware filtering method and apparatus
US6172980B1 (en) * 1997-09-11 2001-01-09 3Com Corporation Multiple protocol support
US6065096A (en) * 1997-09-30 2000-05-16 Lsi Logic Corporation Integrated single chip dual mode raid controller
US6389479B1 (en) * 1997-10-14 2002-05-14 Alacritech, Inc. Intelligent network interface device and system for accelerated communication
US6226680B1 (en) * 1997-10-14 2001-05-01 Alacritech, Inc. Intelligent network interface system method for protocol processing
US6247060B1 (en) * 1997-10-14 2001-06-12 Alacritech, Inc. Passing a communication control block from host to a local device such that a message is processed on the device
US6393487B2 (en) * 1997-10-14 2002-05-21 Alacritech, Inc. Passing a communication control block to a local device such that a message is processed on the device
US7167927B2 (en) * 1997-10-14 2007-01-23 Alacritech, Inc. TCP/IP offload device with fast-path TCP ACK generating and transmitting mechanism
US7042898B2 (en) * 1997-10-14 2006-05-09 Alacritech, Inc. Reducing delays associated with inserting a checksum into a network message
US6345302B1 (en) * 1997-10-30 2002-02-05 Tsi Telsys, Inc. System for transmitting and receiving data within a reliable communications protocol by concurrently processing portions of the protocol suite
US6057863A (en) * 1997-10-31 2000-05-02 Compaq Computer Corporation Dual purpose apparatus, method and system for accelerated graphics port and fibre channel arbitrated loop interfaces
US6061368A (en) * 1997-11-05 2000-05-09 Xylan Corporation Custom circuitry for adaptive hardware routing engine
US6041381A (en) * 1998-02-05 2000-03-21 Crossroads Systems, Inc. Fibre channel to SCSI addressing method and system
US6016513A (en) * 1998-02-19 2000-01-18 3Com Corporation Method of preventing packet loss during transfers of data packets between a network interface card and an operating system of a computer
US6246683B1 (en) * 1998-05-01 2001-06-12 3Com Corporation Receive processing with network protocol bypass
US6202105B1 (en) * 1998-06-02 2001-03-13 Adaptec, Inc. Host adapter capable of simultaneously transmitting and receiving data of multiple contexts between a computer bus and peripheral bus
US6070200A (en) * 1998-06-02 2000-05-30 Adaptec, Inc. Host adapter having paged data buffers for continuously transferring data between a system bus and a peripheral bus
US6370599B1 (en) * 1998-06-12 2002-04-09 Microsoft Corporation System for ascertaining task off-load capabilities of a device and enabling selected capabilities and when needed selectively and dynamically requesting the device to perform the task
US7167926B1 (en) * 1998-08-27 2007-01-23 Alacritech, Inc. TCP/IP offload network interface device
US20020073223A1 (en) * 1998-09-28 2002-06-13 Raytheon Company, A Delaware Corporation Method and system for scheduling network communication
US6356951B1 (en) * 1999-03-01 2002-03-12 Sun Microsystems, Inc. System for parsing a packet for conformity with a predetermined protocol using mask and comparison values included in a parsing instruction
US6389468B1 (en) * 1999-03-01 2002-05-14 Sun Microsystems, Inc. Method and apparatus for distributing network traffic processing on a multiprocessor computer
US6678283B1 (en) * 1999-03-10 2004-01-13 Lucent Technologies Inc. System and method for distributing packet processing in an internetworking device
US6345301B1 (en) * 1999-03-30 2002-02-05 Unisys Corporation Split data path distributed network protocol
US6526446B1 (en) * 1999-04-27 2003-02-25 3Com Corporation Hardware only transmission control protocol segmentation for a high performance network interface card
US6343360B1 (en) * 1999-05-13 2002-01-29 Microsoft Corporation Automated configuration of computing system using zip code data
US20010004354A1 (en) * 1999-05-17 2001-06-21 Jolitz Lynne G. Accelerator system and method
US6842896B1 (en) * 1999-09-03 2005-01-11 Rainbow Technologies, Inc. System and method for selecting a server in a multiple server license management system
US6681364B1 (en) * 1999-09-24 2004-01-20 International Business Machines Corporation Cyclic redundancy check for partitioned frames
US6570884B1 (en) * 1999-11-05 2003-05-27 3Com Corporation Receive filtering for communication interface
US6697868B2 (en) * 2000-02-28 2004-02-24 Alacritech, Inc. Protocol processing stack for use with intelligent network interface device
US7174393B2 (en) * 2000-12-26 2007-02-06 Alacritech, Inc. TCP/IP offload network interface device
US6687758B2 (en) * 2001-03-07 2004-02-03 Alacritech, Inc. Port aggregation for network connections that are offloaded to network interface devices
US20030066011A1 (en) * 2001-04-12 2003-04-03 Siliquent Technologies Ltd. Out-of-order calculation of error detection codes
US20040054814A1 (en) * 2002-09-17 2004-03-18 Mcdaniel Scott S. System and method for handling frames in multiple stack environments
US20040059926A1 (en) * 2002-09-20 2004-03-25 Compaq Information Technology Group, L.P. Network interface controller with firmware enabled licensing features
US7191241B2 (en) * 2002-09-27 2007-03-13 Alacritech, Inc. Fast-path apparatus for receiving data corresponding to a TCP connection
US7191318B2 (en) * 2002-12-12 2007-03-13 Alacritech, Inc. Native copy instruction for file-access processor with copy-rule-based validation
US6996070B2 (en) * 2003-12-05 2006-02-07 Alacritech, Inc. TCP/IP offload device with reduced sequential processing

Cited By (256)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8774199B2 (en) 1997-01-23 2014-07-08 Broadcom Corporation Fibre channel arbitrated loop bufferless switch circuitry to increase bandwidth without significant increase in cost
US8750320B2 (en) 1997-01-23 2014-06-10 Broadcom Corporation Fibre channel arbitrated loop bufferless switch circuitry to increase bandwidth without significant increase in cost
US8767756B2 (en) 1997-01-23 2014-07-01 Broadcom Corporation Fibre channel arbitrated loop bufferless switch circuitry to increase bandwidth without significant increase in cost
US20060031818A1 (en) * 1997-05-08 2006-02-09 Poff Thomas C Hardware accelerator for an object-oriented programming language
US9098297B2 (en) 1997-05-08 2015-08-04 Nvidia Corporation Hardware accelerator for an object-oriented programming language
US7483375B2 (en) 1998-06-11 2009-01-27 Nvidia Corporation TCP/IP/PPP modem
US20040213290A1 (en) * 1998-06-11 2004-10-28 Johnson Michael Ward TCP/IP/PPP modem
US8798091B2 (en) 1998-11-19 2014-08-05 Broadcom Corporation Fibre channel arbitrated loop bufferless switch circuitry to increase bandwidth without significant increase in cost
US7577164B2 (en) * 2000-07-25 2009-08-18 Juniper Networks, Inc. Network architecture and methods for transparent on-line cross-sessional encoding and transport of network communications data
US7760954B2 (en) 2000-07-25 2010-07-20 Juniper Networks, Inc. System and method for incremental and continuous data compression
US20060114939A1 (en) * 2000-07-25 2006-06-01 Juniper Networks, Inc. Network architecture and methods for transparent on-line cross-sessional encoding and transport of network communications data
US20090022413A1 (en) * 2000-07-25 2009-01-22 Juniper Networks, Inc. System and method for incremental and continuous data compression
US20050271042A1 (en) * 2000-11-10 2005-12-08 Michael Johnson Internet modem streaming socket method
US7302499B2 (en) 2000-11-10 2007-11-27 Nvidia Corporation Internet modem streaming socket method
US8964754B2 (en) 2000-11-17 2015-02-24 Foundry Networks, Llc Backplane interface adapter with error control and redundant fabric
US20080205407A1 (en) * 2000-11-17 2008-08-28 Andrew Chang Network switch cross point
US20060010238A1 (en) * 2001-03-07 2006-01-12 Alacritech, Inc. Port aggregation for network connections that are offloaded to network interface devices
US7640364B2 (en) * 2001-03-07 2009-12-29 Alacritech, Inc. Port aggregation for network connections that are offloaded to network interface devices
US20090097499A1 (en) * 2001-04-11 2009-04-16 Chelsio Communications, Inc. Multi-purpose switching network interface controller
US7447795B2 (en) 2001-04-11 2008-11-04 Chelsio Communications, Inc. Multi-purpose switching network interface controller
US20040172485A1 (en) * 2001-04-11 2004-09-02 Kianoosh Naghshineh Multi-purpose switching network interface controller
US8032655B2 (en) 2001-04-11 2011-10-04 Chelsio Communications, Inc. Configurable switching network interface controller using forwarding engine
US8218555B2 (en) 2001-04-24 2012-07-10 Nvidia Corporation Gigabit ethernet adapter
US20030165160A1 (en) * 2001-04-24 2003-09-04 Minami John Shigeto Gigabit Ethernet adapter
US8493857B2 (en) 2001-07-23 2013-07-23 Broadcom Corporation Multiple logical channels for use in network devices
US9036643B2 (en) 2001-07-23 2015-05-19 Broadcom Corporation Multiple logical channels for use in network devices
US8116203B2 (en) 2001-07-23 2012-02-14 Broadcom Corporation Multiple virtual channels for use in network devices
US8493852B2 (en) 2002-01-15 2013-07-23 Intel Corporation Packet aggregation
US8730984B2 (en) 2002-01-15 2014-05-20 Intel Corporation Queuing based on packet classification
US20110208871A1 (en) * 2002-01-15 2011-08-25 Intel Corporation Queuing based on packet classification
US20110208874A1 (en) * 2002-01-15 2011-08-25 Intel Corporation Packet aggregation
US20040081202A1 (en) * 2002-01-25 2004-04-29 Minami John S Communications processor
US7535913B2 (en) * 2002-03-06 2009-05-19 Nvidia Corporation Gigabit ethernet adapter supporting the iSCSI and IPSEC protocols
US20040062267A1 (en) * 2002-03-06 2004-04-01 Minami John Shigeto Gigabit Ethernet adapter supporting the iSCSI and IPSEC protocols
US8958440B2 (en) 2002-03-08 2015-02-17 Broadcom Corporation System and method for identifying upper layer protocol message boundaries
US8135016B2 (en) 2002-03-08 2012-03-13 Broadcom Corporation System and method for identifying upper layer protocol message boundaries
US8345689B2 (en) 2002-03-08 2013-01-01 Broadcom Corporation System and method for identifying upper layer protocol message boundaries
US8451863B2 (en) 2002-03-08 2013-05-28 Broadcom Corporation System and method for identifying upper layer protocol message boundaries
US8170044B2 (en) 2002-05-06 2012-05-01 Foundry Networks, Llc Pipeline method and system for switching packets
US8671219B2 (en) 2002-05-06 2014-03-11 Foundry Networks, Llc Method and apparatus for efficiently processing data packets in a computer network
US8194666B2 (en) 2002-05-06 2012-06-05 Foundry Networks, Llc Flexible method for processing data packets in a network routing system for enhanced efficiency and monitoring capability
US8989202B2 (en) 2002-05-06 2015-03-24 Foundry Networks, Llc Pipeline method and system for switching packets
US20100135313A1 (en) * 2002-05-06 2010-06-03 Foundry Networks, Inc. Network routing system for enhanced efficiency and monitoring capability
US20040177307A1 (en) * 2002-06-28 2004-09-09 Interdigital Technology Corporation System and method for transmitting a sequence of data blocks
US7934021B2 (en) 2002-08-29 2011-04-26 Broadcom Corporation System and method for network interfacing
US8180928B2 (en) 2002-08-30 2012-05-15 Broadcom Corporation Method and system for supporting read operations with CRC for iSCSI and iSCSI chimney
US7929540B2 (en) 2002-08-30 2011-04-19 Broadcom Corporation System and method for handling out-of-order frames
WO2004021150A3 (en) * 2002-08-30 2004-08-12 Broadcom Corp System and method for tpc/ip offload independent of bandwidth delay product
US20040042464A1 (en) * 2002-08-30 2004-03-04 Uri Elzur System and method for TCP/IP offload independent of bandwidth delay product
US8677010B2 (en) 2002-08-30 2014-03-18 Broadcom Corporation System and method for TCP offload
US8402142B2 (en) 2002-08-30 2013-03-19 Broadcom Corporation System and method for TCP/IP offload independent of bandwidth delay product
US7849208B2 (en) 2002-08-30 2010-12-07 Broadcom Corporation System and method for TCP offload
US7912064B2 (en) 2002-08-30 2011-03-22 Broadcom Corporation System and method for handling out-of-order frames
US8549152B2 (en) 2002-08-30 2013-10-01 Broadcom Corporation System and method for TCP/IP offload independent of bandwidth delay product
US7313623B2 (en) 2002-08-30 2007-12-25 Broadcom Corporation System and method for TCP/IP offload independent of bandwidth delay product
US20070263630A1 (en) * 2002-09-04 2007-11-15 Broadcom Corporation System and method for fault tolerant tcp offload
US7746867B2 (en) 2002-09-04 2010-06-29 Broadcom Corporation System and method for fault tolerant TCP offload
US7224692B2 (en) * 2002-09-04 2007-05-29 Broadcom Corporation System and method for fault tolerant TCP offload
US20040042412A1 (en) * 2002-09-04 2004-03-04 Fan Kan Frankie System and method for fault tolerant TCP offload
US20100262859A1 (en) * 2002-09-04 2010-10-14 Broadcom Corporation System and method for fault tolerant tcp offload
US7876761B2 (en) 2002-09-04 2011-01-25 Broadcom Corporation System and method for fault tolerant TCP offload
US20040049580A1 (en) * 2002-09-05 2004-03-11 International Business Machines Corporation Receive queue device with efficient queue flow control, segment placement and virtualization mechanisms
US7912988B2 (en) 2002-09-05 2011-03-22 International Business Machines Corporation Receive queue device with efficient queue flow control, segment placement and virtualization mechanisms
US20060259644A1 (en) * 2002-09-05 2006-11-16 Boyd William T Receive queue device with efficient queue flow control, segment placement and virtualization mechanisms
US7673074B1 (en) * 2002-10-24 2010-03-02 Emulex Design & Manufacturing Corporation Avoiding port collisions in hardware-accelerated network protocol
US20040100907A1 (en) * 2002-11-25 2004-05-27 Illikkal Rameshkumar G. Managing a protocol control block cache in a network device
US7289445B2 (en) * 2002-11-25 2007-10-30 Intel Corporation Managing a protocol control block cache in a network device
US20040117496A1 (en) * 2002-12-12 2004-06-17 Nexsil Communications, Inc. Networked application request servicing offloaded from host
US7596634B2 (en) * 2002-12-12 2009-09-29 Millind Mittal Networked application request servicing offloaded from host
US7362772B1 (en) 2002-12-13 2008-04-22 Nvidia Corporation Network processing pipeline chipset for routing and host packet processing
US7324547B1 (en) 2002-12-13 2008-01-29 Nvidia Corporation Internet protocol (IP) router residing in a processor chipset
US7924868B1 (en) 2002-12-13 2011-04-12 Nvidia Corporation Internet protocol (IP) router residing in a processor chipset
US20080301406A1 (en) * 2003-01-06 2008-12-04 Van Jacobson System and method for allocating communications to processors in a multiprocessor system
US7386619B1 (en) * 2003-01-06 2008-06-10 Slt Logic, Llc System and method for allocating communications to processors in a multiprocessor system
US7594002B1 (en) * 2003-02-14 2009-09-22 Istor Networks, Inc. Hardware-accelerated high availability integrated networked storage system
US20040221050A1 (en) * 2003-05-02 2004-11-04 Graham Smith Direct TCP/IP communication method and system for coupling to a CPU/Memory complex
US8718051B2 (en) 2003-05-15 2014-05-06 Foundry Networks, Llc System and method for high speed packet transmission
US8811390B2 (en) 2003-05-15 2014-08-19 Foundry Networks, Llc System and method for high speed packet transmission
US20100061393A1 (en) * 2003-05-15 2010-03-11 Foundry Networks, Inc. System and Method for High Speed Packet Transmission
US9461940B2 (en) 2003-05-15 2016-10-04 Foundry Networks, Llc System and method for high speed packet transmission
US7420931B2 (en) 2003-06-05 2008-09-02 Nvidia Corporation Using TCP/IP offload to accelerate packet filtering
US7991918B2 (en) 2003-06-05 2011-08-02 Nvidia Corporation Transmitting commands and information between a TCP/IP stack and an offload unit
US7412488B2 (en) 2003-06-05 2008-08-12 Nvidia Corporation Setting up a delegated TCP connection for hardware-optimized processing
US20040257986A1 (en) * 2003-06-05 2004-12-23 Jha Ashutosh K. Processing data for a TCP connection using an offload unit
US20040249998A1 (en) * 2003-06-05 2004-12-09 Anand Rajagopalan Uploading TCP frame data to user buffers and buffers in system memory
US8417852B2 (en) 2003-06-05 2013-04-09 Nvidia Corporation Uploading TCP frame data to user buffers and buffers in system memory
US7363572B2 (en) 2003-06-05 2008-04-22 Nvidia Corporation Editing outbound TCP frames and generating acknowledgements
US20080056124A1 (en) * 2003-06-05 2008-03-06 Sameer Nanda Using TCP/IP offload to accelerate packet filtering
US20040246974A1 (en) * 2003-06-05 2004-12-09 Gyugyi Paul J. Storing and accessing TCP connection information
US7613109B2 (en) 2003-06-05 2009-11-03 Nvidia Corporation Processing data for a TCP connection using an offload unit
US7609696B2 (en) 2003-06-05 2009-10-27 Nvidia Corporation Storing and accessing TCP connection information
US8068739B2 (en) 2003-06-12 2011-11-29 Finisar Corporation Modular optical device that interfaces with an external controller
US20050025502A1 (en) * 2003-06-12 2005-02-03 Finisar Modular optical device that interfaces with an external controller
US8923704B2 (en) * 2003-08-29 2014-12-30 Finisar Corporation Computer system with modular optical devices
US8891970B2 (en) 2003-08-29 2014-11-18 Finisar Corporation Modular optical device with mixed signal interface
US20050050250A1 (en) * 2003-08-29 2005-03-03 Finisar Computer system with modular optical devices
US9065571B2 (en) 2003-08-29 2015-06-23 Finisar Corporation Modular controller that interfaces with modular optical device
US7689702B1 (en) * 2003-10-31 2010-03-30 Sun Microsystems, Inc. Methods and apparatus for coordinating processing of network connections between two network protocol stacks
US8549345B1 (en) * 2003-10-31 2013-10-01 Oracle America, Inc. Methods and apparatus for recovering from a failed network interface card
US20050149632A1 (en) * 2003-12-19 2005-07-07 Iready Corporation Retransmission system and method for a transport offload engine
US20050138180A1 (en) * 2003-12-19 2005-06-23 Iredy Corporation Connection management system and method for a transport offload engine
US8065439B1 (en) * 2003-12-19 2011-11-22 Nvidia Corporation System and method for using metadata in the context of a transport offload engine
US7899913B2 (en) * 2003-12-19 2011-03-01 Nvidia Corporation Connection management system and method for a transport offload engine
US8549170B2 (en) * 2003-12-19 2013-10-01 Nvidia Corporation Retransmission system and method for a transport offload engine
US20050195851A1 (en) * 2004-02-12 2005-09-08 International Business Machines Corporation System, apparatus and method of aggregating TCP-offloaded adapters
US8493988B2 (en) 2004-03-26 2013-07-23 Foundry Networks, Llc Method and apparatus for aggregating input data streams
US9338100B2 (en) 2004-03-26 2016-05-10 Foundry Networks, Llc Method and apparatus for aggregating input data streams
US7698413B1 (en) 2004-04-12 2010-04-13 Nvidia Corporation Method and apparatus for accessing and maintaining socket control information for high speed network connections
US8730961B1 (en) 2004-04-26 2014-05-20 Foundry Networks, Llc System and method for optimizing router lookup
US20050246450A1 (en) * 2004-04-28 2005-11-03 Yutaka Enko Network protocol processing device
US7206864B2 (en) * 2004-04-28 2007-04-17 Hitachi, Ltd. Network protocol processing device
US7831745B1 (en) 2004-05-25 2010-11-09 Chelsio Communications, Inc. Scalable direct memory access using validation of host and scatter gather engine (SGE) generation indications
US7945705B1 (en) 2004-05-25 2011-05-17 Chelsio Communications, Inc. Method for using a protocol language to avoid separate channels for control messages involving encapsulated payload data messages
US20050281262A1 (en) * 2004-06-17 2005-12-22 Zur Uri E Method and system for supporting read operations for iSCSI and iSCSI chimney
US7975064B2 (en) * 2004-09-16 2011-07-05 International Business Machines Corporation Envelope packet architecture for broadband engine
US20060059273A1 (en) * 2004-09-16 2006-03-16 Carnevale Michael J Envelope packet architecture for broadband engine
US7953922B2 (en) * 2004-10-29 2011-05-31 Foundry Networks, Llc Double density content addressable memory (CAM) lookup scheme
US20100161894A1 (en) * 2004-10-29 2010-06-24 Foundry Networks, Inc. Double density content addressable memory (cam) lookup scheme
US7953923B2 (en) 2004-10-29 2011-05-31 Foundry Networks, Llc Double density content addressable memory (CAM) lookup scheme
US10652147B2 (en) 2004-11-16 2020-05-12 Intel Corporation Packet coalescing
US20060104303A1 (en) * 2004-11-16 2006-05-18 Srihari Makineni Packet coalescing
US7620071B2 (en) 2004-11-16 2009-11-17 Intel Corporation Packet coalescing
WO2006055494A1 (en) * 2004-11-16 2006-05-26 Intel Corporation Packet coalescing
US9485178B2 (en) 2004-11-16 2016-11-01 Intel Corporation Packet coalescing
US8036246B2 (en) 2004-11-16 2011-10-11 Intel Corporation Packet coalescing
CN102427446A (en) * 2004-11-16 2012-04-25 英特尔公司 Packet coalescing
US8718096B2 (en) 2004-11-16 2014-05-06 Intel Corporation Packet coalescing
US20110090920A1 (en) * 2004-11-16 2011-04-21 Srihari Makineni Packet coalescing
US20100020819A1 (en) * 2004-11-16 2010-01-28 Srihari Makineni Packet coalescing
US10957423B2 (en) 2005-03-03 2021-03-23 Washington University Method and apparatus for performing similarity searching
US10580518B2 (en) 2005-03-03 2020-03-03 Washington University Method and apparatus for performing similarity searching
US9547680B2 (en) 2005-03-03 2017-01-17 Washington University Method and apparatus for performing similarity searching
US7647436B1 (en) * 2005-04-29 2010-01-12 Sun Microsystems, Inc. Method and apparatus to interface an offload engine network interface with a host machine
US7562175B2 (en) 2005-05-12 2009-07-14 International Business Machines Corporation Internet SCSI communication via UNDI services
US20080082314A1 (en) * 2005-05-12 2008-04-03 Sumeet Kochar Internet scsi communication via undi services
US20080082313A1 (en) * 2005-05-12 2008-04-03 Dunham Scott N Internet scsi communication via undi services
US20070266195A1 (en) * 2005-05-12 2007-11-15 Dunham Scott N Internet SCSI Communication via UNDI Services
US20060259291A1 (en) * 2005-05-12 2006-11-16 International Business Machines Corporation Internet SCSI communication via UNDI services
US7509449B2 (en) 2005-05-12 2009-03-24 International Business Machines Corporation Internet SCSI communication via UNDI services
US7430629B2 (en) 2005-05-12 2008-09-30 International Business Machines Corporation Internet SCSI communication via UNDI services
US8155001B1 (en) 2005-08-31 2012-04-10 Chelsio Communications, Inc. Protocol offload transmit traffic management
US7724658B1 (en) 2005-08-31 2010-05-25 Chelsio Communications, Inc. Protocol offload transmit traffic management
US8339952B1 (en) 2005-08-31 2012-12-25 Chelsio Communications, Inc. Protocol offload transmit traffic management
US7616563B1 (en) 2005-08-31 2009-11-10 Chelsio Communications, Inc. Method to implement an L4-L7 switch using split connections and an offloading NIC
US8139482B1 (en) 2005-08-31 2012-03-20 Chelsio Communications, Inc. Method to implement an L4-L7 switch using split connections and an offloading NIC
US7760733B1 (en) 2005-10-13 2010-07-20 Chelsio Communications, Inc. Filtering ingress packets in network interface circuitry
US7715436B1 (en) 2005-11-18 2010-05-11 Chelsio Communications, Inc. Method for UDP transmit protocol offload processing with traffic management
US20140056315A1 (en) * 2005-12-02 2014-02-27 Broadcom Corporation Method and system for speed negotiation for twisted pair links in fibre channel systems
US9008127B2 (en) * 2005-12-02 2015-04-14 Broadcom Corporation Method and system for speed negotiation for twisted pair links in fibre channel systems
US7580410B2 (en) 2005-12-16 2009-08-25 Industrial Technology Research Institute Extensible protocol processing system
US20070140297A1 (en) * 2005-12-16 2007-06-21 Shen-Ming Chung Extensible protocol processing system
US8213427B1 (en) 2005-12-19 2012-07-03 Chelsio Communications, Inc. Method for traffic scheduling in intelligent network interface circuitry
US7660264B1 (en) 2005-12-19 2010-02-09 Chelsio Communications, Inc. Method for traffic schedulign in intelligent network interface circuitry
US8448162B2 (en) 2005-12-28 2013-05-21 Foundry Networks, Llc Hitless software upgrades
US20090279549A1 (en) * 2005-12-28 2009-11-12 Foundry Networks, Inc. Hitless software upgrades
US9378005B2 (en) 2005-12-28 2016-06-28 Foundry Networks, Llc Hitless software upgrades
US20070153818A1 (en) * 2005-12-29 2007-07-05 Sridhar Lakshmanamurthy On-device packet descriptor cache
US7426610B2 (en) * 2005-12-29 2008-09-16 Intel Corporation On-device packet descriptor cache
US7924840B1 (en) 2006-01-12 2011-04-12 Chelsio Communications, Inc. Virtualizing the operation of intelligent network interface circuitry
US8686838B1 (en) 2006-01-12 2014-04-01 Chelsio Communications, Inc. Virtualizing the operation of intelligent network interface circuitry
US7660306B1 (en) 2006-01-12 2010-02-09 Chelsio Communications, Inc. Virtualizing the operation of intelligent network interface circuitry
US20080040519A1 (en) * 2006-05-02 2008-02-14 Alacritech, Inc. Network interface device with 10 Gb/s full-duplex transfer rate
US20070283041A1 (en) * 2006-06-01 2007-12-06 Shen-Ming Chung System and method for recognizing offloaded packets
US7747766B2 (en) 2006-06-01 2010-06-29 Industrial Technology Research Institute System and method for recognizing offloaded packets
US10169814B2 (en) 2006-06-19 2019-01-01 Ip Reservoir, Llc High speed processing of financial information using FPGA devices
US10817945B2 (en) 2006-06-19 2020-10-27 Ip Reservoir, Llc System and method for routing of streaming data as between multiple compute resources
US11182856B2 (en) 2006-06-19 2021-11-23 Exegy Incorporated System and method for routing of streaming data as between multiple compute resources
US9916622B2 (en) 2006-06-19 2018-03-13 Ip Reservoir, Llc High speed processing of financial information using FPGA devices
US10360632B2 (en) 2006-06-19 2019-07-23 Ip Reservoir, Llc Fast track routing of streaming data using FPGA devices
US9672565B2 (en) 2006-06-19 2017-06-06 Ip Reservoir, Llc High speed processing of financial information using FPGA devices
US10504184B2 (en) 2006-06-19 2019-12-10 Ip Reservoir, Llc Fast track routing of streaming data as between multiple compute resources
US9582831B2 (en) 2006-06-19 2017-02-28 Ip Reservoir, Llc High speed processing of financial information using FPGA devices
US10467692B2 (en) 2006-06-19 2019-11-05 Ip Reservoir, Llc High speed processing of financial information using FPGA devices
US10749994B2 (en) 2006-11-08 2020-08-18 Standard Microsystems Corporation Network traffic controller (NTC)
US9794378B2 (en) 2006-11-08 2017-10-17 Standard Microsystems Corporation Network traffic controller (NTC)
US20080109562A1 (en) * 2006-11-08 2008-05-08 Hariramanathan Ramakrishnan Network Traffic Controller (NTC)
US20080117911A1 (en) * 2006-11-21 2008-05-22 Yasantha Rajakarunanayake System and method for a software-based TCP/IP offload engine for digital media renderers
US7773546B2 (en) 2006-11-21 2010-08-10 Broadcom Corporation System and method for a software-based TCP/IP offload engine for digital media renderers
US9030943B2 (en) 2006-11-22 2015-05-12 Foundry Networks, Llc Recovering from failures without impact on data traffic in a shared bus architecture
US8238255B2 (en) 2006-11-22 2012-08-07 Foundry Networks, Llc Recovering from failures without impact on data traffic in a shared bus architecture
US8155011B2 (en) 2007-01-11 2012-04-10 Foundry Networks, Llc Techniques for using dual memory structures for processing failure detection protocol packets
US8395996B2 (en) 2007-01-11 2013-03-12 Foundry Networks, Llc Techniques for processing incoming failure detection protocol packets
US7978614B2 (en) 2007-01-11 2011-07-12 Foundry Network, LLC Techniques for detecting non-receipt of fault detection protocol packets
US9112780B2 (en) 2007-01-11 2015-08-18 Foundry Networks, Llc Techniques for processing incoming failure detection protocol packets
US8170023B2 (en) * 2007-02-20 2012-05-01 Broadcom Corporation System and method for a software-based TCP/IP offload engine for implementing efficient digital media streaming over internet protocol networks
US20080198781A1 (en) * 2007-02-20 2008-08-21 Yasantha Rajakarunanayake System and method for a software-based TCP/IP offload engine for implementing efficient digital media streaming over Internet protocol networks
US8935406B1 (en) 2007-04-16 2015-01-13 Chelsio Communications, Inc. Network adaptor configured for connection establishment offload
US9537878B1 (en) 2007-04-16 2017-01-03 Chelsio Communications, Inc. Network adaptor configured for connection establishment offload
US8060644B1 (en) 2007-05-11 2011-11-15 Chelsio Communications, Inc. Intelligent network adaptor with end-to-end flow control
US8589587B1 (en) 2007-05-11 2013-11-19 Chelsio Communications, Inc. Protocol offload in intelligent network adaptor, including application level signalling
US8356112B1 (en) 2007-05-11 2013-01-15 Chelsio Communications, Inc. Intelligent network adaptor with end-to-end flow control
US7826350B1 (en) 2007-05-11 2010-11-02 Chelsio Communications, Inc. Intelligent network adaptor with adaptive direct data placement scheme
US7831720B1 (en) 2007-05-17 2010-11-09 Chelsio Communications, Inc. Full offload of stateful connections, with partial connection offload
US7908624B2 (en) 2007-06-18 2011-03-15 Broadcom Corporation System and method for just in time streaming of digital programs for network recording and relaying over internet protocol network
US20080313687A1 (en) * 2007-06-18 2008-12-18 Yasantha Nirmal Rajakarunanayake System and method for just in time streaming of digital programs for network recording and relaying over internet protocol network
US8037399B2 (en) 2007-07-18 2011-10-11 Foundry Networks, Llc Techniques for segmented CRC design in high speed networks
US8271859B2 (en) 2007-07-18 2012-09-18 Foundry Networks Llc Segmented CRC design in high speed networks
US8509236B2 (en) 2007-09-26 2013-08-13 Foundry Networks, Llc Techniques for selecting paths and/or trunk ports for forwarding traffic flows
US8149839B1 (en) 2007-09-26 2012-04-03 Foundry Networks, Llc Selection of trunk ports and paths using rotation
US10229453B2 (en) 2008-01-11 2019-03-12 Ip Reservoir, Llc Method and system for low latency basket calculation
US20090288013A1 (en) * 2008-05-16 2009-11-19 Honeywell International Inc. Scalable User Interface System
US7930343B2 (en) * 2008-05-16 2011-04-19 Honeywell International Inc. Scalable user interface system
US7751401B2 (en) * 2008-06-30 2010-07-06 Oracle America, Inc. Method and apparatus to provide virtual toe interface with fail-over
US20090323691A1 (en) * 2008-06-30 2009-12-31 Sun Microsystems, Inc. Method and apparatus to provide virtual toe interface with fail-over
US20110270976A1 (en) * 2008-09-19 2011-11-03 Masama Yasuda Network protocol processing system and network protocol processing method
US8838782B2 (en) * 2008-09-19 2014-09-16 Nec Corporation Network protocol processing system and network protocol processing method
US11676206B2 (en) 2008-12-15 2023-06-13 Exegy Incorporated Method and apparatus for high-speed processing of financial market depth data
US10062115B2 (en) 2008-12-15 2018-08-28 Ip Reservoir, Llc Method and apparatus for high-speed processing of financial market depth data
US10929930B2 (en) 2008-12-15 2021-02-23 Ip Reservoir, Llc Method and apparatus for high-speed processing of financial market depth data
US9154453B2 (en) 2009-01-16 2015-10-06 F5 Networks, Inc. Methods and systems for providing direct DMA
US9152483B2 (en) 2009-01-16 2015-10-06 F5 Networks, Inc. Network devices with multiple fully isolated and independently resettable direct memory access channels and methods thereof
US8090901B2 (en) 2009-05-14 2012-01-03 Brocade Communications Systems, Inc. TCAM management approach that minimize movements
US9166818B2 (en) 2009-09-21 2015-10-20 Brocade Communications Systems, Inc. Provisioning single or multistage networks using ethernet service instances (ESIs)
US8599850B2 (en) 2009-09-21 2013-12-03 Brocade Communications Systems, Inc. Provisioning single or multistage networks using ethernet service instances (ESIs)
US9313047B2 (en) 2009-11-06 2016-04-12 F5 Networks, Inc. Handling high throughput and low latency network data packets in a traffic management device
US11397985B2 (en) 2010-12-09 2022-07-26 Exegy Incorporated Method and apparatus for managing orders in financial markets
US10037568B2 (en) 2010-12-09 2018-07-31 Ip Reservoir, Llc Method and apparatus for managing orders in financial markets
US11803912B2 (en) 2010-12-09 2023-10-31 Exegy Incorporated Method and apparatus for managing orders in financial markets
US9172640B2 (en) 2011-03-30 2015-10-27 Amazon Technologies, Inc. Frameworks and interfaces for offload device-based packet processing
CN104054067A (en) * 2011-03-30 2014-09-17 亚马逊技术公司 Frameworks and interfaces for offload device-based packet processing
US9042403B1 (en) 2011-03-30 2015-05-26 Amazon Technologies, Inc. Offload device for stateless packet processing
US11099885B2 (en) 2011-03-30 2021-08-24 Amazon Technologies, Inc. Frameworks and interfaces for offload device-based packet processing
US9904568B2 (en) 2011-03-30 2018-02-27 Amazon Technologies, Inc. Frameworks and interfaces for offload device-based packet processing
AU2012236513B2 (en) * 2011-03-30 2015-02-05 Amazon Technologies, Inc. Frameworks and interfaces for offload device-based packet processing
US10565002B2 (en) 2011-03-30 2020-02-18 Amazon Technologies, Inc. Frameworks and interfaces for offload device-based packet processing
US8774213B2 (en) 2011-03-30 2014-07-08 Amazon Technologies, Inc. Frameworks and interfaces for offload device-based packet processing
US11656900B2 (en) 2011-03-30 2023-05-23 Amazon Technologies, Inc. Frameworks and interfaces for offload device-based packet processing
WO2012135442A1 (en) * 2011-03-30 2012-10-04 Amazon Technologies, Inc. Frameworks and interfaces for offload device-based packet processing
US20130185378A1 (en) * 2012-01-18 2013-07-18 LineRate Systems, Inc. Cached hash table for networking
US10963962B2 (en) 2012-03-27 2021-03-30 Ip Reservoir, Llc Offload processing of data packets containing financial market data
US10872078B2 (en) 2012-03-27 2020-12-22 Ip Reservoir, Llc Intelligent feed switch
US10121196B2 (en) 2012-03-27 2018-11-06 Ip Reservoir, Llc Offload processing of data packets containing financial market data
US10650452B2 (en) * 2012-03-27 2020-05-12 Ip Reservoir, Llc Offload processing of data packets
US9990393B2 (en) 2012-03-27 2018-06-05 Ip Reservoir, Llc Intelligent feed switch
US20140180903A1 (en) * 2012-03-27 2014-06-26 Ip Reservoir, Llc Offload Processing of Data Packets
US11436672B2 (en) 2012-03-27 2022-09-06 Exegy Incorporated Intelligent switch for processing financial market data
US9047417B2 (en) 2012-10-29 2015-06-02 Intel Corporation NUMA aware network interface
US9270602B1 (en) * 2012-12-31 2016-02-23 F5 Networks, Inc. Transmit rate pacing of large network traffic bursts to reduce jitter, buffer overrun, wasted bandwidth, and retransmissions
US10375155B1 (en) 2013-02-19 2019-08-06 F5 Networks, Inc. System and method for achieving hardware acceleration for asymmetric flow connections
US10684973B2 (en) 2013-08-30 2020-06-16 Intel Corporation NUMA node peripheral switch
US11593292B2 (en) 2013-08-30 2023-02-28 Intel Corporation Many-to-many PCIe switch
US9864606B2 (en) 2013-09-05 2018-01-09 F5 Networks, Inc. Methods for configurable hardware logic device reloading and devices thereof
GB2542373A (en) * 2015-09-16 2017-03-22 Nanospeed Tech Ltd TCP/IP offload system
US11855898B1 (en) 2018-03-14 2023-12-26 F5, Inc. Methods for traffic dependent direct memory access optimization and devices thereof
US11537716B1 (en) 2018-11-13 2022-12-27 F5, Inc. Methods for detecting changes to a firmware and devices thereof
US11741056B2 (en) 2019-11-01 2023-08-29 EMC IP Holding Company LLC Methods and systems for allocating free space in a sparse file system
US11669259B2 (en) 2021-04-29 2023-06-06 EMC IP Holding Company LLC Methods and systems for methods and systems for in-line deduplication in a distributed storage system
US11579976B2 (en) 2021-04-29 2023-02-14 EMC IP Holding Company LLC Methods and systems parallel raid rebuild in a distributed storage system
US20220350543A1 (en) * 2021-04-29 2022-11-03 EMC IP Holding Company LLC Methods and systems for storing data in a distributed system using offload components
US11740822B2 (en) 2021-04-29 2023-08-29 EMC IP Holding Company LLC Methods and systems for error detection and correction in a distributed storage system
US11567704B2 (en) 2021-04-29 2023-01-31 EMC IP Holding Company LLC Method and systems for storing data in a storage pool using memory semantics with applications interacting with emulated block devices
US11604610B2 (en) * 2021-04-29 2023-03-14 EMC IP Holding Company LLC Methods and systems for storing data in a distributed system using offload components
US11892983B2 (en) 2021-04-29 2024-02-06 EMC IP Holding Company LLC Methods and systems for seamless tiering in a distributed storage system
US11677633B2 (en) 2021-10-27 2023-06-13 EMC IP Holding Company LLC Methods and systems for distributing topology information to client nodes
US11762682B2 (en) 2021-10-27 2023-09-19 EMC IP Holding Company LLC Methods and systems for storing data in a distributed system using offload components with advanced data services
US11922071B2 (en) 2021-10-27 2024-03-05 EMC IP Holding Company LLC Methods and systems for storing data in a distributed system using offload components and a GPU module

Also Published As

Publication number Publication date
US7496689B2 (en) 2009-02-24

Similar Documents

Publication Publication Date Title
US7496689B2 (en) TCP/IP offload device
JP5066707B2 (en) TCP / IP offload device with reduced sequential processing
US6751665B2 (en) Providing window updates from a computer to a network interface device
US8447803B2 (en) Method and apparatus for distributing network traffic processing on a multiprocessor computer
US7472156B2 (en) Transferring control of a TCP connection between devices
US6658480B2 (en) Intelligent network interface system and method for accelerated protocol processing
USRE47756E1 (en) High performance memory based communications interface
US7058735B2 (en) Method and apparatus for local and distributed data memory access (“DMA”) control
US6389479B1 (en) Intelligent network interface device and system for accelerated communication
US5608892A (en) Active cache for a microprocessor
US6813653B2 (en) Method and apparatus for implementing PCI DMA speculative prefetching in a message passing queue oriented bus system
US6912610B2 (en) Hardware assisted firmware task scheduling and management
US6334153B2 (en) Passing a communication control block from host to a local device such that a message is processed on the device
US20080040519A1 (en) Network interface device with 10 Gb/s full-duplex transfer rate
US20040064589A1 (en) Fast-path apparatus for receiving data corresponding to a TCP connection
US20040054813A1 (en) TCP offload network interface device
US20040034718A1 (en) Prefetching of receive queue descriptors
US20020156927A1 (en) TCP/IP offload network interface device
US20150032835A1 (en) Iwarp send with immediate data operations
US20060092934A1 (en) System for protocol processing engine
US6880047B2 (en) Local emulation of data RAM utilizing write-through cache hardware within a CPU module

Legal Events

Date Code Title Description
AS Assignment

Owner name: ALACRITECH, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SHARP, COLIN C.;PHILBRICK, CLIVE M.;STARR, DARYL D.;AND OTHERS;REEL/FRAME:014432/0802

Effective date: 20030818

STCF Information on status: patent grant

Free format text: PATENTED CASE

CC Certificate of correction
FPAY Fee payment

Year of fee payment: 4

AS Assignment

Owner name: A-TECH LLC, DELAWARE

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ALACRITECH INC.;REEL/FRAME:031644/0783

Effective date: 20131017

AS Assignment

Owner name: ALACRITECH, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:A-TECH LLC;REEL/FRAME:039068/0884

Effective date: 20160617

FPAY Fee payment

Year of fee payment: 8

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20210224