US20020010793A1 - Method and apparatus for performing frame processing for a network - Google Patents
Method and apparatus for performing frame processing for a network Download PDFInfo
- Publication number
- US20020010793A1 US20020010793A1 US08/916,487 US91648797A US2002010793A1 US 20020010793 A1 US20020010793 A1 US 20020010793A1 US 91648797 A US91648797 A US 91648797A US 2002010793 A1 US2002010793 A1 US 2002010793A1
- Authority
- US
- United States
- Prior art keywords
- data
- instruction
- data frames
- frame
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L49/00—Packet switching elements
- H04L49/90—Buffering arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/12—Protocol engines
Definitions
- the present invention relates to data communications networks and, more particularly, to switching data frames through data communications networks.
- Frame processing is performed at nodes of networks, such as local area networks (LANs).
- LANs local area networks
- the nodes are able to determine how to forward or switch frames to other nodes in the network.
- FIG. 1 is a block diagram of a conventional frame processing apparatus 100 .
- the conventional frame processing apparatus 100 is suitable for use in a LAN, namely a token-ring network.
- the conventional frame processing apparatus 100 receives data frames from a plurality of ports associated with the LAN.
- the data frames are processed by the conventional frame processing apparatus 100 to effectuate a switching operation.
- data frames received from each of the ports are processed such that they are either dropped or forwarded to other ports being serviced by the conventional frame processing apparatus 100 .
- the conventional frame processing apparatus 100 includes physical layer interfaces 102 , 104 , 106 and 108 .
- the physical layer interfaces 102 - 108 individually couple to a respective port of the token-ring network. Coupled to each of the physical layer interfaces 102 - 108 is a token-ring chip set.
- token-ring chips sets 110 , 112 , 114 and 116 respectively couple to the physical layer interfaces 102 , 104 , 106 and 108 .
- each of the token-ring chip sets 110 - 116 includes a TMS380C26 LAN communications processor token-ring chip as well as TMS380FPA PacketBlaster network accelerator and TMS44400 DRAM, all of which are available from Texas Instruments, Inc. of Dallas, Tex.
- the token-ring chip sets 110 - 116 could each couple to a data bus directly, to improve performance the conventional frame processing apparatus 100 may include bus interface circuits 118 and 120 .
- the bus interface circuits 118 and 120 couple the token-ring chip sets 110 - 116 to a data bus 122 .
- the bus interface circuits 118 - 120 transmit a burst of data over the data bus 122 for storage in a frame buffer 124 . By transmitting the data in bursts, the bandwidth of the data bus 122 is able to be better utilized.
- a frame buffer controller 126 controls the storage and retrieval of data to and from the frame buffer 124 by way of the bus interface circuits 118 and 120 using control lines 128 , 130 and 132 .
- the frame buffer 124 stores one or more data frames that are being processed by the conventional frame processing apparatus 100 .
- An isolation device 134 is used to couple a bus 136 for a microprocessor 138 to the data bus 122 .
- the microprocessor 138 is also coupled to a microprocessor memory 140 and a frame buffer controller 126 .
- the microprocessor 138 is typically a general purpose microprocessor programmed to perform frame processing using the general instruction set for the microprocessor 138 .
- the microprocessor 138 interacts with data frames stored in the frame buffer 124 to perform filtering to determine whether to drop data frames or provide a switching destination for the data frames.
- the microprocessor 138 is also responsible for low level buffer management, control and setup of hardware and network address management.
- the microprocessors used to perform the frame processing are primarily general purpose microprocessors. Recently, a few specialized microprocessors have been built to be better suited to frame processing tasks than are general purpose microprocessors.
- An example of such a microprocessor is the CXP microprocessor produced by Bay Networks, Inc.
- these specialized microprocessors are separate integrated circuit chips that process frames already stored into a frame buffer.
- One problem with conventional frame processing apparatuses such as the conventional frame processing apparatus 100 illustrated in FIG. 1, is that the general purpose microprocessor is not able to process data frames at high speed. As a result, the number of ports that the conventional frame processing apparatus can support is limited by the speed at which the general purpose microprocessor can perform the filtering operations. The use of specialized microprocessors is an improvement but places additional burdens on the bandwidth requirements of the data paths.
- Another problem with the conventional frame processing apparatus is that the data path to and from the physical layer and the frame buffer during reception and transmission of data has various bottlenecks that render the conventional hardware design inefficient.
- Yet another disadvantage of the conventional frame processing apparatus is that it requires a large number of integrated circuit chips.
- the bus interface circuits 118 and 120 are individually provided as application specific integrated circuits (ASICs) for each pair of ports, the token-ring chip sets 110 - 116 include one or more integrated circuit chips for each port, and various other chips.
- ASICs application specific integrated circuits
- the invention is an improved frame processing apparatus for a network that supports high speed frame processing.
- the frame processing apparatus uses a combination of fixed hardware and programmable hardware to implement network processing, including frame processing and media access control (MAC) processing.
- MAC media access control
- the improved frame processing apparatus is particular suited for token-ring networks and ethernet networks.
- the invention can be implemented in numerous ways, including as an apparatus, an integrated circuit and network equipment. Several embodiments of the invention are discussed below.
- an embodiment of the invention includes at least: a plurality of protocol handlers of the data communications network, each of the protocol handlers being associated with a port of the data communications network; and a pipelined processor to filter the data frames received by the protocol handlers as the data frames are being received.
- the pipelined processor provides a uniform latency by sequencing through the protocol handlers with each clock cycle.
- the apparatus is formed on a single integrated circuit chip.
- an embodiment of the invention includes at least a plurality of protocol handlers, each of the protocol handlers corresponding to a different communications port; a receive buffer for temporarily storing data received from the protocol handlers; framing logic, the framing logic controls the reception and transmission of data frames via the protocol handlers; and a filter processor to filter the data frames received by the protocol handlers such that certain of the data frames are dropped and other data frames are provided with a switching destination.
- the integrated circuit further includes a transmit buffer for temporarily storing outgoing data to be supplied to said protocol handlers, and the filter processor further operates to filter the data frames being supplied to said protocol handlers for transmission.
- an embodiment of the invention includes: a network processing apparatus for processing data frames received and data frames to be transmitted, a frame buffer to store the data frames received that are to be switched to other destinations in the network, and switch circuitry to switch the data frames in said frame buffer to the appropriate one or more protocol handlers.
- the network processing apparatus includes at least a plurality of protocol handlers, each of said protocol handlers corresponding to a different communications port of the network; and a frame processing apparatus to processes the data frames received from said protocol handlers and the data frames to be transmitted via said protocol handlers.
- the advantages of the invention are numerous.
- One advantage of the invention is that a frame processing apparatus is able to process frames faster, thus allowing the frame processing apparatus to service more ports than conventionally possible.
- Another advantage of the invention is that the frame processing apparatus according to the invention requires significantly fewer integrated circuit chips per port serviced.
- FIG. 1 is a block diagram of a conventional frame processing apparatus
- FIG. 2 is a block diagram of a frame processing apparatus according to an embodiment of the invention.
- FIG. 3A is a block diagram of MAC circuitry according to an embodiment of the invention.
- FIG. 3B is a block diagram of a protocol handler according to an embodiment of the invention.
- FIG. 4 is a block diagram of a filter processor according to an embodiment of the invention.
- FIG. 5 is a block diagram of a filter processor according to another embodiment of the invention.
- FIG. 6A is a block diagram of an instruction selection circuit according to an embodiment of the invention.
- FIG. 6B is a diagram illustrating the context switching utilized by a filter processor according to the invention.
- FIG. 7 is a block diagram of an address calculation circuit according to an embodiment of the invention.
- FIG. 9 is a block diagram of an aligner according to an embodiment of the invention.
- FIG. 10 is a block diagram of a switching circuit.
- the invention relates to an improved frame processing apparatus for a network that supports high speed frame processing.
- the frame processing apparatus uses a combination of fixed hardware and programmable hardware to implement network related processing, including frame processing and media access control (MAC) processing.
- MAC media access control
- the improved frame processing apparatus is particular suited for token-ring networks and ethernet networks.
- FIG. 2 is a block diagram of a frame processing apparatus 200 according to an embodiment of the invention.
- the frame processing apparatus 200 includes physical layer interfaces 202 - 206 .
- Each of the physical layer interfaces 202 - 206 are associated with a port of the frame processing apparatus 200 , and each port is in turn coupled to a node of a network.
- the network may be a local area network (LAN). Examples of LANs include token-ring networks and ethernet networks.
- Each of the physical layer interfaces 202 - 206 also couple to media access controller (MAC) circuitry 208 .
- the MAC circuitry 208 performs media access control operations and filtering operations on the data frames being processed by the frame processing apparatus 200 .
- the MAC circuitry 208 is itself an integrated circuit chip. The details on the construction and operation on the MAC circuitry 208 are discussed in detail below with respect to FIGS. 3 A- 9 .
- the MAC circuitry 208 couples to forwarding tables 210 by way of a table bus 212 .
- the forwarding tables 210 store information such as destination addresses, IP addresses, VLAN or bridge group information which are used by the MAC circuitry 208 .
- the forwarding tables 210 are coupled to the MAC circuitry 208 through a bus 212 . Additional details on the forwarding tables 210 are provided in FIG. 8 below.
- the MAC circuitry 208 receives incoming data frames, and then filters and processes the incoming data frames. The processed data frames are then stored in a frame buffer 214 . During transmission, the MAC circuitry 208 also receives the processed data frames from the frame buffer 214 , filters and forwards them to the appropriate nodes of the network. Hence, the MAC circuitry 208 is capable of performing both receive side filtering and transmit side filtering.
- the frame buffer 214 is coupled to the MAC circuitry 208 through a data bus 216 .
- the data bus 216 also couples to switch circuitry 218 .
- the data frames stored in the frame buffer 214 by the MAC circuitry 208 have normally been filtered by the MAC circuitry 208 .
- the switch circuitry 218 is thus able to retrieve the data frames to be switched from the frame buffer 214 over the data bus 216 .
- the switch circuitry 218 performs conventional switching operations, such as level-2 and level-3 switching.
- the switch circuitry 218 and the MAC circuitry 208 send and receive control signals over a control bus 220 .
- a control bus 222 is also used to communicate control signals between the frame buffer 214 and the switch circuitry 218 .
- the switch circuitry 218 is further described with respect to FIG. 10 below.
- the frame processing apparatus 200 further includes output queues and buffer management information storage 224 .
- the output queues and buffer management information storage 224 is coupled to the switch circuitry 218 over a bus 226 .
- the switch circuitry 218 monitors the output queues and buffer management information storage 224 to determine how to manage its switching operations.
- the frame processing apparatus 200 may further include an ATM port 227 that is coupled to the switch circuitry 218 and thus coupled to the frame buffer 214 and the output queues and buffer management information storage 224 .
- a microprocessor 228 is also coupled to the switch circuitry over bus 230 to assist with operations not directly associated with the reception and transmission of data frames. For example, the microprocessor 228 performs configuration of the MAC circuitry 208 during initialization, gathering statistical information, etc.
- the microprocessor 228 is coupled to a processor random-access memory (RAM) 232 over a processor bus 234 .
- the processor RAM 232 stores data utilized by the microprocessor 228 .
- the MAC circuitry 208 is also operatively coupled to the processor bus 234 by an isolation device 236 and an interconnect bus 238 .
- FIG. 3A is a block diagram of MAC circuitry 300 according to an embodiment of the invention.
- the MAC circuitry 300 may be the MAC circuitry 208 illustrated in FIG. 2.
- the MAC circuitry 300 includes a plurality of protocol handlers 302 .
- the protocol handlers 302 couple to physical layer interfaces and individually receive and transmit data over the physical media of the network coupled to the physical layer interfaces.
- a received data bus 304 couples the protocol handlers 302 to an input multiplexer 306 .
- the input multiplexer 306 is in turn coupled to a receive FIFO 310 through receive bus 308 .
- data being received at one of the protocol handlers 302 is directed along a receive data path consisting of the received data bus 304 , the input multiplexer 306 , the receive bus 308 , and the receive FIFO 310 .
- the protocol handlers 302 preferably implement in hardware those features of the 802.5 specification for the MAC layer that need to be implemented in hardware, the remaining other features of the MAC layer are left to software (i.e., hardware programmed with software).
- the protocol handlers 302 incorporate hardware to perform full repeat path, token generation and acquisition, frame reception and transmission, priority operation, latency buffer and elasticity buffer.
- various timers, counters and policy flags are provided in the protocol handlers 302 .
- the balance of the MAC layer functions are performed in software in other portions of the MAC circuitry 300 (i.e., by the filter processor) or by the microprocessor 228 .
- a filter processor 312 is coupled to the receive FIFO 310 through a processor bus 314 .
- the processor bus 314 is also coupled to an output multiplexer 316 .
- the output multiplexer 316 is also coupled to a filter variables RAM 318 over a filter variables bus 320 .
- the filter variables RAM 318 also couples to the filter processor 312 to provide filter variables to the filter processor 312 as needed.
- the filter variables RAM 318 includes a receive filter variables RAM 318 - 1 for use by the filter processor 312 during receiving of frames and a transmit filter variables RAM 318 - 2 for use by the filter processor 312 during transmission of frames.
- the filter processor 312 within the MAC circuitry 208 is a programmable solution to the problem.
- the filter processor 312 can be implemented by a small core of logic (e.g., less than 15K gates) that can be dynamically programmed.
- the filter processor 312 preferably forms an execution pipeline that executes instructions over a series of stages.
- the instruction set is preferably small and tailored to frame examination operations.
- a received frame being processed has an execution context where each frame contains its own set of operating variables.
- the filter processor 312 is specialized for performing frame processing operations in a rapid and efficient manner in accordance with directions provided by program instructions.
- the filter processor 312 performs filter processing and other processing associated with forwarding frames. Each frame must be processed extensively to determine frame destinations. This includes extracting the frame destination address (DA) and looking it up in the forwarding tables 210 . Additionally, other fields may be attached to the destination address (DA) for context specific lookups. As an example, this could include VLAN or bridge group information. For layer-3 functionality, IP addresses can be extracted and passed through the forwarding tables 210 . In general, the filter processor 312 allows up to two arbitrary fields in either the received frame or variable memory to be concatenated and sent through the forwarding tables 210 . Furthermore, many frame fields must be compared against specific values or decoded from a range of values.
- the filter processor 312 preferably allows single instruction methods of comparing and branching, comparing and storing (for building complex Boolean functions), and lastly range checking, branching or storing.
- Customer configured filters can also be performed through this processing logic.
- Custom configured filters are, for example, used for blocking traffic between particular stations, networks or protocols, for monitoring traffic, or for mirroring traffic.
- the filter variables RAM 318 is a 128 ⁇ 64 RAM that holds 64 bytes of variables for each port.
- the filter variables RAM 318 is preferably a dual port RAM where both the read and write ports are used by the filter processor 312 .
- the first 64 bytes of variables for a port are always written out to the frame buffer 214 with a status write for each frame processed by the filter processor 312 .
- the status write thus contains the control information that results from the frame processing.
- the control information includes beginning location and ending location within the frame buffer 214 , status information (e.g., CRC error, Rx overflow, Too long, Alignment error, Frame aborted, Priority), a forwarding map, and various destinations for the frame.
- the remaining 32 bytes can be written by request of the filter processor 312 .
- This allows software or external routing devices easy access to variables that can be used to store extracted data or Boolean results in a small collected area. Instructions should not depend on initialized values for any variable as the RAM entries are re-used on a frame basis and thus will start each frame initialized to the values written by the last frame. Note that many variables have a pre-defined function that is used by the switch circuitry 218 for forwarding frames.
- the microprocessor 228 is able to read or write any location in the filter variables RAM 318 .
- the microprocessor 228 reads information from the filter variables RAM 318 for diagnostic purposes. It can, however, be used by functional software in order to pass in parameters for a port that are fixed from frame to frame but programmable during the lifetime of a port. Examples of this include the spanning tree state (blocked or not blocked).
- the filter variables RAM 318 may also be double buffered. In one embodiment, there are two 64 byte areas per port, and alternate frames received for a port re-use a given 64 byte area. As a result, frame processing can begin on a subsequent frame while the buffer system is still waiting to unload the previous frame's variables. This is an important point for software since port control parameters must be written to both areas.
- the filter variables RAM 318 also contains status registers for each port.
- the status registers are updated with the progress of the processing of each frame. Status information in the status registers is primarily for the benefit of the filter processor 312 .
- the status registers are normally written by the protocol handlers 302 but can also be updated by the filter processor 312 .
- An instruction RAM 322 is also coupled to the filter processor 312 to supply the instructions to be executed by the filter processor 312 .
- the instruction RAM 322 stores the instructions executed by the filter processor 312 .
- the instructions are written to the instruction RAM 322 by the microprocessor 228 and read from the instruction RAM 322 by the filter processor 312 .
- the instruction RAM 322 can be a 512 ⁇ 64 RAM having a single port. All ports of the frame processing apparatus 200 share the same instruction set for the processing carried out by the filter processor 312 .
- the filter processor 312 is able to support execution specific to a port or group of ports. Grouping of ports is, for example, useful to form subnetworks within a network.
- a table interface 324 provides an interface between the forwarding tables 210 and the filter processor 312 .
- the forwarding tables 210 store destination addresses, IP addresses, VLAN or bridge group information which are used by the filter processor 312 in processing the frames. Additional details on the table interface are described below with reference to FIG. 8.
- a buffer 326 receives the output data from the output multiplexer 316 and couples the output data to the data bus 216 .
- the data bus 216 is coupled to a transmit FIFO 328 .
- the output of the transmit FIFO 328 is coupled to a transmit bus 330 which is coupled to the protocol handlers 302 and the filter processor 312 .
- the transmit data path through the MAC circuitry 300 consists of the data bus 216 , the transmit FIFO 328 , and the transmit bus 330 .
- the MAC circuitry 300 further includes a FIFO controller 322 for controlling the receive FIFO 310 and the transmit FIFO 328 .
- the FIFO controller 332 couples to the control lines 220 through a frame buffer interface 334 .
- the FIFO controller 332 additionally couples to framing logic 336 that manages reception and transmission of frames.
- the framing logic 336 is coupled to the filter processor 312 over control line 338 , and the FIFO controller 332 is coupled to the filter processor over control line 340 .
- the framing logic 336 further couples to a statistics controller 342 that controls the storage of statistics in a statistics RAM 344 . Exemplary statistics are provided in Table 1 below.
- the data is streamed to and from the frame buffer 214 through the FIFOs 310 , 328 for providing latency tolerance.
- the frame buffer interface 334 handles the unloading of data from the receive FIFO 310 and writing the unloaded data to the frame buffer 214 .
- the frame buffer interface 334 also handles the removal of data to be transmitted from the frame buffer 214 and the loading of the removed data into the transmit FIFO 328 .
- the output queues and buffer management information storage 224 is used to perform buffer address management.
- the frame buffer interface 334 whenever a block of data in the receive FIFO 310 is ready for any of the ports, the frame buffer interface 334 generates a RxDATA request to the switch circuitry 218 for each ready port. Likewise, whenever the transmit FIFO 328 has a block of space available for any port, the frame buffer interface 334 generates a TxDATA request to the switch circuitry 218 . Buffer memory commands generated by the switch circuitry 218 are received and decoded by the frame buffer interface 334 and used to control burst cycles into and out of the two FIFOs 310 , 328 .
- the framing logic 336 tracks frame boundaries for both reception and transmission and controls the protocol handler side of the receive and transmit FIFOs 310 , 328 . On the receive side, each time a byte is ready from the protocol handler 302 it is written into the receive FIFO 310 , and the framing logic 336 keeps a count of valid bytes in the frame. In one embodiment, this count lags behind by four bytes in order to automatically strip the FCS from a received frame. In this case, an unload request for the receive FIFO 310 will not be generated until a block of data (e.g., 32 bytes) is known not to include the FCS.
- a block of data e.g., 32 bytes
- Each entry in the receive FIFO 310 may also include termination flags that describe how much of a word (e.g., 8 bytes) is valid as well as marks the end of frame. These termination flags can be used during unloading of the receive FIFO 310 to properly generate external bus flags used by the switch circuitry 218 . Subsequently received frames will be placed in the receive FIFO 310 starting on the next block boundary (e.g., next 32 byte boundary). This allows the switch circuitry 218 greater latency tolerance in processing frames.
- termination flags that describe how much of a word (e.g., 8 bytes) is valid as well as marks the end of frame. These termination flags can be used during unloading of the receive FIFO 310 to properly generate external bus flags used by the switch circuitry 218 . Subsequently received frames will be placed in the receive FIFO 310 starting on the next block boundary (e.g., next 32 byte boundary). This allows the switch circuitry 218 greater latency tolerance in processing frames
- the protocol handler 302 On the transmit side, the protocol handler 302 is notified of a transmission request as soon as a block of data (e.g., 32 bytes) is ready in the transmit FIFO 328 . As with the receive side, each line may include termination flags that are used to control the end of frame. The protocol handler 302 will automatically add the proper FCS after transmitting the last byte. Multiple frames may be stored in the transmit FIFO 328 in order to minimize inter-frame gaps. In one embodiment, each port (channel) serviced by the frame processing apparatus 200 has 128 bytes of storage space in the FIFOs 310 , 328 . Up to two (2) frames (of 64 bytes) can be simultaneously stored in each of the FIFOs 310 , 328 .
- data is moved in bursts of four 64 bit wide cycles. This allows the reception of the data stream to have better tolerance to inter-packet allocation latencies and also to provide the ability to transmit on successive tokens at minimum Inter Frame Gaps (IFGs).
- IFGs Inter Frame Gaps
- Status information is sent from the framing logic 336 to external logic indicating availability of received data, or transmit data, as well as received status events.
- the transmit FIFO 328 may have a complication in that data can arrive from the frame buffer 214 unpacked. This can happen when software modifies frame headers and links fragments together.
- the frame buffer interface 334 may include a data aligner that will properly position incoming data based on where empty bytes start in the transmit FIFO 328 . Each byte is written on any boundary of the transmit FIFO 328 in a single clock.
- the receive FIFO 310 is implemented as two internal 128 ⁇ 32 RAMs. Each of the eight ports of the frame processing apparatus 200 is assigned a 16 ⁇ 64 region used to store up to four blocks. Frames start aligned with 32 byte blocks and fill consecutive memory bytes.
- the receive FIFO 310 is split into two RAMs in order to allow the filter processor 312 to fetch a word sized operand on any arbitrary boundary. To accommodate this, each RAM half uses an independent read address.
- the transmit FIFO 328 is slightly more complex. It is made of two 64 ⁇ 64 RAMs together with two 64 ⁇ 4 internal RAMs. The 64 ⁇ 64 RAMs hold the data words as received from the frame buffer 214 while the 64 ⁇ 4 RAMs are used to store the end of frame (EOF) flag together with a count of how many bytes are valid in the data word. Assuming data arrived aligned, each double-word of a burst would write to an alternate RAM. By using two RAMs split in this fashion, arbitrarily unaligned data can arrive with some portion being written into each RAM simultaneously.
- the statistics RAM 344 and the filter processor statistics RAM 323 are responsible for maintaining all per port statistics. A large number of counters are required or at least desired to provide Simple Network Management Protocol (SNMP) and Remote Monitor (RMON) operations. These particular counts are preferably maintained in the statistics RAM 344 . Also, the microprocessor 228 is able to read the statistics at any point in time through the CPU interface 346 .
- SNMP Simple Network Management Protocol
- RMON Remote Monitor
- a single incrementer/adder per RAM is used together with a state machine to process all the counters stored in the statistics RAM 344 .
- Statistics generated by receive and transmit control logic are kept in the statistics RAM 344 .
- the statistics RAM 344 is a 128 ⁇ 16 RAM (16 statistics per port) and are all 16 bits wide except for the octet counters which are 32 bits wide and thus occupy two successive memory locations. The microprocessor 228 is flagged each time any counter reaches 0 ⁇ C00, at which point it must then read the counters.
- Table 1 below illustrates representative statistic that can be stored in the statistics RAM 344 .
- frames will be classified first into groups and then only one counter per group will be affected for each frame. For example, a non-MAC broadcast frame properly received without source routing information will increment a counter storing a count for a DataBroadcastPkts statistic only.
- the microprocessor 228 has to add the DataBroadcastPkts, AllRoutesBroadcastPkts, SingleRoutesBroadcastPkts, InFrames, etc.
- the filter processor statistics RAM 323 is a 512 ⁇ 16 RAM for storage of 64 different 16 bit counts for each port. These statistics can be used for counting complex events or RMON functions.
- the microprocessor 228 is flagged each time a counter is half full, at which point it must then read the counters.
- the frame processing apparatus 200 also provides an interface to the microprocessor 228 so as to provide the microprocessor 228 with low-latency access to the internal resources of the MAC circuitry 208 .
- a CPU interface 346 interfaces the MAC circuitry 300 to the microprocessor 228 via the interconnect bus 238 so that the microprocessor 228 has access to the internal resources of the frame processing apparatus 200 .
- burst cycles are supported to allow software to use double-word transfers and block cycles.
- the microprocessor 228 is also used to read and write control registers in each of the protocol handlers 302 to provide control of ring access as well as assist with the processing of the MAC frames. Also, by providing the microprocessor 328 with access to the internal resources, the microprocessor 228 can perform diagnostics operations.
- the CPU interface 346 can also couple to the forwarding tables 210 so as to provide initialization and maintenance.
- the CPU interface 346 further couples to the protocol handlers 302 and a special transmit circuit 350 .
- the special transmit circuit 350 couples to the protocol handlers 302 over bus 352 .
- the protocol handlers 302 couple to the framing logic 336 over control lines 354 .
- the special transmit circuit 350 operates to transmit special data, namely high priority MAC frames.
- the special transmit circuit 350 is used within the MAC circuitry 300 to transmit high priority frames without having to put them through the switch circuitry 218 .
- certain MAC frames e.g., beacon, claim and purge
- certain high-priority MAC frames i.e., AMP and SMP
- AMP and SMP high-priority MAC frames
- the special transmit circuit 350 includes an internal buffer to store an incoming high priority frame.
- the internal buffer can store a block of 64 bytes of data within the special transmit circuit 350 .
- the MAC processing software (microprocessor 228 ) is notified when a frame is stored in the internal buffer and then instructs the internal buffer to de-queue the frame to the protocol handler 302 for transmission.
- the MAC processing software thereafter polls for completion of the transmission and may alternatively abort the transmission.
- the special transmit circuit 350 may also be written by the microprocessor 228 via the CPU interface 346 .
- FIG. 3B is a block diagram of a protocol handler 356 according to an embodiment of the invention.
- the protocol handler 356 is, for example, an implementation of the protocol handler 302 illustrated in FIG. 3.
- the protocol handler 356 implements physical signaling components (PSC) section and certain parts of the MAC Facility section of the IEEE 802.5 specification.
- PSC physical signaling components
- the protocol handler 356 converts the token ring network into receive and transmit byte-wide data streams and implements the token access protocol for access to the shared network media (i.e., line).
- Data being received from a line is received at a local loopback multiplexer 358 which forwards a selected output to a receive state machine 360 .
- the receive state machine 360 contains a de-serializer to convert the input stream into align octets.
- the primary output from the receive state machine 360 is a parallel byte stream that is forwarded to a receive FIFO 362 .
- the receive state machine 360 also detects errors (e.g., Manchester or CRC errors) for each frame, marks the start of the frame, and initializes a symbol decoder and the de-serializer. Further, the receive state machine 360 parses the input stream and generates the required flags and timing markers for subsequent processing. Additionally, the receive state machine 360 detects and validates token sequences, namely, the receive state machine 360 captures the priority field (P) and reservation field (R) of each token and frame and presents them to the remaining MAC circuitry 300 as current frame's priority field (Pr) and current frame's reservation field (Rr).
- the receive FIFO 362 is a FIFO device for the received data and also operates to re-synchronize the received data to a main system clock.
- the protocol handler 356 also has a transmit interface that includes two byte-wide transmit channels. One transmit channel is used for MAC frames and the other transmit channel is used for LLC frames (and some of the management style MAC frames).
- the LLC frames are supplied over the transmit bus 330 from the switch circuitry 218 .
- the MAC frames are fed from the special transmit circuitry 350 over the bus 352 .
- These two transmit channels supply two streams of frames to a transmit re-synchronizer 364 for synchronization with the main system clock.
- the re-synchronized transmit signals for the two streams are then forwarded from the transmit re-synchronizer 364 to a transmit state machine 366 .
- the transmit state machine 366 multiplexes the data from the two input streams by selecting the data from the bus 352 first and then the data from the bus 330 .
- the transmit state machine 366 controls a multiplexer 368 to select either one of the input streams supplied by the transmit state machine 366 or repeat data supplied by a repeat path supplier 370 . While waiting for the detection of a token of the suitable priority, the transmit state machine 366 causes the multiplexer 368 to output the repeat data from the repeat path supplier 370 .
- the transmit state machine 366 when the transmit state machine 366 detects a token with the proper priority, the transmit state machine 366 causes the multiplexer 368 to output frame data to be transmitted, and at the end of each frame, inserts a frame check sequence (FCS) and ending frame sequence (EFS), and then transmits the inter frame gap (IFG) and a token.
- FCS frame check sequence
- EFS ending frame sequence
- IFG inter frame gap
- the transmit state machine 366 is also responsible for stripping any frame that it has put on the token-ring network. The stripping happens in parallel with transmission and follows a procedure defined in the 802.5 specification. As suggested in the 802.5 specification, under-stripping is avoided at the expense of over-stripping.
- the output of the multiplexer 368 is supplied to a priority state machine 372 .
- the priority state machine 372 implements the 802.5 specification priority stacking mechanism. For example, when priority stacking is in use, i.e., when the priority of the token is raised, the repeat path is delayed by up to eight (8) additional bits. Once the priority stacking is no longer in use, the priority delay is removed.
- the output of the priority state machine 372 is forwarded to a fixed latency buffer 374 that, for example, inserts a fixed latency of a predetermined number of bits (e.g., 24 bits) to ensure that a token can circulate around the token-ring.
- the output from the fixed latency buffer 374 is supplied to an elasticity buffer 376 as well as to the loopback multiplexer 358 for loopback purposes.
- the elasticity buffer 376 provides a variable delay for clock rate error tolerance.
- the output of the priority state machine 372 as well as the output of the elasticity buffer 376 are supplied to a multiplexer 378 .
- the data stream to be transmitted from either the priority state machine 372 or the delayed version from the elasticity buffer 376 are then provided to a wire-side loopback multiplexer 380 .
- the wire-side loopback multiplexer 380 also receives the input data stream when a loopback is desired.
- the wire-side loopback multiplexer 380 couples to one of the physical layer interfaces 202 - 206 and outputs either the output from the multiplexer 378 or the input data stream for loopback.
- the protocol handler 356 also includes a protocol handler register bank 382 that includes various control registers.
- the frame processing apparatus 200 can support several connection modes (e.g., direct attachment, station, RI/RO expansion), functionality at power-up and during insertion have configurable deviations from the specification.
- connection modes e.g., direct attachment, station, RI/RO expansion
- direct attachment and RI/RO expansion require that the frame processing apparatus 200 repeat data at all times.
- the protocol handler 356 includes a wire-side loopback path implemented by the wire-side loopback multiplexer 380 for this purpose. This situation allows for accurate detection of idle rings (based on detecting lack of valid Manchester coding), instead of depending on the crude energy detect output from the physical layer interfaces 202 - 206 .
- the normal initialization process of sending loop-media test frames is not applicable when connectivity has been ascertained prior to any insertion attempt. As such, this step of the initialization can be eliminated for all attachment modes besides station. For applications where the lobe testing is desirable or required, normal station attachment for RI/RO where phantom drive is generated can be utilized.
- Each frame of data that is received is processed through the filter processor 312 to determine whether or not the frame should be accepted by the port and forwarded.
- the filter processor 312 is preferably implemented by specialized general purpose hardware that processes programmed filtering instructions. Embodiments of the specialized general purpose hardware are described in detail below with reference to FIGS. 4 and 5.
- the filter processor 312 can execute a plurality of instructions (e.g., up to 512 instructions). Each instruction is capable of extracting fields from the frame of data and storing them in a storage device (i.e., the filter variables RAM 318 ). Likewise, frame fields can be compared against immediate values and the results of comparisons stored in the filter variables RAM 318 . Lastly, fields can be extracted, looked up in the forwarding tables 210 and the results stored in the filter variables RAM 318 . Each port also includes some number of control registers that are set by the microprocessor 228 and can be read by the filter processor 312 during execution of the filtering instructions. For example, these control registers are typically used to store virtual ring (VRING) membership numbers, source routing ring and bridge numbers, etc.
- VRING virtual ring
- the execution of filtering instructions by the filter processor 312 is generally responsible for two major functions. First, the filter processor 312 must determine a destination mask and BP DEST (backplane destination) fields used by the switch circuitry 218 for forwarding the frame. Second, the filter processor 312 must determine whether or not to accept the frame in order to properly set the AR (address recognized) and FC (frame copied) bits in the FS (frame status) field.
- BP DEST backplane destination
- the filter processor 312 While the filter processor 312 is processing a current frame, subsequent frame are placed in the receive FIFO 310 . The processing time for the current frame thus should complete before the receive FIFO 310 is filled because when the receive FIFO 310 overflows frames are dropped.
- all instructions that determine the acceptance of a frame must finish executing before the FS byte is copied off of the wire, else the previous settings will be used.
- execution is preferably scheduled as soon as the frame data that an instruction depends on arrives.
- the filter processor 312 can allow all required instructions to complete before or during the reception of the CRC. Also, it is sufficient to provide the filter processor 312 with a single execution unit to supports all of the ports of the frame processing apparatus 200 , particularly when the ports are serviced in a round robin fashion as discussed below.
- the filter processor 312 also performs transmit side filtering.
- the same execution unit that performs the receive side filtering can perform the transmit side filtering while the reception side is idle.
- the use of the single execution unit should provide acceptable; however, for full duplex operation a second execution unit is provided to perform the transmit side filtering.
- the filter processor 312 operates to perform RIF scanning required to forward source routed frames. For each received frame of data that has a RIF, circuitry in the framing logic 336 operates to scan this field looking for a match between the source ring and bridge and an internal register. If a match is found the destination ring is extracted and placed in a register visible to the filter processor 312 . Thereafter, the destination ring stored in the register can be used to index a table within the forwarding tables 210 .
- FIG. 4 is a block diagram of a filter processor 400 according to an embodiment of the invention. Even though the filter processor is a high speed pipelined processor, the circuitry implementing the filter processor 400 is minimal and compact so as to fit within the MAC circuitry 208 .
- the filter processor 400 is one embodiment of the filter processor 312 together with the RAM 322 illustrated in FIG. 3.
- the filter processor 400 has five (5) distinct pipeline stages. Generally, the stages are described as instruction fetch, operand fetch, decode, execute and write.
- the filter processor 400 retrieves an instruction to be next executed. More particularly, the instruction is retrieved from an instruction RAM 402 using a program counter obtained from a program counters storage 404 .
- the program counters storage 404 stores a program counter for each of the protocol handlers 302 being serviced by the MAC circuitry 300 .
- the instruction retrieved or fetched from the instruction RAM 402 is then latched in a fetched instruction word (I-word) register 406 . This completes the first stage of the filter processing pipeline.
- a cancel circuit 408 produces a cancel signal 410 to notify the program counters storage 404 to activate a wait counter for the particular protocol handler 302 being serviced.
- the wait counter provides a waiting period during which processing for the protocol handler 302 currently being processed in this stage of the processing pipeline undergoes no processing during the wait period.
- This stage also includes an address calculation circuit 412 to calculate one or more addresses 414 used to access stored data in a memory storage device or devices.
- An operand fetch (op-fetch) output register 418 stores various data items that are determined in or carried-through 416 the operand fetch stage of the filter processing pipeline.
- a mask and function circuit 420 produces preferably a mask and a function. The mask will be used to protect data in a word outside the active field.
- a carry-through link 422 carries through the decode stage various data items from the operand fetch output register 418 .
- An aligner 424 receives the one or more operands from the data storage device or devices over a link 426 and possibly data from the operand fetch output register 418 . The aligner 424 then outputs one or more aligned operands.
- a branch target circuit 428 determines a branch target for certain instructions.
- a decode stage output register 430 stores the items produced by the decode stage, namely, the mask, function, carry through data, aligned operands, branch target, and miscellaneous other information.
- an arithmetic logic unit (ALU) 432 performs a logical operation on the aligned operands and possibly the function and produces an output result 434 .
- the ALU 432 also controls a selector 436 .
- the selector 436 selects one of the branch target from the decode stage output register 430 and a program counter after having been incremented by one via an adder 438 , to be output as a next program counter 440 .
- the next program counter 440 is supplied to the program counter storage 404 to update the appropriate program counter stored therein.
- the output result 434 and carry through data 442 are stored in an execute stage output register 444 together with other miscellaneous information.
- an aligner 446 aligns the output result 434 obtained from the execute state output register 444 to produce an aligned output result 448 known as processed data.
- the processed data is then written to a determined location in the memory storage device or devices.
- the filter processor 400 services the protocol handlers 302 in a round robin fashion. In particular, with each clock cycle, the filter processor 400 begins execution of an instruction for a different one of the protocol handlers 302 .
- the processing resources of the filter processor 400 are distributed across the ports requiring service so that certain ports do not monopolize the processing resources.
- FIG. 5 is a block diagram of a filter processor 500 according to another embodiment of the invention.
- the filter processor 500 is a detailed embodiment of the filter processor 312 together with the instruction RAM 322 illustrated in FIG. 3.
- the filter processor 500 is also a more detailed embodiment of the filter processor 400 .
- the filter processor 500 is a pipelined processor having five (5) stages. Generally, the stages are described as instruction fetch, operand fetch, decode, execute and write.
- the filter processor 500 receives an instruction from an instruction RAM 501 .
- the instruction RAM 501 is an internal 512 ⁇ 64 RAM that holds instruction words. Since the port number can be read from the filter variables RAM 318 , execution specific to a port or group of ports can be supported. In one embodiment, protocol handlers share the same instruction set.
- the instruction RAM 501 is initialized by the microprocessor 228 at boot-up. While dynamic code changes are allowed, execution is preferably halted to prevent erroneous execution.
- a fetch controller 502 produces an instruction select signal 504 that is used to select the appropriate instruction from the instruction RAM 501 .
- the fetch controller 502 produces the instruction select signal 504 based on program counters 506 and weight counters 508 . Specifically, the fetch controller 502 selects the appropriate instruction in accordance with the program counter 506 for the particular protocol handler 302 being processed in any given clock cycle and its associated wait counter 508 . If the associated wait counter 506 is greater than zero, the pipeline executes transmit instructions retrieved from the instruction RAM 501 . Otherwise, when the associated wait counter 506 is not greater than zero, the processing continues using the program counter for the particular protocol handler 302 .
- the operation of the fetch controller 502 is such that operates to switch its processing to each of the protocol handlers 302 with each clock cycle by selecting the program counter 506 for that protocol handler 302 .
- the protocol handlers 302 are services by the filter processor 500 in a round robin fashion. Stated another way, each frame that is received or transmitted resets the context of the filter processor 500 for that port. For example, in the case in which the MAC circuitry 300 supports eight protocol handlers, the fetch controller 502 will sequence through each of the program counters 506 (one for each of the protocol handlers 302 ) to effectively service each the protocol handlers one clock cycle out of every eight clock cycles.
- the first stage (fetch stage) of the filter processor 500 uses two clock cycles, and the remaining stages use a single clock cycle.
- the first stage requires two clocks to complete because the instruction RAM 501 contains an address register so that the first clock cycle selects one of eight (8) receive or transmit program counters and during the second clock cycle the appropriate instruction is read from the instruction RAM 501 .
- the appropriate instruction that is retrieved from the instruction RAM 501 is latched in a fetch instruction word (I-word) register 510 . Additionally, a port number is latched in a port register 512 , a valid indicator is latched in a valid register 514 , receive/transmit indicator is stored in a receive/transmit register (RX/TX) 516 , and a program counter is stored in a program counter register 518 .
- I-word fetch instruction word
- RX/TX receive/transmit register
- a destination address, source-one (S1) address, and source-two (S2) address calculations are performed by a first address calculation circuit 520 .
- Both S1 and S2 are obtained from an instruction, where S2 is an immediate value within the instruction format, and S2 includes a position in RX FIFO 310 , a variable for a variable in the variable RAM 320 and a relative address adjustment within the instruction format.
- the first address calculation circuit 520 produces a destination address 522 , a source-one address 524 , and a source-two address 526 , all of which are supplied to the next stage.
- the destination address 522 is also supplied to a stalling circuit 528 which produces a stall signal 530 that is supplied to the fetch controller 502 .
- the stall signal 530 causes the pipeline to hold its current state until the stall condition is resolved.
- a carry-through link 532 carries through this stage other portions of data from the instruction that are needed in subsequent stages.
- the operand fetch stage of the filter processor 500 also includes a second address calculation circuit 534 that calculates a filter variable address 554 , a FIFO address 552 , and a register address 548 .
- the filter variable address 554 is supplied to a variable storage device, the FIFO address is supplied to a FIFO device, and the register address is supplied to a control register.
- the variable storage device may be the filter variables RAM 318
- the FIFO device may be the transmit and receive FIFOs 328 , 310
- the control register may be within the framing logic 336 .
- the operand fetch stage generates write stage addresses.
- this stage requires two clock cycles to complete since data from the FIFOs 310 , 328 and the filter variables RAM 318 due to address registers in the implementing RAMs.
- instruction decoding by the decode stage is performed in parallel with the second clock of this stage, it is treated as requiring only a single clock cycle.
- the operand fetch stage also includes logic 536 that combines the contents of the port register 512 , the valid register 514 and the received/transmit register 516 , and produces a combined context indicator.
- an operand-fetch stage register 538 stores the carry-through data 532 and the addresses produced by the first address calculation circuit 520 .
- the context indicator from the logic 536 is stored in a register 540 and the associated program counter is stored in the program counter register 542 .
- a multiplexer 544 receives an immediate value 546 from the operand-fetch stage register 538 and possibly an operand 548 from the control register. Depending upon the type of instruction, the multiplexer 544 selects one of the immediate value 546 and the operand 548 as the output.
- a multiplexer 550 receives the possibly retrieved operands from the control register, the FIFO device, and the variable RAM over links 548 , 552 , and 554 . The multiplexer 550 selects one of these input operands as its output operand.
- the merge multiplexer 556 operates to merge the operands retrieved from the FIFO device and the variable RAM.
- An aligner 558 aligns the output operand from the multiplexer 550
- an aligner 560 aligns the output from the multiplexer 544 .
- An alignment controller 562 operates to control the merge multiplexer 556 , the aligner 558 , and the aligner 560 based on address signals from the operand-fetch stage register.
- a branch target circuit 564 operates to produce a branch target in certain cases.
- a decode stage register 566 stores the aligned values from the aligners 558 and 560 , any mask or function produced by a mask and function circuit 565 , the merged operand from the merge multiplexer 556 , the branch target, and carry through data from the operand-fetch stage register 538 .
- the accompanying context indicator is stored in the context register 568
- the accompanying program counter is stored in a program counter register 570 .
- an arithmetic logic unit (ALU) 572 receives input values 574 , 576 , and 578 .
- the input value 574 is provided (via the decode stage register 566 ) by the aligner 560
- the input value 576 is provided by the mask and function circuit 565
- the input value 578 is provided by the aligner 558 .
- the ALU 572 produces an output value 580 the output value 580 based on the input values 574 , 576 and 578 .
- the output value 580 and a merged operand 582 supplied via the merged multiplexer 556 ) are supplied to a bit level multiplexer 584 which outputs a masked output value.
- the bit level multiplexer 584 is controlled in accordance with the mask via link 586 .
- the execution stage includes a 64-bit ALU that can perform ADD, SUBTRACT, OR, XOR, and AND operations.
- the execution stage also generates Boolean outputs for comparison operations.
- the program counter is written in this stage. The program counter is either incremented (no branch or branch not taken) or loaded (branch taken).
- the execution stage also includes a multiplexer 588 that receives as inputs the branch target over a link 590 and the associated program counter after being incremented by one (1) by adder 592 .
- the multiplexer 588 selects one of its inputs in accordance with a control signal produced by a zero/carry flag logic 593 that is coupled to the ALU 572 and the multiplexer 588 .
- the mask (via the link 586 ) in the resulting value from the bit level multiplexer 584 are stored in an execute stage register 594 .
- the context indicator is carried through this stage and stored in a context latch 596 .
- an aligner 597 aligns the masked output value from the ALU 572 to produce write data.
- the aligner 597 is controlled by the mask via a link 598 .
- the link 598 also supplies the mask to a write address calculation circuit 599 that produces write addresses for the variable RAM, the FIFO devices, and the control register.
- the write stage then writes the write data 600 to one of the FIFOs 310 , 328 , filter variable RAM 318 , or control registers.
- the final result of receive frame processing is both the appropriate destination information for the frame as well as a copy/reject indication for the receiver layer of the protocol handler. In the case of token-ring, this information is used to set the AR & FC bits correctly. How quickly instructions execute affects both functions. On the system side, if instruction are still executing in order to forward the current frame, any following frame will fill into the receive FIFO 328 until up to 32 bytes. If the 32 nd byte is received before the previous frame finishes instruction execution the frame will be dropped automatically. For token-ring applications, the copy/reject decision should be completed by the time the FS is received.
- the final result of transmit frame processing is deciding whether or not the frame should actually be transmitted on the wire or dropped. Additionally, for level-3 switching, transmit processing will replace the destination address (DA) with information from a translation table.
- DA destination address
- Up to 512 instructions may be used to process a frame. Instruction execution begins at address 0 for receive frames, and begins at a programmable address for transmit frames. Each instruction is capable of extracting fields from the frame and storing them in a 64 byte variable space. Likewise, frame fields can be compared against immediate values and the results of comparisons stored in variables. Lastly, fields can be extracted, looked up in a CAM and the CAM results stored in a variable.
- the microprocessor 228 can set port specific configuration parameters (VRING membership numbers, source routing ring and bridge numbers, etc.) in the variable memory as well.
- Transmit side filtering will affect the minimum IPG the switch will be able to transmit with because the frame will have to be accumulated and held in the transmit FIFO 328 until processing has finished. Additionally, the transmit side filtering will be limited to the depth of the FIFO (128 bytes).
- transmit side filtering can be executed whenever receive instructions are not being executed. This should yield wire speed performance for any half-duplex medium. For more performance, a second execution pipeline together with another read port on the instruction RAM could be added.
- FIG. 6A is a block diagram of an instruction selection circuit 600 according to an embodiment of the invention.
- the instruction selection circuit 600 represents an implementation of the fetch controller 502 , the program counters 506 , and the wait counters 508 illustrated in FIG. 5.
- the instruction selection circuit 600 includes a port counter 602 that increments a counter to correspond to the port number currently serviced by the filter processor 500 . For example, if a frame processing apparatus is servicing eight (8) ports, then the port count repeatedly counts from zero (0) to seven (7). The port count produced by the port counter 602 is forwarded to a port multiplexers 604 and 606 . The port multiplexer 606 selects one of a plurality of transmitter program counters (Tx PC) 608 in accordance with the port count. The port multiplexer 606 selects one of a plurality of receive program counters (Rx PC) 610 . The instruction selection circuit 600 includes one transmit program counter (Tx PC) and one receive program counters for each of the ports.
- Tx PC transmitter program counter
- Rx PC receive program counters
- a port multiplexer 606 selects one of the receive program counter (Rx PC) 610 in accordance with the port count supplied by the port counter 602 .
- the output of the port multiplexers 604 and 606 are supplied to a transmit/receive multiplexer (Tx/Rx MUX) 612 .
- the output of the transmit/receive multiplexer 612 is forwarded to the instruction RAM 501 to select the appropriate instruction for the particular port being serviced during a particular clock cycle.
- the transmit and receive program counter 608 and 610 also receive a new program count (NEW PC) from later stages of the filter processor 500 in the case in which the program counter for a particular port is altered due to a branch instruction or the like.
- the instruction selection circuit 600 includes one counters (WAIT) 616 for each of the receive ports, and a port multiplexer 614 that selects one of the plurality wait counters (WAIT) 616 in accordance with the port count from the port counter 602 .
- the particular wait counter 616 that is selected by the port multiplexer 614 is supplied to a transmit/receive determining unit 618 .
- a transmit/receive determining unit 618 supplies a control signal to the transmit/receive multiplexer 612 such that the transmit/receive multiplexer 612 outputs the transmit program counter (Tx PC) when the selected wait counter is greater than zero (0), and otherwise outputs the receive program counter (Rx PC).
- FIG. 6B is a diagram 622 illustrating the context switching utilized by a filter processor according to the invention.
- a five (5) stage pipeline operates to process instructions for each of the various ports.
- the allocation of the processing is performed on a round-robin basis for each port on each clock cycle. For example, as illustrated in the diagram 622 provided in FIG.
- the port number being incremented on each clock cycle (CK), and then the initial port is eventually returned to and the next instruction (whether for transmit or receive processing) for that port is then processed.
- the pipeline of the filter processor 500 need not stall to wait for currently executing instructions to complete when there are dependencies with subsequent instructions for the same port. For example, in FIG. 6B, it is not until eight (8) clock cycles (CLK 9 ) later that the next instruction (I 1 ) is fetched by the filter processor for the port 0 which last processed an instruction (I 0 ) during clock 1 (CLK 1 ).
- FIG. 7 is a block diagram of an address calculation circuit 700 according to an embodiment of the invention.
- the address calculating circuit 700 performs most of the operations performed by the first address calculating circuit 520 and the second address calculating circuit 534 illustrated in FIG. 5.
- the address calculation circuit 700 calculates the address of the operands in the storage devices (FIFOs, control registers, filter variables RAM).
- the address specified in the instruction being processed can be relative to a field in the frame (RIF or VLAN) and thus requires arithmetic operations. Additionally, the determined address must be checked against the current receive count for that port. If the requested data at that determined address has not yet arrived, the instruction must be canceled.
- the address calculation circuit 700 includes a base multiplexer 702 for outputting a base address for each of the ports, a relative multiplexer 704 for outputting a relative address for each of the ports, and a length multiplexer 706 for outputting a length of the frame.
- An adder 708 adds the relative address to a position provided in the instruction word (I-WORD) to produce an address for the storage device.
- a subtractor 710 implements the comparison by taking the result from the adder 708 and subtracts it from the length obtained from the length multiplexer 706 . If the output of the subtractor 710 is greater than zero (0) then the instruction is canceled; otherwise, the appropriate wait counter is set.
- An adder 714 adds the base address from the base multiplexer 702 with the address produced (bits 5 and 6 ) from the adder 708 . The resulting sum from the adder 714 produces a high address for the FIFO.
- the output from a decrementer device 716 causes a decrement operation to occur if bit 2 is zero (0).
- the output of the decrementer device 716 regardless of whether or not it decrements, is a low address value for the FIFO.
- the forwarding tables 210 preferably includes an external table RAM and an external content-addressable memory (CAM).
- FIG. 8 is a block diagram of a CAM and a table RAM for implementing forwarding tables 210 and associated interface circuitry illustrated in FIG. 2.
- FIG. 8 illustrates forwarding tables 802 as including a CAM 804 and a table RAM 806 .
- the MAC circuitry 300 or a portion thereof (e.g., the table interface 324 ), is coupled to the forwarding tables 802 .
- the portion of the MAC circuitry 300 illustrated in FIG. 8 includes a CAM/table controller 800 that represents the table interface 324 illustrated in FIG. 3.
- the CAM/table controller 800 communicates with the CAM 804 and the table RAM 806 through a data bus (DATA) and an address bus (ADDR), and controls the CAM 804 and the table RAM 806 using control signals (CNTL).
- the MAC circuitry 300 preferably includes a write multiplexer 808 that outputs write data to be stored in one of the storage devices from either the data bus (DATA) coupling the CAM/table controller 800 with the CAM 804 and the table RAM 806 or the write data line of the write stage of the filter processor 500 illustrated in FIG. 5.
- the frame processing apparatus 200 uses the CAM 804 for MAC level DA and SA processing as well as for RIF ring numbers and IP addresses.
- the table RAM 806 is used for destination information tables. In the case of multiple instances of the MAC circuitry 208 , the CAM 804 and the table RAM 806 can be shared among the instances.
- the CAM 804 is used to translate large fields to small ones for later use as a table index into the table RAM 806 . In all cases, the address of the match is returned and used as a variable or table index. The benefit of using the CAM 804 is to preserve the associated data for performing wider matches.
- the table below summarizes typically occurring lookups: Match Word Used For 48 bit DA + 12 bit VRING/Bridge L2 frame destination determination group 48 bit SA Address learning 12 bit Destination Ring Number Source route destination determination 32 bit IP add. + 12 bit VRING/ L3 frame destination determination Bridge group
- Each lookup also includes a 2, 3, or 4 bit field that keys what type of data (e.g., MAC layer Addresses, IP Addresses) is being searched. This allows the CAM 804 to be used to store different types of information.
- MAC layer Addresses e.g., IP Addresses
- the microprocessor 228 must carefully build destination tables cognizant of where data lands in the CAM 804 since match addresses are used as indexes as opposed to associated data.
- the size of a table entry is programmable but must be a power of 2 and at least 8 bytes (i.e., 8, 16, 32 bytes).
- the filter processor makes no assumptions on the contents of an entry. Rather, lookup instructions can specify that a given amount of data be transferred from the table to internal variables.
- the table RAM 806 holds destination information for properly switching frames between ports. It also can include substitute VLAN information for transforming between tagged and untagged ports as well as MAC layer DA and RIF fields for layer-3 switching.
- each of the MAC circuitry 208 structures includes strapping options to specify master or slave operation.
- the master controls arbitration amongst all the MAC circuitry 208 structures for access to the CAM 804 and the table RAM 806 .
- the master supports access to the external memories (e.g., processor RAM 232 ) via the microprocessor 228 .
- the frame processing apparatus 200 could provide each of the MAC circuitry 208 structures its own CAM and table RAM, in which case the strapping options are not needed.
- the CAM/table controller 800 accepts lookup requests from the pipeline of the filter processor and generates the appropriate cycles to the CAM 804 . Multiple protocol handlers can share the single CAM 804 .
- the pipeline of the filter processor 312 continues to execute while the CAM search is in progress. When the CAM cycle finishes, the result is automatically written into the filter variables RAM 318 . No data dependencies are automatically checked.
- the filter processing software is responsible for proper synchronization (e.g., a status bit is available indicating lookup completion).
- FIG. 9 is a block diagram of an aligner 900 according to an embodiment of the invention.
- the aligner 900 represents an implementation of the aligners illustrated in FIG. 5, in particular the aligner 560 .
- the aligner 900 includes a 4-to-1 multiplexer 902 and a 2-to-1 multiplexer 904 .
- the 4-to-1 multiplexer 902 upon receiving an input signal of 64 bits (63:0), the 4-to-1 multiplexer 902 receives four different alignments of the four bytes of the input signal. The selected alignment is determined by a rotate signal (ROTATE).
- ROTATE rotate signal
- the 2-to-1 multiplexer receives two different alignments.
- One alignment is directly from the output of the 4-to-1 multiplexer 902 , and the other alignment is rotated by two bytes.
- the 2-to-1 multiplexer 904 then produces an output signal (OUT) by selecting one of the two alignments in accordance with the rotate signal (ROTATE).
- FIG. 10 is a block diagram of a switching circuit 1000 .
- the switching circuit 1000 is a more detailed diagram of the switch circuitry 218 of FIG. 2.
- the switching circuit 1000 includes a frame controller and DMA unit 1002 , a MAC interface controller 1004 , a frame buffer controller 1006 , a queue manager 1008 , a buffer manager 1010 , an ATM interface 1012 , and a CPU interface 1014 .
- the frame controller and DMA unit 1002 controls the overall management of the switching operation.
- the queue manager 1008 and the buffer manager 1020 respectively manage the queues and buffers of the output queues and buffer management information storage 224 via the bus 226 .
- the CPU interface 1014 enables the microprocessor 228 to interact with the output queues and buffer management information storage 224 , the frame buffer 214 , and the ATM interface 1012 . Attached hereto as part of this document is Appendix A containing additional information on exemplary instruction formats and instructions that are suitable for use by a filter processor according to the invention.
- This instruction would subtract one from a byte wide field on a byte boundary (no .a specified) that is 8 bytes into the IP header in the RxFIFO, write the modified field back and jump if the result is zero to location 65 .
- the time-to-live counter of an IP frame could be decrement in this fashion and a branch taken at zero (reject frame).
- the source operand may be any length from 1 to 32 bits. Only one source operand may come from the variable ram. That is, vN s1 AND vN s2 is not supported. If the source 1 operand is a variable an extra length bit is included allowing 64 bit logical operations.
- N byte offset of LSB in FIFO
- variable ram or register number for argument 2 rel adjust N for headers automatically or select variables or register as source
- Z variable ram target address (if zero, destination address is same as source 2).
- the magnitude result of the comparison cascaded with the previous result is stored in the variable addressed by Z.
- the source operand may be any length from 1 to 32 bits.
- the destination operand is automatically two bits wide.
- Instruction Format: Instruction Fields: # immediate value right justified.
- L length of operands in bits from 1 to 32.
- off bit offset in FIFO for non byte aligned fields.
- Compare if Equal, Store boolean Operation: (vZ) ⁇ ( ( [(fN)
- This instruction is intended as a precursor for complex filters.
- a collection of booleans may be created and then operated on simultaneously.
- the source operand may be any length from 1 to 32 bits.
- (gN)] - #) > 0) Assembler Syntax: cgtes #,fN,vZ or cgtes #,gN,vZ Description: The immediate value specified in the instruction is subtracted from the source operand which can come from either the FIFO ram, the variable ram or the registers. If the result is positive, the boolean at address Z in the variable ram is set true.
- This instruction is intended as a precursor for complex filters.
- a collection of booleans may be created and then operated on simultaneously.
- the source operand may be any length from 1 to 32 bits.
- the source operand may be any length from 1 to 32 bits when using immediate compares. For variable based compares the source operand may be up to 64 bits long.
- L length of operands in bits from 1 to 32.
- off bit offset in FIFO for non byte aligned fields.
- N byte offset of LSB in FIFO or variable or register number.
- new_PC new PC execution address after branch. Compare, Jump if Greater Than or Equal Operation: If( [(fN s2 )
- the source operand may be any length from 1 to 32 bits when using immediate compares. For variable based compares the source operand may be up to 64 bits long.
- N byte offset of LSB in FIFO or variable or register number.
- M byte address into variable ram for source argument.
- N byte offset of LSB in FIFO or variable or register number
- the source operand may be any length from 1 to 32 bits when using immediate compares. For variable based compares the source operand may be up to 64 bits long.
- L length of operands in bits from 1 to 32.
- off bit offset in FIFO for non byte aligned fields.
- N byte offset of LSB in FIFO or variable or register number.
- new_PC new PC execution address after branch. Compare, Jump if Not Equal Operation: If ( [(fN s2 )
- the source operand may be any length from 1 to 32 bits when using immediate compares. For variable based compares the source operand may be up to 64 bits long.
- L length of operands in bits from 1 to 32.
- off bit offset in FIFO for non byte aligned fields.
- N byte offset of LSB in FIFO or variable or register number.
- M byte address into variable ram for source argument.
- (gN s )] ⁇ # high ) then PC ⁇ new_PC or If !
- (gN)] - #) ! 0) Assembler Syntax: cnes #,fN,vZ or cnes #,gN,vZ Description: The immediate value specified in the instruction is subtracted from the source operand which can come from either the FIFO ram, the variable ram or the registers. If the result is not zero, the boolean at address Z in the variable ram is set true.
- This instruction is intended as a precursor for complex filters.
- a collection of booleans may be created and then operated on simultaneously.
- the source operand may be any length from 1 to 32 bits.
- temparg [(fN)
- the magnitude result of the comparison is stored in the variable addressed by Z.
- the source operand may be any length from 1 to 32 bits.
- the destination operand is automatically two bits wide.
- Instruction Format: Instruction Fields: # immediate value right justified.
- L length of operands in bits from 1 to 32.
- off bit offset in FIFO for non byte aligned fields.
- the external table ram is accessed at entry FBASE + TRA (FBASE is a configuration register while TRA comes from the instruction).
- FBASE is a configuration register while TRA comes from the instruction.
- the destination mask and backplane destinations for the frame (fixed locations in the variable ram) are added to from the table entry.
- the table ram address may come from two aligned bytes of variable memory (shown as vM).
- the source operand may be any length from 1 to 32 bits.
- the external table ram is accessed at entry FBASE + TRA (FBASE is a configuration register while TRA comes from the instruction).
- FBASE is a configuration register while TRA comes from the instruction.
- the destination mask and backplane destinations for the frame (fixed locations in the variable ram) are loaded from the table entry.
- the table ram address may come from two aligned bytes of variable memoiy (shown as vM).
- the source operand may be any length from 1 to 32 bits.
- tempvar byte offset of LSD of 16 bit table address in variable memory
- Filter Compare Range and Add Destination Operation: tempvar : [(fN)
- the external table ram is accessed at entry FBASE + TRA (FBASE is a configuration register while TRA comes from the instruction).
- FBASE is a configuration register while TRA comes from the instruction.
- the destination mask and backplane destinations for the frame (fixed locations in the variable ram) are added to from the table entry.
- the table ram address may come from two aligned bytes of variable memory (shown as vM).
- the source operand may be any length from 1 to 16 bits.
- tempvar [(fN)
- the external table ram is accessed at entry FBASE + TRA (FBASE is a configuration register while TRA comes from the instruction).
- FBASE is a configuration register while TRA comes from the instruction.
- the destination mask and backplane destinations for the frame (fixed locations in the variable ram) are added to from the table entry.
- the table ram address may come from two aligned bytes of variable memory (shown as vM).
- the source operand may be any length from 1 to 16 bits.
- +00 vZ s Jump location.
- Its length can be any number of bytes from 1 to 8.
- This field is concatenated with an optional B field pulled the variable ram.
- the B field length is automatically calculated to pad the lookup value to 8 bytes.
- the top 2, 3 or 4 bits (63 downto 62,61 or 60) are replaced with the key value specified in the instruction.
- This value is passed to the CAM together with the mask select.
- the match address from the CAM is stored in the variable ram at the selected destination. If no length is specified for the A field it is assumed to be 64 bits.
- the CAM result is used to index the external table ram.
- the destination mask and the BPDEST field is fetched from ram and added into the variable ram at the predefined address for this information.
- D Destination address for the table index returned from the CAM. Also used as the base for any bytes moved from the extended information fields of a table entry.
- Its length can be any number of bytes from 1 to 8.
- This field is concatenated with an optional B field also pulled from either the variable ram or FIFO.
- the B field length is automatically calculated to pad the lookup value to 8 bytes.
- the top 2, 3 or 4 bits (63 downto 62,61 or 60) are replaced with the key value specified in the instrnction.
- This value is passed to the CAM together with the mask select.
- the match address from the CAM is stored in the variable ram at the selected destination. If no length is specified for the A field it is assumed to be 64 bits. Next the CAM result is used to index the external table ram.
- the destination mask and BPDEST0 and BPDEST1 fields are fetched from ram and loaded into the variable ram at the predefined address for this information.
- the variable ram entries for BPDEST2 and BPDEST3 are written to 0.
- Lengths that are not multiples of 8 will be padded to 8 bits.
- the length of the B field is based upon the A field.
- (vD) table[index].offset Assembler Syntax: load # i ,# o ,vD or load (vI),# o ,vD Description: The external table ram is accessed at a given index and the entries starting with the programmed offset are fetched and copied into either the FIFO or variable ram at the specified destination. The index may be either specified directly in the instruction or indirectly through a variable. Instruction Formats: Indirect table index Immediate table index # o - Offset from index at which to begin loading data. Valid values are O . . . 31.
- D len - Move count Number of bytes of extended data to move into variable memory location D (specified as the length of D in bytes) # i - Index into the table (represents address/16) I - variable memory location containing a 16 bit index into the table (represents address/16). Valid values are O . . . 65535.
- Q Relative information for the D address.
- v(destination mask) v(destination mask) OR tableram(#I
- the destination mask for this ently is or'd into the current mask in variable ram.
- the backplane destinations are stored in the variable ram starting with the first empty one. If none of the backplane destinations are empty data from the table may be lost. Also see loadd.
- Instruction Formats Indirect table index Immediate table index location D (specified as the length of D in bytes) # i - Index into the table (represents address/16). Valid values are 0 . . . 65535. I - variable memory location containing a 16 bit index into the table (represents address/16).
- This field is concatenated with an optional B field pulled from either the variable ram or FIFO.
- the B field length is automatically calculated to pad the lookup value to 8 bytes.
- the top 2, 3 or 4 bits (63 downto 62,61 or 60) are replaced with the key value specified in the instruction.
- This value is passed to the CAM together with the mask select.
- the match address from the CAM is stored in the variable ram at the selected destination. If no length is specified for the A field it is assumed to be 64 bits. It is always padded to at least 4 bytes.
- Instruction Format Instruction Fields: mask - Mask select for CAM lookups key - Key bits (left aligned for smaller than 4 bit keys) klen - Key length.
- B - B key field address L+ - 6 th length bit for the A field length, allowing lengths up to 64 bits B - B key field address.
- the address for the B field of the key if used.
- the length of the B field is based upon the A field.
- (gN s2 )] ⁇ [(fN s2 )
- (gN s2 )] OR (vM s1 ) or (vZ d ) ⁇ [(fN s2 )
- (gN s2 )] ⁇ [(fN s2 )
- (gN s2 )] OR # or (vZ d ) ⁇ [(fN s2
- the source operand may be any length from 1 to 32 bits. Only one source operand may come from the variable ram. That is, vN s1 OR vN s2 is not supported. If the source 1 operand is a variable an extra length bit is included allowing 64 bit logical operations.
- the source operand may be any length from 1 to 32 bits. Only one source operand may come from the variable ram. That is, vN s2 - vN s1 is not supported. If the source 1 operand is a variable an extra length bit is included allowing 64 bit logical operations.
- N byte offset of LSB in FIFO
- variable ram or register number for argument 2 rel adjust N for headers automatically or select variables or register as source
- Z variable ram target address (if zero, destination address is same as source 2).
- the result is stored back into operand 2. If the result is zero the PC is replaced with the new_PC field of the instruction. Otherwise execution continues with the next instruction. All jumps are relative, with a range of ⁇ 128 to 127 instructions from the current pc.
- N byte offset of LSB in FIFO
- variable ram or register number for argument 2 rel adjust N for headers automatically or select variables or register as source
- Z variable ram target address (if zero, destination address is same as source 2).
- the result is stored back into operand 2. If the result is non-zero the PC is replaced with the new_PC field of the instruction. Otherwise execution continues with the next instruction. All jumps are relative, with a range of ⁇ 128 to 127 instructions from the current pc.
- (gN s2 )] ⁇ [(fN s2 )
- (gN s2 )] XOR (vM s1 ) or (vZ d ) ⁇ [(fN s2 )
- (gN s2 )] ⁇ [(fN s2 )
- (gN s2 )] XOR # or (vZ d ) ⁇ [(fN s2 )
- the source operand may be any length from 1 to 32 bits. Only one source operand may come from the variable ram. That is, vN s1 XOR vN s2 is not supported. If the source 1 operand is a variable an extra length bit is included allowing 64 bit logical operations.
- the mov instruction is intended to be used to open holes in a frame for inserting VLAN tags or RIF fields or to close holes in a frame for the reverse transformations. The instruction is executed as an OR with the value #0. The only difference is that the mov instruction allows 64 bit lengths and the destination may be either a different FIFO address (normal instructions only write to the FIFO at the same address as the source operand) or a variable address. The mov instruction is also limited to moving whole bytes.
- Example 1 Opening hole for VLAN insertion.
- Original FIFO contents Contents after move: Instruction Sequence: mov f3.L32,v60 ; move AC,FC,DA0 and DA1 into tail of variable ram mov f9.L48,f5 ; move DA2-DA5, SA0,SA1 to base of FIFO ; this move limited to 6 bytes because of address mov f13.L32,f9 ; move SA2-SA5 or mov f3.L32,v60 ; move AC,FC,DA0 and DA1 into tail of variables.
- switchda ; if block bit is set or must have RIF, reject cjne 0, BLOCKorNORIF, reject docommon: sti halt, EOFREG; ; EOF now causes frame to go w/ last status ; $$SS$ insert user filters here ; as last check before halting, look if SA is unknown and CPU ; is not getting a copy of the frame. If so, send a copy to the ; unknown SA queue.
Abstract
Description
- A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
- 1. Field of the Invention
- The present invention relates to data communications networks and, more particularly, to switching data frames through data communications networks.
- 2. Description of the Related Art
- Frame processing is performed at nodes of networks, such as local area networks (LANs). By processing frames, the nodes are able to determine how to forward or switch frames to other nodes in the network.
- FIG. 1 is a block diagram of a conventional
frame processing apparatus 100. The conventionalframe processing apparatus 100 is suitable for use in a LAN, namely a token-ring network. The conventionalframe processing apparatus 100 receives data frames from a plurality of ports associated with the LAN. The data frames are processed by the conventionalframe processing apparatus 100 to effectuate a switching operation. In particular, data frames received from each of the ports are processed such that they are either dropped or forwarded to other ports being serviced by the conventionalframe processing apparatus 100. - The conventional
frame processing apparatus 100 includesphysical layer interfaces physical layer interfaces - Although the token-ring chip sets110-116 could each couple to a data bus directly, to improve performance the conventional
frame processing apparatus 100 may includebus interface circuits bus interface circuits data bus 122. The bus interface circuits 118-120 transmit a burst of data over thedata bus 122 for storage in aframe buffer 124. By transmitting the data in bursts, the bandwidth of thedata bus 122 is able to be better utilized. Aframe buffer controller 126 controls the storage and retrieval of data to and from theframe buffer 124 by way of thebus interface circuits control lines frame buffer 124 stores one or more data frames that are being processed by the conventionalframe processing apparatus 100. - An
isolation device 134 is used to couple abus 136 for amicroprocessor 138 to thedata bus 122. Themicroprocessor 138 is also coupled to amicroprocessor memory 140 and aframe buffer controller 126. Themicroprocessor 138 is typically a general purpose microprocessor programmed to perform frame processing using the general instruction set for themicroprocessor 138. In this regard, themicroprocessor 138 interacts with data frames stored in theframe buffer 124 to perform filtering to determine whether to drop data frames or provide a switching destination for the data frames. In addition to being responsible for frame filtering, themicroprocessor 138 is also responsible for low level buffer management, control and setup of hardware and network address management. - Conventionally, as noted above, the microprocessors used to perform the frame processing are primarily general purpose microprocessors. Recently, a few specialized microprocessors have been built to be better suited to frame processing tasks than are general purpose microprocessors. An example of such a microprocessor is the CXP microprocessor produced by Bay Networks, Inc. In any event, these specialized microprocessors are separate integrated circuit chips that process frames already stored into a frame buffer.
- One problem with conventional frame processing apparatuses, such as the conventional
frame processing apparatus 100 illustrated in FIG. 1, is that the general purpose microprocessor is not able to process data frames at high speed. As a result, the number of ports that the conventional frame processing apparatus can support is limited by the speed at which the general purpose microprocessor can perform the filtering operations. The use of specialized microprocessors is an improvement but places additional burdens on the bandwidth requirements of the data paths. Another problem with the conventional frame processing apparatus is that the data path to and from the physical layer and the frame buffer during reception and transmission of data has various bottlenecks that render the conventional hardware design inefficient. Yet another disadvantage of the conventional frame processing apparatus is that it requires a large number of integrated circuit chips. For example, with respect to FIG. 1, thebus interface circuits - Thus, there is a need for improved designs for frame processing apparatuses so that frame processing for a local area network can be rapidly performed with fewer integrated circuit chips.
- Broadly speaking, the invention is an improved frame processing apparatus for a network that supports high speed frame processing. The frame processing apparatus uses a combination of fixed hardware and programmable hardware to implement network processing, including frame processing and media access control (MAC) processing. Although generally applicable to frame processing for networks, the improved frame processing apparatus is particular suited for token-ring networks and ethernet networks.
- The invention can be implemented in numerous ways, including as an apparatus, an integrated circuit and network equipment. Several embodiments of the invention are discussed below.
- As an apparatus for filtering data frames of a data communications network, an embodiment of the invention includes at least: a plurality of protocol handlers of the data communications network, each of the protocol handlers being associated with a port of the data communications network; and a pipelined processor to filter the data frames received by the protocol handlers as the data frames are being received. In one embodiment, the pipelined processor provides a uniform latency by sequencing through the protocol handlers with each clock cycle. Preferably, the apparatus is formed on a single integrated circuit chip.
- As an integrated circuit, an embodiment of the invention includes at least a plurality of protocol handlers, each of the protocol handlers corresponding to a different communications port; a receive buffer for temporarily storing data received from the protocol handlers; framing logic, the framing logic controls the reception and transmission of data frames via the protocol handlers; and a filter processor to filter the data frames received by the protocol handlers such that certain of the data frames are dropped and other data frames are provided with a switching destination. Optionally, the integrated circuit further includes a transmit buffer for temporarily storing outgoing data to be supplied to said protocol handlers, and the filter processor further operates to filter the data frames being supplied to said protocol handlers for transmission.
- As network equipment that couples to a network for processing data frames transmitted in a the network, an embodiment of the invention includes: a network processing apparatus for processing data frames received and data frames to be transmitted, a frame buffer to store the data frames received that are to be switched to other destinations in the network, and switch circuitry to switch the data frames in said frame buffer to the appropriate one or more protocol handlers. The network processing apparatus includes at least a plurality of protocol handlers, each of said protocol handlers corresponding to a different communications port of the network; and a frame processing apparatus to processes the data frames received from said protocol handlers and the data frames to be transmitted via said protocol handlers.
- The advantages of the invention are numerous. One advantage of the invention is that a frame processing apparatus is able to process frames faster, thus allowing the frame processing apparatus to service more ports than conventionally possible. Another advantage of the invention is that the frame processing apparatus according to the invention requires significantly fewer integrated circuit chips per port serviced.
- Other aspects and advantages of the invention will become apparent from the following detailed description, taken in conjunction with the accompanying drawings, illustrating by way of example the principles of the invention.
- The present invention will be readily understood by the following detailed description in conjunction with the accompanying drawings, wherein like reference numerals designate like structural elements, and in which:
- FIG. 1 is a block diagram of a conventional frame processing apparatus;
- FIG. 2 is a block diagram of a frame processing apparatus according to an embodiment of the invention;
- FIG. 3A is a block diagram of MAC circuitry according to an embodiment of the invention;
- FIG. 3B is a block diagram of a protocol handler according to an embodiment of the invention;
- FIG. 4 is a block diagram of a filter processor according to an embodiment of the invention;
- FIG. 5 is a block diagram of a filter processor according to another embodiment of the invention;
- FIG. 6A is a block diagram of an instruction selection circuit according to an embodiment of the invention;
- FIG. 6B is a diagram illustrating the context switching utilized by a filter processor according to the invention.
- FIG. 7 is a block diagram of an address calculation circuit according to an embodiment of the invention;
- FIG. 8 is a block diagram of a CAM and a table RAM for implementing forwarding tables and associated interface circuitry illustrated in FIG. 2; and
- FIG. 9 is a block diagram of an aligner according to an embodiment of the invention; and
- FIG. 10 is a block diagram of a switching circuit.
- The invention relates to an improved frame processing apparatus for a network that supports high speed frame processing. The frame processing apparatus uses a combination of fixed hardware and programmable hardware to implement network related processing, including frame processing and media access control (MAC) processing. Although generally applicable to frame processing for networks, the improved frame processing apparatus is particular suited for token-ring networks and ethernet networks.
- Embodiments of the invention are discussed below with reference to FIGS.2-10. However, those skilled in the art will readily appreciate that the detailed description given herein with respect to these figures is for explanatory purposes as the invention extends beyond these limited embodiments.
- FIG. 2 is a block diagram of a
frame processing apparatus 200 according to an embodiment of the invention. Theframe processing apparatus 200 includes physical layer interfaces 202-206. Each of the physical layer interfaces 202-206 are associated with a port of theframe processing apparatus 200, and each port is in turn coupled to a node of a network. The network may be a local area network (LAN). Examples of LANs include token-ring networks and ethernet networks. Each of the physical layer interfaces 202-206 also couple to media access controller (MAC)circuitry 208. TheMAC circuitry 208 performs media access control operations and filtering operations on the data frames being processed by theframe processing apparatus 200. In one embodiment, theMAC circuitry 208 is itself an integrated circuit chip. The details on the construction and operation on theMAC circuitry 208 are discussed in detail below with respect to FIGS. 3A-9. - The
MAC circuitry 208 couples to forwarding tables 210 by way of atable bus 212. The forwarding tables 210 store information such as destination addresses, IP addresses, VLAN or bridge group information which are used by theMAC circuitry 208. The forwarding tables 210 are coupled to theMAC circuitry 208 through abus 212. Additional details on the forwarding tables 210 are provided in FIG. 8 below. - During reception, the
MAC circuitry 208 receives incoming data frames, and then filters and processes the incoming data frames. The processed data frames are then stored in aframe buffer 214. During transmission, theMAC circuitry 208 also receives the processed data frames from theframe buffer 214, filters and forwards them to the appropriate nodes of the network. Hence, theMAC circuitry 208 is capable of performing both receive side filtering and transmit side filtering. - The
frame buffer 214 is coupled to theMAC circuitry 208 through adata bus 216. Thedata bus 216 also couples to switchcircuitry 218. The data frames stored in theframe buffer 214 by theMAC circuitry 208 have normally been filtered by theMAC circuitry 208. Theswitch circuitry 218 is thus able to retrieve the data frames to be switched from theframe buffer 214 over thedata bus 216. Theswitch circuitry 218 performs conventional switching operations, such as level-2 and level-3 switching. Theswitch circuitry 218 and theMAC circuitry 208 send and receive control signals over acontrol bus 220. Acontrol bus 222 is also used to communicate control signals between theframe buffer 214 and theswitch circuitry 218. Theswitch circuitry 218 is further described with respect to FIG. 10 below. - The
frame processing apparatus 200 further includes output queues and buffermanagement information storage 224. The output queues and buffermanagement information storage 224 is coupled to theswitch circuitry 218 over abus 226. Theswitch circuitry 218 monitors the output queues and buffermanagement information storage 224 to determine how to manage its switching operations. In addition, theframe processing apparatus 200 may further include anATM port 227 that is coupled to theswitch circuitry 218 and thus coupled to theframe buffer 214 and the output queues and buffermanagement information storage 224. - A
microprocessor 228 is also coupled to the switch circuitry overbus 230 to assist with operations not directly associated with the reception and transmission of data frames. For example, themicroprocessor 228 performs configuration of theMAC circuitry 208 during initialization, gathering statistical information, etc. Themicroprocessor 228 is coupled to a processor random-access memory (RAM) 232 over aprocessor bus 234. Theprocessor RAM 232 stores data utilized by themicroprocessor 228. TheMAC circuitry 208 is also operatively coupled to theprocessor bus 234 by anisolation device 236 and aninterconnect bus 238. - FIG. 3A is a block diagram of
MAC circuitry 300 according to an embodiment of the invention. TheMAC circuitry 300, for example, may be theMAC circuitry 208 illustrated in FIG. 2. - The
MAC circuitry 300 includes a plurality ofprotocol handlers 302. Theprotocol handlers 302 couple to physical layer interfaces and individually receive and transmit data over the physical media of the network coupled to the physical layer interfaces. A receiveddata bus 304 couples theprotocol handlers 302 to aninput multiplexer 306. Theinput multiplexer 306 is in turn coupled to a receiveFIFO 310 through receive bus 308. Hence, data being received at one of theprotocol handlers 302 is directed along a receive data path consisting of the receiveddata bus 304, theinput multiplexer 306, the receive bus 308, and the receiveFIFO 310. - The
protocol handlers 302 preferably implement in hardware those features of the 802.5 specification for the MAC layer that need to be implemented in hardware, the remaining other features of the MAC layer are left to software (i.e., hardware programmed with software). For example, theprotocol handlers 302 incorporate hardware to perform full repeat path, token generation and acquisition, frame reception and transmission, priority operation, latency buffer and elasticity buffer. In addition, various timers, counters and policy flags are provided in theprotocol handlers 302. The balance of the MAC layer functions are performed in software in other portions of the MAC circuitry 300 (i.e., by the filter processor) or by themicroprocessor 228. - A
filter processor 312 is coupled to the receiveFIFO 310 through aprocessor bus 314. Theprocessor bus 314 is also coupled to anoutput multiplexer 316. Theoutput multiplexer 316 is also coupled to a filter variables RAM 318 over afilter variables bus 320. The filter variables RAM 318 also couples to thefilter processor 312 to provide filter variables to thefilter processor 312 as needed. In one embodiment, the filter variables RAM 318 includes a receive filter variables RAM 318-1 for use by thefilter processor 312 during receiving of frames and a transmit filter variables RAM 318-2 for use by thefilter processor 312 during transmission of frames. - In order to accomplish sophisticated level-2 switching in hardware (i.e., with user level filters, bridge groups, VLANs, etc.) at wire speed as well as level-3 switching, significant amounts of frame processing must be performed by the
frame processing apparatus 200. Although frame processing could be implemented in hardwired logic, such an approach would be unreasonable given the complexities of the frame processing. Thefilter processor 312 within theMAC circuitry 208 is a programmable solution to the problem. Thefilter processor 312 can be implemented by a small core of logic (e.g., less than 15K gates) that can be dynamically programmed. Thefilter processor 312 preferably forms an execution pipeline that executes instructions over a series of stages. The instruction set is preferably small and tailored to frame examination operations. A received frame being processed has an execution context where each frame contains its own set of operating variables. In other words, thefilter processor 312 is specialized for performing frame processing operations in a rapid and efficient manner in accordance with directions provided by program instructions. - In general, the
filter processor 312 performs filter processing and other processing associated with forwarding frames. Each frame must be processed extensively to determine frame destinations. This includes extracting the frame destination address (DA) and looking it up in the forwarding tables 210. Additionally, other fields may be attached to the destination address (DA) for context specific lookups. As an example, this could include VLAN or bridge group information. For layer-3 functionality, IP addresses can be extracted and passed through the forwarding tables 210. In general, thefilter processor 312 allows up to two arbitrary fields in either the received frame or variable memory to be concatenated and sent through the forwarding tables 210. Furthermore, many frame fields must be compared against specific values or decoded from a range of values. Thefilter processor 312 preferably allows single instruction methods of comparing and branching, comparing and storing (for building complex Boolean functions), and lastly range checking, branching or storing. Customer configured filters can also be performed through this processing logic. Custom configured filters are, for example, used for blocking traffic between particular stations, networks or protocols, for monitoring traffic, or for mirroring traffic. - In one embodiment, the filter variables RAM318 is a 128×64 RAM that holds 64 bytes of variables for each port. The filter variables RAM 318 is preferably a dual port RAM where both the read and write ports are used by the
filter processor 312. The first 64 bytes of variables for a port are always written out to theframe buffer 214 with a status write for each frame processed by thefilter processor 312. The status write thus contains the control information that results from the frame processing. As an example, the control information includes beginning location and ending location within theframe buffer 214, status information (e.g., CRC error, Rx overflow, Too long, Alignment error, Frame aborted, Priority), a forwarding map, and various destinations for the frame. The remaining 32 bytes can be written by request of thefilter processor 312. This allows software or external routing devices easy access to variables that can be used to store extracted data or Boolean results in a small collected area. Instructions should not depend on initialized values for any variable as the RAM entries are re-used on a frame basis and thus will start each frame initialized to the values written by the last frame. Note that many variables have a pre-defined function that is used by theswitch circuitry 218 for forwarding frames. - The
microprocessor 228 is able to read or write any location in thefilter variables RAM 318. Generally, themicroprocessor 228 reads information from the filter variables RAM 318 for diagnostic purposes. It can, however, be used by functional software in order to pass in parameters for a port that are fixed from frame to frame but programmable during the lifetime of a port. Examples of this include the spanning tree state (blocked or not blocked). - The filter variables RAM318 may also be double buffered. In one embodiment, there are two 64 byte areas per port, and alternate frames received for a port re-use a given 64 byte area. As a result, frame processing can begin on a subsequent frame while the buffer system is still waiting to unload the previous frame's variables. This is an important point for software since port control parameters must be written to both areas.
- In one embodiment, the filter variables RAM318 also contains status registers for each port. The status registers are updated with the progress of the processing of each frame. Status information in the status registers is primarily for the benefit of the
filter processor 312. The status registers are normally written by theprotocol handlers 302 but can also be updated by thefilter processor 312. - An
instruction RAM 322 is also coupled to thefilter processor 312 to supply the instructions to be executed by thefilter processor 312. Theinstruction RAM 322 stores the instructions executed by thefilter processor 312. The instructions are written to theinstruction RAM 322 by themicroprocessor 228 and read from theinstruction RAM 322 by thefilter processor 312. For example, in one embodiment having 64-bit instruction words, theinstruction RAM 322 can be a 512×64 RAM having a single port. All ports of theframe processing apparatus 200 share the same instruction set for the processing carried out by thefilter processor 312. Also, with each port having a unique variable space within the filter variables RAM, thefilter processor 312 is able to support execution specific to a port or group of ports. Grouping of ports is, for example, useful to form subnetworks within a network. - Further, a
table interface 324 provides an interface between the forwarding tables 210 and thefilter processor 312. The forwarding tables 210 store destination addresses, IP addresses, VLAN or bridge group information which are used by thefilter processor 312 in processing the frames. Additional details on the table interface are described below with reference to FIG. 8. - A
buffer 326 receives the output data from theoutput multiplexer 316 and couples the output data to thedata bus 216. In addition to being coupled to thebuffer 326, thedata bus 216 is coupled to a transmitFIFO 328. The output of the transmitFIFO 328 is coupled to a transmitbus 330 which is coupled to theprotocol handlers 302 and thefilter processor 312. The transmit data path through theMAC circuitry 300 consists of thedata bus 216, the transmitFIFO 328, and the transmitbus 330. - The
MAC circuitry 300 further includes aFIFO controller 322 for controlling the receiveFIFO 310 and the transmitFIFO 328. TheFIFO controller 332 couples to thecontrol lines 220 through a frame buffer interface 334. TheFIFO controller 332 additionally couples to framinglogic 336 that manages reception and transmission of frames. The framinglogic 336 is coupled to thefilter processor 312 overcontrol line 338, and theFIFO controller 332 is coupled to the filter processor overcontrol line 340. The framinglogic 336 further couples to astatistics controller 342 that controls the storage of statistics in a statistics RAM 344. Exemplary statistics are provided in Table 1 below. - The data is streamed to and from the
frame buffer 214 through theFIFOs FIFO 310 and writing the unloaded data to theframe buffer 214. The frame buffer interface 334 also handles the removal of data to be transmitted from theframe buffer 214 and the loading of the removed data into the transmitFIFO 328. The output queues and buffermanagement information storage 224 is used to perform buffer address management. - In one embodiment, whenever a block of data in the receive
FIFO 310 is ready for any of the ports, the frame buffer interface 334 generates a RxDATA request to theswitch circuitry 218 for each ready port. Likewise, whenever the transmitFIFO 328 has a block of space available for any port, the frame buffer interface 334 generates a TxDATA request to theswitch circuitry 218. Buffer memory commands generated by theswitch circuitry 218 are received and decoded by the frame buffer interface 334 and used to control burst cycles into and out of the twoFIFOs - The
framing logic 336 tracks frame boundaries for both reception and transmission and controls the protocol handler side of the receive and transmitFIFOs protocol handler 302 it is written into the receiveFIFO 310, and the framinglogic 336 keeps a count of valid bytes in the frame. In one embodiment, this count lags behind by four bytes in order to automatically strip the FCS from a received frame. In this case, an unload request for the receiveFIFO 310 will not be generated until a block of data (e.g., 32 bytes) is known not to include the FCS. Each entry in the receiveFIFO 310 may also include termination flags that describe how much of a word (e.g., 8 bytes) is valid as well as marks the end of frame. These termination flags can be used during unloading of the receiveFIFO 310 to properly generate external bus flags used by theswitch circuitry 218. Subsequently received frames will be placed in the receiveFIFO 310 starting on the next block boundary (e.g., next 32 byte boundary). This allows theswitch circuitry 218 greater latency tolerance in processing frames. - On the transmit side, the
protocol handler 302 is notified of a transmission request as soon as a block of data (e.g., 32 bytes) is ready in the transmitFIFO 328. As with the receive side, each line may include termination flags that are used to control the end of frame. Theprotocol handler 302 will automatically add the proper FCS after transmitting the last byte. Multiple frames may be stored in the transmitFIFO 328 in order to minimize inter-frame gaps. In one embodiment, each port (channel) serviced by theframe processing apparatus 200 has 128 bytes of storage space in theFIFOs FIFOs logic 336 to external logic indicating availability of received data, or transmit data, as well as received status events. - The transmit
FIFO 328 may have a complication in that data can arrive from theframe buffer 214 unpacked. This can happen when software modifies frame headers and links fragments together. In order to accommodate this, the frame buffer interface 334 may include a data aligner that will properly position incoming data based on where empty bytes start in the transmitFIFO 328. Each byte is written on any boundary of the transmitFIFO 328 in a single clock. - In one embodiment, the receive
FIFO 310 is implemented as two internal 128×32 RAMs. Each of the eight ports of theframe processing apparatus 200 is assigned a 16×64 region used to store up to four blocks. Frames start aligned with 32 byte blocks and fill consecutive memory bytes. The receiveFIFO 310 is split into two RAMs in order to allow thefilter processor 312 to fetch a word sized operand on any arbitrary boundary. To accommodate this, each RAM half uses an independent read address. - Because of the unaligned write capability, the transmit
FIFO 328 is slightly more complex. It is made of two 64×64 RAMs together with two 64×4 internal RAMs. The 64×64 RAMs hold the data words as received from theframe buffer 214 while the 64×4 RAMs are used to store the end of frame (EOF) flag together with a count of how many bytes are valid in the data word. Assuming data arrived aligned, each double-word of a burst would write to an alternate RAM. By using two RAMs split in this fashion, arbitrarily unaligned data can arrive with some portion being written into each RAM simultaneously. - The statistics RAM344 and the filter processor statistics RAM 323 are responsible for maintaining all per port statistics. A large number of counters are required or at least desired to provide Simple Network Management Protocol (SNMP) and Remote Monitor (RMON) operations. These particular counts are preferably maintained in the statistics RAM 344. Also, the
microprocessor 228 is able to read the statistics at any point in time through theCPU interface 346. - In one embodiment, a single incrementer/adder per RAM is used together with a state machine to process all the counters stored in the statistics RAM344. Statistics generated by receive and transmit control logic are kept in the statistics RAM 344. In one embodiment, the statistics RAM 344 is a 128×16 RAM (16 statistics per port) and are all 16 bits wide except for the octet counters which are 32 bits wide and thus occupy two successive memory locations. The
microprocessor 228 is flagged each time any counter reaches 0×C00, at which point it must then read the counters. - Table 1 below illustrates representative statistic that can be stored in the statistics RAM344. In order to limit the number of counters that must be affected per frame, frames will be classified first into groups and then only one counter per group will be affected for each frame. For example, a non-MAC broadcast frame properly received without source routing information will increment a counter storing a count for a DataBroadcastPkts statistic only. Hence, in this example, to count the total number of received frames, the
microprocessor 228 has to add the DataBroadcastPkts, AllRoutesBroadcastPkts, SingleRoutesBroadcastPkts, InFrames, etc. Normally, statistics are only incremented by one, except for the octet counters where the size is added to the least significant word and the overflow (if any) increments the most significant word. An additional configuration bit per port may be used to allow the receive statistics to be kept for all frames seen on the ring or only for frames accepted by the port.TABLE 1 Grp Statistic Purpose A RxOctet hi Received octets in non-error frames except through octets A RxOctet lo Received octets in non-error frames except through octets A RxThruOctet hi Received octets in non-error source routed frames where this ring is not terminal ring A RxThruOctet lo Received octets in non-error source routed frames where this ring is not terminal ring A TxOctet hi Transmitted octets A TxOctet lo Transmitted octets B RxPktUnicast Received unicast LLC frames wo/RIF or w/RIF and directed B RxPktGrpcast Received groupcast LLC frames wo/RIF or w/RIF and directed B RxPktBroad Received broadcast LLC frames wo/RIF or w/RIF and directed B RxPktThrough Received LLC source routed directed frames passed through switch B TxPktUnicast Transmitted unicast LLC frames B TxPktGrpcast Transmitted groupcast LLC frames B TxPktBroad Transmitted broadcast LLC frames C RxFPOver Receive frame dropped, filter processor busy on previous frame C RxFIFOOver Receive frame dropped, RxFIFO overflow C TxFIFOUnder Transmit frame dropped, TxFIFO underflow - Statistics generated by the
filter processor 312 are kept in the filterprocessor statistics RAM 323. In one embodiment, the filter processor statistics RAM 323 is a 512×16 RAM for storage of 64 different 16 bit counts for each port. These statistics can be used for counting complex events or RMON functions. Themicroprocessor 228 is flagged each time a counter is half full, at which point it must then read the counters. - The
frame processing apparatus 200 also provides an interface to themicroprocessor 228 so as to provide themicroprocessor 228 with low-latency access to the internal resources of theMAC circuitry 208. In one embodiment, aCPU interface 346 interfaces theMAC circuitry 300 to themicroprocessor 228 via theinterconnect bus 238 so that themicroprocessor 228 has access to the internal resources of theframe processing apparatus 200. Preferably, burst cycles are supported to allow software to use double-word transfers and block cycles. Themicroprocessor 228 is also used to read and write control registers in each of theprotocol handlers 302 to provide control of ring access as well as assist with the processing of the MAC frames. Also, by providing themicroprocessor 328 with access to the internal resources, themicroprocessor 228 can perform diagnostics operations. TheCPU interface 346 can also couple to the forwarding tables 210 so as to provide initialization and maintenance. - The
CPU interface 346 further couples to theprotocol handlers 302 and a special transmitcircuit 350. The special transmitcircuit 350 couples to theprotocol handlers 302 overbus 352. Moreover, theprotocol handlers 302 couple to the framinglogic 336 overcontrol lines 354. - The special transmit
circuit 350 operates to transmit special data, namely high priority MAC frames. The special transmitcircuit 350 is used within theMAC circuitry 300 to transmit high priority frames without having to put them through theswitch circuitry 218. As part of the ring recovery process, certain MAC frames (e.g., beacon, claim and purge) must be transmitted immediately, and thus bypass other frames that are queued in theswitch circuitry 218. Also, for successful ring poll outcomes on large busy rings, certain high-priority MAC frames (i.e., AMP and SMP) are transmitted without being blocked by lower priority frames queued ahead of them in theoutput queues 224. - The special transmit
circuit 350 includes an internal buffer to store an incoming high priority frame. In one embodiment, the internal buffer can store a block of 64 bytes of data within the special transmitcircuit 350. The MAC processing software (microprocessor 228) is notified when a frame is stored in the internal buffer and then instructs the internal buffer to de-queue the frame to theprotocol handler 302 for transmission. The MAC processing software thereafter polls for completion of the transmission and may alternatively abort the transmission. The special transmitcircuit 350 may also be written by themicroprocessor 228 via theCPU interface 346. - FIG. 3B is a block diagram of a
protocol handler 356 according to an embodiment of the invention. Theprotocol handler 356 is, for example, an implementation of theprotocol handler 302 illustrated in FIG. 3. - The
protocol handler 356 implements physical signaling components (PSC) section and certain parts of the MAC Facility section of the IEEE 802.5 specification. In the case of token ring, theprotocol handler 356 converts the token ring network into receive and transmit byte-wide data streams and implements the token access protocol for access to the shared network media (i.e., line). Data being received from a line is received at alocal loopback multiplexer 358 which forwards a selected output to a receivestate machine 360. The receivestate machine 360 contains a de-serializer to convert the input stream into align octets. The primary output from the receivestate machine 360 is a parallel byte stream that is forwarded to a receiveFIFO 362. The receivestate machine 360 also detects errors (e.g., Manchester or CRC errors) for each frame, marks the start of the frame, and initializes a symbol decoder and the de-serializer. Further, the receivestate machine 360 parses the input stream and generates the required flags and timing markers for subsequent processing. Additionally, the receivestate machine 360 detects and validates token sequences, namely, the receivestate machine 360 captures the priority field (P) and reservation field (R) of each token and frame and presents them to the remainingMAC circuitry 300 as current frame's priority field (Pr) and current frame's reservation field (Rr). The receiveFIFO 362 is a FIFO device for the received data and also operates to re-synchronize the received data to a main system clock. - The
protocol handler 356 also has a transmit interface that includes two byte-wide transmit channels. One transmit channel is used for MAC frames and the other transmit channel is used for LLC frames (and some of the management style MAC frames). The LLC frames are supplied over the transmitbus 330 from theswitch circuitry 218. The MAC frames are fed from the special transmitcircuitry 350 over thebus 352. These two transmit channels supply two streams of frames to a transmit re-synchronizer 364 for synchronization with the main system clock. The re-synchronized transmit signals for the two streams are then forwarded from the transmit re-synchronizer 364 to a transmitstate machine 366. - The transmit
state machine 366 multiplexes the data from the two input streams by selecting the data from thebus 352 first and then the data from thebus 330. The transmitstate machine 366 controls amultiplexer 368 to select either one of the input streams supplied by the transmitstate machine 366 or repeat data supplied by arepeat path supplier 370. While waiting for the detection of a token of the suitable priority, the transmitstate machine 366 causes themultiplexer 368 to output the repeat data from therepeat path supplier 370. Otherwise, when the transmitstate machine 366 detects a token with the proper priority, the transmitstate machine 366 causes themultiplexer 368 to output frame data to be transmitted, and at the end of each frame, inserts a frame check sequence (FCS) and ending frame sequence (EFS), and then transmits the inter frame gap (IFG) and a token. The transmitstate machine 366 is also responsible for stripping any frame that it has put on the token-ring network. The stripping happens in parallel with transmission and follows a procedure defined in the 802.5 specification. As suggested in the 802.5 specification, under-stripping is avoided at the expense of over-stripping. - The output of the
multiplexer 368 is supplied to a priority state machine 372. The priority state machine 372 implements the 802.5 specification priority stacking mechanism. For example, when priority stacking is in use, i.e., when the priority of the token is raised, the repeat path is delayed by up to eight (8) additional bits. Once the priority stacking is no longer in use, the priority delay is removed. - The output of the priority state machine372 is forwarded to a fixed
latency buffer 374 that, for example, inserts a fixed latency of a predetermined number of bits (e.g., 24 bits) to ensure that a token can circulate around the token-ring. The output from the fixedlatency buffer 374 is supplied to anelasticity buffer 376 as well as to theloopback multiplexer 358 for loopback purposes. Theelasticity buffer 376 provides a variable delay for clock rate error tolerance. - The output of the priority state machine372 as well as the output of the
elasticity buffer 376 are supplied to a multiplexer 378. The data stream to be transmitted from either the priority state machine 372 or the delayed version from theelasticity buffer 376 are then provided to a wire-side loopback multiplexer 380. The wire-side loopback multiplexer 380 also receives the input data stream when a loopback is desired. The wire-side loopback multiplexer 380 couples to one of the physical layer interfaces 202-206 and outputs either the output from the multiplexer 378 or the input data stream for loopback. Theprotocol handler 356 also includes a protocolhandler register bank 382 that includes various control registers. - Since the
frame processing apparatus 200 can support several connection modes (e.g., direct attachment, station, RI/RO expansion), functionality at power-up and during insertion have configurable deviations from the specification. First, direct attachment and RI/RO expansion require that theframe processing apparatus 200 repeat data at all times. Theprotocol handler 356 includes a wire-side loopback path implemented by the wire-side loopback multiplexer 380 for this purpose. This situation allows for accurate detection of idle rings (based on detecting lack of valid Manchester coding), instead of depending on the crude energy detect output from the physical layer interfaces 202-206. In addition, the normal initialization process of sending loop-media test frames is not applicable when connectivity has been ascertained prior to any insertion attempt. As such, this step of the initialization can be eliminated for all attachment modes besides station. For applications where the lobe testing is desirable or required, normal station attachment for RI/RO where phantom drive is generated can be utilized. - Each frame of data that is received is processed through the
filter processor 312 to determine whether or not the frame should be accepted by the port and forwarded. Thefilter processor 312 is preferably implemented by specialized general purpose hardware that processes programmed filtering instructions. Embodiments of the specialized general purpose hardware are described in detail below with reference to FIGS. 4 and 5. - In processing a frame of data, the
filter processor 312 can execute a plurality of instructions (e.g., up to 512 instructions). Each instruction is capable of extracting fields from the frame of data and storing them in a storage device (i.e., the filter variables RAM 318). Likewise, frame fields can be compared against immediate values and the results of comparisons stored in thefilter variables RAM 318. Lastly, fields can be extracted, looked up in the forwarding tables 210 and the results stored in thefilter variables RAM 318. Each port also includes some number of control registers that are set by themicroprocessor 228 and can be read by thefilter processor 312 during execution of the filtering instructions. For example, these control registers are typically used to store virtual ring (VRING) membership numbers, source routing ring and bridge numbers, etc. - The execution of filtering instructions by the
filter processor 312 is generally responsible for two major functions. First, thefilter processor 312 must determine a destination mask and BP DEST (backplane destination) fields used by theswitch circuitry 218 for forwarding the frame. Second, thefilter processor 312 must determine whether or not to accept the frame in order to properly set the AR (address recognized) and FC (frame copied) bits in the FS (frame status) field. - While the
filter processor 312 is processing a current frame, subsequent frame are placed in the receiveFIFO 310. The processing time for the current frame thus should complete before the receiveFIFO 310 is filled because when the receiveFIFO 310 overflows frames are dropped. For the AR/FC function, all instructions that determine the acceptance of a frame must finish executing before the FS byte is copied off of the wire, else the previous settings will be used. In order to help the instructions to complete in time, execution is preferably scheduled as soon as the frame data that an instruction depends on arrives. As an example, thefilter processor 312 can allow all required instructions to complete before or during the reception of the CRC. Also, it is sufficient to provide thefilter processor 312 with a single execution unit to supports all of the ports of theframe processing apparatus 200, particularly when the ports are serviced in a round robin fashion as discussed below. - The
filter processor 312 also performs transmit side filtering. To reduce circuitry, the same execution unit that performs the receive side filtering can perform the transmit side filtering while the reception side is idle. For half-duplex operation the use of the single execution unit should provide acceptable; however, for full duplex operation a second execution unit is provided to perform the transmit side filtering. - Additionally, the
filter processor 312 operates to perform RIF scanning required to forward source routed frames. For each received frame of data that has a RIF, circuitry in the framinglogic 336 operates to scan this field looking for a match between the source ring and bridge and an internal register. If a match is found the destination ring is extracted and placed in a register visible to thefilter processor 312. Thereafter, the destination ring stored in the register can be used to index a table within the forwarding tables 210. - FIG. 4 is a block diagram of a
filter processor 400 according to an embodiment of the invention. Even though the filter processor is a high speed pipelined processor, the circuitry implementing thefilter processor 400 is minimal and compact so as to fit within theMAC circuitry 208. Thefilter processor 400 is one embodiment of thefilter processor 312 together with theRAM 322 illustrated in FIG. 3. Thefilter processor 400 has five (5) distinct pipeline stages. Generally, the stages are described as instruction fetch, operand fetch, decode, execute and write. - In the first (instruction fetch) stage of the filter processing pipeline, the
filter processor 400 retrieves an instruction to be next executed. More particularly, the instruction is retrieved from aninstruction RAM 402 using a program counter obtained from a program countersstorage 404. The program countersstorage 404 stores a program counter for each of theprotocol handlers 302 being serviced by theMAC circuitry 300. The instruction retrieved or fetched from theinstruction RAM 402 is then latched in a fetched instruction word (I-word)register 406. This completes the first stage of the filter processing pipeline. - In the next (operand fetch) stage of the filter processing pipeline, a cancel
circuit 408 produces a cancelsignal 410 to notify the program countersstorage 404 to activate a wait counter for theparticular protocol handler 302 being serviced. The wait counter provides a waiting period during which processing for theprotocol handler 302 currently being processed in this stage of the processing pipeline undergoes no processing during the wait period. This stage also includes anaddress calculation circuit 412 to calculate one ormore addresses 414 used to access stored data in a memory storage device or devices. An operand fetch (op-fetch)output register 418 stores various data items that are determined in or carried-through 416 the operand fetch stage of the filter processing pipeline. - In the next (decode) stage of the processing pipeline, the instruction is decoded, a mask is produced, a function may be produced, the fetched operands may be aligned, and a branch target may be determined. In particular, a mask and
function circuit 420 produces preferably a mask and a function. The mask will be used to protect data in a word outside the active field. A carry-throughlink 422 carries through the decode stage various data items from the operand fetchoutput register 418. Analigner 424 receives the one or more operands from the data storage device or devices over alink 426 and possibly data from the operand fetchoutput register 418. Thealigner 424 then outputs one or more aligned operands. Abranch target circuit 428 determines a branch target for certain instructions. A decode stage output register 430 stores the items produced by the decode stage, namely, the mask, function, carry through data, aligned operands, branch target, and miscellaneous other information. - In the next (execute) stage, an arithmetic logic unit (ALU)432 performs a logical operation on the aligned operands and possibly the function and produces an
output result 434. TheALU 432 also controls aselector 436. Theselector 436 selects one of the branch target from the decodestage output register 430 and a program counter after having been incremented by one via anadder 438, to be output as anext program counter 440. Thenext program counter 440 is supplied to theprogram counter storage 404 to update the appropriate program counter stored therein. Theoutput result 434 and carry throughdata 442 are stored in an executestage output register 444 together with other miscellaneous information. - In the last (write) stage of the filter processing pipeline, an
aligner 446 aligns theoutput result 434 obtained from the executestate output register 444 to produce an alignedoutput result 448 known as processed data. The processed data is then written to a determined location in the memory storage device or devices. - The
filter processor 400 services theprotocol handlers 302 in a round robin fashion. In particular, with each clock cycle, thefilter processor 400 begins execution of an instruction for a different one of theprotocol handlers 302. By this approach, the processing resources of thefilter processor 400 are distributed across the ports requiring service so that certain ports do not monopolize the processing resources. - FIG. 5 is a block diagram of a
filter processor 500 according to another embodiment of the invention. Thefilter processor 500 is a detailed embodiment of thefilter processor 312 together with theinstruction RAM 322 illustrated in FIG. 3. Thefilter processor 500 is also a more detailed embodiment of thefilter processor 400. Thefilter processor 500 is a pipelined processor having five (5) stages. Generally, the stages are described as instruction fetch, operand fetch, decode, execute and write. - The
filter processor 500 receives an instruction from aninstruction RAM 501. Theinstruction RAM 501 is an internal 512×64 RAM that holds instruction words. Since the port number can be read from thefilter variables RAM 318, execution specific to a port or group of ports can be supported. In one embodiment, protocol handlers share the same instruction set. Theinstruction RAM 501 is initialized by themicroprocessor 228 at boot-up. While dynamic code changes are allowed, execution is preferably halted to prevent erroneous execution. - A fetch
controller 502 produces an instructionselect signal 504 that is used to select the appropriate instruction from theinstruction RAM 501. The fetchcontroller 502 produces the instructionselect signal 504 based on program counters 506 and weight counters 508. Specifically, the fetchcontroller 502 selects the appropriate instruction in accordance with theprogram counter 506 for theparticular protocol handler 302 being processed in any given clock cycle and its associatedwait counter 508. If the associatedwait counter 506 is greater than zero, the pipeline executes transmit instructions retrieved from theinstruction RAM 501. Otherwise, when the associatedwait counter 506 is not greater than zero, the processing continues using the program counter for theparticular protocol handler 302. - In any event, the operation of the fetch
controller 502 is such that operates to switch its processing to each of theprotocol handlers 302 with each clock cycle by selecting theprogram counter 506 for thatprotocol handler 302. In other words, theprotocol handlers 302 are services by thefilter processor 500 in a round robin fashion. Stated another way, each frame that is received or transmitted resets the context of thefilter processor 500 for that port. For example, in the case in which theMAC circuitry 300 supports eight protocol handlers, the fetchcontroller 502 will sequence through each of the program counters 506 (one for each of the protocol handlers 302) to effectively service each the protocol handlers one clock cycle out of every eight clock cycles. - The first stage (fetch stage) of the
filter processor 500 uses two clock cycles, and the remaining stages use a single clock cycle. The first stage requires two clocks to complete because theinstruction RAM 501 contains an address register so that the first clock cycle selects one of eight (8) receive or transmit program counters and during the second clock cycle the appropriate instruction is read from theinstruction RAM 501. - The appropriate instruction that is retrieved from the
instruction RAM 501 is latched in a fetch instruction word (I-word)register 510. Additionally, a port number is latched in aport register 512, a valid indicator is latched in avalid register 514, receive/transmit indicator is stored in a receive/transmit register (RX/TX) 516, and a program counter is stored in aprogram counter register 518. - In a next stage of the
filter processor 500, the operand fetch stage, a destination address, source-one (S1) address, and source-two (S2) address calculations are performed by a firstaddress calculation circuit 520. Both S1 and S2 are obtained from an instruction, where S2 is an immediate value within the instruction format, and S2 includes a position inRX FIFO 310, a variable for a variable in thevariable RAM 320 and a relative address adjustment within the instruction format. The firstaddress calculation circuit 520 produces adestination address 522, a source-oneaddress 524, and a source-twoaddress 526, all of which are supplied to the next stage. Thedestination address 522 is also supplied to astalling circuit 528 which produces a stall signal 530 that is supplied to the fetchcontroller 502. The stall signal 530 causes the pipeline to hold its current state until the stall condition is resolved. A carry-throughlink 532 carries through this stage other portions of data from the instruction that are needed in subsequent stages. - The operand fetch stage of the
filter processor 500 also includes a secondaddress calculation circuit 534 that calculates a filtervariable address 554, aFIFO address 552, and aregister address 548. The filtervariable address 554 is supplied to a variable storage device, the FIFO address is supplied to a FIFO device, and the register address is supplied to a control register. As an example, with respect to FIG. 3, the variable storage device may be thefilter variables RAM 318, the FIFO device may be the transmit and receiveFIFOs logic 336. - The operand fetch stage generates write stage addresses. Technically, this stage requires two clock cycles to complete since data from the
FIFOs - The operand fetch stage also includes
logic 536 that combines the contents of theport register 512, thevalid register 514 and the received/transmitregister 516, and produces a combined context indicator. At the end of this stage, an operand-fetch stage register 538 stores the carry-throughdata 532 and the addresses produced by the firstaddress calculation circuit 520. Also, the context indicator from thelogic 536 is stored in aregister 540 and the associated program counter is stored in theprogram counter register 542. - In the next stage, the decode stage, a multiplexer544 (A-MUX) receives an immediate value 546 from the operand-fetch
stage register 538 and possibly anoperand 548 from the control register. Depending upon the type of instruction, themultiplexer 544 selects one of the immediate value 546 and theoperand 548 as the output. A multiplexer 550 (B-MUX) receives the possibly retrieved operands from the control register, the FIFO device, and the variable RAM overlinks merge multiplexer 556 operates to merge the operands retrieved from the FIFO device and the variable RAM. Since the destination can be on any byte boundary, both operands are aligned to the destination to facilitate subsequent storage and processed data to a memory storage device. An aligner 558 (B-ALIGNER) aligns the output operand from the multiplexer 550, and an aligner 560 (A-ALIGNER) aligns the output from themultiplexer 544. Analignment controller 562 operates to control themerge multiplexer 556, thealigner 558, and the aligner 560 based on address signals from the operand-fetch stage register. A branch target circuit 564 operates to produce a branch target in certain cases. A decode stage register 566 stores the aligned values from thealigners 558 and 560, any mask or function produced by a mask andfunction circuit 565, the merged operand from themerge multiplexer 556, the branch target, and carry through data from the operand-fetchstage register 538. The accompanying context indicator is stored in the context register 568, and the accompanying program counter is stored in aprogram counter register 570. - In the next stage, the execution stage, an arithmetic logic unit (ALU)572 receives input values 574, 576, and 578. The
input value 574 is provided (via the decode stage register 566) by the aligner 560, theinput value 576 is provided by the mask andfunction circuit 565, and the input value 578 is provided by thealigner 558. TheALU 572 produces anoutput value 580 theoutput value 580 based on the input values 574, 576 and 578. Theoutput value 580 and a merged operand 582 (supplied via the merged multiplexer 556) are supplied to abit level multiplexer 584 which outputs a masked output value. Thebit level multiplexer 584 is controlled in accordance with the mask vialink 586. - The execution stage includes a 64-bit ALU that can perform ADD, SUBTRACT, OR, XOR, and AND operations. The execution stage also generates Boolean outputs for comparison operations. In general, the program counter is written in this stage. The program counter is either incremented (no branch or branch not taken) or loaded (branch taken).
- The execution stage also includes a
multiplexer 588 that receives as inputs the branch target over alink 590 and the associated program counter after being incremented by one (1) byadder 592. Themultiplexer 588 selects one of its inputs in accordance with a control signal produced by a zero/carry flag logic 593 that is coupled to theALU 572 and themultiplexer 588. The mask (via the link 586) in the resulting value from thebit level multiplexer 584 are stored in an executestage register 594. The context indicator is carried through this stage and stored in acontext latch 596. - In the final stage, the write stage, of the
filter processor 500, analigner 597 aligns the masked output value from theALU 572 to produce write data. Thealigner 597 is controlled by the mask via alink 598. Thelink 598 also supplies the mask to a writeaddress calculation circuit 599 that produces write addresses for the variable RAM, the FIFO devices, and the control register. The write stage then writes thewrite data 600 to one of theFIFOs variable RAM 318, or control registers. - The final result of receive frame processing is both the appropriate destination information for the frame as well as a copy/reject indication for the receiver layer of the protocol handler. In the case of token-ring, this information is used to set the AR & FC bits correctly. How quickly instructions execute affects both functions. On the system side, if instruction are still executing in order to forward the current frame, any following frame will fill into the receive
FIFO 328 until up to 32 bytes. If the 32nd byte is received before the previous frame finishes instruction execution the frame will be dropped automatically. For token-ring applications, the copy/reject decision should be completed by the time the FS is received. - The final result of transmit frame processing is deciding whether or not the frame should actually be transmitted on the wire or dropped. Additionally, for level-3 switching, transmit processing will replace the destination address (DA) with information from a translation table.
- Up to 512 instructions may be used to process a frame. Instruction execution begins at
address 0 for receive frames, and begins at a programmable address for transmit frames. Each instruction is capable of extracting fields from the frame and storing them in a 64 byte variable space. Likewise, frame fields can be compared against immediate values and the results of comparisons stored in variables. Lastly, fields can be extracted, looked up in a CAM and the CAM results stored in a variable. Themicroprocessor 228 can set port specific configuration parameters (VRING membership numbers, source routing ring and bridge numbers, etc.) in the variable memory as well. - In order to help instructions complete in time, execution is preferably scheduled as soon as the frame data on which an instruction depends arrives. Conversely, if an instruction requiring a data byte that has not yet been received attempts to execute, that instruction will be canceled. In many cases, this allows all required instructions to complete before or during the reception of the CRC.
- Transmit side filtering will affect the minimum IPG the switch will be able to transmit with because the frame will have to be accumulated and held in the transmit
FIFO 328 until processing has finished. Additionally, the transmit side filtering will be limited to the depth of the FIFO (128 bytes). - For space conscious implementations, transmit side filtering can be executed whenever receive instructions are not being executed. This should yield wire speed performance for any half-duplex medium. For more performance, a second execution pipeline together with another read port on the instruction RAM could be added.
- FIG. 6A is a block diagram of an
instruction selection circuit 600 according to an embodiment of the invention. Theinstruction selection circuit 600 represents an implementation of the fetchcontroller 502, the program counters 506, and the wait counters 508 illustrated in FIG. 5. - The
instruction selection circuit 600 includes aport counter 602 that increments a counter to correspond to the port number currently serviced by thefilter processor 500. For example, if a frame processing apparatus is servicing eight (8) ports, then the port count repeatedly counts from zero (0) to seven (7). The port count produced by theport counter 602 is forwarded to aport multiplexers port multiplexer 606 selects one of a plurality of transmitter program counters (Tx PC) 608 in accordance with the port count. Theport multiplexer 606 selects one of a plurality of receive program counters (Rx PC) 610. Theinstruction selection circuit 600 includes one transmit program counter (Tx PC) and one receive program counters for each of the ports. Aport multiplexer 606 selects one of the receive program counter (Rx PC) 610 in accordance with the port count supplied by theport counter 602. The output of theport multiplexers multiplexer 612 is forwarded to theinstruction RAM 501 to select the appropriate instruction for the particular port being serviced during a particular clock cycle. The transmit and receiveprogram counter filter processor 500 in the case in which the program counter for a particular port is altered due to a branch instruction or the like. - The
instruction selection circuit 600 includes one counters (WAIT) 616 for each of the receive ports, and aport multiplexer 614 that selects one of the plurality wait counters (WAIT) 616 in accordance with the port count from theport counter 602. The particular wait counter 616 that is selected by theport multiplexer 614 is supplied to a transmit/receive determiningunit 618. A transmit/receive determiningunit 618 supplies a control signal to the transmit/receivemultiplexer 612 such that the transmit/receivemultiplexer 612 outputs the transmit program counter (Tx PC) when the selected wait counter is greater than zero (0), and otherwise outputs the receive program counter (Rx PC). - Accordingly, the
instruction selection circuit 600 causes the processing for each port to switch context at each clock cycle, and to perform transmit processing only when an associated wait counter indicates that the receive processing must wait or when no receive processing is active. FIG. 6B is a diagram 622 illustrating the context switching utilized by a filter processor according to the invention. In particular, in the case of thefilter processor 500 illustrated in FIG. 5, a five (5) stage pipeline operates to process instructions for each of the various ports. The allocation of the processing is performed on a round-robin basis for each port on each clock cycle. For example, as illustrated in the diagram 622 provided in FIG. 6B, the port number being incremented on each clock cycle (CK), and then the initial port is eventually returned to and the next instruction (whether for transmit or receive processing) for that port is then processed. By utilizing such a processing allocation technique, the pipeline of thefilter processor 500 need not stall to wait for currently executing instructions to complete when there are dependencies with subsequent instructions for the same port. For example, in FIG. 6B, it is not until eight (8) clock cycles (CLK9) later that the next instruction (I1) is fetched by the filter processor for theport 0 which last processed an instruction (I0) during clock 1 (CLK1). - FIG. 7 is a block diagram of an
address calculation circuit 700 according to an embodiment of the invention. Theaddress calculating circuit 700 performs most of the operations performed by the firstaddress calculating circuit 520 and the secondaddress calculating circuit 534 illustrated in FIG. 5. - The
address calculation circuit 700 calculates the address of the operands in the storage devices (FIFOs, control registers, filter variables RAM). The address specified in the instruction being processed can be relative to a field in the frame (RIF or VLAN) and thus requires arithmetic operations. Additionally, the determined address must be checked against the current receive count for that port. If the requested data at that determined address has not yet arrived, the instruction must be canceled. Accordingly, theaddress calculation circuit 700 includes abase multiplexer 702 for outputting a base address for each of the ports, arelative multiplexer 704 for outputting a relative address for each of the ports, and alength multiplexer 706 for outputting a length of the frame. Anadder 708 adds the relative address to a position provided in the instruction word (I-WORD) to produce an address for the storage device. - For FIFO locations, the address produced is compared against the write pointer for the port. A
subtractor 710 implements the comparison by taking the result from theadder 708 and subtracts it from the length obtained from thelength multiplexer 706. If the output of thesubtractor 710 is greater than zero (0) then the instruction is canceled; otherwise, the appropriate wait counter is set. Anadder 714 adds the base address from thebase multiplexer 702 with the address produced (bits 5 and 6) from theadder 708. The resulting sum from theadder 714 produces a high address for the FIFO. The output from adecrementer device 716 causes a decrement operation to occur ifbit 2 is zero (0). The output of thedecrementer device 716, regardless of whether or not it decrements, is a low address value for the FIFO. - The forwarding tables210 preferably includes an external table RAM and an external content-addressable memory (CAM). FIG. 8 is a block diagram of a CAM and a table RAM for implementing forwarding tables 210 and associated interface circuitry illustrated in FIG. 2. In particular, FIG. 8 illustrates forwarding tables 802 as including a
CAM 804 and atable RAM 806. TheMAC circuitry 300, or a portion thereof (e.g., the table interface 324), is coupled to the forwarding tables 802. The portion of theMAC circuitry 300 illustrated in FIG. 8 includes a CAM/table controller 800 that represents thetable interface 324 illustrated in FIG. 3. The CAM/table controller 800 communicates with theCAM 804 and thetable RAM 806 through a data bus (DATA) and an address bus (ADDR), and controls theCAM 804 and thetable RAM 806 using control signals (CNTL). In addition, theMAC circuitry 300 preferably includes awrite multiplexer 808 that outputs write data to be stored in one of the storage devices from either the data bus (DATA) coupling the CAM/table controller 800 with theCAM 804 and thetable RAM 806 or the write data line of the write stage of thefilter processor 500 illustrated in FIG. 5. - The
frame processing apparatus 200 uses theCAM 804 for MAC level DA and SA processing as well as for RIF ring numbers and IP addresses. In addition, thetable RAM 806 is used for destination information tables. In the case of multiple instances of theMAC circuitry 208, theCAM 804 and thetable RAM 806 can be shared among the instances. - The
CAM 804 is used to translate large fields to small ones for later use as a table index into thetable RAM 806. In all cases, the address of the match is returned and used as a variable or table index. The benefit of using theCAM 804 is to preserve the associated data for performing wider matches. The table below summarizes typically occurring lookups:Match Word Used For 48 bit DA + 12 bit VRING/Bridge L2 frame destination determination group 48 bit SA Address learning 12 bit Destination Ring Number Source route destination determination 32 bit IP add. + 12 bit VRING/ L3 frame destination determination Bridge group - Each lookup also includes a 2, 3, or 4 bit field that keys what type of data (e.g., MAC layer Addresses, IP Addresses) is being searched. This allows the
CAM 804 to be used to store different types of information. - In all cases, the
microprocessor 228 must carefully build destination tables cognizant of where data lands in theCAM 804 since match addresses are used as indexes as opposed to associated data. The size of a table entry is programmable but must be a power of 2 and at least 8 bytes (i.e., 8, 16, 32 bytes). The filter processor makes no assumptions on the contents of an entry. Rather, lookup instructions can specify that a given amount of data be transferred from the table to internal variables. - The
table RAM 806 holds destination information for properly switching frames between ports. It also can include substitute VLAN information for transforming between tagged and untagged ports as well as MAC layer DA and RIF fields for layer-3 switching. - For the
CAM 804 and thetable RAM 806 to supportmultiple MAC circuitry 208 structures within theframe processing apparatus 200, each of theMAC circuitry 208 structures includes strapping options to specify master or slave operation. The master controls arbitration amongst all theMAC circuitry 208 structures for access to theCAM 804 and thetable RAM 806. Additionally, the master supports access to the external memories (e.g., processor RAM 232) via themicroprocessor 228. Alternately, theframe processing apparatus 200 could provide each of theMAC circuitry 208 structures its own CAM and table RAM, in which case the strapping options are not needed. - The CAM/
table controller 800 accepts lookup requests from the pipeline of the filter processor and generates the appropriate cycles to theCAM 804. Multiple protocol handlers can share thesingle CAM 804. The pipeline of thefilter processor 312 continues to execute while the CAM search is in progress. When the CAM cycle finishes, the result is automatically written into thefilter variables RAM 318. No data dependencies are automatically checked. The filter processing software is responsible for proper synchronization (e.g., a status bit is available indicating lookup completion). - FIG. 9 is a block diagram of an aligner900 according to an embodiment of the invention. The aligner 900 represents an implementation of the aligners illustrated in FIG. 5, in particular the aligner 560. The aligner 900 includes a 4-to-1
multiplexer 902 and a 2-to-1multiplexer 904. For example, upon receiving an input signal of 64 bits (63:0), the 4-to-1multiplexer 902 receives four different alignments of the four bytes of the input signal. The selected alignment is determined by a rotate signal (ROTATE). Using the output from the 4-to-1multiplexer 902, the 2-to-1 multiplexer receives two different alignments. One alignment is directly from the output of the 4-to-1multiplexer 902, and the other alignment is rotated by two bytes. The 2-to-1multiplexer 904 then produces an output signal (OUT) by selecting one of the two alignments in accordance with the rotate signal (ROTATE). - FIG. 10 is a block diagram of a
switching circuit 1000. Theswitching circuit 1000 is a more detailed diagram of theswitch circuitry 218 of FIG. 2. Theswitching circuit 1000 includes a frame controller andDMA unit 1002, a MAC interface controller 1004, aframe buffer controller 1006, aqueue manager 1008, abuffer manager 1010, anATM interface 1012, and aCPU interface 1014. The frame controller andDMA unit 1002 controls the overall management of the switching operation. Thequeue manager 1008 and the buffer manager 1020 respectively manage the queues and buffers of the output queues and buffermanagement information storage 224 via thebus 226. Theframe buffer controller 1006 couples to thedata bus 216 for receiving incoming data frames as well as outgoing data frames. Theframe buffer controller 1006 stores and retrieves the data frames to theframe buffer 214 via thebus 222. The MAC interface controller 1004 communicates with theMAC circuitry 208 via thecontrol bus 220 to determine when frames are to be received to or removed from theframe buffer 214. The ATM interface couples to theATM port 227 to receive data from or supply data to theATM port 227. The data received from the ATM port is stored to theframe buffer 214 in the same manner as other frames, though thedata bus 216 is not used. TheCPU interface 1014 enables themicroprocessor 228 to interact with the output queues and buffermanagement information storage 224, theframe buffer 214, and theATM interface 1012. Attached hereto as part of this document is Appendix A containing additional information on exemplary instruction formats and instructions that are suitable for use by a filter processor according to the invention. - The many features and advantages of the present invention are apparent from the written description, and thus, it is intended by the appended claims to cover all such features and advantages of the invention. Further, since numerous modifications and changes will readily occur to those skilled in the art, it is not desired to limit the invention to the exact construction and operation as illustrated and described. Hence, all suitable modifications and equivalents may be resorted to as falling within the scope of the invention.
-
Opcode Instruction Effect 00 halt # Stop processing until restart at next frame, optionally abort frame 01 jmp # jmp to immediate location 02 sti #,d1 store immediate to RxFIFO, variable ram or registers 03 or #,s<,d> mem[d] = mem[s] OR immediate, if only s specified, d=s 04 xor #,s<,d> mem[d] = mem[s] XOR immediate, if only s specified, d=s 05 and #,s<,d> mem[d] = mem[s] AND immediate, if only s specified, d=s 06 sub #,s,<,d> mem[d] = mem[s] − immediate, if only s specified, d =s 07 add #,s,<,d> mem[d] = mem[s] + immediate, if only s specified, d=s 08 cje #,s,pc compare mem[s] with immediate; jump to PC if result zero 09 cjne #,s,pc compare mem[s] with immediate; jump to PC if result non-zero 0A cjgte #,s,pc compare mem[s] with immediate; jump to PC if greater or equal 0B cjlt #,s,pc compare mem[s] with immediate; jump to PC if less than 0C subje #,s,pc mem[s] = (mem[s] − immediate); jump to PC if result non-zero 0D subjne #,s,pc mem[s] = (mem[s] − immediate); jump to PC if result zero 0E cjin #,#,s,pc compare mem[s] with immediate, jump to PC if in range 0F cjout #,#,s,pc compare mem[s] with immediate, jump to PC if out of range 10-11 reserved 12 comps #,s,d mem[d] = (mem[s] = immediate) − stored w/magnitude 13 ccomps #,s,d mem[d] = (mem[s] = immediate) cascade mem[d] − stored w/ mag. 14 ces #,s,d mem[d] = (mem[s] = immediate) − stored as boolean 15 cnes #,s,d mem[d] = !(mem[s] = immediate) − stored as boolean 16 cgtes #,s,d mem[d] = (mem[s] >= immediate) − stored as boolean 17 clts #,s,d mem[d] = !(mem[s] >= immediate) − stored as boolean 18 fcld #,s,e if(mem[sl = immediate), load destination from table entry 19 fcad #,s,e if(mem[s] = immediate), add to destinations from table entry 1A fcrld #,#,s,e if(imm1 <= mem[s] <=imm2), load destination from table 1B fcrad #,#,s,e if(imm1 <= mem[s] <=imm2), add to destinations from table 1C-1D reserved 1E wait # wait for byte to be received 1F see below lookups - see next section 20 reserved 21 jmp <s> jump to mem[s] 22 mov s,d mem[d] = mem[s] 23 or s1,s2<,d> mem[d] = mem[s1] OR mem[s2], if only s specified, d=s2 24 xor s1,s2<,d> mem[d] = mem[s1] XOR mem[s2], if only s specified, d=s2 25 and s1,s2<,d> mem[d] =mem[s1] AND mem[s2], if only s specified, d=s2 26 sub s1,s2,<d> mem[d] = mem[s1] − mem[s2], if only s specified, d=s2 27 add s1,s2,<d> mem[d] mem[s1] + mem[s2], if only s specified, d=s2 28 cje s1,s2,pc compare mem[s2] with mem[s1]; jump to PC if result zero 29 cjne s1,s2,pc compare mem[s2] with mem[s1]; jump to PC if result non-zero 2A cjgte s1,s2,pc compare mem[s2] with mem[s1]; jump to PC if greater or equal 2B cjlt s1,s2,pc compare mem[s2] with mem[s1]; jump to PC if less than 2C subje s1,s2,pc mem[s] = (mem[s2] − mem[s1]); jump to PC if result non-zero 2D subjne s1,s2,pc mem[s] = (mem[s2] − mem[s1]); jump to PC if result zero 2E cjin s1,s2,pc compare mem[s2] with mem[s1]e, jump to PC if in range 2F cjout s1,s2,pc compare mem[s2] with mem[sl],jump to PC if out of range 30-37 reserved 38 fcld #,s,v(e) if(mem[s] = immediate), load destination from table entry 39 fcad #,s,v(e) if(mem[s] = immediate), add to destinations from table entry 3A fcrld #,#,s, v(e) if(imm1 <= mem[s] <= imm2), load destination from table 3B fcrad #,#,s, v(e) if(imm1 <= mem[s] <= imm2), add to destinations from table 3C-3F reserved - An example instruction might look like:
- subje f8.18.ri,1,65
- This instruction would subtract one from a byte wide field on a byte boundary (no .a specified) that is 8 bytes into the IP header in the RxFIFO, write the modified field back and jump if the result is zero to location65. The time-to-live counter of an IP frame could be decrement in this fashion and a branch taken at zero (reject frame).
- The basic instruction format is diagrammed below:
ADD Operation: [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] + (vMs1) or (vZd) <= [(fNs2) | (vNs2) | (gNs2)] + (vMs1) or [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] + # or (vZd) <= [(fNs2) | (vNs2) | (gNs2)] + # Assembler Syntax: add vM,fN<,vZ> or add vM,gN<,vZ> or add #,fN<,vZ> or add #,vN<,vZ> or add #,gN<,vZ> Description: Source operand 1 from the variable ram or an immediate is added tosource operand 2 from the FIFO ram, variable ram, or the registers. If theZ field is zero, the result is stored back into source 2. Otherwise the resultis stored in variable ram at the address specified in the Z field. The source operand may be any length from 1 to 32 bits. Only one source operand may come from the variable ram. That is, vNs1 + vNs2 is not supported. If the source 1 operand is a variable an extra length bit isincluded allowing 64 bit additions Instruction Format: Source 1 =variable Source 1 = immediate data Instruction Fields: # =immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. M = variable ram source address for argument 1 N = byte offset of LSB in FIFO, variable ram, or register number for argument 2rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address (if zero, destination address is same as source 2). AND Operation: [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] AND (vMs1) or (vZd) <= [(fNs2) | (vNs2) | (gNs2)] AND (vMs1) or [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] AND # or (vZd) <= [(fNs2) | (vNs2) | (gNs2)] AND # or Assembler Syntax: and vM,fN<,vZ> or and vM,gN<,vZ> or and #,fN<,vZ> or and #,vN<,vZ> or and #,gN<,vZ> Description: Source operand 1 from the variable ram or an immediate is anded withsource operand 2 from the FIFO ram, variable ram, or the registers. If theZ field is zero, the result is stored back into source 2. Otherwise the resultis stored in variable ram at the address specified in the Z field. The source operand may be any length from 1 to 32 bits. Only one source operand may come from the variable ram. That is, vNs1 AND vNs2 is not supported. If the source 1 operand is a variable an extra length bit isincluded allowing 64 bit logical operations. Instruction Format: Source 1 =variable Source 1 = immediate data Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields.. M = variable ram source address for argument 1 N = byte offset of LSB in FIFO, variable ram or register number for argument 2rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address (if zero, destination address is same as source 2). Chained COMPare immediate and Store magnitude result Operation: temparg = ](fN) | (gN)] - # if (vZ) = 11 { if temparg = 0, (vZ) <= 11 elsif temparg < 0, (vZ) <= 00 elsif temparg > 0, (vZ) <= 01 } Assembler Syntax: ccomps #,fN,vZ or ccomps #,gN,vZ Description: The source operand, which may come from either the FIFO ram, variable ram or the registers, is compared with the immediate value contained in the instruction. Simultaneously, the previous magnitude result in the variable addressed by Z is fetched. The magnitude result of the comparison cascaded with the previous result is stored in the variable addressed by Z. The source operand may be any length from 1 to 32 bits. The destination operand is automatically two bits wide. Instruction Format: Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. N = byte offset of LSB in FIFO or variable or register number rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address db = dibit address of result (.O-.3) Compare if Equal, Store boolean Operation: (vZ) <= ( ( [(fN) | (gN)] - #) == 0) Assembler Syntax: ces #,fN,vZ or ces #,gN,vZ Description: The immediate value specified in the instruction is subtracted from the source operand which can come from either the FIFO ram, the variable ram or the registers. If the result is zero, the boolean at address Z in the variable ram is set true. Otherwise it is set false. This instruction is intended as a precursor for complex filters. A collection of booleans may be created and then operated on simultaneously. The source operand may be any length from 1 to 32 bits. Instruction Format: Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. N = byte offset of LSB in FIFO or variable or register number rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address db = dibit address of result (.O-.3) Compare if Greater Than or Equal, Store boolean Operation: (vZ) <= ( ( [(fN) | (gN)] - #) >= 0) Assembler Syntax: cgtes #,fN,vZ or cgtes #,gN,vZ Description: The immediate value specified in the instruction is subtracted from the source operand which can come from either the FIFO ram, the variable ram or the registers. If the result is positive, the boolean at address Z in the variable ram is set true. Otherwise it is set false. This instruction is intended as a precursor for complex filters. A collection of booleans may be created and then operated on simultaneously. The source operand may be any length from 1 to 32 bits. Instruction Format: Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. N = byte offset of LSB in FIFO or variable or register number rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address db = dibit address of result (.O-.3) Compare, Jump if Equal Operation: If ( [(fNs2) | (vNs2) | (gNs2)] - #) == 0 then PC <= new13 PC or If ( [(fNs2) | (gNs2)] - (vMs1) ) == 0 then PC <= new_PC Assembler Syntax: cje #,fN,#new_PC or cje #,vN,#new_PC or cje #,gN,#new_PC or cje vM,fN,#new_PC or cje vM,gN,#new_PC Description: The immediate value specified in the instruction is subtracted from the source operand which can come from either the FIFO ram, the variable ram or the registers. If the result is zero, the PC is replaced with the new_PC contained in the instruction. Otherwise execution continues with the next instruction. All jumps are relative, with a range of −128 to 127 instructions from the current PC. The source operand may be any length from 1 to 32 bits when using immediate compares. For variable based compares the source operand may be up to 64 bits long. Instruction Format: Source 1 =variable Source 1 = immediate Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. N = byte offset of LSB in FIFO or variable or register number. M = byte address into variable ram for source argument. rel = adjust N for headers automatically or select variables or register as source. new_PC = new PC execution address after branch. Compare, Jump if Greater Than or Equal Operation: If( [(fNs2) | (vNs2) | (gNs2)] - #) >= 0 then PC <= new_PC or If ( [(fNs2) | (gNs2)] - (vMs1) ) >= 0 then PC <= new_PC Assembler Syntax: cjgte #,fN,#new_PC or cjgte #,vN,#new_PC or cjgte #,gN,#new_PC or cjgte vM,fN,#new_PC or cjgte vM,gN,#new_PC Description: The immediate value specified in the instruction is subtracted from the source operand which can come from either the FIFO ram, the variable ram or the registers. If the result is positive, the PC is replaced with the new_PC contained in the instruction. Otherwise execution continues with the next instruction. All jumps are relative, with a range of −128 to 127 instructions from the current pc. The source operand may be any length from 1 to 32 bits when using immediate compares. For variable based compares the source operand may be up to 64 bits long. Instruction Format: Source 1 =variable Source 1 = immediate Instruction Fields: # = immediate value right justified. L = length of operands inbits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. N = byte offset of LSB in FIFO or variable or register number. M = byte address into variable ram for source argument. rel = adjust N for headers automatically or select variables or register as source. new_PC = new PC execution address after branch. Compare, Jump if IN range Operation: If (#low <= [(fNs) | (vNs) | (gNs)] <#high) then PC <= new PC or If ((vM)low <= [(fNs) | (gNs)] < (vM)high) then PC <= new_PC Assembler Syntax: cjin #low,#high,fN,#new_PC or cjin #low,#high,vN,#new_PC or cjin #low,#high,gN,#new_PC or cjin vM,fN,#new_PC or cjin vM,gN,#new_PC Description: The immediate value is logically broken into two 16 bit sections one representing the low end and one the high end of a range comparison. The source argument, which can come from either the FIFO ram, the variable ram or the registers is compared against both the high and low limits. If the low comparison is positive AND the high comparison is negative then the PC is replaced with the new_PC contained in the instruction. Otherwise execution continues with the next instruction. All jumps are relative, with a range of −128 to 127 instructions from the current pc. If source 1 is a variable, it is assumed to be 32 bits wide and is broken intotwo 16 bit sections as above. The source operand may be any length from 1 to 16 bits. Instruction Format: Source 1 =variable Source 1 = immediate Instruction Fields: #high = high immediate value right justified. #low = low immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. N = byte offset of LSB in FIFO or variable or register number M = byte offset of LSB in variable memory for 32 bit source rel = adjust N for headers automatically or select variables or register as source new_PC = new PC execution address after branch. Compare, Jump if Less Than Operation: If ( [(fNs2) | (vNs2) | (gNs2)] - < 0 then PC <= new_PC or If ( [(fNs2) | (gNs2)] - (vMs1) ) < 0 then PC < new_PC Assembler Syntax: cjlt #,fN,#new_PC or cjlt #,vN,#new_PC or cjlt #,gN,#new_PC or cjlt vM,fN,#new_PC or cjlt vM,gN Description: The immediate value specified in the instruction is subtracted from the source operand which can come from either the FIFO ram, the variable ram or the registers. If the result is negative, the PC is replaced with the new_PC contained in the instruction. Otherwise execution continues with the next instruction. All jumps are relative, with a range of −128 to 127 instructions from the current PC. The source operand may be any length from 1 to 32 bits when using immediate compares. For variable based compares the source operand may be up to 64 bits long. Instruction Format: Source 1 =variable Source 1 = immediate Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. N = byte offset of LSB in FIFO or variable or register number. M = byte address into variable ram for source argument rel = adjust N for headers automatically or select variables or register as source. new_PC = new PC execution address after branch. Compare, Jump if Not Equal Operation: If ( [(fNs2) | (vNs2) | (gNs2)] - #) != 0 then PC <= new_PC or If ( [(fNs2) | (gNs2)] - (vMs1) ) != 0 then PC <= new_PC Assembler Syntax: cjne #,fN,#new_PC or cjne #,vN,#new_PC or cjne #,gN,#new_PC or cjne vM,fN,#new_PC or cjne vM,gN,#new_PC Description: The immediate value specified in the instruction is subtracted from the source operand which can come from either the FIFO ram, the variable ram or the registers. If the result is non-zero, the PC is replaced with the new_PC contained in the instruction. Otherwise execution continues with the next instruction. All jumps are relative, with a range of −128 to 127 instructions from the current pc. The source operand may be any length from 1 to 32 bits when using immediate compares. For variable based compares the source operand may be up to 64 bits long. Instruction Format: Source 1 =variable Source 1 = immediate Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. N = byte offset of LSB in FIFO or variable or register number. M = byte address into variable ram for source argument. rel = adjust N for headers automatically or select variables or register as source. new_PC = new PC execution address after branch. Compare, Jump if OUT of range Operation: If ! (#low <= [(fNs) | (vNs) | (gNs)] < #high) then PC <= new_PC or If ! ((vM)low <= [(fNs) | (gNs)] < (vM)high) then PC <= new_PC Assembler Syntax: cjout #low,#high,fN,#new_PC or cjout #low,#high,vN,#new_PC or cjout #low,#high,gN #new_PC or cjout vM,fN,#new_PC or cjout vM,gN,#new_PC Description: The immediate value is logically broken into two 16 bit sections one representing the low end and one the high end of a range comparison. The source argument, which can come from either the FIFO ram, the variable ram or the registers is compared against both the high and low limits. If the low comparison is negative or the high comparison is positive then the PC is replaced with the new_PC contained in the instruction. Otherwise execution continues with the next instruction. All jumps are relative, with a range of −128 to 127 instructions from the current pc. If source 1 is a variable, it is assumed to be 32 bits wide and is broken intotwo 16 bit sections as above. The source operand may be any length from 1 to 16 bits. Instruction Format: Source 1 =variable Source 1 = immediate Instruction Fields: #high = high immediate value right justified. #low = low immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. M = byte offset of LSB in variable memory for 32 bit source N = byte offset of LSB in FIFO or variable or register number rel = adjust N for headers automatically or select variables or register as source new_PC = new PC execution address after branch. Compare if Less Than, Store boolean Operation: (vZ) <= ( ( [(fN) | (gN)] - <0) Assembler Syntax: clts #,fN,vZ or clts #,gN,vZ Description: The immediate value specified in the instruction is subtracted from the source operand which can come from either the FIFO ram, the variable ram or the registers. If the result is negative, the boolean at address Z in the variable ram is set true. Otherwise it is set false. This instruction is intended as a precursor for complex filters. A collection of booleans may be created and then operated on simultaneously. The source operand may be any length from 1 to 32 bits. Instruction Format: Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. N = byte offset of LSB in FIFO or variable or register number rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address db = dibit address of result (.O-.3) Compare if Not Equal, Store boolean Operation: (vZ) <=( ( [(fN) | (gN)] - #) != 0) Assembler Syntax: cnes #,fN,vZ or cnes #,gN,vZ Description: The immediate value specified in the instruction is subtracted from the source operand which can come from either the FIFO ram, the variable ram or the registers. If the result is not zero, the boolean at address Z in the variable ram is set true. Otherwise it is set false. This instruction is intended as a precursor for complex filters. A collection of booleans may be created and then operated on simultaneously. The source operand may be any length from 1 to 32 bits. Instruction Format: Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIR) for non byte aligned fields. N = byte offset of LSB in FIFO or variable or register number rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address db = dibit address of result (.O-.3) COMPare immediate and Store magnitude result Operation: temparg = [(fN) | (gN)] - #; if temparg = 0, vZ <=11 elsif temparg < 0, vZ <= 00 elsif temparg > 0, vZ <= 01 Assembler Syntax: comps #,fN,vZ or comps #,gN,vZ Description: The source operand, which may come from either the receive FIFO, variable ram or the registers, is compared with the immediate value contained in the instruction. The magnitude result of the comparison is stored in the variable addressed by Z. The source operand may be any length from 1 to 32 bits. The destination operand is automatically two bits wide. Instruction Format: Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. N = byte offset of LSB in FIFO or variable or register number rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address db = dibit address of result (.O-.3) Filter Compare and Add Destination Operation: if( [(fN) | (vN) | (gN)] - #) == 0) { v(destination mask) = v(destination mask) OR tableram(TRA); if v(bpdest0) = 0 then tempvar = 0 elsif v(bpdest1) = 0 then tempvar = 1 elsif v(bpdest2) = 0 then tempvar = 2 elsif v(bpdest3) = 0 then tempvar = 3 else tempvar = 4 if tempvar < 4 v(bpdest0+tempvar) <= high(tableram(TRA+1)); if tempvar < 3 v(bpdest1+tempvar) <= low(tableram(TRA+1)); } Assembler Syntax: fcad #,fN,#TRA or fcad #,vN,#TRA or fcad #,gN,#TRA or fcad #,fN,vM or fcad #,gN,vM Description: The immediate value specified in the instruction is subtracted from the source operand which can come from either the FIFO ram, the variable ram or the registers. If the result is zero, the external table ram is accessed at entry FBASE + TRA (FBASE is a configuration register while TRA comes from the instruction). The destination mask and backplane destinations for the frame (fixed locations in the variable ram) are added to from the table entry. In a variation of this, the table ram address may come from two aligned bytes of variable memory (shown as vM). The source operand may be any length from 1 to 32 bits. Instruction Formats Source 3 =variable Source 3 = immediate Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. M = byte offset of LSD of 16 bit table address in variable memory N = byte offset of LSB in FIFO or variable or register number rel = adjust N for headers automatically or select variables or register as source TRA = external table ram address. Filter Compare and Load Destination Operation: if ( [(fN) | (vN) | (gN)] - #) == 0) { v(destination mask) = tableram(TRA); v(bpdest0) = high(tableram(TRA+1)); v(bpdestl) = low(tableram(TRA+1)); v(bpdest2) = 0; v(bpdest3) = 0; } Assembler Syntax: fcld #,fN,#TRA or fcld #,vN,#TRA or fcld #,gN,#TRA or fcld #,fN,vM or fcld #,gN,vM Description: The immediate value specified in the instruction is subtracted from the source operand which can come from either the FIFO ram, the variable ram or the registers. If the result is zero, the external table ram is accessed at entry FBASE + TRA (FBASE is a configuration register while TRA comes from the instruction). The destination mask and backplane destinations for the frame (fixed locations in the variable ram) are loaded from the table entry. In a variation of this, the table ram address may come from two aligned bytes of variable memoiy (shown as vM). The source operand may be any length from 1 to 32 bits. Instruction Format: Source 3 =variable Source 3 = immediate Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. M = byte offset of LSD of 16 bit table address in variable memory N = byte offset of LSB in FIFO or variable or register number rel = adjust N for headers automatically or select variables or register as source TRA = external table ram address. Filter Compare Range and Add Destination Operation: tempvar := [(fN) | (vN) | (gN)]; if ((( tempvar - #low) >= 0) && ((tempvar - #high) < 0)) { v(destination mask) = v(destination mask) OR tableram(TRA); if v(bpdest0) = 0 then tempvar = 0 elsif v(bpdest1) = 0 then tempvar = 1 elsif v(bpdest2) = 0 then tempvar = 2 elsif v(bpdest3) = 0 then tempvar = 3 else tempvar = 4 if tempvar < 4 v(bpdest0+tempvar) <= high(tableram(TRA+1)); if tempvar < 3 v(bpdest1+tempvar) <= low(tableram(TRA+1)); } Assembler Syntax: fcrad #low,#high,fN,#TRA or fcrad #low,#high,vN,#TRA or fcrad #low,#high,gN,#TRA 0r fcrad #low,#high,fN,vM or fcrad #low,#high,gN,vM Description: The dual immediate value specified in the instruction is range checked against the source operand which can come from either the FIFO ram, the variable ram or the registers. (Refer to CJIN for details of the range checking.) If the result is in range, the external table ram is accessed at entry FBASE + TRA (FBASE is a configuration register while TRA comes from the instruction). The destination mask and backplane destinations for the frame (fixed locations in the variable ram) are added to from the table entry. In a variation of this, the table ram address may come from two aligned bytes of variable memory (shown as vM). The source operand may be any length from 1 to 16 bits. Instruction Formats: Source 4 =variable Source 4 = immediate Instruction Fields: #high = high immediate value right justified. #low = low immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. M = byte offset of LSD of 16 bit table address in variable memory N = byte offset of LSB in FIFO or variable or register number rel = adjust N for headers automatically or select variables or register as source TRA = external table ram address. Filter Compare Range and Load Destination Operation: tempvar := [(fN) | (vN) | (gN)]; if (((tempvar - #low) >= 0) && ((tempvar - #high) < 0)) { v(destination mask) = tableram(TRA); v(bpdest0) <= high(tableram(TRA+1)); v(bpdest1) <= low(tableram(TRA+1)); v(bpdest2) <= 0; v(bpdest3) <= 0; } Assembler Syntax: fcrld #low,#high,fN,#TRA or fcrld #low,#high,vN,#TRA or fcrld #low,#high,gN,#TRA or fcrld #low,#high,fN,vM or fcrld #low,#high,gN,vM Description: The dual immediate value specified in the instruction is range checked against the source operand which can come from either the FIFO ram, the variable ram or the registers. (Refer to CJIN for details of the range checking.) If the result is in range, the external table ram is accessed at entry FBASE + TRA (FBASE is a configuration register while TRA comes from the instruction). The destination mask and backplane destinations for the frame (fixed locations in the variable ram) are added to from the table entry. In a variation of this, the table ram address may come from two aligned bytes of variable memory (shown as vM). The source operand may be any length from 1 to 16 bits. Instruction Format: Source 4 =variable Source 4 = immediate Instruction Fields: #high = high immediate value right justified. #low= low immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. M = byte offset of LSD of 16 bit table address in variable memory N = byte offset of LSB in FIFO or variable or register number rel = adjust N for headers automatically or select variables or register as source TRA = external table ram address. Operation: Suspend instruction processing. Assembler Syntax: halt Description: Causes instruction processing to stop for current frame. Processing will resume with instruction number 0 at the beginning of thenext frame. Instruction Format: Jump TBD update Operation: PC <= [# | (var Zs)]; if vZd then (var vZd) <= (old PC + 1) or PC <= [(var Zs)+#]; if vZd then (var vZd) <= (old PC + 1) Assembler Syntax: jmp # or jmp vZs or jmp #,vZd or jmp vZs,vZd or jmp vZs,# or jmp vZs, #,vZd Description: Program control is transferred to either a location specified in the instruction word or to a location stored in a variable indexed by the instruction word, or to a location stored in a variable + an offset. Optionally, if the vZd field is not zero, the old PC +1 is stored there. This allows subroutines by storing the previous program counter in variable space. Variable number 0 may not be used as a link address. All jumpaddresses are direct, 9 bits in length. For both the source and destination, variable size is assumed to be 9 bits. This means 8 bits from the specified location, and the lsb from the preceding location. Instruction Formats: Variable+offset Variable or offset Instruction Fields: R = 0 jump to location in instruction word bits 0-8 (9 bits) R = 1 jump to location in variable ram location vZs #|+00 vZs = Jump location. Direct or indirect VZd = store old PC+1 in location cam LooKup with table LoaD and Add Destination Operation: tmp = key & <[(vB)]> & [(fA) | (vA)] (vD) = cam lookup(tmp,mask) v(destination mask) = v(destination mask) OR tableram(vD); if v(bpdest0) = 0 then tempvar = 0 elsif v(bpdest1) = 0 then tempvar = 1 elsif v(bpdest2) = 0 then tempvar = 2 elsif v(bpdest3) = 0 then tempvar = 3 elsif tempvar = 4 if tempvar < 4 v(bpdest0+tempvar) <= high(tableram(TRA+1)); Assembler Syntax: lklad #k,#m,vA,vD or lklad #k,#m,fA,vD or lklad #k,#m,vB,vA,vD or lklad #k,#m,vB,fA,vD Description: The A field is pulled from either the variable ram of FIFO. Its length can be any number of bytes from 1 to 8. This field is concatenated with an optional B field pulled the variable ram. The B field length is automatically calculated to pad the lookup value to 8 bytes. The top 2, 3 or 4 bits (63 downto 62,61 or 60) are replaced with the key value specified in the instruction. This value is passed to the CAM together with the mask select. The match address from the CAM is stored in the variable ram at the selected destination. If no length is specified for the A field it is assumed to be 64 bits. Next the CAM result is used to index the external table ram. The destination mask and the BPDEST field is fetched from ram and added into the variable ram at the predefined address for this information. Instruction Format: Instruction Fields: mask - Mask select for CAM lookups key - Key bits (left aligned for smaller than 4 bit keys) klen - Key length. (0=2 bits, 1=3 bits, 2=4 bits, 3=reserved) L+ - 6th length bit for the A field length, allowing lengths up to 64 bits B - B key field address. The address for the B field of the key, if used. A len - low 5 bits of the A field length. Any length 1-64 bits may be specified. Lengths that are not multiples of 8 will be padded to 8 bits. The length of the B field is based upon the A field. A - byte offset in variable memory for the A field. D - Destination address for the table index returned from the CAM. Also used as the base for any bytes moved from the extended information fields of a table entry. A rel - Relative information for the B field. Indicates whether the B field is in variable memory or in the FIFO, and if it's in the FIFO, how it is offset cam LooKup with table LoaD Operation: tmp = key & <[(vB)]> & [(fA)I|vA)] (vD) = cam lookup(tmp,mask) (v4) = table[cam result].destination mask (v8) = table[cam result].BPDest (v12) = 0 Assembler Syntax: lkld #k,#m,vA,vD or lkld #k,#m,fA,vD or lkld #k,#m,vB,vA,vD or lkld #k,#m,vB,fA,VD Description: The A field is pulled from either the variable ram or FIFO. Its length can be any number of bytes from 1 to 8. This field is concatenated with an optional B field also pulled from either the variable ram or FIFO. The B field length is automatically calculated to pad the lookup value to 8 bytes. The top 2, 3 or 4 bits (63 downto 62,61 or 60) are replaced with the key value specified in the instrnction. This value is passed to the CAM together with the mask select. The match address from the CAM is stored in the variable ram at the selected destination. If no length is specified for the A field it is assumed to be 64 bits. Next the CAM result is used to index the external table ram. The destination mask and BPDEST0 and BPDEST1 fields are fetched from ram and loaded into the variable ram at the predefined address for this information. The variable ram entries for BPDEST2 and BPDEST3 are written to 0. Instruction Format Instruction Fields: mask - Mask select for CAM lookups key - Key bits (left aligned for smaller than 4 bit keys) klen - Key length. (0=2 bits, 1=3 bits, 2=4 bits, 3=reserved) L+ - 6th length bit for the A field length, allowing lengths up to 64 bits B - B key field address. The address for the B field of the key, if used. A len - low 5 bits of the A field length. Any length 1-64 bits may be specified. Lengths that are not multiples of 8 will be padded to 8 bits. The length of the B field is based upon the A field. A - byte offset in variable memory for the A field. D - Destination address for the table index returned from the CAM. Also used as the base for any bytes moved from the extended information fields of a table entry. A rel - Relative information for the B field. Indicates whether the B field is in variable memory or in the FIFO, and if it's in the FIFO, how it is offset LOAD table information Operation: (vD) = table[index].offset Assembler Syntax: load #i,#o,vD or load (vI),#o,vD Description: The external table ram is accessed at a given index and the entries starting with the programmed offset are fetched and copied into either the FIFO or variable ram at the specified destination. The index may be either specified directly in the instruction or indirectly through a variable. Instruction Formats: Indirect table index Immediate table index #o - Offset from index at which to begin loading data. Valid values are O . . . 31. D len - Move count. Number of bytes of extended data to move into variable memory location D (specified as the length of D in bytes) #i - Index into the table (represents address/16) I - variable memory location containing a 16 bit index into the table (represents address/16). Valid values are O . . . 65535. D - Destination address for the extended information fields of a table entry. Q - Relative information for the D address. Indicates whether the D is in variable memory or in the FIFO, and if it's in the FIFO, how it is offset LOAD destination information from table, ADd it in Operation: v(destination mask) = v(destination mask) OR tableram(#I|vI); if v(bpdest0) = 0 then tempvar = 0 elsif v(bpdest1) = 0 then tempvar = 1 elsif v(bpdest2) = 0 then tempvar = 2 elsif v(bpdest3) = 0 then tempvar = 3 else tempvar = 4 if tempvar < 4 v(bpdest0+tempvar) <= high(tableram((#I|vI +1)); Assembler Syntax: loadad #I, or loadad (vI) Description: The external table ram is accessed at a given index. The destination mask for this ently is or'd into the current mask in variable ram. The backplane destinations are stored in the variable ram starting with the first empty one. If none of the backplane destinations are empty data from the table may be lost. Also see loadd. Instruction Formats: Indirect table index Immediate table index location D (specified as the length of D in bytes) #i - Index into the table (represents address/16). Valid values are 0 . . . 65535. I - variable memory location containing a 16 bit index into the table (represents address/16). LOAD destination information from table Operation: v(destination mask) = table[index].destination mask v(bpdest0) = table[index].BPDEST0 v(bpdest1) = 0 Assembler Syntax: loadd #i, or loadd (vI) Description: The external table ram is accessed at a given index. The destination mask for this entry is stored into the current mask in variable ram. The backplane destinations are stored in the variable ram overwriting existing information. Also see loadad. Instruction Formats: Indirect table index Immediate table index location D (specified as the length of D in bytes) #i - Index into the table (represents address/16). Valid values are 0 . . . 65535. I - variable memory location containing a 16 bit index into the table (represents address/16). cam LOOKup Operation: tmp = key & <[(fB)|(vB)[> & [(vA)] (vD) = cam lookup(tmp,mask) Assembler Syntax: look #k,#m,vA,vD or look #k,#m,fA,vD or look #k,#m,vB,vA,vD or look #k,#m,vB,fA,vD Desciiption: The A field is pulled from either the variable ram. Its length can be any number of bits from 1 to 64. This field is concatenated with an optional B field pulled from either the variable ram or FIFO. The B field length is automatically calculated to pad the lookup value to 8 bytes. The top 2, 3 or 4 bits (63 downto 62,61 or 60) are replaced with the key value specified in the instruction. This value is passed to the CAM together with the mask select. The match address from the CAM is stored in the variable ram at the selected destination. If no length is specified for the A field it is assumed to be 64 bits. It is always padded to at least 4 bytes. Instruction Format: Instruction Fields: mask - Mask select for CAM lookups key - Key bits (left aligned for smaller than 4 bit keys) klen - Key length. (0=2 bits, 1=3 bits, 2=4 bits, 3=reserved) L+ - 6th length bit for the A field length, allowing lengths up to 64 bits B - B key field address. The address for the B field of the key, if used. A len - low 5 bits of the A field length. Any length 1-64 bits may be specified. Lengths that are not multiples of 8 will be padded to 8 bits. The length of the B field is based upon the A field. A - byte offset in variable memory for the A field. D - Destination address for the table index returned from the CAM. Also used as the base for any bytes moved from the extended information fields of a table entry. A rel - Relative information for the B field. Indicates whether the B field is in variable memory or in the FIFO, and if it's in the FIFO, how it is offset MOVe memory TBD update Operation: [(fZ) | (vZ) | (rZ)]= [(fN) | (vN) | (rN)] Assembler Syntax: mov fN,fZ or mov fN,vZ or mov fN,rZ or mov vN,fZ or mov vN,vZ or mov vN,rZ or mov rN,fZ or mov rN,vZ or mov rN,rZ Description: This instruction moves an arbitrary number (from 1 to 8) of bytes from the FIFO or variable space to another location in the FIFO or variable space. Its main purpose is for opening holes the header of a frame for inserting VLAN or RIF information or for removing data from the head of a frame. It can also be used to move a single variable 1 to 8 bytes inlength between the FIFO and the variable ram. Moves to the registers can be bytes or bit lengths up to 8 bits, and specify an offset. Instruction Format: Instruction Fields: L = length of operands inbits from 1 to 64. Note that this field includes an extended length bit in instruction bit 34 off = bit offset in FIFO for non byte aligned fields. Zrel = Additional relative field for destination argument. N = byte offset of LSB in FIFO or register number for argument 2rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address (if zero, destination address is same as source 2). OR Operation: [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] OR (vMs1) or (vZd) <= [(fNs2) | (vNs2) | (gNs2)] OR (vMs1) or [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] OR # or (vZd) <= [(fNs2) | (vNs2) | (gNs2)] OR # Assembler Syntax: or vM,fN<,vZ> or or vM,gN<,vZ> or or #,fN<,vZ> or or #,vN<,vZ> or or #,gN<,vZ> Description: Source operand 1 from the variable ram or an immediate is ored withsource operand 2 from the FIFO ram, variable ram, or the registers. If theZ field is zero, the result is stored back into source 2. Otherwise the resultis stored in variable ram at the address specified in the Z field. The source operand may be any length from 1 to 32 bits. Only one source operand may come from the variable ram. That is, vNs1 OR vNs2 is not supported. If the source 1 operand is a variable an extra length bit isincluded allowing 64 bit logical operations. Instruction Format: Source 1 =variable Source 1 = immediate data Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. M = variable ram source address for argument 1 N = byte offset of LSB in FIFO, variable ram or register number for argument 2rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address (if zero, destination address is same as source 2). STore Immediate Operation: (fN) <= # or (vZ) <= # or (gZ) <= # Assembler Syntax: sti #,fN or sti #,vZ or sti #,gZ Description: The immediate operand given in the instruction word is stored in either the FIFO, variable space or registers. Alternately the operand can be stored into both the FIFO and variable space at independent locations. As with similar instructions, if the Z field is zero, no variable is written. Thus to write variable number 0 the N field must be zero and the rel fieldset to select variable space. Instruction Format: Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. NOTE: For stores to the variable ram, the length will be rounded up to 12 the nearest supported size and the data zero extended. off = bit offset in FIFO for non byte aligned fields. N = byte offset of LSB in FIFO. rel = adjust N for headers automatically. SUBtract Operation: [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] -(vMs1) or (vZd) <= [(fNs2) | (vNs2) | (gNs2)] - (vMs1) or [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] - # or (vZd) <= [(fNs2) | (vNs2) | (gNs2)] - # Assembler Syntax: sub vM,fN<,vZ> or sub vM,gN<,vZ> or sub #,fN<,vZ> or sub #,vN<,vZ> or sub #,gN<,vZ> Description: Source operand I from the variable ram or an immediate is subtracted from source operand 2 from the FIFO ram, variable ram, or the registers.If the Z field is zero, the result is stored back into source 2. Otherwise theresult is stored in variable ram at the address specified in the Z field. The source operand may be any length from 1 to 32 bits. Only one source operand may come from the variable ram. That is, vNs2 - vNs1 is not supported. If the source 1 operand is a variable an extra length bit isincluded allowing 64 bit logical operations. Instruction Format: Source 1 =variable Source 1 = immediate data Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. M = variable ram source address for argument 1 N = byte offset of LSB in FIFO, variable ram or register number for argument 2rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address (if zero, destination address is same as source 2). SUBtract, Jump if Equal Operation: [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] - (vMs1); if zero PC <=]new_PC or [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] - #; if zero PC <= new13PC Assembler Syntax: subje vM,fN,#new_PC or subje vM,gN,#new_PC or subje #,fN,#new_PC or subje #,vN,#new_PC or subje #,gN,#new_PC Description: Source operand 1 from the variable ram or an immediate is subtractedfrom source operand 2 from the FIFO ram, variable ram, or the registers.The result is stored back into operand 2. If the result is zero the PC isreplaced with the new_PC field of the instruction. Otherwise execution continues with the next instruction. All jumps are relative, with a range of −128 to 127 instructions from the current pc. The source operand may be any length from 1 to 32 bits. Only one source operand may come from the variable ram. That is, vNs2 - vNs1 is not supported. If the source 1 operand is a variable an extra length bit isincluded allowing 64 bit logical operations. Instruction Format: Source 1 =variable Source 1 = immediate data Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. M = variable ram source address for argument 1 N = byte offset of LSB in FIFO, variable ram or register number for argument 2rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address (if zero, destination address is same as source 2). SUBtract, Jump if Not Equal Operation: [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] - (vMs1); if !zero PC <= new_PC or [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] - #;if !zero PC <= new_PC Assembler Syntax: subjne vM,fN,#new_PC or subjne vM,gN,#new_PC or subjne #,fN,#new_PC or subjne #,vN,#new_PC or subjne #,gN,#new_PC Description: Source operand 1 from the variable ram or an immediate is subtractedfrom source operand 2 from the FIFO ram, variable ram, or the registers.The result is stored back into operand 2. If the result is non-zero the PC isreplaced with the new_PC field of the instruction. Otherwise execution continues with the next instruction. All jumps are relative, with a range of −128 to 127 instructions from the current pc. The source operand may be any length from 1 to 32 bits. Only one source operand may come from the variable ram. That is, vNs2 - vNs1 is not supported. If the source 1 operand is a variable an extra length bit isincluded allowing 64 bit logical operations. Instruction Format: Source 1 =variable Source 1 = immediate data Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. M = variable ram source address for argument 1 N = byte offset of LSB in FIFO, variable ram or register number for argument 2rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address (if zero, destination address is same as source 2). Operation: PC <= PC if FIFO count not received yet, else PC <= PC + 1 if EOF received before data, PC <= JMP_EOF Assembler Syntax: wait # or wait fN Description: Program execution is suspended if the data count has not yet been received. Otherwise program execution continues with the next instruction. If the frame ends before the requested byte is received, this instruction jumps to the location specified in the JMP_EOF register. Instruction Format: Instruction Fields: reserved = don't care #high = high 8 bits of count #low = low 8 bits of count N address in FIFO to wait for XOR Operation: [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] XOR (vMs1) or (vZd) <= [(fNs2) | (vNs2) | (gNs2)] XOR (vMs1) or [(fNs2) | (vNs2) | (gNs2)] <= [(fNs2) | (vNs2) | (gNs2)] XOR # or (vZd) <= [(fNs2) | (vNs2) | (gNs2)] XOR # Assembler Syntax: xor vM, fN<,vZ> or xor vM,gN<,vZ> or xor #,fN<,vZ> or xor #,vN<,vZ> or xor #,gN<,vZ> Description: Source operand 1 from the variable ram or an immediate is xored withsource operand 2 from the FIFO ram, variable ram, or the registers. If theZ field is zero, the result is stored back into source 2. Otherwise the resultis stored in variable ram at the address specified in the Z field. The source operand may be any length from 1 to 32 bits. Only one source operand may come from the variable ram. That is, vNs1 XOR vNs2 is not supported. If the source 1 operand is a variable an extra length bit isincluded allowing 64 bit logical operations. Instruction Format: Source 1 =variable Source 1 = immediate data Instruction Fields: # = immediate value right justified. L = length of operands in bits from 1 to 32. off = bit offset in FIFO for non byte aligned fields. M = variable ram source address for argument 1 N = byte offset of LSB in FIFO, variable ram or register number for argument 2rel = adjust N for headers automatically or select variables or register as source Z = variable ram target address (if zero, destination address is same as source 2). Using the MOV Instruction The mov instruction is intended to be used to open holes in a frame for inserting VLAN tags or RIF fields or to close holes in a frame for the reverse transformations. The instruction is executed as an OR with the value # 0. The only difference is that the movinstruction allows 64 bit lengths and the destination may be either a different FIFO address (normal instructions only write to the FIFO at the same address as the source operand) or a variable address. The mov instruction is also limited to moving whole bytes. It does not support arbitrary bit alignment. Because the FIFO and variable ram support split addressing on word boundaries only there are restrictions on the mov instructions ability to arbitrarily open and close holes. Specifically, no source or destination operand may cross two 32 bit boundaries. Thus the amount of data that can be moved in a single instruction is limited by: Min( source_limit, dest_limit) The chart below shows what combinations of length and starting address cross two 32 bit boundaries. Addresses are given as big-endian LSB addresses (like all instructions): Another consideration for the mov instruction is relative FIFO addresses. The actual byte address is unknown at compile time restricting the maximum move length to 5 bytes. In reality they probably pose another problem in that the filter processor as well as the mov instruction are not optimal for moving large amounts of data. Example 1: Opening hole for VLAN insertion. Original FIFO contents: Contents after move: Instruction Sequence: mov f3.L32,v60 ; move AC,FC,DA0 and DA1 into tail of variable ram mov f9.L48,f5 ; move DA2-DA5, SA0,SA1 to base of FIFO ; this move limited to 6 bytes because of address mov f13.L32,f9 ; move SA2-SA5 or mov f3.L32,v60 ; move AC,FC,DA0 and DA1 into tail of variables. mov f11.L8,f7 ; move DA2-DA5, SA0-SA3 into base of FIFO mov f13.L2,f9 ; move SA4-SA5 Example 2: Closing hole for VLAN extraction. Original FIFO contents: Contents after move: Instruction Sequence: mov r13.148,r17 ; move SA0-SA5 up over VLAN (butt up to data) ; this move limited to 6 bytes because of address mov r7.164,r11 ; move AC,FC,DA0-DA5 in one shot Example Instructions for Token-Ring Switching TBD Needs to be updated The following instructions assume the variables are laid out as: ; define locations in registers #defineEOFREG r0x0d #define DESTRING r0x09.116 #define SCANDONE r0x01.3 #define LASTRING r0x07.2 #define RINGHIT r0x07.1 ; define locations in frame data #define mac_fc f0x01 ; FC field #define fc_type f0x01.12.a6 ; frame type field in FC #define mac_da0 f0x02 ; Destination address MSB #define gcast_type f0x02.11.a7 ; Groupcast bit in DA #define mac_da5 f0x01 ; last byte of DA #define mac_sa0 f0x08 ; Source Address MSB #define rif_type f0x08.11.a7 ; RII bit in SA #define mac_sa5 f0x0d ; last byte of SA #define mac_mvec f0x0f ; major vector for MAC frames #define mac_rc_exp f0xe.11.a7 ; explorer bit in RIF control word #define mac_rc_sre f0xe.11.a6 ; single route explore bit in RIF control #define mac_rc_len f0xe.15 ; length field in RIF control #define mac_rc_odd f0xe.11 ; check of odd length RIF ; define locations in variable ram #define dest_mask v0x7 #define flags v0x10 #define mac_flag v0x10.3 #define gcast_flag v0x10.2 #define rif_flag v0x10.l #define cam_da v0x15 #define cam_sa v0x19 #define cam_dring v0x1b #define bridge_grp v0x1d #define dring_copy v0x1f ; control bits in variable ram for forwarding #define KILL_RIF v0x11.11.a7 ; reject frames with a RIF #define BLOCKED v0x11.11.a6 ; spanning tree blocked state #define KILL_NORIF v0x11.11.a5 ; reject frames without a RIF #define BLOCKorNORIF v0x11.12.a5 ; includes both of above bits #define ONLYINVRING v0x11.11.a4 ; only port in VRING #define ANYCPU v0x4.14.a3 ; CPUs four queue bits in dest mask ; constants of interest #define ISMAC 0b00 ; frame type in FC #define ISGCAST 0b1 ; DA bit 47 is groupcast indicator #define ISRIF 0b1 ; SA bit 47 is RIF indicator #define DAKEY 0b0000 ; use this key and mask for DA/SA lookups #define DAMASK 0b0000 #define RINGKEY 0b0001 ; use this key and mask for ring lookups #define RINGMASK 0b0001 #define TRUE 0b11 ; mboolean true #define FALSE 0b10 ; mboolean false #define EQUAL 0b11 ; mboolean equal #define GT_E 0b01 ; mboolean greater than or equal #define LT 0b00 ; mboolean less than #define UNKNOWN_SA 0x10000000 ; unknown SA queue in destination mask #define CPU_QUEUE 0x08000000 ; general CPU queue #define UNKNOWN_DA 0x20000000 ; unknown DA queue in destination mask #define MAC_QUEUE 0x40000000 ; MAC frame for CPU #define BPDU 0x1234 ; equal to where software puts BPDU in cam #define MCP 0x5678 ; equal to where software puts MCP in cam ; Source code for basic switching ; Note, at execution start the reject flag is clear meaning the frame is ; to be accepted. It will be set as soon as processing determines the ; frame is to be rejected or left alone. start: sti reject,EOFREG ; early EOF cause frame reject ; next pullout MAC, and GROUPCAST indicators into flags ces ISMAC, fc_type, mac_flag ces ISGCAST, gcast_type, gcast_flag ; next lookup DA together with bridge group and load as default dest lkld DAKEY, DAMASK, bridge_grp, mac_da5.148, cam—da ces ISRIF, rif_type, rif_flag ; pull out RIP indicator ; next lookup SA together with bridge group for learning look DAKEY, DAMASK, bridge grp, mac_sa5.148, cam_sa cje TRUE, mac_flag.12, domac ; if MAC frame jump to mac processing ; NOTE next two instructions can be combined if software always ; places BPDU address and MCP address together in CAM where only ; A0 changes between the two. cje BPDU, cam_da.116, halt ; if BPDU accept frame and done cje MCP, cam_da.116, halt ; if destined to MCP done cje FALSE, rif_flag.12, switchda ; if no RIF, switch by DA cje 1, KILL_RIF, reject ; if don't want RIP frames reject ; fall through from above into source route processing dosrcroute: cje TRUE, mac_rc_odd, reject ; do length checks on RIF field cje 0, mac_rc_len, reject cje 4, mac_rc_len, reject waitscan: cje FALSE, SCANDONE, waitscan ; wait till RIP scanning finished or 0, DESTRING, dring_cop ; move destring into variables cje TRUE, mac_rc_exp, doexplore ; if ARE or SRE jump cje FALSE, RINGHIT, reject ; reject if switch not in path ; next replace destination mask and BPIDs with destination ring lookup lkld RINGKEY, RINGMASK, bridge_grp, dring_copy.116, cam_dring cjne 0, cam_dring, docommon ; if ring known, jump cje 1, ONLYINVRING, reject ; else if only in ring reject jmp docommon ; send all explorer frames to CPU doexplore: sti CPU_QUEUE, dest_mask halt ; switch by DA processing starts here. switchda: ; if block bit is set or must have RIF, reject cjne 0, BLOCKorNORIF, reject docommon: sti halt, EOFREG; ; EOF now causes frame to go w/ last status ; $$SS$ insert user filters here ; as last check before halting, look if SA is unknown and CPU ; is not getting a copy of the frame. If so, send a copy to the ; unknown SA queue. cjne 0, ANYCPU, halt ; if CPU already has a copy exit cjne 0, cam_sa.116, halt ; if SA was known (non-zero) exit or UNKNOWN_SA, dest_mask ; fall into halt ; can jump here from many places. Whenever processing is deemed complete and ; the reject/accept decision is not to be changed, jump here. halt: halt 0 ; can jump here from many places. Whenever processing is deemed complete and ; the frame is to be rejected, jump here reject: halt 1 domac:
Claims (28)
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/916,487 US6377998B2 (en) | 1997-08-22 | 1997-08-22 | Method and apparatus for performing frame processing for a network |
PCT/US1998/017299 WO1999010804A1 (en) | 1997-08-22 | 1998-08-21 | Method and apparatus for performing frame processing for a network |
EP98943279A EP1044406A4 (en) | 1997-08-22 | 1998-08-21 | Method and apparatus for performing frame processing for a network |
CA002301568A CA2301568C (en) | 1997-08-22 | 1998-08-21 | Method and apparatus for performing frame processing for a network |
AU91108/98A AU9110898A (en) | 1997-08-22 | 1998-08-21 | Method and apparatus for performing frame processing for a network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/916,487 US6377998B2 (en) | 1997-08-22 | 1997-08-22 | Method and apparatus for performing frame processing for a network |
Publications (2)
Publication Number | Publication Date |
---|---|
US20020010793A1 true US20020010793A1 (en) | 2002-01-24 |
US6377998B2 US6377998B2 (en) | 2002-04-23 |
Family
ID=25437355
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US08/916,487 Expired - Lifetime US6377998B2 (en) | 1997-08-22 | 1997-08-22 | Method and apparatus for performing frame processing for a network |
Country Status (5)
Country | Link |
---|---|
US (1) | US6377998B2 (en) |
EP (1) | EP1044406A4 (en) |
AU (1) | AU9110898A (en) |
CA (1) | CA2301568C (en) |
WO (1) | WO1999010804A1 (en) |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020041595A1 (en) * | 2000-10-05 | 2002-04-11 | Marc Delvaux | System and method for suspending and resuming transmission of information without creating significant additional overhead |
US6449214B1 (en) * | 2000-11-28 | 2002-09-10 | Silicon Access Networks | Statistics counter overflow content addressable memory (CAM) and method |
US20020167910A1 (en) * | 2001-05-11 | 2002-11-14 | Gammenthaler Robert S. | Context switching system and method for implementing a high speed link (HSL) in a network element |
US20020172198A1 (en) * | 2001-02-22 | 2002-11-21 | Kovacevic Branko D. | Method and system for high speed data retention |
US6570877B1 (en) * | 1999-04-07 | 2003-05-27 | Cisco Technology, Inc. | Search engine for forwarding table content addressable memory |
US20030152078A1 (en) * | 1998-08-07 | 2003-08-14 | Henderson Alex E. | Services processor having a packet editing unit |
US6704794B1 (en) | 2000-03-03 | 2004-03-09 | Nokia Intelligent Edge Routers Inc. | Cell reassembly for packet based networks |
US6757249B1 (en) * | 1999-10-14 | 2004-06-29 | Nokia Inc. | Method and apparatus for output rate regulation and control associated with a packet pipeline |
US6788701B1 (en) * | 1999-05-14 | 2004-09-07 | Dunti Llc | Communication network having modular switches that enhance data throughput |
US6816924B2 (en) * | 2000-08-10 | 2004-11-09 | Infineon Technologies North America Corp. | System and method for tracing ATM cells and deriving trigger signals |
US20050044252A1 (en) * | 2002-12-19 | 2005-02-24 | Floyd Geoffrey E. | Packet classifier |
US6882642B1 (en) | 1999-10-14 | 2005-04-19 | Nokia, Inc. | Method and apparatus for input rate regulation associated with a packet processing pipeline |
US20050144339A1 (en) * | 2003-12-24 | 2005-06-30 | Wagh Mahesh U. | Speculative processing of transaction layer packets |
EP1551130A1 (en) * | 2003-12-31 | 2005-07-06 | Alcatel | Parallel data link layer controllers providing statistics acquisition in a network switching device |
US6934250B1 (en) | 1999-10-14 | 2005-08-23 | Nokia, Inc. | Method and apparatus for an output packet organizer |
US6988238B1 (en) | 2000-01-24 | 2006-01-17 | Ati Technologies, Inc. | Method and system for handling errors and a system for receiving packet stream data |
US6990101B1 (en) * | 2001-03-23 | 2006-01-24 | Advanced Micro Devices, Inc. | System and method for performing layer 3 switching in a network device |
US20060029038A1 (en) * | 2000-06-23 | 2006-02-09 | Cloudshield Technologies, Inc. | System and method for processing packets using location and content addressable memories |
US7031297B1 (en) * | 2000-06-15 | 2006-04-18 | Avaya Communication Israel Ltd. | Policy enforcement switching |
US7133400B1 (en) * | 1998-08-07 | 2006-11-07 | Intel Corporation | System and method for filtering data |
US20080288663A1 (en) * | 2000-01-24 | 2008-11-20 | Ati Technologies, Inc. | Method and system for handling errors |
US20090182798A1 (en) * | 2008-01-11 | 2009-07-16 | Mediatek Inc. | Method and apparatus to improve the effectiveness of system logging |
US7629982B1 (en) * | 2005-04-12 | 2009-12-08 | Nvidia Corporation | Optimized alpha blend for anti-aliased render |
US20090316588A1 (en) * | 2006-06-30 | 2009-12-24 | Mitsubishi Electric Corporation | Communication node, and ring configuration method and ring establishment method in communication system |
US7778259B1 (en) | 1999-05-14 | 2010-08-17 | Dunti Llc | Network packet transmission mechanism |
US20120243405A1 (en) * | 2011-03-23 | 2012-09-27 | Marc Holness | Systems and methods for scaling performance of ethernet ring protection protocol |
US8458453B1 (en) | 2004-06-11 | 2013-06-04 | Dunti Llc | Method and apparatus for securing communication over public network |
US20140019729A1 (en) * | 2012-07-10 | 2014-01-16 | Maxeler Technologies, Ltd. | Method for Processing Data Sets, a Pipelined Stream Processor for Processing Data Sets, and a Computer Program for Programming a Pipelined Stream Processor |
US20140064271A1 (en) * | 2012-08-29 | 2014-03-06 | Marvell World Trade Ltd. | Semaphore soft and hard hybrid architecture |
US8683572B1 (en) | 2008-01-24 | 2014-03-25 | Dunti Llc | Method and apparatus for providing continuous user verification in a packet-based network |
US9043792B1 (en) * | 2004-11-17 | 2015-05-26 | Vmware, Inc. | Virtual local area network (vlan) coordinator providing access to vlans |
US9413660B1 (en) * | 2008-09-30 | 2016-08-09 | Juniper Networks, Inc. | Methods and apparatus to implement except condition during data packet classification |
US20160364474A1 (en) * | 2015-06-15 | 2016-12-15 | Ca, Inc. | Identifying Data Offsets Using Binary Masks |
US10768958B2 (en) | 2004-11-17 | 2020-09-08 | Vmware, Inc. | Using virtual local area networks in a virtual computer system |
Families Citing this family (41)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20010039564A1 (en) * | 1997-10-20 | 2001-11-08 | Victor Hahn | Log on personal computer |
US6496869B1 (en) * | 1998-03-26 | 2002-12-17 | National Semiconductor Corporation | Receiving data on a networked computer in a reduced power state |
US6658002B1 (en) | 1998-06-30 | 2003-12-02 | Cisco Technology, Inc. | Logical operation unit for packet processing |
KR20000039742A (en) * | 1998-12-15 | 2000-07-05 | 서평원 | Method for monitoring atm cell bus |
US6260082B1 (en) * | 1998-12-23 | 2001-07-10 | Bops, Inc. | Methods and apparatus for providing data transfer control |
US6546010B1 (en) * | 1999-02-04 | 2003-04-08 | Advanced Micro Devices, Inc. | Bandwidth efficiency in cascaded scheme |
US6581102B1 (en) * | 1999-05-27 | 2003-06-17 | International Business Machines Corporation | System and method for integrating arbitrary isochronous processing algorithms in general media processing systems |
US6591304B1 (en) * | 1999-06-21 | 2003-07-08 | Cisco Technology, Inc. | Dynamic, scaleable attribute filtering in a multi-protocol compatible network access environment |
US6983350B1 (en) | 1999-08-31 | 2006-01-03 | Intel Corporation | SDRAM controller for parallel processor architecture |
US6651107B1 (en) * | 1999-09-21 | 2003-11-18 | Intel Corporation | Reduced hardware network adapter and communication |
US6532509B1 (en) | 1999-12-22 | 2003-03-11 | Intel Corporation | Arbitrating command requests in a parallel multi-threaded processing system |
US6694380B1 (en) | 1999-12-27 | 2004-02-17 | Intel Corporation | Mapping requests from a processing unit that uses memory-mapped input-output space |
US6661794B1 (en) | 1999-12-29 | 2003-12-09 | Intel Corporation | Method and apparatus for gigabit packet assignment for multithreaded packet processing |
US6584522B1 (en) * | 1999-12-30 | 2003-06-24 | Intel Corporation | Communication between processors |
US7293103B1 (en) * | 2001-02-20 | 2007-11-06 | At&T Corporation | Enhanced channel access mechanisms for a HPNA network |
US6606681B1 (en) * | 2001-02-23 | 2003-08-12 | Cisco Systems, Inc. | Optimized content addressable memory (CAM) |
US7058087B1 (en) * | 2001-05-29 | 2006-06-06 | Bigband Networks, Inc. | Method and system for prioritized bit rate conversion |
EP1267568A1 (en) * | 2001-06-11 | 2002-12-18 | STMicroelectronics Limited | A method and circuitry for processing data |
US7280549B2 (en) | 2001-07-09 | 2007-10-09 | Micron Technology, Inc. | High speed ring/bus |
US6961808B1 (en) | 2002-01-08 | 2005-11-01 | Cisco Technology, Inc. | Method and apparatus for implementing and using multiple virtual portions of physical associative memories |
US20030217182A1 (en) * | 2002-05-15 | 2003-11-20 | Xiaodong Liu | Interface architecture |
US7136400B2 (en) * | 2002-06-21 | 2006-11-14 | International Business Machines Corporation | Method and apparatus for multiplexing multiple protocol handlers on a shared memory bus |
US20040006724A1 (en) * | 2002-07-05 | 2004-01-08 | Intel Corporation | Network processor performance monitoring system and method |
US7562156B2 (en) * | 2002-08-16 | 2009-07-14 | Symantec Operating Corporation | System and method for decoding communications between nodes of a cluster server |
US7454532B1 (en) * | 2003-04-08 | 2008-11-18 | Telairity Semiconductor, Inc. | Stream data interface for processing system |
US20050060420A1 (en) * | 2003-09-11 | 2005-03-17 | Kovacevic Branko D. | System for decoding multimedia data and method thereof |
US7336673B2 (en) * | 2003-10-17 | 2008-02-26 | Agilent Technologies, Inc. | Creating a low bandwidth channel within a high bandwidth packet stream |
US20050190982A1 (en) * | 2003-11-28 | 2005-09-01 | Matsushita Electric Industrial Co., Ltd. | Image reducing device and image reducing method |
US20060101152A1 (en) * | 2004-10-25 | 2006-05-11 | Integrated Device Technology, Inc. | Statistics engine |
WO2006079901A1 (en) * | 2005-01-26 | 2006-08-03 | Nokia Corporation | Method, apparatus and computer program product providing device identification via configurable ring/multi-drop bus architecture |
US7804832B2 (en) * | 2006-02-13 | 2010-09-28 | Cisco Technology, Inc. | Method and system for simplified network wide traffic and/or flow monitoring in a data network |
WO2008154556A1 (en) * | 2007-06-11 | 2008-12-18 | Blade Network Technologies, Inc. | Sequential frame forwarding |
US9667442B2 (en) * | 2007-06-11 | 2017-05-30 | International Business Machines Corporation | Tag-based interface between a switching device and servers for use in frame processing and forwarding |
EP2045937B1 (en) * | 2007-10-04 | 2019-06-19 | Microchip Technology Germany GmbH | System and method for real time synchronization through a communication system |
US8867341B2 (en) * | 2007-11-09 | 2014-10-21 | International Business Machines Corporation | Traffic management of client traffic at ingress location of a data center |
US8553537B2 (en) * | 2007-11-09 | 2013-10-08 | International Business Machines Corporation | Session-less load balancing of client traffic across servers in a server group |
US8072882B1 (en) * | 2009-01-23 | 2011-12-06 | Tellabs San Jose, Inc. | Method and apparatus for a graceful flow control mechanism in a TDM-based packet processing architecture |
EP2288058B1 (en) * | 2009-08-21 | 2012-05-23 | SMSC Europe GmbH | System and method for detection of multiple timing masters in a network |
US8817621B2 (en) | 2010-07-06 | 2014-08-26 | Nicira, Inc. | Network virtualization apparatus |
US9680750B2 (en) | 2010-07-06 | 2017-06-13 | Nicira, Inc. | Use of tunnels to hide network addresses |
US9426067B2 (en) | 2012-06-12 | 2016-08-23 | International Business Machines Corporation | Integrated switch for dynamic orchestration of traffic |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0436194A3 (en) | 1990-01-02 | 1992-12-16 | National Semiconductor Corporation | Media access controller |
US5136580A (en) * | 1990-05-16 | 1992-08-04 | Microcom Systems, Inc. | Apparatus and method for learning and filtering destination and source addresses in a local area network system |
US5197064A (en) | 1990-11-26 | 1993-03-23 | Bell Communications Research, Inc. | Distributed modular packet switch employing recursive partitioning |
US5524250A (en) * | 1991-08-23 | 1996-06-04 | Silicon Graphics, Inc. | Central processing unit for processing a plurality of threads using dedicated general purpose registers and masque register for providing access to the registers |
US5305321A (en) | 1992-02-24 | 1994-04-19 | Advanced Micro Devices | Ethernet media access controller with external address detection interface and associated method |
US5343471A (en) * | 1992-05-11 | 1994-08-30 | Hughes Aircraft Company | Address filter for a transparent bridge interconnecting local area networks |
US5490252A (en) | 1992-09-30 | 1996-02-06 | Bay Networks Group, Inc. | System having central processor for transmitting generic packets to another processor to be altered and transmitting altered packets back to central processor for routing |
US5491531A (en) | 1993-04-28 | 1996-02-13 | Allen-Bradley Company, Inc. | Media access controller with a shared class message delivery capability |
IL109601A (en) * | 1994-05-09 | 1996-05-14 | Audiogard International Ltd | Device for the verification of an alarm |
US5566178A (en) | 1994-12-22 | 1996-10-15 | International Business Machines Corporation | Method and system for improving the performance of a token ring network |
US5787252A (en) * | 1995-11-01 | 1998-07-28 | Hewlett-Packard Company | Filtering system and method for high performance network management map |
US5724358A (en) * | 1996-02-23 | 1998-03-03 | Zeitnet, Inc. | High speed packet-switched digital switch and method |
US5949974A (en) | 1996-07-23 | 1999-09-07 | Ewing; Carrell W. | System for reading the status and for controlling the power supplies of appliances connected to computer networks |
US5802054A (en) * | 1996-08-15 | 1998-09-01 | 3Com Corporation | Atomic network switch with integrated circuit switch nodes |
US5862338A (en) | 1996-12-30 | 1999-01-19 | Compaq Computer Corporation | Polling system that determines the status of network ports and that stores values indicative thereof |
US5909564A (en) | 1997-03-27 | 1999-06-01 | Pmc-Sierra Ltd. | Multi-port ethernet frame switch |
US5970069A (en) * | 1997-04-21 | 1999-10-19 | Lsi Logic Corporation | Single chip remote access processor |
-
1997
- 1997-08-22 US US08/916,487 patent/US6377998B2/en not_active Expired - Lifetime
-
1998
- 1998-08-21 WO PCT/US1998/017299 patent/WO1999010804A1/en not_active Application Discontinuation
- 1998-08-21 EP EP98943279A patent/EP1044406A4/en not_active Withdrawn
- 1998-08-21 AU AU91108/98A patent/AU9110898A/en not_active Abandoned
- 1998-08-21 CA CA002301568A patent/CA2301568C/en not_active Expired - Fee Related
Cited By (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7333484B2 (en) | 1998-08-07 | 2008-02-19 | Intel Corporation | Services processor having a packet editing unit |
US7133400B1 (en) * | 1998-08-07 | 2006-11-07 | Intel Corporation | System and method for filtering data |
US20030152078A1 (en) * | 1998-08-07 | 2003-08-14 | Henderson Alex E. | Services processor having a packet editing unit |
US6570877B1 (en) * | 1999-04-07 | 2003-05-27 | Cisco Technology, Inc. | Search engine for forwarding table content addressable memory |
US6788701B1 (en) * | 1999-05-14 | 2004-09-07 | Dunti Llc | Communication network having modular switches that enhance data throughput |
US7778259B1 (en) | 1999-05-14 | 2010-08-17 | Dunti Llc | Network packet transmission mechanism |
US6882642B1 (en) | 1999-10-14 | 2005-04-19 | Nokia, Inc. | Method and apparatus for input rate regulation associated with a packet processing pipeline |
US6934250B1 (en) | 1999-10-14 | 2005-08-23 | Nokia, Inc. | Method and apparatus for an output packet organizer |
US6757249B1 (en) * | 1999-10-14 | 2004-06-29 | Nokia Inc. | Method and apparatus for output rate regulation and control associated with a packet pipeline |
US20080288663A1 (en) * | 2000-01-24 | 2008-11-20 | Ati Technologies, Inc. | Method and system for handling errors |
US6988238B1 (en) | 2000-01-24 | 2006-01-17 | Ati Technologies, Inc. | Method and system for handling errors and a system for receiving packet stream data |
US6704794B1 (en) | 2000-03-03 | 2004-03-09 | Nokia Intelligent Edge Routers Inc. | Cell reassembly for packet based networks |
US7031297B1 (en) * | 2000-06-15 | 2006-04-18 | Avaya Communication Israel Ltd. | Policy enforcement switching |
US7330908B2 (en) * | 2000-06-23 | 2008-02-12 | Clouldshield Technologies, Inc. | System and method for processing packets using location and content addressable memories |
US20060029038A1 (en) * | 2000-06-23 | 2006-02-09 | Cloudshield Technologies, Inc. | System and method for processing packets using location and content addressable memories |
US6816924B2 (en) * | 2000-08-10 | 2004-11-09 | Infineon Technologies North America Corp. | System and method for tracing ATM cells and deriving trigger signals |
US20020041595A1 (en) * | 2000-10-05 | 2002-04-11 | Marc Delvaux | System and method for suspending and resuming transmission of information without creating significant additional overhead |
US6449214B1 (en) * | 2000-11-28 | 2002-09-10 | Silicon Access Networks | Statistics counter overflow content addressable memory (CAM) and method |
US6807585B2 (en) * | 2001-02-22 | 2004-10-19 | Ati Technologies, Inc. | Method and system for parsing section data |
US20020172198A1 (en) * | 2001-02-22 | 2002-11-21 | Kovacevic Branko D. | Method and system for high speed data retention |
US6990101B1 (en) * | 2001-03-23 | 2006-01-24 | Advanced Micro Devices, Inc. | System and method for performing layer 3 switching in a network device |
US6934302B2 (en) * | 2001-05-11 | 2005-08-23 | Alcatel | Context switching system and method for implementing a high speed link (HSL) in a network element |
US20020167910A1 (en) * | 2001-05-11 | 2002-11-14 | Gammenthaler Robert S. | Context switching system and method for implementing a high speed link (HSL) in a network element |
US20050044252A1 (en) * | 2002-12-19 | 2005-02-24 | Floyd Geoffrey E. | Packet classifier |
US20050144339A1 (en) * | 2003-12-24 | 2005-06-30 | Wagh Mahesh U. | Speculative processing of transaction layer packets |
US20050198258A1 (en) * | 2003-12-31 | 2005-09-08 | Anees Narsinh | Parallel data link layer controllers in a network switching device |
EP1551130A1 (en) * | 2003-12-31 | 2005-07-06 | Alcatel | Parallel data link layer controllers providing statistics acquisition in a network switching device |
US8458453B1 (en) | 2004-06-11 | 2013-06-04 | Dunti Llc | Method and apparatus for securing communication over public network |
US11893406B2 (en) | 2004-11-17 | 2024-02-06 | Vmware, Inc. | Using virtual local area networks in a virtual computer system |
US10768958B2 (en) | 2004-11-17 | 2020-09-08 | Vmware, Inc. | Using virtual local area networks in a virtual computer system |
US9043792B1 (en) * | 2004-11-17 | 2015-05-26 | Vmware, Inc. | Virtual local area network (vlan) coordinator providing access to vlans |
US7629982B1 (en) * | 2005-04-12 | 2009-12-08 | Nvidia Corporation | Optimized alpha blend for anti-aliased render |
US7852779B2 (en) * | 2006-06-30 | 2010-12-14 | Mitsubishi Electric Corporation | Communication node, and ring configuration method and ring establishment method in communication system |
US7983177B2 (en) * | 2006-06-30 | 2011-07-19 | Mitsubishi Electric Corporation | Communication node, and ring configuration method and ring establishment method in communication system |
US20090316588A1 (en) * | 2006-06-30 | 2009-12-24 | Mitsubishi Electric Corporation | Communication node, and ring configuration method and ring establishment method in communication system |
US20100165998A1 (en) * | 2006-06-30 | 2010-07-01 | Mitsubishi Electric Corporation | Communication node, and ring configuration method and ring establishment method in communication system |
US20090182798A1 (en) * | 2008-01-11 | 2009-07-16 | Mediatek Inc. | Method and apparatus to improve the effectiveness of system logging |
US8683572B1 (en) | 2008-01-24 | 2014-03-25 | Dunti Llc | Method and apparatus for providing continuous user verification in a packet-based network |
US9413660B1 (en) * | 2008-09-30 | 2016-08-09 | Juniper Networks, Inc. | Methods and apparatus to implement except condition during data packet classification |
US20120243405A1 (en) * | 2011-03-23 | 2012-09-27 | Marc Holness | Systems and methods for scaling performance of ethernet ring protection protocol |
US8509061B2 (en) * | 2011-03-23 | 2013-08-13 | Ciena Corporation | Systems and methods for scaling performance of Ethernet ring protection protocol |
US20140019729A1 (en) * | 2012-07-10 | 2014-01-16 | Maxeler Technologies, Ltd. | Method for Processing Data Sets, a Pipelined Stream Processor for Processing Data Sets, and a Computer Program for Programming a Pipelined Stream Processor |
US9514094B2 (en) * | 2012-07-10 | 2016-12-06 | Maxeler Technologies Ltd | Processing data sets using dedicated logic units to prevent data collision in a pipelined stream processor |
US9525621B2 (en) * | 2012-08-29 | 2016-12-20 | Marvell World Trade Ltd. | Semaphore soft and hard hybrid architecture |
US20140233582A1 (en) * | 2012-08-29 | 2014-08-21 | Marvell World Trade Ltd. | Semaphore soft and hard hybrid architecture |
US20140064271A1 (en) * | 2012-08-29 | 2014-03-06 | Marvell World Trade Ltd. | Semaphore soft and hard hybrid architecture |
US20160364474A1 (en) * | 2015-06-15 | 2016-12-15 | Ca, Inc. | Identifying Data Offsets Using Binary Masks |
US9830326B2 (en) * | 2015-06-15 | 2017-11-28 | Ca, Inc. | Identifying data offsets using binary masks |
Also Published As
Publication number | Publication date |
---|---|
AU9110898A (en) | 1999-03-16 |
WO1999010804A1 (en) | 1999-03-04 |
CA2301568C (en) | 2004-06-29 |
US6377998B2 (en) | 2002-04-23 |
EP1044406A1 (en) | 2000-10-18 |
CA2301568A1 (en) | 1999-03-04 |
EP1044406A4 (en) | 2002-06-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20020010793A1 (en) | Method and apparatus for performing frame processing for a network | |
US10425359B2 (en) | Packet data traffic management apparatus | |
US9912590B2 (en) | In-line packet processing | |
US20170237661A1 (en) | Processing packets by a network device | |
US7729351B2 (en) | Pipelined packet switching and queuing architecture | |
JP4264866B2 (en) | Intelligent network interface device and system for speeding up communication | |
EP1014626B1 (en) | Method and apparatus for controlling network congestion | |
US7283528B1 (en) | On the fly header checksum processing using dedicated logic | |
US7337253B2 (en) | Method and system of routing network-based data using frame address notification | |
US7236501B1 (en) | Systems and methods for handling packet fragmentation | |
EP1014648A2 (en) | Method and network device for creating buffer structures in shared memory | |
EP0990990A2 (en) | Flow control in a fifo memory | |
JP2002541732A (en) | Automatic service adjustment detection method for bulk data transfer | |
EP1014649A2 (en) | Method and system of data transfer control | |
US8824468B2 (en) | System and method for parsing frames | |
US20040246956A1 (en) | Parallel packet receiving, routing and forwarding | |
US7002979B1 (en) | Voice data packet processing system | |
US7519728B1 (en) | Merge systems and methods for transmit systems interfaces |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: TRESEQ, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:NOLL, MICHAEL;CLARKE, MICHAEL;SMALLWOOD, MARK;REEL/FRAME:008999/0848 Effective date: 19980213 |
|
AS | Assignment |
Owner name: NORTEL NETWORKS CORPORATION, CANADA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TRESEQ, INC.;REEL/FRAME:010530/0499 Effective date: 19991228 |
|
AS | Assignment |
Owner name: NORTEL NETWORKS LIMITED, CANADA Free format text: CHANGE OF NAME;ASSIGNOR:NORTEL NETWORKS CORPORATION;REEL/FRAME:011195/0706 Effective date: 20000830 Owner name: NORTEL NETWORKS LIMITED,CANADA Free format text: CHANGE OF NAME;ASSIGNOR:NORTEL NETWORKS CORPORATION;REEL/FRAME:011195/0706 Effective date: 20000830 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: AVAYA HOLDINGS LIMITED,NEW JERSEY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NORTEL NETWORKS LIMITED;REEL/FRAME:023998/0799 Effective date: 20091218 Owner name: AVAYA HOLDINGS LIMITED, NEW JERSEY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NORTEL NETWORKS LIMITED;REEL/FRAME:023998/0799 Effective date: 20091218 |
|
FPAY | Fee payment |
Year of fee payment: 12 |
|
AS | Assignment |
Owner name: AVAYA MANAGEMENT L.P., DELAWARE Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:AVAYA HOLDINGS LIMITED;REEL/FRAME:048577/0492 Effective date: 20190211 |
|
AS | Assignment |
Owner name: GOLDMAN SACHS BANK USA, NEW YORK Free format text: SECURITY INTEREST;ASSIGNOR:AVAYA MANAGEMENT L.P.;REEL/FRAME:048612/0598 Effective date: 20190315 Owner name: CITIBANK, N.A., NEW YORK Free format text: SECURITY INTEREST;ASSIGNOR:AVAYA MANAGEMENT L.P.;REEL/FRAME:048612/0582 Effective date: 20190315 |
|
AS | Assignment |
Owner name: AVAYA MANAGEMENT L.P., NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS AT REEL 48612/FRAME 0582;ASSIGNOR:CITIBANK, N.A., AS COLLATERAL AGENT;REEL/FRAME:063456/0428 Effective date: 20230403 Owner name: AVAYA INC., NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS AT REEL 48612/FRAME 0582;ASSIGNOR:CITIBANK, N.A., AS COLLATERAL AGENT;REEL/FRAME:063456/0428 Effective date: 20230403 Owner name: AVAYA HOLDINGS CORP., NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS AT REEL 48612/FRAME 0582;ASSIGNOR:CITIBANK, N.A., AS COLLATERAL AGENT;REEL/FRAME:063456/0428 Effective date: 20230403 |
|
AS | Assignment |
Owner name: AVAYA MANAGEMENT L.P., NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 48612/0598);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063691/0294 Effective date: 20230501 Owner name: CAAS TECHNOLOGIES, LLC, NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 48612/0598);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063691/0294 Effective date: 20230501 Owner name: HYPERQUALITY II, LLC, NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 48612/0598);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063691/0294 Effective date: 20230501 Owner name: HYPERQUALITY, INC., NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 48612/0598);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063691/0294 Effective date: 20230501 Owner name: ZANG, INC. (FORMER NAME OF AVAYA CLOUD INC.), NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 48612/0598);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063691/0294 Effective date: 20230501 Owner name: VPNET TECHNOLOGIES, INC., NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 48612/0598);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063691/0294 Effective date: 20230501 Owner name: OCTEL COMMUNICATIONS LLC, NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 48612/0598);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063691/0294 Effective date: 20230501 Owner name: AVAYA INTEGRATED CABINET SOLUTIONS LLC, NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 48612/0598);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063691/0294 Effective date: 20230501 Owner name: INTELLISIST, INC., NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 48612/0598);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063691/0294 Effective date: 20230501 Owner name: AVAYA INC., NEW JERSEY Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 48612/0598);ASSIGNOR:GOLDMAN SACHS BANK USA., AS COLLATERAL AGENT;REEL/FRAME:063691/0294 Effective date: 20230501 |