US20020161926A1 - Method for controlling the order of datagrams - Google Patents

Method for controlling the order of datagrams Download PDF

Info

Publication number
US20020161926A1
US20020161926A1 US10/074,019 US7401902A US2002161926A1 US 20020161926 A1 US20020161926 A1 US 20020161926A1 US 7401902 A US7401902 A US 7401902A US 2002161926 A1 US2002161926 A1 US 2002161926A1
Authority
US
United States
Prior art keywords
processing
datagrams
ticket
processing engine
semaphore
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/074,019
Inventor
Ken Cameron
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Clearspeed Solutions Ltd
Rambus Inc
Original Assignee
ClearSpeed Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GB0103678A external-priority patent/GB0103678D0/en
Priority claimed from GB0103687A external-priority patent/GB0103687D0/en
Priority claimed from GB0121790A external-priority patent/GB0121790D0/en
Application filed by ClearSpeed Technology Ltd filed Critical ClearSpeed Technology Ltd
Assigned to CLEARSPEED TECHNOLOGY LIMITED reassignment CLEARSPEED TECHNOLOGY LIMITED ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CAMERON, KEN
Publication of US20020161926A1 publication Critical patent/US20020161926A1/en
Assigned to CLEARSPEED SOLUTIONS LIMITED reassignment CLEARSPEED SOLUTIONS LIMITED CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: CLEARSPEED TECHNOLOGY LIMITED
Assigned to CLEARSPEED TECHNOLOGY PLC reassignment CLEARSPEED TECHNOLOGY PLC MERGER (SEE DOCUMENT FOR DETAILS). Assignors: CLEARSPEED SOLUTIONS LIMITED
Assigned to CLEARSPEED TECHNOLOGY LIMITED reassignment CLEARSPEED TECHNOLOGY LIMITED CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: CLEARSPEED TECHNOLOGY PLC
Assigned to RAMBUS INC. reassignment RAMBUS INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CLEARSPEED TECHNOLOGY LTD
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • G06F1/10Distribution of clock signals, e.g. skew
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/80Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors
    • G06F15/8007Architectures of general purpose stored program computers comprising an array of processing units with common control, e.g. single instruction multiple data processors single instruction multiple data [SIMD] multiprocessors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/32Circuit design at the digital level
    • G06F30/327Logic synthesis; Behaviour synthesis, e.g. mapping logic, HDL to netlist, high-level language to RTL or netlist
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/54Store-and-forward switching systems 
    • H04L12/56Packet switching systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/74Address processing for routing
    • H04L45/742Route cache; Operation thereof

Definitions

  • the present invention relates to a method for controlling the order of datagrams processed in a processing system of multiple processors or multi-tasking operating systems or the like.
  • Multiple processors comprise an array of processing elements that contain arithmetic and logic processing circuits organized as a set of data paths. To achieve high performance the processing elements are arranged to perform tasks in parallel. These may be MIMD-based network processors or SIMD-based network processors etc. In such computer systems, which allow several processes to co-exist (e.g. multi-tasking operating system or multi-processor systems), a means to synchronize the processes is needed.
  • One solution is a “deli counter” algorithm. This permits the processing elements to process data packets in the order that the requests arrived.
  • This algorithm is based on a “supermarket deli counter” in which the customer takes a ticket from a dispenser and waits until their ticket is called. The customer is then served and gives up their ticket. A counter maintains the ticket number and when a server becomes free, the counter is incremented and the next waiting customer is served. In this way the customers are served in the order that they took a ticket.
  • the algorithm enables data packets to be processed in order. If a processing element is not available, the data packet retrieves a ticket from a ticket dispenser and waits until its ticket number is called by a processing element which has become free.
  • the ticket number is maintained and incremented by a counter whenever a processing element becomes available and the processing element accepts the next waiting data packet which is holding the corresponding ticket number. Once the data packet is retrieved, the ticket is “given up”. Once the ticket counter reaches its maximum, the ticket numbers are reused. The maximum of the ticket counter would have to be sufficient to avoid duplication of ticket numbers being used by waiting data packets.
  • the object of the present invention is to provide a method which is capable of allocating datagrams (units of work) or groups of datagrams to processing elements that preserves global data packet order of the work units.
  • the method is applicable to any data flow processing system that contains multiple processors of the same or different types, for which the order of data elements must be preserved.
  • the method is applicable to network processors that contain multiple processing elements that are served from a common source.
  • a method for controlling the order of datagrams the datagrams being processed by at least one processing engine, each of the at least one processing engine having at least one input port and at least one output port, wherein each datagram or each group of datagrams has a ticket associated therewith, the ticket being used to control the order of the datagram or group of datagrams at the at least one input port of the processing engine and at the at least one output port of the processing engine.
  • the processing engine may comprise a single or a plurality of processing elements. Some of the input ports and output ports of the processing elements share an input and output of the processing engine and hence a ticket.
  • a processing engine for processing datagrams in a predetermined order, the processing engine comprising at least one input port, at least one output port and at least one processing element, the at least one processing element comprising an input port connected to the at least one input port of the processing engine, an output port connected to the at least one output port of the processing engine and arithmetic and logic means, the order of processing datagrams being controlled at the at least one input port of the processing engine and the at least one output port of the processing engine by a ticket associated with the datagram or a group of the datagrams.
  • a processing system comprising a plurality of processing engines for processing datagrams in a predetermined order, the processing engine comprising at least one input port, at least one output port and at least one processing element, the at least one processing element comprising an input port connected to the at least one input port of the processing engine, an output port connected to the at least one output port of the processing engine and arithmetic and logic means, the order of processing datagrams being controlled at the at least one input port of the processing engine and the at least one output port of the processing engine by a ticket associated with the datagram or a group of the datagrams.
  • a datagram represents any kind of a unit of data that requires some processing.
  • a processor in this context is any kind of programmable or non-programmable mechanism that does some transformation of the data unit, possibly but not necessarily reading or modifying some global variables as a side effect.
  • a processor could be a programmable CPU or a fixed-function application specific integrated circuit ASIC. If the physical processor supports multiple threads, the processor can be virtual, i.e. one of several threads running on a physical CPU.
  • the method according to the present invention offers great flexibility in that the processors can remove or inject work units at will, and processors can join or leave the process.
  • FIG. 1 is a schematic block diagram illustrating a simplified system utilising the method according to a preferred embodiment of the present invention
  • FIG. 2 is a schematic block diagram illustrating a typical multiple processor system utilising the method according to a preferred embodiment of the present invention
  • FIG. 3 is a simplified block diagram illustrating the global semaphore unit of FIG. 2 according to the preferred embodiment of the present invention
  • FIG. 4 illustrates the ticket semaphore initialised with a queue for data according to the preferred embodiment of the present invention
  • FIG. 5 illustrates the Input Buffer semaphores according to the preferred embodiment of the present invention
  • FIG. 6 illustrates the Output Buffer semaphores according to a preferred embodiment of the present invention
  • FIG. 7 illustrates the processing elements on start-up according to the preferred embodiment of the present invention
  • FIG. 8 a illustrates Ticket semaphore and Data FIFO after MTAP processor given Tickets according to the preferred embodiment of the present invention.
  • FIG. 8 b illustrates Ticket semaphore and Data FIFO after a processing element has returned its Ticket according to the preferred embodiment of the present invention.
  • FIG. 1 shows a simplified system utilising the method (or algorithm) according to a preferred embodiment of the present invention.
  • the multiple processing system 100 comprises a plurality of processing elements 110 a , 110 b and 110 c .
  • the processing elements 110 a , 110 b , 110 c may be substantially similar, or even run at the same speed; indeed one might be a programmable CPU and another an ASIC running a fixed algorithm.
  • Datagrams that is, the natural unit of data in an application, (in a Network processor this would equivalent to a data packet) are supplied to a processing engine from a data source 120 .
  • the data source 120 is a functional unit that supplies one or more datagrams from a processor when requested by that processor.
  • the system may comprise a plurality of data sources.
  • the datagrams are processed by a processing element 110 a , 110 b , 110 c .
  • the processing element 110 a , 110 b, 110 c writes the datagram to a data sink 130 in the same order that the datagrams were read from the data source 120 .
  • the data sink 130 is a functional unit that accepts processed datagrams from a processing element when requested by that processing element.
  • a single data sink is shown here, it can be appreciated that the system may comprise a plurality of data sinks.
  • the processing elements 110 a , 110 b , 110 c can drop selected datagrams (i.e.
  • the overall state of the datagrams needed by the deli counter algorithm is maintained by another functional unit, the semaphore unit 140 .
  • FIG. 2 illustrates an example of a parallel processing system 200 incorporating the deli counter algorithm according the preferred embodiment of the present invention.
  • the system 200 comprises a Network Input Port (NIP) 202 and Network Output Port (NOP) 204 and a plurality of multi threaded array processors (MTAPs) 206 connected to a common bus 208 .
  • the MTAP is a single instruction multiple data (SIMD) processor that shares instruction fetch and decode amongst a number of processing elements. Typically the processing elements all execute the same instruction in lock step.
  • SIMD single instruction multiple data
  • the system includes 4 MTAPs 206 a , 206 b , 206 c and 206 d (not specifically shown in FIG. 2).
  • Each MTAP 206 a , 206 b, 206 c , 206 d has, typically, at least 64 processing elements.
  • a Global Semaphore Unit 210 is connected to the common bus 208 to synchronise several processes over the plurality of MTAPs 206 a , 206 b , 206 c , 206 d .
  • Each MTAP 206 is able to execute its own core instruction stream independently of the other MTAPs.
  • the MTAPs 206 are responsible for managing the packet flow through the system. All communication between the MTAPs 206 is via the other functional blocks including external memory (not shown in FIG. 2), the Global Semaphore Unit 210 , NIP 202 , NOP 204 etc using the common bus 208 such that the MTAPs 206 communicate with each other by means of semaphores.
  • the function of the system is defined by the software running on the MTAPs 206 a , 206 b , 206 c , 206 d .
  • the present invention can be utilised in any multi processor system in which the processor blocks may be of different types.
  • the massively parallel processor block at the heart of the architecture is used in a similar way to conventional processors, in that it reads and writes data in response to executing a stored program.
  • the semaphore unit 210 has a number of features. One feature is the ability to maintain a set of semaphores. Each semaphore has associated with it some data which is returned to the client that performs the wait; in the method of the present invention these are called Tickets.
  • the number of Tickets is equal to or greater than the total number of buffers. In the case if the processor illustrated in FIG. 2 there are four buffers per MTAP 206 a , 206 b , 206 c , 206 d and four MTAPs making a total of sixteen buffers. Therefore, the number of Tickets would be greater than or equal to sixteen. Having a ticket gives the holder permission to continue but is not a handle to a particular buffer itself.
  • the global semaphore unit 210 as shown in FIG. 3 comprises a ticket dispenser 302 .
  • the ticket dispenser 302 includes a FIFO buffer 304 for storing the tickets (semaphores with data) and a counter 306 .
  • the global semaphore unit 210 also comprises means 308 for generating and maintaining an Input Buffer Semaphore array and means 310 for generating and maintaining an Output Buffer Semaphore array.
  • the global semaphore unit 210 is used by the software executing on the MTAPs 206 a , 206 b , 206 c , 206 d as a generic control mechanism.
  • the global semaphore unit 210 maintains semaphores on which clients can wait and signal. It also maintains semaphores that have items of data attached. This data is returned to the client that successfully waits on the particular semaphore.
  • the method is based on cooperating sequential processes using shared global semaphores for synchronisation.
  • a semaphore-with-data comprises a semaphore and a FIFO queue.
  • a semaphore is a counter with the atomic operations signal and wait.
  • Sequential processes work together using semaphores as follows. Initially, a semaphore has a non-negative integer count. When a process performs a wait operation (waits on) a semaphore, the semaphore count is decremented by one, and if the resulting count is negative, the process is blocked from further execution.
  • the semaphore count is incremented by one, and if the resulting count is non-positive, a process that is currently blocked on the semaphore is permitted to continue execution.
  • a global semaphore unit 210 is attached to a bus 208 which can be used by the software to synchronise behaviour. Each semaphore in the unit is memory mapped into the address space of the semaphore unit 210 .
  • the number of semaphores provided by a unit 210 , and the number of units attached to the bus 208 can be tuned to the application.
  • each semaphore would have to count each signal received when there are not a currently pending wait and also queue a list of pending waits if no signals were available to satisfy them.
  • An extension of this is to allow a small item of data to be attached to each signal which is returned to the wait that is matched with it. This requires signals to be queued and not just counted.
  • All semaphore state is accessible via the bus, to allow the unit's context to be saved/restored or otherwise modified.
  • a semaphore-with-data has the following additional behaviour.
  • the signalling process provides a value as a calling argument.
  • the semaphore associated with the semaphore-with-data is dealt with as described above, and in addition a copy of the argument value is inserted into the associated FIFO.
  • a semaphore-with-data is waited on, a wait operation is performed on the associated semaphore as described above.
  • the method according to the preferred embodiment comprises a system initialisation section and an operation section.
  • the semaphore-with-data (ticket dispenser 302 ), in the global semaphore unit 210 is initialised as follows: the semaphore part is initialised with the value N; and the FIFO part is filled with a sequential set of N values, 0to N-1, called tickets.
  • An array 308 of N semaphores, called InBuf, in the global semaphore unit are allocated for purposes of reading from the data source.
  • the k-th element of this array is referred to as InBuf[k].
  • These semaphores are initialised to have a count of zero except for the first (InBuf[0]) which is initialised to have a count of one.
  • An array 310 of N semaphores, called OutBuf, in the global semaphore unit are allocated for purposes of writing to the data sink.
  • the k-th element of this array is referred to as outBuf[k].
  • These semaphores are initialised to have a count of zero except for the first 9OutBuf[0]) which initialised to have a count of one.
  • the operation section specifies the behaviour of one of the processors that is participating in the method, i.e. reading datagrams from the data source, transforming them, and writing them to the data sink.
  • the number N of tickets is determined by the number of processors that can be concurrently processing datagrams. The only requirement is that N is at least as large as the number of processors.
  • the method preserves order of the datagrams. That is, the datagrams are delivered to the output sink in the order they were taken from the input source. To see this, note that initially exactly one of the InBuf elements has the value1. This means that all the processors will block at step 3 except for the processor that has ticket 0. That processor will read a datagram in step 4 and set InBuf[1] to 1 in step 5. This permits the processor holding ticket 1 to proceed from step 3. In general after the algorithm has run for a while, at most one of the InBuf semaphores will have the value 1 and the rest will have the value 0.
  • a new processor can join the algorithm at any time by simply taking a ticket and following the basic steps given above.
  • a processor can drop out of the algorithm at any time by simply leaving the flow above at step 9.
  • a processor In order to stop a datagram, a processor simply omits to execute the write step 8. To inject a datagram, a processor simply omits to execute the read step 4.
  • the method according to the preferred embodiment can handle multiple data sinks.
  • the method is extended to handle multiple data sources by assigning a ticket dispenser semaphore-with-data and a pair of arrays of semaphores to each data source. Then steps 1 and 2 of the method are modified as follows:
  • step 1 The remaining steps are the same, except that they must use semaphores and data source associated with the select ticket dispenser.
  • the method does not specify which of the ticket dispenser should be selected in step 1.
  • the choice could be arbitrary, or the most full ticket dispenser could be selected.
  • processors may choose from different sets of ticket dispensers, e.g. if different data sources should be serviced by different processors.
  • FreeBuffer is initialised to four
  • FullBuffer is initialised to zero
  • ComputeBuffer is initialised to zero
  • InputBufferSemaphore (0) is initialised to one and
  • InputBufferSemaphore (1 . . . 15) is initialised to zero.
  • NIPRequestSemaphore initialised to four.
  • OutputBufferSemaphore (0) is initialised to 1 and
  • OutputBufferSemaphore (1 . . . 15) is initialised to zero.
  • FIGS. 5 and 6 Examples of Input Buffer semaphores and Output Buffer semaphores are shown in FIGS. 5 and 6 respectively.
  • PutTicket K // Put back ticket value // // Signal the next ticket semaphore // signal InputBufferSemaphore ((K+1) %TotalNumberOfBuffer) wait ReadComplete // Hold off until DIO finished signal FullBuffer // Buffer available for compute thread ⁇
  • each MTAP On start-up each MTAP will get a ticket (wait on a semaphore with data) using GetTicket. The MTAP whose request reaches the semaphore block first will be given ticket value 0 and since the associated InputBufferSemaphore has been pre-signalled it will enable the execution of the Input thread. All the other processors will wait.
  • the data 400 ( 0 ) through to 400 ( 15 ) in the FIFO of Tickets attached to the Ticket semaphore is initialised in numerically increasing values.
  • the semaphore (or count) 410 is pre-initialised to the number of items in the data FIFO—in this case 16 .
  • FIG. 7 shows an example where the MTAPs 206 a , 206 b, 206 c , 206 d have started up and the requests to the semaphore block have resulted in the order A, C, B, D, for example, i.e. MTAP 206 a is given ticket 0, MTAP 206 c is given ticket 1, MTAP 206 b is given ticket 2 and MTAP 206 d is given ticket 3. This will be the order of round robin sequence.
  • the state of the Ticket semaphore is shown in FIG. 8 a.
  • processor 206 a Once the Input thread of processor 206 a has issued its read request to the NIP 202 it will put back its current ticket and signal the next InputBufferSemaphore. The ticket value sent back to the Ticket semaphore will be placed at the end of the linked list as shown in FIG. 8 b and the associated count will be incremented. This will allow MTAP processor 206 c to proceed with it's Input thread, which was waiting on the InputBufferSemaphore associated with its ticket value.
  • MTAP 206 a will now wait on it's local semaphore ReadComplete, which will be signalled by the Direct I/O (DIO) mechanism for that processor. Once the DIO has completed its operation the Input thread will signal, a local semaphore FullBuffer, on which the Compute thread is waiting. The Input thread is now ready to start its set of operations all over again.
  • DIO Direct I/O
  • MTAP 206 a can now proceed with computation and lookups—for illustration the lookup and compute is being performed by one thread, however in a realistic system there will more than one thread doing this.
  • ComputeBuffer Once the compute has completed a local semaphore, ComputeBuffer, is signalled.
  • MTAP 206 a can now proceed as it has been waiting on the local semaphore ComputeBuffer, which has been signalled by the Compute thread.
  • the Output thread now wait on a global semaphore in OutputBufferSemaphore.
  • This is actually an array of semaphores, which at start-up will be initialised to the same semaphore values as the InputBufferSemaphore. That is initially only the first element of the array will be pre-signalled.
  • the Output processor will issue a request to the Collector and signal the next global semaphore in the array OutputBufferSemaphore.
  • each semaphore occupies Xbytes of the address space. Each write to that address is recognised as a signal. Each read is recognised as a wait. The read does not return data until a signal has been received. The value written is discarded and the value returned is undefined.
  • the maximum number of pending signals or waits is one. i.e. there is no counting or queuing. This can be extended to allow the value contained in the signalling write to be returned in the waiting read. It can also be extended to allow multiple pending signals and waits. This requires the signals be counted and waits be queued. We may choose that the values written are ignored, or choose that the be added to the semaphore count value.
  • the values read are undefined or that the contain the current value of the semaphore counter. This can be further extended to allow the value contained in the signalling write to be returned in the read of the wait to which it is matched. This replaces the counter increment/examine options. It also requires that a queue is maintained of the pending signals instead of just counting.
  • the NIP/NOP would operate such that they decide which mini-cores receive packets next. However this requires that the distribution algorithm is chosen and fixed in hardware.
  • An alternative arrangment is instedd of hardwired flow of control use software based. The objective is to allow the mini-cores/processors to decide what the algorithm is, and thus cast it in software rather than hardware. The motivation for this is two fold. Firstly it introduces flexability and secondly it simplifies the hardware.
  • the system bus 208 is preferrably a split transaction type, i.e. that reads are two seperate transactions, a request and a response. This would prevents deadlocks occurring. Since a deadlock would occur if a wait were issued unless a matching signal was already posted. The wait (read) would tie up the bus and prevent any signal being sent (a write) that would complete the transaction.

Abstract

A method for controlling the order of datagrams, the datagrams being processed by at least one processing engine 110 a , 110 b , 110 c, the at least one processing engine having at least one input port and at least one output port, wherein each datagram or group of datagrams has a ticket associated therewith, the ticket being used to control the order of the datagram or group of datagrams at the at least one input port of the processing engine 110 a , 110 b , 110 c and at the at least one output port of the processing engine 110 a , 110 b, 110 c.

Description

    TECHNICAL FIELD
  • The present invention relates to a method for controlling the order of datagrams processed in a processing system of multiple processors or multi-tasking operating systems or the like. [0001]
  • BACKGROUND OF THE INVENTION
  • Multiple processors comprise an array of processing elements that contain arithmetic and logic processing circuits organized as a set of data paths. To achieve high performance the processing elements are arranged to perform tasks in parallel. These may be MIMD-based network processors or SIMD-based network processors etc. In such computer systems, which allow several processes to co-exist (e.g. multi-tasking operating system or multi-processor systems), a means to synchronize the processes is needed. [0002]
  • In conventional network processor, data packets or datagrams are given to processing elements on demand, as the processing elements become available. As the processing time per packet varies, the processing elements will generally not finish their work in packet order. To preserve the packet order at the output port, a random access packet queue is required. Processing elements keep track of where in the queue their packet came from and write back to the same location. The problem with the random access packet queue is its complexity. Such complexity makes it difficult to operate at the high packet rates that modern networks must sustain. For example, at OC-768 speed, the queue must handle about 100 million packets per second. [0003]
  • One solution is a “deli counter” algorithm. This permits the processing elements to process data packets in the order that the requests arrived. This algorithm is based on a “supermarket deli counter” in which the customer takes a ticket from a dispenser and waits until their ticket is called. The customer is then served and gives up their ticket. A counter maintains the ticket number and when a server becomes free, the counter is incremented and the next waiting customer is served. In this way the customers are served in the order that they took a ticket. In computer systems, the algorithm enables data packets to be processed in order. If a processing element is not available, the data packet retrieves a ticket from a ticket dispenser and waits until its ticket number is called by a processing element which has become free. The ticket number is maintained and incremented by a counter whenever a processing element becomes available and the processing element accepts the next waiting data packet which is holding the corresponding ticket number. Once the data packet is retrieved, the ticket is “given up”. Once the ticket counter reaches its maximum, the ticket numbers are reused. The maximum of the ticket counter would have to be sufficient to avoid duplication of ticket numbers being used by waiting data packets. [0004]
  • Since the processing elements may process data packets at different rates, the order of the data packets output from the processing engines cannot be preserved with conventional deli counter algorithms. Therefore, conventional deli counter algorithms still require means of preserving packet order at the output of the processing engine. [0005]
  • SUMMARY OF THE INVENTION
  • The object of the present invention is to provide a method which is capable of allocating datagrams (units of work) or groups of datagrams to processing elements that preserves global data packet order of the work units. [0006]
  • The method is applicable to any data flow processing system that contains multiple processors of the same or different types, for which the order of data elements must be preserved. In particular, the method is applicable to network processors that contain multiple processing elements that are served from a common source. [0007]
  • According to a first aspect of the present invention, there is provided a method for controlling the order of datagrams, the datagrams being processed by at least one processing engine, each of the at least one processing engine having at least one input port and at least one output port, wherein each datagram or each group of datagrams has a ticket associated therewith, the ticket being used to control the order of the datagram or group of datagrams at the at least one input port of the processing engine and at the at least one output port of the processing engine. [0008]
  • The processing engine may comprise a single or a plurality of processing elements. Some of the input ports and output ports of the processing elements share an input and output of the processing engine and hence a ticket. [0009]
  • According to a second aspect of the present invention, there is provided a processing engine for processing datagrams in a predetermined order, the processing engine comprising at least one input port, at least one output port and at least one processing element, the at least one processing element comprising an input port connected to the at least one input port of the processing engine, an output port connected to the at least one output port of the processing engine and arithmetic and logic means, the order of processing datagrams being controlled at the at least one input port of the processing engine and the at least one output port of the processing engine by a ticket associated with the datagram or a group of the datagrams. [0010]
  • According to a third aspect of the present invention, there is provided a processing system comprising a plurality of processing engines for processing datagrams in a predetermined order, the processing engine comprising at least one input port, at least one output port and at least one processing element, the at least one processing element comprising an input port connected to the at least one input port of the processing engine, an output port connected to the at least one output port of the processing engine and arithmetic and logic means, the order of processing datagrams being controlled at the at least one input port of the processing engine and the at least one output port of the processing engine by a ticket associated with the datagram or a group of the datagrams. [0011]
  • The context of which this invention is particularly applicable is a data flow processing system that processes units of data, referred to as datagrams, and that contains multiple independent processors. A datagram represents any kind of a unit of data that requires some processing. A processor in this context is any kind of programmable or non-programmable mechanism that does some transformation of the data unit, possibly but not necessarily reading or modifying some global variables as a side effect. A processor could be a programmable CPU or a fixed-function application specific integrated circuit ASIC. If the physical processor supports multiple threads, the processor can be virtual, i.e. one of several threads running on a physical CPU. [0012]
  • The method according to the present invention offers great flexibility in that the processors can remove or inject work units at will, and processors can join or leave the process.[0013]
  • BRIEF DESCRIPTION OF DRAWINGS
  • FIG. 1 is a schematic block diagram illustrating a simplified system utilising the method according to a preferred embodiment of the present invention; [0014]
  • FIG. 2 is a schematic block diagram illustrating a typical multiple processor system utilising the method according to a preferred embodiment of the present invention; [0015]
  • FIG. 3 is a simplified block diagram illustrating the global semaphore unit of FIG. 2 according to the preferred embodiment of the present invention; [0016]
  • FIG. 4 illustrates the ticket semaphore initialised with a queue for data according to the preferred embodiment of the present invention; [0017]
  • FIG. 5 illustrates the Input Buffer semaphores according to the preferred embodiment of the present invention; [0018]
  • FIG. 6 illustrates the Output Buffer semaphores according to a preferred embodiment of the present invention; [0019]
  • FIG. 7 illustrates the processing elements on start-up according to the preferred embodiment of the present invention; [0020]
  • FIG. 8[0021] a illustrates Ticket semaphore and Data FIFO after MTAP processor given Tickets according to the preferred embodiment of the present invention; and
  • FIG. 8[0022] b illustrates Ticket semaphore and Data FIFO after a processing element has returned its Ticket according to the preferred embodiment of the present invention.
  • DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
  • FIG. 1 shows a simplified system utilising the method (or algorithm) according to a preferred embodiment of the present invention. The [0023] multiple processing system 100 comprises a plurality of processing elements 110 a, 110 b and 110 c. The processing elements 110 a, 110 b, 110 c may be substantially similar, or even run at the same speed; indeed one might be a programmable CPU and another an ASIC running a fixed algorithm. Datagrams, that is, the natural unit of data in an application, (in a Network processor this would equivalent to a data packet) are supplied to a processing engine from a data source 120. The data source 120 is a functional unit that supplies one or more datagrams from a processor when requested by that processor. Although a single data source is shown here, it can be appreciated that the system may comprise a plurality of data sources. The datagrams are processed by a processing element 110 a, 110 b, 110 c. Upon completion of the processing, the processing element 110 a, 110 b, 110 c writes the datagram to a data sink 130 in the same order that the datagrams were read from the data source120. The data sink 130 is a functional unit that accepts processed datagrams from a processing element when requested by that processing element. Although a single data sink is shown here, it can be appreciated that the system may comprise a plurality of data sinks. The processing elements 110 a, 110 b, 110 c can drop selected datagrams (i.e. not send them to the data sink) and can inject new datagrams (i.e. create and send a new datagram to the data sink); and processors can enter and leave the processing sequence at any time. The overall state of the datagrams needed by the deli counter algorithm is maintained by another functional unit, the semaphore unit 140.
  • FIG. 2 illustrates an example of a [0024] parallel processing system 200 incorporating the deli counter algorithm according the preferred embodiment of the present invention. The system 200 comprises a Network Input Port (NIP) 202 and Network Output Port (NOP) 204 and a plurality of multi threaded array processors (MTAPs) 206 connected to a common bus 208. The MTAP is a single instruction multiple data (SIMD) processor that shares instruction fetch and decode amongst a number of processing elements. Typically the processing elements all execute the same instruction in lock step. In the preferred embodiment of the present invention the system includes 4 MTAPs 206 a, 206 b, 206 c and 206 d (not specifically shown in FIG. 2). Each MTAP 206 a, 206 b, 206 c, 206 d has, typically, at least 64 processing elements.
  • A [0025] Global Semaphore Unit 210 is connected to the common bus 208 to synchronise several processes over the plurality of MTAPs 206 a, 206 b, 206 c, 206 d. Each MTAP 206 is able to execute its own core instruction stream independently of the other MTAPs. The MTAPs 206 are responsible for managing the packet flow through the system. All communication between the MTAPs 206 is via the other functional blocks including external memory (not shown in FIG. 2), the Global Semaphore Unit 210, NIP 202, NOP 204 etc using the common bus 208 such that the MTAPs 206 communicate with each other by means of semaphores.
  • The function of the system is defined by the software running on the MTAPs [0026] 206 a, 206 b, 206 c, 206 d. Although in the preferred embodiment a plurality of substantially similar MTAPs 206 are described, the present invention can be utilised in any multi processor system in which the processor blocks may be of different types.
  • The massively parallel processor block at the heart of the architecture is used in a similar way to conventional processors, in that it reads and writes data in response to executing a stored program. [0027]
  • The [0028] semaphore unit 210 has a number of features. One feature is the ability to maintain a set of semaphores. Each semaphore has associated with it some data which is returned to the client that performs the wait; in the method of the present invention these are called Tickets. The number of Tickets is equal to or greater than the total number of buffers. In the case if the processor illustrated in FIG. 2 there are four buffers per MTAP 206 a, 206 b, 206 c, 206 d and four MTAPs making a total of sixteen buffers. Therefore, the number of Tickets would be greater than or equal to sixteen. Having a ticket gives the holder permission to continue but is not a handle to a particular buffer itself.
  • On any one [0029] MTAP 206 a, 206 b, 206 c, 206 d there are at least three threads, one requesting input of data, one for compute and one requesting output. Each thread makes use of a number of semaphores. Some of the semaphores are local, that is local to a MTAP 206 a, 206 b, 206 c, 206 d and some are global which may reside either in the global semaphore unit 210 or in specific blocks such as distributers and collectors.
  • The [0030] global semaphore unit 210, as shown in FIG. 3 comprises a ticket dispenser 302. The ticket dispenser 302 includes a FIFO buffer 304 for storing the tickets (semaphores with data) and a counter 306. The global semaphore unit 210 also comprises means 308 for generating and maintaining an Input Buffer Semaphore array and means 310 for generating and maintaining an Output Buffer Semaphore array.
  • The [0031] global semaphore unit 210 is used by the software executing on the MTAPs 206 a, 206 b, 206 c, 206 d as a generic control mechanism. The global semaphore unit 210 maintains semaphores on which clients can wait and signal. It also maintains semaphores that have items of data attached. This data is returned to the client that successfully waits on the particular semaphore.
  • The method is based on cooperating sequential processes using shared global semaphores for synchronisation. Within the semaphore unit there is a semaphore-with-data. A semaphore-with-data comprises a semaphore and a FIFO queue. A semaphore is a counter with the atomic operations signal and wait. Sequential processes work together using semaphores as follows. Initially, a semaphore has a non-negative integer count. When a process performs a wait operation (waits on) a semaphore, the semaphore count is decremented by one, and if the resulting count is negative, the process is blocked from further execution. When a process performs a signal operation (signals) a semaphore, the semaphore count is incremented by one, and if the resulting count is non-positive, a process that is currently blocked on the semaphore is permitted to continue execution. [0032]
  • A [0033] global semaphore unit 210 is attached to a bus 208 which can be used by the software to synchronise behaviour. Each semaphore in the unit is memory mapped into the address space of the semaphore unit 210. The number of semaphores provided by a unit 210, and the number of units attached to the bus 208 can be tuned to the application.
  • To implement the above, each semaphore would have to count each signal received when there are not a currently pending wait and also queue a list of pending waits if no signals were available to satisfy them. An extension of this is to allow a small item of data to be attached to each signal which is returned to the wait that is matched with it. This requires signals to be queued and not just counted. [0034]
  • All semaphore state is accessible via the bus, to allow the unit's context to be saved/restored or otherwise modified. [0035]
  • A semaphore-with-data has the following additional behaviour. When a semaphore-with-data is signalled, the signalling process provides a value as a calling argument. The semaphore associated with the semaphore-with-data is dealt with as described above, and in addition a copy of the argument value is inserted into the associated FIFO. When a semaphore-with-data is waited on, a wait operation is performed on the associated semaphore as described above. In addition, a value extracted from the associated FIFO and returned to the calling process. [0036]
  • The method according to the preferred embodiment comprises a system initialisation section and an operation section. [0037]
  • In the initialisation section, in which up to N processors can participate in the method. FIG. 3 shows the contents of the global semaphore unit after initialisation, with N=8. [0038]
  • The semaphore-with-data (ticket dispenser [0039] 302), in the global semaphore unit 210 is initialised as follows: the semaphore part is initialised with the value N; and the FIFO part is filled with a sequential set of N values, 0to N-1, called tickets.
  • An [0040] array 308 of N semaphores, called InBuf, in the global semaphore unit are allocated for purposes of reading from the data source. The k-th element of this array is referred to as InBuf[k]. These semaphores are initialised to have a count of zero except for the first (InBuf[0]) which is initialised to have a count of one.
  • An [0041] array 310 of N semaphores, called OutBuf, in the global semaphore unit are allocated for purposes of writing to the data sink. The k-th element of this array is referred to as outBuf[k]. These semaphores are initialised to have a count of zero except for the first 9OutBuf[0]) which initialised to have a count of one.
  • The operation section specifies the behaviour of one of the processors that is participating in the method, i.e. reading datagrams from the data source, transforming them, and writing them to the data sink. [0042]
  • 1. Take a ticket, that is wait on TicketDispenser, obtaining a ticket (T). [0043]
  • 2. Return the ticket, that is signal TicketDispenser, with value T. [0044]
  • 3. Wait on InBuf[T]. [0045]
  • 4. Read a datagram from the data source. [0046]
  • 5. Signal InBuf[T+1 mod N]. [0047]
  • 6. Transform the datagram. [0048]
  • 7. Wait on OutBuf[T]. [0049]
  • 8. Write the datagram to the data sink. [0050]
  • 9. Signal OutBuf[T+1 mod N][0051]
  • 10. Go to step 1. [0052]
  • The number N of tickets is determined by the number of processors that can be concurrently processing datagrams. The only requirement is that N is at least as large as the number of processors. The method preserves order of the datagrams. That is, the datagrams are delivered to the output sink in the order they were taken from the input source. To see this, note that initially exactly one of the InBuf elements has the value1. This means that all the processors will block at [0053] step 3 except for the processor that has ticket 0. That processor will read a datagram in step 4 and set InBuf[1] to 1 in step 5. This permits the processor holding ticket 1 to proceed from step 3. In general after the algorithm has run for a while, at most one of the InBuf semaphores will have the value 1 and the rest will have the value 0. If all the InBuf semaphores are currently 0, one of the processors will eventually execute step 5, causing an InBuf semaphore to be set to 1. This effectively permits the processor with the ticket value corresponding to that InBuf semaphore to proceed to read the next datagram. On the output side, a similar argument applies. At most one of the OutBuf semaphores will have the value 1 and the rest will have the value 0. If all the OutBuf semaphores are currently 0, one of the processors will eventually execute step 9, causing an OutBuf semaphore to be set to 1. This permits the processor with the ticket value corresponding to that OutBuf semaphore to proceed to write the next datagram.
  • Which of the processors actually handle any datagram is irrelevant. Whichever processor it is will notify the processor that should read next by executing [0054] step 5 and will notify the processor that should write next by executing step 9. When a processor takes a ticket, it is committing itself to execute 3, 5, 7 and 9 to keep the sequence going properly.
  • A new processor can join the algorithm at any time by simply taking a ticket and following the basic steps given above. A processor can drop out of the algorithm at any time by simply leaving the flow above at [0055] step 9.
  • In order to stop a datagram, a processor simply omits to execute the [0056] write step 8. To inject a datagram, a processor simply omits to execute the read step 4.
  • The method according to the preferred embodiment can handle multiple data sinks. The method is extended to handle multiple data sources by assigning a ticket dispenser semaphore-with-data and a pair of arrays of semaphores to each data source. Then steps 1 and 2 of the method are modified as follows: [0057]
  • 1. Select a TicketDispenser. Take a ticket T from that TicketDispenser. That is, wait on the selected TicketDispenser, returning a ticket value T. [0058]
  • 2. Return the ticket. That is, signal with TicketDispenser selected in [0059] step 1 with value T.
  • The remaining steps are the same, except that they must use semaphores and data source associated with the select ticket dispenser. The method does not specify which of the ticket dispenser should be selected in [0060] step 1. The choice could be arbitrary, or the most full ticket dispenser could be selected. A possible variation is that different processors may choose from different sets of ticket dispensers, e.g. if different data sources should be serviced by different processors.
  • In the mutiple processing system shown in FIG. 2, the semaphores used are: [0061]
  • Local semaphores: [0062]
  • FreeBuffer, is initialised to four [0063]
  • FullBuffer, is initialised to zero [0064]
  • ComputeBuffer, is initialised to zero [0065]
  • Global semaphores: [0066]
  • InputBufferSemaphore (16), this is an array of semaphores, [0067]
  • InputBufferSemaphore (0) is initialised to one and [0068]
  • InputBufferSemaphore (1 . . . 15) is initialised to zero. [0069]
  • NIPRequestSemaphore, initialised to four. [0070]
  • OutputBufferSemaphore (16), this is an array of semaphores, [0071]
  • OutputBufferSemaphore (0) is initialised to 1 and [0072]
  • OutputBufferSemaphore (1 . . . 15) is initialised to zero. [0073]
  • CollectorSemaphore, initialised to one. [0074]
  • Examples of Input Buffer semaphores and Output Buffer semaphores are shown in FIGS. 5 and 6 respectively. [0075]
  • There is also a FIFO of Tickets, which is used to communicate Ticket numbers from the Input thread to the Output thread. [0076]
  • Input Thread [0077]
    while (true) {
    wait FreeBuffer // Local semaphore
    GetTicket (K) // Get Ticket value
    PutLocalTicket (K) // Place in local
    FIFO for output thread
    wait InputBufferSemaphore (K) // Wait on ticket
    semaphore
    wait NipRequestSemaphore // wait for NIP (May
    not be necessary, e.g. have a queue in the
    Distributor that does not overflow)
    ReadNIP // Issue request to
    NIP
    signal NipRequestSemaphore // Release NIP (May not
    be necessary, e.g. have a queue in the Distributor
    May not be necessary, e.g. have a queue in the
    Distributor that does not overflow)
    PutTicket (K) // Put back ticket
    value
    //
    // Signal the next ticket semaphore
    //
    signal InputBufferSemaphore
    ((K+1) %TotalNumberOfBuffer)
    wait ReadComplete // Hold off until DIO
    finished
    signal FullBuffer // Buffer available
    for compute thread
    }
  • Compute Thread [0078]
  • For illustration this is doing both lookups and regular compute [0079]
    while (true) {
    wait FullBuffer // Wait until buffer
    available for compute
    ComputeOperation
    signal ComputeBuffer // Indicate buffer finished
    with and can be emptied
    }
    Output Thread
    while (true) {
    wait ComputeBuffer // Wait until a
    buffer is available
    GetLocalTicket (K) // Get Value used
    by Input thread
    wait OutputBufferSemaphore (K) // wait for correct
    output turn
    wait CollectorSemaphore // wait for
    Collector (This is actually implemented in the LUE
    transfer engine)
    write NOP // Issue request to
    NOP
    //
    // Indicate okay for next in turn
    //
    signal OutputBufferSemaphore
    ((K+1) %TotalNumberOfBuffer)
    wait WriteComplete // wait for DIO
    to complete
    signal CollectorSemaphore // Release
    collector (This is actually implemented in the LUE
    transfer engine)
    signal FreeBuffer // Return the
    buffer
    }
  • On start-up each MTAP will get a ticket (wait on a semaphore with data) using GetTicket. The MTAP whose request reaches the semaphore block first will be given [0080] ticket value 0 and since the associated InputBufferSemaphore has been pre-signalled it will enable the execution of the Input thread. All the other processors will wait.
  • As shown in FIG. 4, for a system where the MTAP are serviced in a round robin manner, the data [0081] 400(0) through to 400 (15) in the FIFO of Tickets attached to the Ticket semaphore is initialised in numerically increasing values. The semaphore (or count) 410 is pre-initialised to the number of items in the data FIFO—in this case 16.
  • FIG. 7 shows an example where the MTAPs [0082] 206 a, 206 b, 206 c, 206 d have started up and the requests to the semaphore block have resulted in the order A, C, B, D, for example, i.e. MTAP 206 a is given ticket 0, MTAP 206 cis given ticket 1, MTAP 206 b is given ticket 2 and MTAP 206 d is given ticket 3. This will be the order of round robin sequence. The state of the Ticket semaphore is shown in FIG. 8a.
  • Once the Input thread of [0083] processor 206 a has issued its read request to the NIP 202 it will put back its current ticket and signal the next InputBufferSemaphore. The ticket value sent back to the Ticket semaphore will be placed at the end of the linked list as shown in FIG. 8b and the associated count will be incremented. This will allow MTAP processor 206 c to proceed with it's Input thread, which was waiting on the InputBufferSemaphore associated with its ticket value.
  • [0084] MTAP 206 a's Input thread will now wait on it's local semaphore ReadComplete, which will be signalled by the Direct I/O (DIO) mechanism for that processor. Once the DIO has completed its operation the Input thread will signal, a local semaphore FullBuffer, on which the Compute thread is waiting. The Input thread is now ready to start its set of operations all over again.
  • [0085] MTAP 206 a's Compute thread can now proceed with computation and lookups—for illustration the lookup and compute is being performed by one thread, however in a realistic system there will more than one thread doing this. Once the compute has completed a local semaphore, ComputeBuffer, is signalled.
  • [0086] MTAP 206 a's Output thread can now proceed as it has been waiting on the local semaphore ComputeBuffer, which has been signalled by the Compute thread. The Output thread now wait on a global semaphore in OutputBufferSemaphore. This is actually an array of semaphores, which at start-up will be initialised to the same semaphore values as the InputBufferSemaphore. That is initially only the first element of the array will be pre-signalled. The Output processor will issue a request to the Collector and signal the next global semaphore in the array OutputBufferSemaphore.
  • Should the need arise where a MTAP needs to drop out of the round robin sequence, then this can be achieved by taking a ticket, signalling the next InputBufferSemaphore immediately and missing out the NOP phase. The thread sequencing will then look like this. [0087]
  • Input Thread [0088]
    while (true) {
    wait FreeBuffer // Local
    semaphore
    GetTicket (K) // Get Ticket value
    PutLocalTicket (K) // Place in local
    FIFO for output thread
    wait InputBufferSemaphore (K) // Wait on ticket
    semaphore
    PutTicket (K) // Put back
    ticket value
    signal InputBufferSemaphore
    ((K+1) %TotalNumberOfBuffer)
    signal ComputeBuffer // compute thread
    is skipped
    }
    Compute Thread
    The compute thread is not used.
    Output Thread
    while (true) {
    wait ComputeBuffer // Wait until a
    buffer is available to empty
    GetLocalTicket (K) // Get Value used
    by Input thread
    wait OutputBufferSemaphore (K) // wait for correct
    output turn
    signal OutputBufferSemaphore
    ((K+1) %TotalNumberOfBuffer)
    signal FreeBuffer // Return the
    buffer
    }
  • As mentioned previously another processor can join the sequence by taking a ticket and skipping the NIP phase. That will require more InputBufferSempahores and OutputBufferSemaphores. It is up to the user how many more are needed. In case the new processor only needing occasional access, the dropping out of sequence method can be used. [0089]
  • In it's simplest version, each semaphore occupies Xbytes of the address space. Each write to that address is recognised as a signal. Each read is recognised as a wait. The read does not return data until a signal has been received. The value written is discarded and the value returned is undefined. The maximum number of pending signals or waits is one. i.e. there is no counting or queuing. This can be extended to allow the value contained in the signalling write to be returned in the waiting read. It can also be extended to allow multiple pending signals and waits. This requires the signals be counted and waits be queued. We may choose that the values written are ignored, or choose that the be added to the semaphore count value. We may choose that the values read are undefined or that the contain the current value of the semaphore counter. This can be further extended to allow the value contained in the signalling write to be returned in the read of the wait to which it is matched. This replaces the counter increment/examine options. It also requires that a queue is maintained of the pending signals instead of just counting. In one alternative arrangement the NIP/NOP would operate such that they decide which mini-cores receive packets next. However this requires that the distribution algorithm is chosen and fixed in hardware. An alternative arrangment is instedd of hardwired flow of control use software based. The objective is to allow the mini-cores/processors to decide what the algorithm is, and thus cast it in software rather than hardware. The motivation for this is two fold. Firstly it introduces flexability and secondly it simplifies the hardware. [0090]
  • The [0091] system bus 208 is preferrably a split transaction type, i.e. that reads are two seperate transactions, a request and a response. This would prevents deadlocks occurring. Since a deadlock would occur if a wait were issued unless a matching signal was already posted. The wait (read) would tie up the bus and prevent any signal being sent (a write) that would complete the transaction.
  • Although a preferred embodiment of the method and system of the present invention has been illustrated in the accompanying drawings and described in the forgoing detailed description, it will be understood that the invention is not limited to the embodiment disclosed, but is capable of numerous variations, modifications without departing from the scope of the invention as set out in the following claims. [0092]

Claims (14)

1. A method for controlling the order of datagrams, the datagrams being processed by at least one processing engine, each of the at least one processing engine having at least one input port and at least one output port, wherein each datagram or each group of datagrams has a ticket associated therewith, the ticket being used to control the order of the datagram or group of datagrams at the at least one input port of the processing engine and at the at least one output port of the processing engine.
2. A method according to claim 1, wherein the order of the datagrams or group of datagrams at the at least one input port corresponds to the order of the datagrams at the at least one output port.
3. A method according to claim 1, wherein the tickets comprise numerical values.
4. A method according to claim 1, wherein the ticket comprises a semaphore with data associated therewith.
5. A processing engine for processing datagrams in a predetermined order, the processing engine comprising at least one input port, at least one output port and at least one processing element, the at least one processing element comprising an input port connected to the at least one input port of the processing engine, an output port connected to the at least one output port of the processing engine and arithmetic and logic means, the order of processing datagrams being controlled at the at least one input port of the processing engine and the at least one output port of the processing engine by a ticket associated with the datagram or a group of the datagrams.
6. A processing engine according to claim 5, wherein the processing element comprises an element of a multi threaded array processing engine.
7. A processing engine according to claim 5, wherein the processing element can leave or enter the predetermined order.
8. A processing system comprising a plurality of processing engines for processing datagrams in a predetermined order, each processing engine comprising at least one input port, at least one output port and at least one processing element, the at least one processing element comprising an input port connected to the at least one input port of the processing engine, an output port connected to the at least one output port of the processing engine and arithmetic and logic means, the order of processing datagrams being controlled at the at least one input port of the processing engine and the at least one output port of the processing engine by a ticket associated with the datagram or a group of the datagrams.
9. A processing system according to claim 8, wherein datagrams are processed in a round robin manner.
10. A processing system according to claim 8 further comprising a ticket dispenser for giving tickets to a datagram or group of datagrams.
11. A processing system according to claim 10, wherein the tickets are issued on a first come first served basis.
12. A processing system according to claim 8 further comprising a counter for maintaining the value of the current ticket.
13. A processing system according to claim 12, wherein the counter comprises storage means for storing a numerical value.
14. A processing system according to claim 13, wherein once a processing element is allocated a datagram or group of datagrams for processing, the counter is incremented.
US10/074,019 2001-02-14 2002-02-14 Method for controlling the order of datagrams Abandoned US20020161926A1 (en)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
GB0103678A GB0103678D0 (en) 2001-02-14 2001-02-14 Network processing
GB0103678.9 2001-02-14
GB0103687A GB0103687D0 (en) 2001-02-14 2001-02-14 Network processing-architecture II
GB0103687.0 2001-02-14
GB0121790A GB0121790D0 (en) 2001-02-14 2001-09-10 Network processing systems
GB0121790.0 2001-09-10

Publications (1)

Publication Number Publication Date
US20020161926A1 true US20020161926A1 (en) 2002-10-31

Family

ID=27256074

Family Applications (10)

Application Number Title Priority Date Filing Date
US10/468,168 Expired - Fee Related US7290162B2 (en) 2001-02-14 2002-02-14 Clock distribution system
US10/073,948 Expired - Fee Related US7856543B2 (en) 2001-02-14 2002-02-14 Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream
US10/074,022 Abandoned US20020159466A1 (en) 2001-02-14 2002-02-14 Lookup engine
US10/468,167 Abandoned US20040114609A1 (en) 2001-02-14 2002-02-14 Interconnection system
US10/074,019 Abandoned US20020161926A1 (en) 2001-02-14 2002-02-14 Method for controlling the order of datagrams
US11/151,271 Expired - Fee Related US8200686B2 (en) 2001-02-14 2005-06-14 Lookup engine
US11/151,292 Abandoned US20050242976A1 (en) 2001-02-14 2005-06-14 Lookup engine
US11/752,299 Expired - Fee Related US7818541B2 (en) 2001-02-14 2007-05-23 Data processing architectures
US11/752,300 Expired - Fee Related US7917727B2 (en) 2001-02-14 2007-05-23 Data processing architectures for packet handling using a SIMD array
US12/965,673 Expired - Fee Related US8127112B2 (en) 2001-02-14 2010-12-10 SIMD array operable to process different respective packet protocols simultaneously while executing a single common instruction stream

Family Applications Before (4)

Application Number Title Priority Date Filing Date
US10/468,168 Expired - Fee Related US7290162B2 (en) 2001-02-14 2002-02-14 Clock distribution system
US10/073,948 Expired - Fee Related US7856543B2 (en) 2001-02-14 2002-02-14 Data processing architectures for packet handling wherein batches of data packets of unpredictable size are distributed across processing elements arranged in a SIMD array operable to process different respective packet protocols at once while executing a single common instruction stream
US10/074,022 Abandoned US20020159466A1 (en) 2001-02-14 2002-02-14 Lookup engine
US10/468,167 Abandoned US20040114609A1 (en) 2001-02-14 2002-02-14 Interconnection system

Family Applications After (5)

Application Number Title Priority Date Filing Date
US11/151,271 Expired - Fee Related US8200686B2 (en) 2001-02-14 2005-06-14 Lookup engine
US11/151,292 Abandoned US20050242976A1 (en) 2001-02-14 2005-06-14 Lookup engine
US11/752,299 Expired - Fee Related US7818541B2 (en) 2001-02-14 2007-05-23 Data processing architectures
US11/752,300 Expired - Fee Related US7917727B2 (en) 2001-02-14 2007-05-23 Data processing architectures for packet handling using a SIMD array
US12/965,673 Expired - Fee Related US8127112B2 (en) 2001-02-14 2010-12-10 SIMD array operable to process different respective packet protocols simultaneously while executing a single common instruction stream

Country Status (6)

Country Link
US (10) US7290162B2 (en)
JP (2) JP2004524617A (en)
CN (2) CN100367730C (en)
AU (1) AU2002233500A1 (en)
GB (5) GB2374443B (en)
WO (2) WO2002065700A2 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030078997A1 (en) * 2001-10-22 2003-04-24 Franzel Kenneth S. Module and unified network backplane interface for local networks
US20050254486A1 (en) * 2004-05-13 2005-11-17 Ittiam Systems (P) Ltd. Multi processor implementation for signals requiring fast processing
US20130174166A1 (en) * 2011-12-29 2013-07-04 Oracle International Corporation Efficient Sequencer
CN107679621A (en) * 2017-04-19 2018-02-09 北京深鉴科技有限公司 Artificial neural network processing unit
CN107679620A (en) * 2017-04-19 2018-02-09 北京深鉴科技有限公司 Artificial neural network processing unit
US11768714B2 (en) 2021-06-22 2023-09-26 Microsoft Technology Licensing, Llc On-chip hardware semaphore array supporting multiple conditionals

Families Citing this family (268)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7174452B2 (en) * 2001-01-24 2007-02-06 Broadcom Corporation Method for processing multiple security policies applied to a data packet structure
US7549056B2 (en) 1999-03-19 2009-06-16 Broadcom Corporation System and method for processing and protecting content
JP2004524617A (en) 2001-02-14 2004-08-12 クリアスピード・テクノロジー・リミテッド Clock distribution system
US7107478B2 (en) * 2002-12-05 2006-09-12 Connex Technology, Inc. Data processing system having a Cartesian Controller
US7383421B2 (en) * 2002-12-05 2008-06-03 Brightscale, Inc. Cellular engine for a data processing system
FI113113B (en) 2001-11-20 2004-02-27 Nokia Corp Method and device for time synchronization of integrated circuits
US7055123B1 (en) * 2001-12-31 2006-05-30 Richard S. Norman High-performance interconnect arrangement for an array of discrete functional modules
US6836808B2 (en) * 2002-02-25 2004-12-28 International Business Machines Corporation Pipelined packet processing
US8005966B2 (en) * 2002-06-11 2011-08-23 Pandya Ashish A Data processing system using internet protocols
US7415723B2 (en) * 2002-06-11 2008-08-19 Pandya Ashish A Distributed network security system and a hardware processor therefor
US7408957B2 (en) * 2002-06-13 2008-08-05 International Business Machines Corporation Selective header field dispatch in a network processing system
US8015303B2 (en) * 2002-08-02 2011-09-06 Astute Networks Inc. High data rate stateful protocol processing
US7684400B2 (en) * 2002-08-08 2010-03-23 Intel Corporation Logarithmic time range-based multifield-correlation packet classification
US7360007B2 (en) * 2002-08-30 2008-04-15 Intel Corporation System including a segmentable, shared bus
US20040066779A1 (en) * 2002-10-04 2004-04-08 Craig Barrack Method and implementation for context switchover
US20050044324A1 (en) * 2002-10-08 2005-02-24 Abbas Rashid Advanced processor with mechanism for maximizing resource usage in an in-order pipeline with multiple threads
US7346757B2 (en) * 2002-10-08 2008-03-18 Rmi Corporation Advanced processor translation lookaside buffer management in a multithreaded system
US7627721B2 (en) 2002-10-08 2009-12-01 Rmi Corporation Advanced processor with cache coherency
US8037224B2 (en) 2002-10-08 2011-10-11 Netlogic Microsystems, Inc. Delegating network processor operations to star topology serial bus interfaces
US7961723B2 (en) * 2002-10-08 2011-06-14 Netlogic Microsystems, Inc. Advanced processor with mechanism for enforcing ordering between information sent on two independent networks
US7334086B2 (en) * 2002-10-08 2008-02-19 Rmi Corporation Advanced processor with system on a chip interconnect technology
US7924828B2 (en) * 2002-10-08 2011-04-12 Netlogic Microsystems, Inc. Advanced processor with mechanism for fast packet queuing operations
US9088474B2 (en) * 2002-10-08 2015-07-21 Broadcom Corporation Advanced processor with interfacing messaging network to a CPU
US7984268B2 (en) 2002-10-08 2011-07-19 Netlogic Microsystems, Inc. Advanced processor scheduling in a multithreaded system
US8015567B2 (en) 2002-10-08 2011-09-06 Netlogic Microsystems, Inc. Advanced processor with mechanism for packet distribution at high line rate
US8478811B2 (en) 2002-10-08 2013-07-02 Netlogic Microsystems, Inc. Advanced processor with credit based scheme for optimal packet flow in a multi-processor system on a chip
US20050033831A1 (en) * 2002-10-08 2005-02-10 Abbas Rashid Advanced processor with a thread aware return address stack optimally used across active threads
US8176298B2 (en) 2002-10-08 2012-05-08 Netlogic Microsystems, Inc. Multi-core multi-threaded processing systems with instruction reordering in an in-order pipeline
US7596621B1 (en) * 2002-10-17 2009-09-29 Astute Networks, Inc. System and method for managing shared state using multiple programmed processors
US8151278B1 (en) 2002-10-17 2012-04-03 Astute Networks, Inc. System and method for timer management in a stateful protocol processing system
US7814218B1 (en) 2002-10-17 2010-10-12 Astute Networks, Inc. Multi-protocol and multi-format stateful processing
ATE438242T1 (en) * 2002-10-31 2009-08-15 Alcatel Lucent METHOD FOR PROCESSING DATA PACKETS AT LAYER THREE IN A TELECOMMUNICATIONS DEVICE
US7715392B2 (en) * 2002-12-12 2010-05-11 Stmicroelectronics, Inc. System and method for path compression optimization in a pipelined hardware bitmapped multi-bit trie algorithmic network search engine
JP4157403B2 (en) * 2003-03-19 2008-10-01 株式会社日立製作所 Packet communication device
US8477780B2 (en) * 2003-03-26 2013-07-02 Alcatel Lucent Processing packet information using an array of processing elements
US8539089B2 (en) * 2003-04-23 2013-09-17 Oracle America, Inc. System and method for vertical perimeter protection
US7356669B2 (en) 2003-05-07 2008-04-08 Koninklijke Philips Electronics N.V. Processing system and method for transmitting data
US7558268B2 (en) * 2003-05-08 2009-07-07 Samsung Electronics Co., Ltd. Apparatus and method for combining forwarding tables in a distributed architecture router
US7500239B2 (en) * 2003-05-23 2009-03-03 Intel Corporation Packet processing system
US20050108518A1 (en) * 2003-06-10 2005-05-19 Pandya Ashish A. Runtime adaptable security processor
US7349958B2 (en) * 2003-06-25 2008-03-25 International Business Machines Corporation Method for improving performance in a computer storage system by regulating resource requests from clients
US7174398B2 (en) * 2003-06-26 2007-02-06 International Business Machines Corporation Method and apparatus for implementing data mapping with shuffle algorithm
US7702882B2 (en) * 2003-09-10 2010-04-20 Samsung Electronics Co., Ltd. Apparatus and method for performing high-speed lookups in a routing table
CA2442803A1 (en) * 2003-09-26 2005-03-26 Ibm Canada Limited - Ibm Canada Limitee Structure and method for managing workshares in a parallel region
US7886307B1 (en) * 2003-09-26 2011-02-08 The Mathworks, Inc. Object-oriented data transfer system for data sharing
US7120815B2 (en) * 2003-10-31 2006-10-10 Hewlett-Packard Development Company, L.P. Clock circuitry on plural integrated circuits
US7634500B1 (en) * 2003-11-03 2009-12-15 Netlogic Microsystems, Inc. Multiple string searching using content addressable memory
AU2004297923B2 (en) * 2003-11-26 2008-07-10 Cisco Technology, Inc. Method and apparatus to inline encryption and decryption for a wireless station
US6954450B2 (en) * 2003-11-26 2005-10-11 Cisco Technology, Inc. Method and apparatus to provide data streaming over a network connection in a wireless MAC processor
US7340548B2 (en) 2003-12-17 2008-03-04 Microsoft Corporation On-chip bus
US7058424B2 (en) * 2004-01-20 2006-06-06 Lucent Technologies Inc. Method and apparatus for interconnecting wireless and wireline networks
GB0403237D0 (en) * 2004-02-13 2004-03-17 Imec Inter Uni Micro Electr A method for realizing ground bounce reduction in digital circuits adapted according to said method
US7903777B1 (en) 2004-03-03 2011-03-08 Marvell International Ltd. System and method for reducing electromagnetic interference and ground bounce in an information communication system by controlling phase of clock signals among a plurality of information communication devices
US20050216625A1 (en) * 2004-03-09 2005-09-29 Smith Zachary S Suppressing production of bus transactions by a virtual-bus interface
US7478109B1 (en) * 2004-03-15 2009-01-13 Cisco Technology, Inc. Identification of a longest matching prefix based on a search of intervals corresponding to the prefixes
KR100990484B1 (en) * 2004-03-29 2010-10-29 삼성전자주식회사 Transmission clock signal generator for serial bus communication
DE102004035843B4 (en) * 2004-07-23 2010-04-15 Infineon Technologies Ag Router Network Processor
GB2417105B (en) 2004-08-13 2008-04-09 Clearspeed Technology Plc Processor memory system
US7913206B1 (en) * 2004-09-16 2011-03-22 Cadence Design Systems, Inc. Method and mechanism for performing partitioning of DRC operations
US7508397B1 (en) * 2004-11-10 2009-03-24 Nvidia Corporation Rendering of disjoint and overlapping blits
US8170019B2 (en) * 2004-11-30 2012-05-01 Broadcom Corporation CPU transmission of unmodified packets
US20060156316A1 (en) * 2004-12-18 2006-07-13 Gray Area Technologies System and method for application specific array processing
US20060212426A1 (en) * 2004-12-21 2006-09-21 Udaya Shakara Efficient CAM-based techniques to perform string searches in packet payloads
US7769858B2 (en) * 2005-02-23 2010-08-03 International Business Machines Corporation Method for efficiently hashing packet keys into a firewall connection table
US7818705B1 (en) 2005-04-08 2010-10-19 Altera Corporation Method and apparatus for implementing a field programmable gate array architecture with programmable clock skew
WO2006127596A2 (en) 2005-05-20 2006-11-30 Hillcrest Laboratories, Inc. Dynamic hyperlinking approach
US8112654B2 (en) * 2005-06-01 2012-02-07 Teklatech A/S Method and an apparatus for providing timing signals to a number of circuits, and integrated circuit and a node
US7373475B2 (en) * 2005-06-21 2008-05-13 Intel Corporation Methods for optimizing memory unit usage to maximize packet throughput for multi-processor multi-threaded architectures
JP4797482B2 (en) * 2005-07-20 2011-10-19 ブラザー工業株式会社 Wiring board and method of manufacturing wiring board
US20070086456A1 (en) * 2005-08-12 2007-04-19 Electronics And Telecommunications Research Institute Integrated layer frame processing device including variable protocol header
US8325768B2 (en) * 2005-08-24 2012-12-04 Intel Corporation Interleaving data packets in a packet-based communication system
US7904852B1 (en) 2005-09-12 2011-03-08 Cadence Design Systems, Inc. Method and system for implementing parallel processing of electronic design automation tools
US8218770B2 (en) * 2005-09-13 2012-07-10 Agere Systems Inc. Method and apparatus for secure key management and protection
US7353332B2 (en) * 2005-10-11 2008-04-01 Integrated Device Technology, Inc. Switching circuit implementing variable string matching
US7451293B2 (en) * 2005-10-21 2008-11-11 Brightscale Inc. Array of Boolean logic controlled processing elements with concurrent I/O processing and instruction sequencing
US7551609B2 (en) * 2005-10-21 2009-06-23 Cisco Technology, Inc. Data structure for storing and accessing multiple independent sets of forwarding information
US7835359B2 (en) * 2005-12-08 2010-11-16 International Business Machines Corporation Method and apparatus for striping message payload data over a network
EP1971958A2 (en) * 2006-01-10 2008-09-24 Brightscale, Inc. Method and apparatus for processing algorithm steps of multimedia data in parallel processing systems
US20070162531A1 (en) * 2006-01-12 2007-07-12 Bhaskar Kota Flow transform for integrated circuit design and simulation having combined data flow, control flow, and memory flow views
US8301885B2 (en) * 2006-01-27 2012-10-30 Fts Computertechnik Gmbh Time-controlled secure communication
KR20070088190A (en) * 2006-02-24 2007-08-29 삼성전자주식회사 Subword parallelism for processing multimedia data
CN101416216B (en) * 2006-03-30 2012-11-21 日本电气株式会社 Parallel image processing system control method and apparatus
US7617409B2 (en) * 2006-05-01 2009-11-10 Arm Limited System for checking clock-signal correspondence
US8390354B2 (en) * 2006-05-17 2013-03-05 Freescale Semiconductor, Inc. Delay configurable device and methods thereof
US8041929B2 (en) * 2006-06-16 2011-10-18 Cisco Technology, Inc. Techniques for hardware-assisted multi-threaded processing
JP2008004046A (en) * 2006-06-26 2008-01-10 Toshiba Corp Resource management device, and program for the same
US7584286B2 (en) * 2006-06-28 2009-09-01 Intel Corporation Flexible and extensible receive side scaling
US8448096B1 (en) 2006-06-30 2013-05-21 Cadence Design Systems, Inc. Method and system for parallel processing of IC design layouts
US7516437B1 (en) * 2006-07-20 2009-04-07 Xilinx, Inc. Skew-driven routing for networks
CN1909418B (en) * 2006-08-01 2010-05-12 华为技术有限公司 Clock distributing equipment for universal wireless interface and method for realizing speed switching
US20080040214A1 (en) * 2006-08-10 2008-02-14 Ip Commerce System and method for subsidizing payment transaction costs through online advertising
JP4846486B2 (en) * 2006-08-18 2011-12-28 富士通株式会社 Information processing apparatus and control method thereof
CA2557343C (en) * 2006-08-28 2015-09-22 Ibm Canada Limited-Ibm Canada Limitee Runtime code modification in a multi-threaded environment
WO2008027567A2 (en) * 2006-09-01 2008-03-06 Brightscale, Inc. Integral parallel machine
US20080059762A1 (en) * 2006-09-01 2008-03-06 Bogdan Mitu Multi-sequence control for a data parallel system
US9563433B1 (en) 2006-09-01 2017-02-07 Allsearch Semi Llc System and method for class-based execution of an instruction broadcasted to an array of processing elements
US20080055307A1 (en) * 2006-09-01 2008-03-06 Lazar Bivolarski Graphics rendering pipeline
US20080244238A1 (en) * 2006-09-01 2008-10-02 Bogdan Mitu Stream processing accelerator
US20080059763A1 (en) * 2006-09-01 2008-03-06 Lazar Bivolarski System and method for fine-grain instruction parallelism for increased efficiency of processing compressed multimedia data
US20080059467A1 (en) * 2006-09-05 2008-03-06 Lazar Bivolarski Near full motion search algorithm
US7657856B1 (en) 2006-09-12 2010-02-02 Cadence Design Systems, Inc. Method and system for parallel processing of IC design layouts
US7783654B1 (en) 2006-09-19 2010-08-24 Netlogic Microsystems, Inc. Multiple string searching using content addressable memory
JP4377899B2 (en) * 2006-09-20 2009-12-02 株式会社東芝 Resource management apparatus and program
US8010966B2 (en) * 2006-09-27 2011-08-30 Cisco Technology, Inc. Multi-threaded processing using path locks
US8179896B2 (en) 2006-11-09 2012-05-15 Justin Mark Sobaje Network processors and pipeline optimization methods
US9141557B2 (en) 2006-12-08 2015-09-22 Ashish A. Pandya Dynamic random access memory (DRAM) that comprises a programmable intelligent search memory (PRISM) and a cryptography processing engine
US7996348B2 (en) 2006-12-08 2011-08-09 Pandya Ashish A 100GBPS security and search architecture using programmable intelligent search memory (PRISM) that comprises one or more bit interval counters
JP4249780B2 (en) * 2006-12-26 2009-04-08 株式会社東芝 Device and program for managing resources
US7917486B1 (en) 2007-01-18 2011-03-29 Netlogic Microsystems, Inc. Optimizing search trees by increasing failure size parameter
DE602007014413D1 (en) * 2007-03-06 2011-06-16 Nec Corp DATA TRANSFER NETWORK AND CONTROL DEVICE FOR A SYSTEM WITH AN ARRAY OF PROCESSING ELEMENTS, EITHER EITHER SELF- OR COMMONLY CONTROLLED
JP2009086733A (en) * 2007-09-27 2009-04-23 Toshiba Corp Information processor, control method of information processor and control program of information processor
US20090089234A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Automated code generation for simulators
US8548777B2 (en) * 2007-09-28 2013-10-01 Rockwell Automation Technologies, Inc. Automated recommendations from simulation
US20090089029A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Enhanced execution speed to improve simulation performance
US20090089031A1 (en) * 2007-09-28 2009-04-02 Rockwell Automation Technologies, Inc. Integrated simulation of controllers and devices
US8069021B2 (en) * 2007-09-28 2011-11-29 Rockwell Automation Technologies, Inc. Distributed simulation and synchronization
US7801710B2 (en) * 2007-09-28 2010-09-21 Rockwell Automation Technologies, Inc. Simulation controls for model variability and randomness
US7995618B1 (en) * 2007-10-01 2011-08-09 Teklatech A/S System and a method of transmitting data from a first device to a second device
US8515052B2 (en) * 2007-12-17 2013-08-20 Wai Wu Parallel signal processing system and method
US9596324B2 (en) 2008-02-08 2017-03-14 Broadcom Corporation System and method for parsing and allocating a plurality of packets to processor core threads
US8726289B2 (en) * 2008-02-22 2014-05-13 International Business Machines Corporation Streaming attachment of hardware accelerators to computer systems
US8250578B2 (en) * 2008-02-22 2012-08-21 International Business Machines Corporation Pipelining hardware accelerators to computer systems
US20090268727A1 (en) * 2008-04-24 2009-10-29 Allison Brian D Early header CRC in data response packets with variable gap count
US20090271532A1 (en) * 2008-04-24 2009-10-29 Allison Brian D Early header CRC in data response packets with variable gap count
US20090268736A1 (en) * 2008-04-24 2009-10-29 Allison Brian D Early header CRC in data response packets with variable gap count
US8543005B2 (en) * 2008-04-30 2013-09-24 Hewlett-Packard Development Company, L.P. Intentionally skewed optical clock signal distribution
JP2009271724A (en) * 2008-05-07 2009-11-19 Toshiba Corp Hardware engine controller
EP2289001B1 (en) * 2008-05-30 2018-07-25 Advanced Micro Devices, Inc. Local and global data share
US8958419B2 (en) * 2008-06-16 2015-02-17 Intel Corporation Switch fabric primitives
US8566487B2 (en) 2008-06-24 2013-10-22 Hartvig Ekner System and method for creating a scalable monolithic packet processing engine
US7804844B1 (en) * 2008-08-05 2010-09-28 Xilinx, Inc. Dataflow pipeline implementing actions for manipulating packets of a communication protocol
US8311057B1 (en) 2008-08-05 2012-11-13 Xilinx, Inc. Managing formatting of packets of a communication protocol
US8160092B1 (en) 2008-08-05 2012-04-17 Xilinx, Inc. Transforming a declarative description of a packet processor
US7949007B1 (en) 2008-08-05 2011-05-24 Xilinx, Inc. Methods of clustering actions for manipulating packets of a communication protocol
EP2327026A1 (en) * 2008-08-06 2011-06-01 Nxp B.V. Simd parallel processor architecture
CN101355482B (en) * 2008-09-04 2011-09-21 中兴通讯股份有限公司 Equipment, method and system for implementing identification of embedded device address sequence
US8493979B2 (en) * 2008-12-30 2013-07-23 Intel Corporation Single instruction processing of network packets
JP5238525B2 (en) * 2009-01-13 2013-07-17 株式会社東芝 Device and program for managing resources
KR101553652B1 (en) * 2009-02-18 2015-09-16 삼성전자 주식회사 Apparatus and method for compiling instruction for heterogeneous processor
US8140792B2 (en) * 2009-02-25 2012-03-20 International Business Machines Corporation Indirectly-accessed, hardware-affine channel storage in transaction-oriented DMA-intensive environments
US9461930B2 (en) 2009-04-27 2016-10-04 Intel Corporation Modifying data streams without reordering in a multi-thread, multi-flow network processor
US8874878B2 (en) * 2010-05-18 2014-10-28 Lsi Corporation Thread synchronization in a multi-thread, multi-flow network communications processor architecture
US8170062B2 (en) * 2009-04-29 2012-05-01 Intel Corporation Packetized interface for coupling agents
US8743877B2 (en) * 2009-12-21 2014-06-03 Steven L. Pope Header processing engine
CN101807288B (en) * 2010-03-12 2014-09-10 中兴通讯股份有限公司 Scenic spot guide system and implementation method thereof
US8332460B2 (en) * 2010-04-14 2012-12-11 International Business Machines Corporation Performing a local reduction operation on a parallel computer
EP2596470A1 (en) * 2010-07-19 2013-05-29 Advanced Micro Devices, Inc. Data processing using on-chip memory in multiple processing units
US8880507B2 (en) * 2010-07-22 2014-11-04 Brocade Communications Systems, Inc. Longest prefix match using binary search tree
US8904115B2 (en) * 2010-09-28 2014-12-02 Texas Instruments Incorporated Cache with multiple access pipelines
RU2436151C1 (en) * 2010-11-01 2011-12-10 Федеральное государственное унитарное предприятие "Российский Федеральный ядерный центр - Всероссийский научно-исследовательский институт экспериментальной физики" (ФГУП "РФЯЦ-ВНИИЭФ") Method of determining structure of hybrid computer system
US9667539B2 (en) * 2011-01-17 2017-05-30 Alcatel Lucent Method and apparatus for providing transport of customer QoS information via PBB networks
US8869162B2 (en) 2011-04-26 2014-10-21 Microsoft Corporation Stream processing on heterogeneous hardware devices
US9020892B2 (en) * 2011-07-08 2015-04-28 Microsoft Technology Licensing, Llc Efficient metadata storage
US8880494B2 (en) 2011-07-28 2014-11-04 Brocade Communications Systems, Inc. Longest prefix match scheme
CN103858386B (en) 2011-08-02 2017-08-25 凯为公司 For performing the method and apparatus for wrapping classification by the decision tree of optimization
US8923306B2 (en) 2011-08-02 2014-12-30 Cavium, Inc. Phased bucket pre-fetch in a network processor
US8910178B2 (en) 2011-08-10 2014-12-09 International Business Machines Corporation Performing a global barrier operation in a parallel computer
US9154335B2 (en) * 2011-11-08 2015-10-06 Marvell Israel (M.I.S.L) Ltd. Method and apparatus for transmitting data on a network
WO2013100783A1 (en) * 2011-12-29 2013-07-04 Intel Corporation Method and system for control signalling in a data path module
US9495135B2 (en) 2012-02-09 2016-11-15 International Business Machines Corporation Developing collective operations for a parallel computer
US9178730B2 (en) 2012-02-24 2015-11-03 Freescale Semiconductor, Inc. Clock distribution module, synchronous digital system and method therefor
US20130229290A1 (en) * 2012-03-01 2013-09-05 Eaton Corporation Instrument panel bus interface
WO2013141290A1 (en) * 2012-03-23 2013-09-26 株式会社Mush-A Data processing device, data processing system, data structure, recording medium, storage device and data processing method
JP2013222364A (en) 2012-04-18 2013-10-28 Renesas Electronics Corp Signal processing circuit
US8775727B2 (en) 2012-08-31 2014-07-08 Lsi Corporation Lookup engine with pipelined access, speculative add and lock-in-hit function
US9082078B2 (en) 2012-07-27 2015-07-14 The Intellisis Corporation Neural processing engine and architecture using the same
CN103631315A (en) * 2012-08-22 2014-03-12 上海华虹集成电路有限责任公司 Clock design method facilitating timing sequence repair
US9185057B2 (en) * 2012-12-05 2015-11-10 The Intellisis Corporation Smart memory
US9639371B2 (en) 2013-01-29 2017-05-02 Advanced Micro Devices, Inc. Solution to divergent branches in a SIMD core using hardware pointers
US9391893B2 (en) * 2013-02-26 2016-07-12 Dell Products L.P. Lookup engine for an information handling system
US20140269690A1 (en) * 2013-03-13 2014-09-18 Qualcomm Incorporated Network element with distributed flow tables
US9185003B1 (en) * 2013-05-02 2015-11-10 Amazon Technologies, Inc. Distributed clock network with time synchronization and activity tracing between nodes
US20150012679A1 (en) * 2013-07-03 2015-01-08 Iii Holdings 2, Llc Implementing remote transaction functionalities between data processing nodes of a switched interconnect fabric
US10331583B2 (en) 2013-09-26 2019-06-25 Intel Corporation Executing distributed memory operations using processing elements connected by distributed channels
US10392744B2 (en) * 2013-10-11 2019-08-27 Wpt Gmbh Elastic floor covering in the form of a web product that can be rolled up
US20150120224A1 (en) * 2013-10-29 2015-04-30 C3 Energy, Inc. Systems and methods for processing data relating to energy usage
EP3328103A1 (en) 2013-11-29 2018-05-30 Nec Corporation Apparatus, system and method for mtc
US9547553B1 (en) 2014-03-10 2017-01-17 Parallel Machines Ltd. Data resiliency in a shared memory pool
US9372723B2 (en) * 2014-04-01 2016-06-21 Freescale Semiconductor, Inc. System and method for conditional task switching during ordering scope transitions
US9372724B2 (en) * 2014-04-01 2016-06-21 Freescale Semiconductor, Inc. System and method for conditional task switching during ordering scope transitions
US9781027B1 (en) 2014-04-06 2017-10-03 Parallel Machines Ltd. Systems and methods to communicate with external destinations via a memory network
US9690713B1 (en) 2014-04-22 2017-06-27 Parallel Machines Ltd. Systems and methods for effectively interacting with a flash memory
US9477412B1 (en) 2014-12-09 2016-10-25 Parallel Machines Ltd. Systems and methods for automatically aggregating write requests
US9733981B2 (en) 2014-06-10 2017-08-15 Nxp Usa, Inc. System and method for conditional task switching during ordering scope transitions
US9781225B1 (en) 2014-12-09 2017-10-03 Parallel Machines Ltd. Systems and methods for cache streams
US9753873B1 (en) 2014-12-09 2017-09-05 Parallel Machines Ltd. Systems and methods for key-value transactions
US9639473B1 (en) 2014-12-09 2017-05-02 Parallel Machines Ltd. Utilizing a cache mechanism by copying a data set from a cache-disabled memory location to a cache-enabled memory location
US9639407B1 (en) 2014-12-09 2017-05-02 Parallel Machines Ltd. Systems and methods for efficiently implementing functional commands in a data processing system
US9632936B1 (en) 2014-12-09 2017-04-25 Parallel Machines Ltd. Two-tier distributed memory
US9552327B2 (en) 2015-01-29 2017-01-24 Knuedge Incorporated Memory controller for a network on a chip device
US10061531B2 (en) 2015-01-29 2018-08-28 Knuedge Incorporated Uniform system wide addressing for a computing system
US9749225B2 (en) * 2015-04-17 2017-08-29 Huawei Technologies Co., Ltd. Software defined network (SDN) control signaling for traffic engineering to enable multi-type transport in a data plane
EP3278213A4 (en) 2015-06-05 2019-01-30 C3 IoT, Inc. Systems, methods, and devices for an enterprise internet-of-things application development platform
US20160381136A1 (en) * 2015-06-24 2016-12-29 Futurewei Technologies, Inc. System, method, and computer program for providing rest services to fine-grained resources based on a resource-oriented network
CN106326967B (en) * 2015-06-29 2023-05-05 四川谦泰仁投资管理有限公司 RFID chip with interactive switch input port
US10313231B1 (en) * 2016-02-08 2019-06-04 Barefoot Networks, Inc. Resilient hashing for forwarding packets
US10063407B1 (en) 2016-02-08 2018-08-28 Barefoot Networks, Inc. Identifying and marking failed egress links in data plane
US10027583B2 (en) 2016-03-22 2018-07-17 Knuedge Incorporated Chained packet sequences in a network on a chip architecture
US9595308B1 (en) 2016-03-31 2017-03-14 Altera Corporation Multiple-die synchronous insertion delay measurement circuit and methods
US10346049B2 (en) 2016-04-29 2019-07-09 Friday Harbor Llc Distributed contiguous reads in a network on a chip architecture
US10402168B2 (en) 2016-10-01 2019-09-03 Intel Corporation Low energy consumption mantissa multiplication for floating point multiply-add operations
US10664942B2 (en) * 2016-10-21 2020-05-26 Advanced Micro Devices, Inc. Reconfigurable virtual graphics and compute processor pipeline
US10084687B1 (en) 2016-11-17 2018-09-25 Barefoot Networks, Inc. Weighted-cost multi-pathing using range lookups
US10416999B2 (en) 2016-12-30 2019-09-17 Intel Corporation Processors, methods, and systems with a configurable spatial accelerator
US10558575B2 (en) 2016-12-30 2020-02-11 Intel Corporation Processors, methods, and systems with a configurable spatial accelerator
US10474375B2 (en) 2016-12-30 2019-11-12 Intel Corporation Runtime address disambiguation in acceleration hardware
US10572376B2 (en) 2016-12-30 2020-02-25 Intel Corporation Memory ordering in acceleration hardware
US10237206B1 (en) 2017-03-05 2019-03-19 Barefoot Networks, Inc. Equal cost multiple path group failover for multicast
US10404619B1 (en) 2017-03-05 2019-09-03 Barefoot Networks, Inc. Link aggregation group failover for multicast
US10296351B1 (en) * 2017-03-15 2019-05-21 Ambarella, Inc. Computer vision processing in hardware data paths
US10243882B1 (en) 2017-04-13 2019-03-26 Xilinx, Inc. Network on chip switch interconnect
CN107704922B (en) 2017-04-19 2020-12-08 赛灵思公司 Artificial neural network processing device
US10514719B2 (en) * 2017-06-27 2019-12-24 Biosense Webster (Israel) Ltd. System and method for synchronization among clocks in a wireless system
US10445451B2 (en) 2017-07-01 2019-10-15 Intel Corporation Processors, methods, and systems for a configurable spatial accelerator with performance, correctness, and power reduction features
US10515049B1 (en) 2017-07-01 2019-12-24 Intel Corporation Memory circuits and methods for distributed memory hazard detection and error recovery
US10515046B2 (en) 2017-07-01 2019-12-24 Intel Corporation Processors, methods, and systems with a configurable spatial accelerator
US10387319B2 (en) 2017-07-01 2019-08-20 Intel Corporation Processors, methods, and systems for a configurable spatial accelerator with memory system performance, power reduction, and atomics support features
US10469397B2 (en) 2017-07-01 2019-11-05 Intel Corporation Processors and methods with configurable network-based dataflow operator circuits
US10467183B2 (en) 2017-07-01 2019-11-05 Intel Corporation Processors and methods for pipelined runtime services in a spatial array
US10445234B2 (en) 2017-07-01 2019-10-15 Intel Corporation Processors, methods, and systems for a configurable spatial accelerator with transactional and replay features
US11086816B2 (en) 2017-09-28 2021-08-10 Intel Corporation Processors, methods, and systems for debugging a configurable spatial accelerator
US10496574B2 (en) 2017-09-28 2019-12-03 Intel Corporation Processors, methods, and systems for a memory fence in a configurable spatial accelerator
US10380063B2 (en) 2017-09-30 2019-08-13 Intel Corporation Processors, methods, and systems with a configurable spatial accelerator having a sequencer dataflow operator
US10445098B2 (en) 2017-09-30 2019-10-15 Intel Corporation Processors and methods for privileged configuration in a spatial array
CN107831824B (en) * 2017-10-16 2021-04-06 北京比特大陆科技有限公司 Clock signal transmission method and device, multiplexing chip and electronic equipment
GB2568087B (en) * 2017-11-03 2022-07-20 Imagination Tech Ltd Activation functions for deep neural networks
US10565134B2 (en) 2017-12-30 2020-02-18 Intel Corporation Apparatus, methods, and systems for multicast in a configurable spatial accelerator
US10445250B2 (en) 2017-12-30 2019-10-15 Intel Corporation Apparatus, methods, and systems with a configurable spatial accelerator
US10417175B2 (en) 2017-12-30 2019-09-17 Intel Corporation Apparatus, methods, and systems for memory consistency in a configurable spatial accelerator
US10673745B2 (en) 2018-02-01 2020-06-02 Xilinx, Inc. End-to-end quality-of-service in a network-on-chip
US10503690B2 (en) 2018-02-23 2019-12-10 Xilinx, Inc. Programmable NOC compatible with multiple interface communication protocol
JP2019153909A (en) * 2018-03-02 2019-09-12 株式会社リコー Semiconductor integrated circuit and clock supply method
US10621129B2 (en) 2018-03-27 2020-04-14 Xilinx, Inc. Peripheral interconnect for configurable slave endpoint circuits
US11307873B2 (en) 2018-04-03 2022-04-19 Intel Corporation Apparatus, methods, and systems for unstructured data flow in a configurable spatial accelerator with predicate propagation and merging
US10564980B2 (en) 2018-04-03 2020-02-18 Intel Corporation Apparatus, methods, and systems for conditional queues in a configurable spatial accelerator
US10505548B1 (en) 2018-05-25 2019-12-10 Xilinx, Inc. Multi-chip structure having configurable network-on-chip
US11200186B2 (en) 2018-06-30 2021-12-14 Intel Corporation Apparatuses, methods, and systems for operations in a configurable spatial accelerator
US10853073B2 (en) 2018-06-30 2020-12-01 Intel Corporation Apparatuses, methods, and systems for conditional operations in a configurable spatial accelerator
US10891240B2 (en) 2018-06-30 2021-01-12 Intel Corporation Apparatus, methods, and systems for low latency communication in a configurable spatial accelerator
US10459866B1 (en) 2018-06-30 2019-10-29 Intel Corporation Apparatuses, methods, and systems for integrated control and data processing in a configurable spatial accelerator
US10838908B2 (en) 2018-07-20 2020-11-17 Xilinx, Inc. Configurable network-on-chip for a programmable device
US10824505B1 (en) 2018-08-21 2020-11-03 Xilinx, Inc. ECC proxy extension and byte organization for multi-master systems
US11176281B2 (en) * 2018-10-08 2021-11-16 Micron Technology, Inc. Security managers and methods for implementing security protocols in a reconfigurable fabric
US10963460B2 (en) 2018-12-06 2021-03-30 Xilinx, Inc. Integrated circuits and methods to accelerate data queries
US10678724B1 (en) 2018-12-29 2020-06-09 Intel Corporation Apparatuses, methods, and systems for in-network storage in a configurable spatial accelerator
US10796040B2 (en) * 2019-02-05 2020-10-06 Arm Limited Integrated circuit design and fabrication
US10936486B1 (en) 2019-02-21 2021-03-02 Xilinx, Inc. Address interleave support in a programmable device
US10680615B1 (en) 2019-03-27 2020-06-09 Xilinx, Inc. Circuit for and method of configuring and partially reconfiguring function blocks of an integrated circuit device
US10965536B2 (en) 2019-03-30 2021-03-30 Intel Corporation Methods and apparatus to insert buffers in a dataflow graph
US10915471B2 (en) 2019-03-30 2021-02-09 Intel Corporation Apparatuses, methods, and systems for memory interface circuit allocation in a configurable spatial accelerator
US11029927B2 (en) 2019-03-30 2021-06-08 Intel Corporation Methods and apparatus to detect and annotate backedges in a dataflow graph
US10817291B2 (en) 2019-03-30 2020-10-27 Intel Corporation Apparatuses, methods, and systems for swizzle operations in a configurable spatial accelerator
US11386038B2 (en) * 2019-05-09 2022-07-12 SambaNova Systems, Inc. Control flow barrier and reconfigurable data processor
US10891414B2 (en) 2019-05-23 2021-01-12 Xilinx, Inc. Hardware-software design flow for heterogeneous and programmable devices
US11301295B1 (en) 2019-05-23 2022-04-12 Xilinx, Inc. Implementing an application specified as a data flow graph in an array of data processing engines
US11188312B2 (en) 2019-05-23 2021-11-30 Xilinx, Inc. Hardware-software design flow with high-level synthesis for heterogeneous and programmable devices
US10891132B2 (en) 2019-05-23 2021-01-12 Xilinx, Inc. Flow convergence during hardware-software design for heterogeneous and programmable devices
US11288244B2 (en) * 2019-06-10 2022-03-29 Akamai Technologies, Inc. Tree deduplication
US11037050B2 (en) 2019-06-29 2021-06-15 Intel Corporation Apparatuses, methods, and systems for memory interface circuit arbitration in a configurable spatial accelerator
US10977018B1 (en) 2019-12-05 2021-04-13 Xilinx, Inc. Development environment for heterogeneous devices
US11907713B2 (en) 2019-12-28 2024-02-20 Intel Corporation Apparatuses, methods, and systems for fused operations using sign modification in a processing element of a configurable spatial accelerator
US11320885B2 (en) 2020-05-26 2022-05-03 Dell Products L.P. Wide range power mechanism for over-speed memory design
US11496418B1 (en) 2020-08-25 2022-11-08 Xilinx, Inc. Packet-based and time-multiplexed network-on-chip
CN114528246A (en) * 2020-11-23 2022-05-24 深圳比特微电子科技有限公司 Operation core, calculation chip and encrypted currency mining machine
US11520717B1 (en) 2021-03-09 2022-12-06 Xilinx, Inc. Memory tiles in data processing engine array
US11336287B1 (en) 2021-03-09 2022-05-17 Xilinx, Inc. Data processing engine array architecture with memory tiles
US11797480B2 (en) * 2021-12-31 2023-10-24 Tsx Inc. Storage of order books with persistent data structures
US11848670B2 (en) 2022-04-15 2023-12-19 Xilinx, Inc. Multiple partitions in a data processing array

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4860191A (en) * 1985-08-08 1989-08-22 Nec Corporation Coprocessor with dataflow circuitry controlling sequencing to execution unit of data received in tokens from master processor
US5313582A (en) * 1991-04-30 1994-05-17 Standard Microsystems Corporation Method and apparatus for buffering data within stations of a communication network
US5544172A (en) * 1993-04-14 1996-08-06 Gpt Limited Method for the digital transmission of data
US5586119A (en) * 1994-08-31 1996-12-17 Motorola, Inc. Method and apparatus for packet alignment in a communication system
US5634015A (en) * 1991-02-06 1997-05-27 Ibm Corporation Generic high bandwidth adapter providing data communications between diverse communication networks and computer system
US5848290A (en) * 1995-03-09 1998-12-08 Sharp Kabushiki Kaisha Data driven information processor
US6104713A (en) * 1995-05-18 2000-08-15 Kabushiki Kaisha Toshiba Router device and datagram transfer method for data communication network system
US6275508B1 (en) * 1998-04-21 2001-08-14 Nexabit Networks, Llc Method of and system for processing datagram headers for high speed computer network interfaces at low clock speeds, utilizing scalable algorithms for performing such network header adaptation (SAPNA)
US6338078B1 (en) * 1998-12-17 2002-01-08 International Business Machines Corporation System and method for sequencing packets for multiprocessor parallelization in a computer network system
US6404752B1 (en) * 1999-08-27 2002-06-11 International Business Machines Corporation Network switch using network processor and methods
US20020107903A1 (en) * 2000-11-07 2002-08-08 Richter Roger K. Methods and systems for the order serialization of information in a network processing environment
US6654809B1 (en) * 1999-07-27 2003-11-25 Stmicroelectronics Limited Data processing device
US6799267B2 (en) * 2000-03-06 2004-09-28 Fujitsu Limited Packet processor

Family Cites Families (144)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IT1061921B (en) * 1976-06-23 1983-04-30 Lolli & C Spa IMPROVEMENT IN DIFFUSERS FOR AIR CONDITIONING SYSTEMS
USD259208S (en) * 1979-04-23 1981-05-12 Mccullough John R Roof vent
GB8401805D0 (en) * 1984-01-24 1984-02-29 Int Computers Ltd Data processing apparatus
JPS61156338A (en) * 1984-12-27 1986-07-16 Toshiba Corp Multiprocessor system
US4641571A (en) * 1985-07-15 1987-02-10 Enamel Products & Plating Co. Turbo fan vent
US4850027A (en) * 1985-07-26 1989-07-18 International Business Machines Corporation Configurable parallel pipeline image processing system
JPH0771111B2 (en) * 1985-09-13 1995-07-31 日本電気株式会社 Packet exchange processor
US5021947A (en) * 1986-03-31 1991-06-04 Hughes Aircraft Company Data-flow multiprocessor architecture with three dimensional multistage interconnection network for efficient signal and data processing
GB8618943D0 (en) * 1986-08-02 1986-09-10 Int Computers Ltd Data processing apparatus
DE3751412T2 (en) * 1986-09-02 1995-12-14 Fuji Photo Film Co Ltd Method and device for image processing with gradation correction of the image signal.
US5418970A (en) * 1986-12-17 1995-05-23 Massachusetts Institute Of Technology Parallel processing system with processor array with processing elements addressing associated memories using host supplied address value and base register content
GB8723203D0 (en) * 1987-10-02 1987-11-04 Crosfield Electronics Ltd Interactive image modification
DE3742941A1 (en) * 1987-12-18 1989-07-06 Standard Elektrik Lorenz Ag PACKAGE BROKERS
JP2559262B2 (en) * 1988-10-13 1996-12-04 富士写真フイルム株式会社 Magnetic disk
JPH02105910A (en) * 1988-10-14 1990-04-18 Hitachi Ltd Logic integrated circuit
AU620994B2 (en) * 1989-07-12 1992-02-27 Digital Equipment Corporation Compressed prefix matching database searching
US5212777A (en) * 1989-11-17 1993-05-18 Texas Instruments Incorporated Multi-processor reconfigurable in single instruction multiple data (SIMD) and multiple instruction multiple data (MIMD) modes and method of operation
US5218709A (en) * 1989-12-28 1993-06-08 The United States Of America As Represented By The Administrator Of The National Aeronautics And Space Administration Special purpose parallel computer architecture for real-time control and simulation in robotic applications
US5426610A (en) * 1990-03-01 1995-06-20 Texas Instruments Incorporated Storage circuitry using sense amplifier with temporary pause for voltage supply isolation
JPH04219859A (en) * 1990-03-12 1992-08-10 Hewlett Packard Co <Hp> Harware distributor which distributes series-instruction-stream data to parallel processors
US5327159A (en) * 1990-06-27 1994-07-05 Texas Instruments Incorporated Packed bus selection of multiple pixel depths in palette devices, systems and methods
US5121198A (en) * 1990-06-28 1992-06-09 Eastman Kodak Company Method of setting the contrast of a color video picture in a computer controlled photographic film analyzing system
US5625836A (en) 1990-11-13 1997-04-29 International Business Machines Corporation SIMD/MIMD processing memory element (PME)
US5713037A (en) 1990-11-13 1998-01-27 International Business Machines Corporation Slide bus communication functions for SIMD/MIMD array processor
US5765011A (en) 1990-11-13 1998-06-09 International Business Machines Corporation Parallel processing system having a synchronous SIMD processing with processing elements emulating SIMD operation using individual instruction streams
US5963746A (en) 1990-11-13 1999-10-05 International Business Machines Corporation Fully distributed processing memory element
US5590345A (en) * 1990-11-13 1996-12-31 International Business Machines Corporation Advanced parallel array processor(APAP)
US5285528A (en) * 1991-02-22 1994-02-08 International Business Machines Corporation Data structures and algorithms for managing lock states of addressable element ranges
WO1992015960A1 (en) * 1991-03-05 1992-09-17 Hajime Seki Electronic computer system and processor elements used for this system
US5224100A (en) * 1991-05-09 1993-06-29 David Sarnoff Research Center, Inc. Routing technique for a hierarchical interprocessor-communication network between massively-parallel processors
EP0593609A1 (en) * 1991-07-01 1994-04-27 Telstra Corporation Limited High speed switching architecture
US5404550A (en) * 1991-07-25 1995-04-04 Tandem Computers Incorporated Method and apparatus for executing tasks by following a linked list of memory packets
US5155484A (en) * 1991-09-13 1992-10-13 Salient Software, Inc. Fast data compressor with direct lookup table indexing into history buffer
JP2750968B2 (en) * 1991-11-18 1998-05-18 シャープ株式会社 Data driven information processor
US5307381A (en) * 1991-12-27 1994-04-26 Intel Corporation Skew-free clock signal distribution network in a microprocessor
US5603028A (en) * 1992-03-02 1997-02-11 Mitsubishi Denki Kabushiki Kaisha Method and apparatus for data distribution
JPH0696035A (en) 1992-09-16 1994-04-08 Sanyo Electric Co Ltd Processing element and parallel processing computer using the same
EP0601715A1 (en) * 1992-12-11 1994-06-15 National Semiconductor Corporation Bus of CPU core optimized for accessing on-chip memory devices
US5579223A (en) * 1992-12-24 1996-11-26 Microsoft Corporation Method and system for incorporating modifications made to a computer program into a translated version of the computer program
US5640551A (en) * 1993-04-14 1997-06-17 Apple Computer, Inc. Efficient high speed trie search process
US5420858A (en) * 1993-05-05 1995-05-30 Synoptics Communications, Inc. Method and apparatus for communications from a non-ATM communication medium to an ATM communication medium
JP2629568B2 (en) * 1993-07-30 1997-07-09 日本電気株式会社 ATM cell switching system
US5918061A (en) * 1993-12-29 1999-06-29 Intel Corporation Enhanced power managing unit (PMU) in a multiprocessor chip
US5524223A (en) 1994-01-31 1996-06-04 Motorola, Inc. Instruction accelerator for processing loop instructions with address generator using multiple stored increment values
US5423003A (en) * 1994-03-03 1995-06-06 Geonet Limited L.P. System for managing network computer applications
DE69428186T2 (en) * 1994-04-28 2002-03-28 Hewlett Packard Co Multicast device
DE69519449T2 (en) * 1994-05-05 2001-06-21 Conexant Systems Inc Space pointer data path
JP3093273B2 (en) 1994-05-06 2000-10-03 モトローラ・インコーポレーテッド Call routing system for wireless data devices
US5463732A (en) * 1994-05-13 1995-10-31 David Sarnoff Research Center, Inc. Method and apparatus for accessing a distributed data buffer
US5659781A (en) * 1994-06-29 1997-08-19 Larson; Noble G. Bidirectional systolic ring network
US5682480A (en) * 1994-08-15 1997-10-28 Hitachi, Ltd. Parallel computer system for performing barrier synchronization by transferring the synchronization packet through a path which bypasses the packet buffer in response to an interrupt
US5949781A (en) * 1994-08-31 1999-09-07 Brooktree Corporation Controller for ATM segmentation and reassembly
US5754584A (en) * 1994-09-09 1998-05-19 Omnipoint Corporation Non-coherent spread-spectrum continuous-phase modulation communication system
AU1174695A (en) * 1994-11-07 1996-05-31 Temple University - Of The Commonwealth System Of Higher Education Multicomputer system and method
US5651099A (en) * 1995-01-26 1997-07-22 Hewlett-Packard Company Use of a genetic algorithm to optimize memory space
US5634068A (en) * 1995-03-31 1997-05-27 Sun Microsystems, Inc. Packet switched cache coherent multiprocessor system
US5835095A (en) * 1995-05-08 1998-11-10 Intergraph Corporation Visible line processor
US5689677A (en) 1995-06-05 1997-11-18 Macmillan; David C. Circuit for enhancing performance of a computer for personal use
US6147996A (en) * 1995-08-04 2000-11-14 Cisco Technology, Inc. Pipelined multiple issue packet switch
US6115802A (en) * 1995-10-13 2000-09-05 Sun Mircrosystems, Inc. Efficient hash table for use in multi-threaded environments
US5612956A (en) * 1995-12-15 1997-03-18 General Instrument Corporation Of Delaware Reformatting of variable rate data for fixed rate communication
US5822606A (en) * 1996-01-11 1998-10-13 Morton; Steven G. DSP having a plurality of like processors controlled in parallel by an instruction word, and a control processor also controlled by the instruction word
IL116989A (en) * 1996-01-31 1999-10-28 Galileo Technology Ltd Switching ethernet controller
WO1997029613A1 (en) 1996-02-06 1997-08-14 International Business Machines Corporation Parallel on-the-fly processing of fixed length cells
US5781549A (en) * 1996-02-23 1998-07-14 Allied Telesyn International Corp. Method and apparatus for switching data packets in a data network
US6035193A (en) 1996-06-28 2000-03-07 At&T Wireless Services Inc. Telephone system having land-line-supported private base station switchable into cellular network
US6101176A (en) 1996-07-24 2000-08-08 Nokia Mobile Phones Method and apparatus for operating an indoor CDMA telecommunications system
US5828858A (en) * 1996-09-16 1998-10-27 Virginia Tech Intellectual Properties, Inc. Worm-hole run-time reconfigurable processor field programmable gate array (FPGA)
US6088355A (en) * 1996-10-11 2000-07-11 C-Cube Microsystems, Inc. Processing system with pointer-based ATM segmentation and reassembly
US6791947B2 (en) * 1996-12-16 2004-09-14 Juniper Networks In-line packet processing
JPH10271132A (en) * 1997-03-27 1998-10-09 Toshiba Corp Flow control system for packet exchange network
JP3000961B2 (en) * 1997-06-06 2000-01-17 日本電気株式会社 Semiconductor integrated circuit
US5969559A (en) * 1997-06-09 1999-10-19 Schwartz; David M. Method and apparatus for using a power grid for clock distribution in semiconductor integrated circuits
US5828870A (en) * 1997-06-30 1998-10-27 Adaptec, Inc. Method and apparatus for controlling clock skew in an integrated circuit
JP3469046B2 (en) * 1997-07-08 2003-11-25 株式会社東芝 Functional block and semiconductor integrated circuit device
US6047304A (en) * 1997-07-29 2000-04-04 Nortel Networks Corporation Method and apparatus for performing lane arithmetic to perform network processing
WO1999014893A2 (en) * 1997-09-17 1999-03-25 Sony Electronics Inc. Multi-port bridge with triplet architecture and periodical update of address look-up table
JPH11194850A (en) * 1997-09-19 1999-07-21 Lsi Logic Corp Clock distribution network for integrated circuit, and clock distribution method
US6009488A (en) * 1997-11-07 1999-12-28 Microlinc, Llc Computer having packet-based interconnect channel
US5872993A (en) * 1997-12-01 1999-02-16 Advanced Micro Devices, Inc. Communications system with multiple, simultaneous accesses to a memory
US6081523A (en) * 1997-12-05 2000-06-27 Advanced Micro Devices, Inc. Arrangement for transmitting packet data segments from a media access controller across multiple physical links
US6219796B1 (en) * 1997-12-23 2001-04-17 Texas Instruments Incorporated Power reduction for processors by software control of functional units
US6301603B1 (en) * 1998-02-17 2001-10-09 Euphonics Incorporated Scalable audio processing on a heterogeneous processor array
JP3490286B2 (en) 1998-03-13 2004-01-26 株式会社東芝 Router device and frame transfer method
JPH11272629A (en) * 1998-03-19 1999-10-08 Hitachi Ltd Data processor
US6052769A (en) * 1998-03-31 2000-04-18 Intel Corporation Method and apparatus for moving select non-contiguous bytes of packed data in a single instruction
AU3883499A (en) * 1998-05-07 1999-11-23 Aprisma Management Technologies, Inc. Multiple priority buffering in a computer network
US6131102A (en) * 1998-06-15 2000-10-10 Microsoft Corporation Method and system for cost computation of spelling suggestions and automatic replacement
US6305001B1 (en) * 1998-06-18 2001-10-16 Lsi Logic Corporation Clock distribution network planning and method therefor
DE69840947D1 (en) 1998-09-10 2009-08-13 Ibm Packet switching adapter for variable length data packets
US6393026B1 (en) * 1998-09-17 2002-05-21 Nortel Networks Limited Data packet processing system and method for a router
EP0992895A1 (en) * 1998-10-06 2000-04-12 Texas Instruments Inc. Hardware accelerator for data processing systems
JP3504510B2 (en) * 1998-10-12 2004-03-08 日本電信電話株式会社 Packet switch
JP3866425B2 (en) * 1998-11-12 2007-01-10 株式会社日立コミュニケーションテクノロジー Packet switch
US6272522B1 (en) * 1998-11-17 2001-08-07 Sun Microsystems, Incorporated Computer data packet switching and load balancing system using a general-purpose multiprocessor architecture
US6256421B1 (en) * 1998-12-07 2001-07-03 Xerox Corporation Method and apparatus for simulating JPEG compression
JP3704438B2 (en) * 1998-12-09 2005-10-12 株式会社日立製作所 Variable-length packet communication device
US6366584B1 (en) * 1999-02-06 2002-04-02 Triton Network Systems, Inc. Commercial network based on point to point radios
JP3587076B2 (en) 1999-03-05 2004-11-10 松下電器産業株式会社 Packet receiver
AU4651000A (en) * 1999-04-23 2000-11-10 Z-Dice, Inc. Gaming apparatus and method
GB2352536A (en) * 1999-07-21 2001-01-31 Element 14 Ltd Conditional instruction execution
USD428484S (en) * 1999-08-03 2000-07-18 Zirk Todd A Copper roof vent cover
US6631422B1 (en) 1999-08-26 2003-10-07 International Business Machines Corporation Network adapter utilizing a hashing function for distributing packets to multiple processors for parallel processing
US6631419B1 (en) * 1999-09-22 2003-10-07 Juniper Networks, Inc. Method and apparatus for high-speed longest prefix and masked prefix table search
US6963572B1 (en) * 1999-10-22 2005-11-08 Alcatel Canada Inc. Method and apparatus for segmentation and reassembly of data packets in a communication switch
AU2470901A (en) * 1999-10-26 2001-06-06 Arthur D. Little, Inc. Bit-serial memory access with wide processing elements for simd arrays
JP2001177574A (en) * 1999-12-20 2001-06-29 Kddi Corp Transmission controller in packet exchange network
GB2357601B (en) * 1999-12-23 2004-03-31 Ibm Remote power control
US6661794B1 (en) 1999-12-29 2003-12-09 Intel Corporation Method and apparatus for gigabit packet assignment for multithreaded packet processing
CN100339832C (en) * 2000-01-07 2007-09-26 国际商业机器公司 Method and system for frame and protocol classification
US20030093613A1 (en) * 2000-01-14 2003-05-15 David Sherman Compressed ternary mask system and method
JP2001202345A (en) * 2000-01-21 2001-07-27 Hitachi Ltd Parallel processor
DE60026229T2 (en) * 2000-01-27 2006-12-14 International Business Machines Corp. Method and apparatus for classifying data packets
US6704794B1 (en) * 2000-03-03 2004-03-09 Nokia Intelligent Edge Routers Inc. Cell reassembly for packet based networks
US7139282B1 (en) * 2000-03-24 2006-11-21 Juniper Networks, Inc. Bandwidth division for packet processing
US7089240B2 (en) * 2000-04-06 2006-08-08 International Business Machines Corporation Longest prefix match lookup using hash function
US7107265B1 (en) * 2000-04-06 2006-09-12 International Business Machines Corporation Software management tree implementation for a network processor
US6718326B2 (en) * 2000-08-17 2004-04-06 Nippon Telegraph And Telephone Corporation Packet classification search device and method
DE10059026A1 (en) 2000-11-28 2002-06-13 Infineon Technologies Ag Unit for the distribution and processing of data packets
GB2370381B (en) * 2000-12-19 2003-12-24 Picochip Designs Ltd Processor architecture
USD453960S1 (en) * 2001-01-30 2002-02-26 Molded Products Company Shroud for a fan assembly
US6832261B1 (en) 2001-02-04 2004-12-14 Cisco Technology, Inc. Method and apparatus for distributed resequencing and reassembly of subdivided packets
JP2004524617A (en) 2001-02-14 2004-08-12 クリアスピード・テクノロジー・リミテッド Clock distribution system
GB2407673B (en) 2001-02-14 2005-08-24 Clearspeed Technology Plc Lookup engine
JP4475835B2 (en) * 2001-03-05 2010-06-09 富士通株式会社 Input line interface device and packet communication device
CA97495S (en) * 2001-03-20 2003-05-07 Flettner Ventilator Ltd Rotor
USD471971S1 (en) * 2001-03-20 2003-03-18 Flettner Ventilator Limited Ventilation cover
US6687715B2 (en) * 2001-06-28 2004-02-03 Intel Corporation Parallel lookups that keep order
US6922716B2 (en) 2001-07-13 2005-07-26 Motorola, Inc. Method and apparatus for vector processing
US7257590B2 (en) * 2001-08-29 2007-08-14 Nokia Corporation Method and system for classifying binary strings
US7283538B2 (en) * 2001-10-12 2007-10-16 Vormetric, Inc. Load balanced scalable network gateway processor architecture
US7317730B1 (en) * 2001-10-13 2008-01-08 Greenfield Networks, Inc. Queueing architecture and load balancing for parallel packet processing in communication networks
US6941446B2 (en) 2002-01-21 2005-09-06 Analog Devices, Inc. Single instruction multiple data array cell
US7382782B1 (en) 2002-04-12 2008-06-03 Juniper Networks, Inc. Packet spraying for load balancing across multiple packet processors
US20030235194A1 (en) * 2002-06-04 2003-12-25 Mike Morrison Network processor with multiple multi-threaded packet-type specific engines
US7200137B2 (en) * 2002-07-29 2007-04-03 Freescale Semiconductor, Inc. On chip network that maximizes interconnect utilization between processing elements
US8015567B2 (en) 2002-10-08 2011-09-06 Netlogic Microsystems, Inc. Advanced processor with mechanism for packet distribution at high line rate
GB0226249D0 (en) * 2002-11-11 2002-12-18 Clearspeed Technology Ltd Traffic handling system
US7656799B2 (en) 2003-07-29 2010-02-02 Citrix Systems, Inc. Flow control system architecture
US7620050B2 (en) 2004-09-10 2009-11-17 Canon Kabushiki Kaisha Communication control device and communication control method
US7787454B1 (en) 2007-10-31 2010-08-31 Gigamon Llc. Creating and/or managing meta-data for data storage devices using a packet switch appliance
JP5231926B2 (en) 2008-10-06 2013-07-10 キヤノン株式会社 Information processing apparatus, control method therefor, and computer program
US8493979B2 (en) * 2008-12-30 2013-07-23 Intel Corporation Single instruction processing of network packets
US8014295B2 (en) 2009-07-14 2011-09-06 Ixia Parallel packet processor with session active checker

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4860191A (en) * 1985-08-08 1989-08-22 Nec Corporation Coprocessor with dataflow circuitry controlling sequencing to execution unit of data received in tokens from master processor
US5634015A (en) * 1991-02-06 1997-05-27 Ibm Corporation Generic high bandwidth adapter providing data communications between diverse communication networks and computer system
US5313582A (en) * 1991-04-30 1994-05-17 Standard Microsystems Corporation Method and apparatus for buffering data within stations of a communication network
US5544172A (en) * 1993-04-14 1996-08-06 Gpt Limited Method for the digital transmission of data
US5586119A (en) * 1994-08-31 1996-12-17 Motorola, Inc. Method and apparatus for packet alignment in a communication system
US5848290A (en) * 1995-03-09 1998-12-08 Sharp Kabushiki Kaisha Data driven information processor
US6104713A (en) * 1995-05-18 2000-08-15 Kabushiki Kaisha Toshiba Router device and datagram transfer method for data communication network system
US6275508B1 (en) * 1998-04-21 2001-08-14 Nexabit Networks, Llc Method of and system for processing datagram headers for high speed computer network interfaces at low clock speeds, utilizing scalable algorithms for performing such network header adaptation (SAPNA)
US6338078B1 (en) * 1998-12-17 2002-01-08 International Business Machines Corporation System and method for sequencing packets for multiprocessor parallelization in a computer network system
US6654809B1 (en) * 1999-07-27 2003-11-25 Stmicroelectronics Limited Data processing device
US6404752B1 (en) * 1999-08-27 2002-06-11 International Business Machines Corporation Network switch using network processor and methods
US6799267B2 (en) * 2000-03-06 2004-09-28 Fujitsu Limited Packet processor
US20020107903A1 (en) * 2000-11-07 2002-08-08 Richter Roger K. Methods and systems for the order serialization of information in a network processing environment

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030078997A1 (en) * 2001-10-22 2003-04-24 Franzel Kenneth S. Module and unified network backplane interface for local networks
US20050254486A1 (en) * 2004-05-13 2005-11-17 Ittiam Systems (P) Ltd. Multi processor implementation for signals requiring fast processing
US20130174166A1 (en) * 2011-12-29 2013-07-04 Oracle International Corporation Efficient Sequencer
US9542236B2 (en) * 2011-12-29 2017-01-10 Oracle International Corporation Efficiency sequencer for multiple concurrently-executing threads of execution
US10310915B2 (en) 2011-12-29 2019-06-04 Oracle International Corporation Efficient sequencer for multiple concurrently-executing threads of execution
US11269692B2 (en) 2011-12-29 2022-03-08 Oracle International Corporation Efficient sequencer for multiple concurrently-executing threads of execution
CN107679621A (en) * 2017-04-19 2018-02-09 北京深鉴科技有限公司 Artificial neural network processing unit
CN107679620A (en) * 2017-04-19 2018-02-09 北京深鉴科技有限公司 Artificial neural network processing unit
US10902315B2 (en) 2017-04-19 2021-01-26 Xilinx, Inc. Device for implementing artificial neural network with separate computation units
US11768714B2 (en) 2021-06-22 2023-09-26 Microsoft Technology Licensing, Llc On-chip hardware semaphore array supporting multiple conditionals

Also Published As

Publication number Publication date
US20110083000A1 (en) 2011-04-07
GB2377519B (en) 2005-06-15
US8127112B2 (en) 2012-02-28
US20050242976A1 (en) 2005-11-03
GB2389689B (en) 2005-06-08
GB2374443B (en) 2005-06-08
JP2004524617A (en) 2004-08-12
JP2004525449A (en) 2004-08-19
US20030041163A1 (en) 2003-02-27
US20070220232A1 (en) 2007-09-20
US7290162B2 (en) 2007-10-30
US7917727B2 (en) 2011-03-29
WO2002065700A3 (en) 2002-11-21
GB2389689A (en) 2003-12-17
GB0319801D0 (en) 2003-09-24
AU2002233500A1 (en) 2002-08-28
GB0203634D0 (en) 2002-04-03
US20020159466A1 (en) 2002-10-31
US20040114609A1 (en) 2004-06-17
US7856543B2 (en) 2010-12-21
CN1613041A (en) 2005-05-04
US20070217453A1 (en) 2007-09-20
GB0203633D0 (en) 2002-04-03
GB2390506A (en) 2004-01-07
GB2374442B (en) 2005-03-23
WO2002065700A2 (en) 2002-08-22
GB2390506B (en) 2005-03-23
WO2002065259A1 (en) 2002-08-22
GB0203632D0 (en) 2002-04-03
CN1504035A (en) 2004-06-09
GB2374443A (en) 2002-10-16
GB2374442A (en) 2002-10-16
CN100367730C (en) 2008-02-06
US20050243827A1 (en) 2005-11-03
US8200686B2 (en) 2012-06-12
GB2377519A (en) 2003-01-15
US20040130367A1 (en) 2004-07-08
US7818541B2 (en) 2010-10-19
GB0321186D0 (en) 2003-10-08

Similar Documents

Publication Publication Date Title
US20020161926A1 (en) Method for controlling the order of datagrams
EP1247168B1 (en) Memory shared between processing threads
He et al. Matchmaking: A new mapreduce scheduling technique
US7443836B2 (en) Processing a data packet
US7882312B2 (en) State engine for data processor
US7676646B2 (en) Packet processor with wide register set architecture
US7047370B1 (en) Full access to memory interfaces via remote request
US6853382B1 (en) Controller for a memory system having multiple partitions
US5353418A (en) System storing thread descriptor identifying one of plural threads of computation in storage only when all data for operating on thread is ready and independently of resultant imperative processing of thread
US20080040575A1 (en) Parallel data processing apparatus
US20040098496A1 (en) Thread signaling in multi-threaded network processor
US6983462B2 (en) Method and apparatus for serving a request queue
US7415598B2 (en) Message synchronization in network processors
US7529800B2 (en) Queuing of conflicted remotely received transactions
US20030088755A1 (en) Method and apparatus for the data-driven synschronous parallel processing of digital data
US5761506A (en) Method and apparatus for handling cache misses in a computer system
DE112005002432B4 (en) Method and apparatus for providing a source operand for an instruction in a processor
US20050132380A1 (en) Method for hiding latency in a task-based library framework for a multiprocessor environment
KR100895536B1 (en) Data transfer mechanism
US5935235A (en) Method for branching to an instruction in a computer program at a memory address pointed to by a key in a data structure
US5848257A (en) Method and apparatus for multitasking in a computer system
Kappes et al. A lock-free relaxed concurrent queue for fast work distribution
CN115129480A (en) Scalar processing unit and access control method thereof
US6829647B1 (en) Scaleable hardware arbiter
US7191309B1 (en) Double shift instruction for micro engine used in multithreaded parallel processor architecture

Legal Events

Date Code Title Description
AS Assignment

Owner name: CLEARSPEED TECHNOLOGY LIMITED, UNITED KINGDOM

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CAMERON, KEN;REEL/FRAME:012870/0068

Effective date: 20020314

AS Assignment

Owner name: CLEARSPEED TECHNOLOGY PLC, UNITED KINGDOM

Free format text: MERGER;ASSIGNOR:CLEARSPEED SOLUTIONS LIMITED;REEL/FRAME:017596/0727

Effective date: 20040701

Owner name: CLEARSPEED SOLUTIONS LIMITED, UNITED KINGDOM

Free format text: CHANGE OF NAME;ASSIGNOR:CLEARSPEED TECHNOLOGY LIMITED;REEL/FRAME:017596/0686

Effective date: 20040701

AS Assignment

Owner name: CLEARSPEED TECHNOLOGY LIMITED,UNITED KINGDOM

Free format text: CHANGE OF NAME;ASSIGNOR:CLEARSPEED TECHNOLOGY PLC;REEL/FRAME:024576/0975

Effective date: 20090729

Owner name: CLEARSPEED TECHNOLOGY LIMITED, UNITED KINGDOM

Free format text: CHANGE OF NAME;ASSIGNOR:CLEARSPEED TECHNOLOGY PLC;REEL/FRAME:024576/0975

Effective date: 20090729

AS Assignment

Owner name: RAMBUS INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CLEARSPEED TECHNOLOGY LTD;REEL/FRAME:024964/0861

Effective date: 20100818

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE