WO2004023718A2 - Systems and methods for packet flow regulation and transmission integrity verification of a switching entity - Google Patents

Systems and methods for packet flow regulation and transmission integrity verification of a switching entity Download PDF

Info

Publication number
WO2004023718A2
WO2004023718A2 PCT/CA2003/001353 CA0301353W WO2004023718A2 WO 2004023718 A2 WO2004023718 A2 WO 2004023718A2 CA 0301353 W CA0301353 W CA 0301353W WO 2004023718 A2 WO2004023718 A2 WO 2004023718A2
Authority
WO
WIPO (PCT)
Prior art keywords
packets
egress
ingress
receipt
acknowledgement
Prior art date
Application number
PCT/CA2003/001353
Other languages
French (fr)
Other versions
WO2004023718A3 (en
Inventor
Mohammed Sammour
Jean Belanger
Marcelo A. R. De Maria
Original Assignee
4198638 Canada Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 4198638 Canada Inc. filed Critical 4198638 Canada Inc.
Priority to AU2003264210A priority Critical patent/AU2003264210A1/en
Publication of WO2004023718A2 publication Critical patent/WO2004023718A2/en
Publication of WO2004023718A3 publication Critical patent/WO2004023718A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/26Flow control; Congestion control using explicit feedback to the source, e.g. choke packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/27Evaluation or update of window size, e.g. using information derived from acknowledged [ACK] packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/29Flow control; Congestion control using a combination of thresholds
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/30Peripheral units, e.g. input or output ports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L49/00Packet switching elements
    • H04L49/50Overload detection or protection within a single switching element

Definitions

  • the present invention relates generally to switching of packets by a switching entity and, in particular, to methods and apparatus for regulating the flow of packets through the switching entity and verifying the integrity of transmission through the switching entity.
  • Switch fabrics are often used to route traffic between end points in a network.
  • the need for regulating the flow of traffic through a switching entity arises whenever there is a risk of congestion in the switch fabric. Congestion results in the ingress or switch fabric buffers holding a large amount of packets which, for some reason, cannot leave the switch fabric at the rate they are entering. This leads to two major problems. Firstly, the degree of out-of- order packets that may be received at the egress will be higher than what the egress is dimensioned for, which can cause significant reordering problems. Disadvantageously, receipt of out-of-order packets can lead to corruption in the reassembled frames.
  • the degree of congestion in the switch fabric may cause the frame reassembly process of certain frames to take a very long time, resulting in a reduction in the outgoing bit rate, effectively introducing blocking.
  • the present invention provides a scheme used for regulating packet flow through the switch fabric.
  • the principle behind this scheme is based on exchanging tokens between the egress and ingress.
  • the present invention may be broadly summarized as a system for regulating packet flow through a switching entity.
  • the system comprises an ingress capable of sending packets to the switching entity in a designated order and an egress capable of receiving packets from the switching entity, re-ordering the packets in the designated order and sending an acknowledgement of receipt of the packets to the ingress upon re-ordering.
  • the ingress is adapted to receive acknowledgements of receipt of packets from the egress, maintain an indication of a number of packets for which an acknowledgement of receipt has not yet been received from the egress, perform a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from the egress and a threshold, and regulate sending packets to the switching entity on the basis of the comparison.
  • the present invention provides a method of regulating packet flow through a switching entity.
  • the method comprises sending packets from an ingress to the switching entity in a designated order, receiving packets from the switching entity at an egress, re-ordering the packets in the designated order and sending an acknowledgement of receipt of the re-ordered packets to the ingress, receiving acknowledgements of receipt of packets from the egress, maintaining an indication of a number of packets for which an acknowledgement of receipt has not yet been received from the egress, performing a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from the egress and a threshold, and regulating sending packets to the switching entity on the basis of the comparison.
  • the present invention provides computer- readable storage media containing a program element for execution by a computing device to implement an ingress for regulating packet flow through a switching entity.
  • the ingress comprises a control entity operative to send packets to the switching entity in a designated order, receive acknowledgements of receipt of re-ordered packets from an egress, maintain an indication of a number of packets for which an acknowledgement of receipt has not yet been received from the egress, perform a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from the egress and a threshold, and regulate sending packets to the switching entity on the basis of the comparison.
  • the present invention provides a scheme used for verifying the transmission integrity of the switch fabric.
  • the principle behind this scheme is based on exchanging marker packets at identifiable reference instants.
  • the present invention may be broadly summarized as a system for assessing integrity of a flow of packets through a switching entity.
  • the system comprises an ingress capable of sending packets to the switching entity, the packets including a reference packet sent to the switching entity at a reference instant and an egress capable of receiving packets from the switching entity and sending to the ingress an acknowledgement of receipt of packets from the switching entity.
  • the ingress is adapted to receive acknowledgements of receipt of packets from the egress, maintain a current indication of packets for which an acknowledgement of receipt has not yet been received from the egress, store a first data element indicative of packets for which an acknowledgement of receipt had not yet been received from the egress at the reference instant, maintain a second data element indicative of packets for which an acknowledgement of receipt is received from the egress between the reference instant and the instant at which an. acknowledgement of receipt of the reference packet is received from the egress, perform a comparison of the first and second data elements, and assess integrity of the flow of packets on the basis of the comparison.
  • a method of assessing integrity of a flow of packets through a switching entity comprises sending packets from an ingress to the switching entity, the packets including a reference packet sent to the switching entity at a reference instant, receiving packets from the switching entity at an egress and sending to the ingress an acknowledgement of receipt of packets from the switching entity, receiving acknowledgements of receipt of packets from the egress, maintaining a current indication of packets for which an acknowledgement of receipt has not yet been received from the egress, storing a first data element indicative of packets for which an acknowledgement of receipt had not yet been received from the egress at the reference instant, maintaining a second data element indicative of packets for which an acknowledgement of receipt from the egress is received between the reference instant and the instant at which an acknowledgement of receipt of the reference packet is received from the egress, performing a comparison of the first and second data elements, and assessing integrity of the flow of packets on the basis of the comparison.
  • the present invention provides computer- readable storage media containing a program element for execution by a computing device to implement an ingress for regulating packet flow through a switching entity.
  • the ingress comprises a control entity operative to send packets to the switching entity, the packets including a reference packet sent to the switching entity at a reference instant, receive acknowledgements of receipt of packets from an egress, maintain a current indication of packets for which an acknowledgement of receipt has not yet been received from the egress, store a first data element indicative of packets for which an acknowledgement of receipt had not yet been received from the egress at the reference instant, maintain a second data element indicative of packets for which an acknowledgement of receipt from the egress is received between the reference instant and the instant at which an acknowledgement of receipt of the reference packet is received from the egress, perform a comparison of the first and second data elements, and assess integrity of the flow of packets on the basis of the comparison.
  • Fig. 1 is a block diagram of a switching entity disposed between a plurality of ingresses and a plurality of egresses;
  • Fig. 2 illustrates steps executed by the ingress (on the left-hand side) and egress (on the right-hand side) in order to implement a packet flow regulation scheme in accordance with an embodiment of the present invention
  • Figs. 3 and 4 show additional steps executed by the ingress in order to implement a transmission integrity verification scheme in accordance with an embodiment of the present invention.
  • a system 10 that comprises plural ingresses 12 and plural egresses 14 connected on either side of a switching entity, such as a switch fabric 16.
  • the ingresses 12 and the egresses 14 are functional entities that are logically interconnected to one another although they may or may not be physically distinct.
  • the ingresses are functional entities that are logically interconnected to one another although they may or may not be physically distinct.
  • the ingresses 12 receive data from a source external to the system 10. The data is received via a plurality of physical or logical input queues (IQ) 24.
  • the ingresses 12 process the data and provide packets 13 to the switch fabric 16.
  • the ingresses 12 can be uniquely associated with individual input ports 18 of the switch fabric 16.
  • the ingresses 12 provide the packets
  • the switch fabric 16 can be a centralized entity or a distributed entity made up of smaller interconnected entities, and those smaller entities may physically reside in different chassis. Thus, it will be appreciated that the present invention applies to both centralized and distributed architectures of the switch fabric 16.
  • the switch fabric 16 generally has the capacity to switch packets from its input ports 18 to selected ones of a plurality of output ports 20, in accordance with routing instructions.
  • the routing instructions provided to the switch fabric 16 may arrive from an external source and may take the form of instructions to "switch input port A to output port B".
  • the routing instructions will be implicit in the information carried by the received packets themselves.
  • a given packet provided at input port A may specify an end destination, which is examined by the switch fabric 16 and translated by the switch fabric 16 into an output port B based on routing tables and the like.
  • routing tables could be stored in a memory 21 that is internal or external to the switch fabric 16.
  • the egresses 14 can be uniquely associated with individual ones of the output ports 20 of the switch fabric 16. Each of the egresses 14 receives packets 13' from the output ports 20 of the switch fabric 16, performs processing and provides processed data to an optoelectronic converter (O/E) 28 via an output queue (OQ) 26.
  • the opto-electronic converter 28 transforms the processed data received from the output queue 26 into an optical signal that is sent to an entity external to the system 10.
  • packets sharing a common quality of service requirement may have their own packet flow regulation schemes, irrespective of source-destination pair.
  • the characteristic may be priority, bandwidth class, and so on, or any combination of two or more characteristics.
  • the characteristic is the source- destination pair (also known as a "flow")
  • the ingress-side functionality of the packet flow regulation scheme is localized to a single one of the ingresses, denoted 12
  • the egress-side functionality of the packet flow regulation scheme is localized to a single one of the egresses, denoted 14.
  • the ingress-side functionality and the egress-side functionality of each one of a plurality of packet flow regulation schemes may be distributed across multiple ones of the ingresses 12 and egresses 14, respectively.
  • the ingress 12 is adapted to send the packets 13 to the switch fabric 16 in a designated order.
  • This can be referred to as performing a "sequencing operation" on a series of packets arriving at the ingress from an external entity.
  • the sequencing operation is shown as box 202.
  • the designated order is maintained / differentiated through the use of numbers (called sequence numbers).
  • sequence numbers The size of the sequence number space is N.
  • the value of N is chosen to be sufficiently large so as to account for the maximum delay through the switch fabric 16.
  • the designated order can be established, for example, by modifying the header of each packet 13 so as to implement a linked list of numbers from zero to (N-l) and back to zero again. This is but one simple technique that permits the packets 13 to be eventually re-sequenced upon receipt at the egress 14 in a possibly out-of-order fashion.
  • FIG. 2 Other boxes pertaining to the ingress 12 in Fig. 2 represent operations that form part of a packet flow regulation scheme that controls the rate at which the packets 13 are released into the switch fabric 16.
  • the packet flow regulation scheme will be described in further detail herein below.
  • the ingress 12 may include various other features, such as an arbiter or scheduler connected to the ingress queues 24, which controls the release of packets into the switch fabric 16, subject to the packet flow regulation scheme.
  • the order of the packets 13' received from the switch fabric 16 may be different from the designated order. This can be the result of queuing, arbitration or other processing features of the switch fabric 16. Accordingly, the egress 14 receives packets 13' from the switch fabric 16 and is adapted to re-order the packets 13' so that they re-acquire the designated order. This is known as a re-sequencing operation as shown as box 252. The re-sequencing operation can be effected by, for example, examining the contents of the header in each packet 13' received from the switch fabric 16.
  • the re-sequenced packets undergo further processing and the processed data is then sent to the opto-electronic converter 28 via the output queue 26 (see box 254). It will be understood that independent re-sequencing operations may be performed for each of the packet flow regulation schemes being implemented, e.g., on the basis of a characteristic such as source-destination pair, quality of service, priority, bandwidth class, etc.
  • the egress 14 keeps track of successfully re-sequenced packets by updating a "token credit counter" (see box 256), that may be implemented in software, for example.
  • a token credit counter that may be implemented in software, for example.
  • an acknowledgement of receipt of the re-sequenced packets is send back to the ingress 12.
  • the acknowledgement can be sent to the ingress 12 via the switch fabric 16 or through a separate external link (not shown).
  • the acknowledgement is sent back to the ingress 12 after the processed data corresponding to the packets 13' has exited the output queue 26 (i.e., just prior to opto- electronic conversion by the opto-electronic converter 28).
  • the acknowledgement may take the form of a special packet, referred to as a "token credit packet" (TCP) 22, which distinguishes itself from other packets by, e.g., a code in its header.
  • TCP token credit packet
  • a token credit packet 22 may be sent each time a packet is re-sequenced at the egress 14; however, this approach may create congestion in the direction from the egress 14 to the ingress 12.
  • a token credit packet 22 is sent once the value of the token credit counter exceeds a value M, where is a desired integer, possibly although not necessarily less than N. This is shown in box 258. ean be fixed or time varying.
  • M could be made to vary on the basis of a measure of the amount of available return bandwidth through the switch fabric 16. This measure of the amount of available return bandwidth may be effected by the switch fabric 16 and communicated to the egress 14. Alternatively, Mean remain fixed, but an additional condition for sending a token credit packet could be that the available return bandwidth be above a particular threshold. Other embodiments contemplate allocating the token credit packet 22 with the highest possible priority in order to ensure that it is timely received at the ingress 12 so as not to unduly slow down the flow through the switch fabric 16.
  • a function of the token credit packet 22 is to acknowledge the receipt of M successfully re- sequenced packets with which a particular token credit packet 22 is associated.
  • M would be fixed, it is not necessary to transmit the value M, as this information can be implicitly known by the ingress 12 from the mere fact that a token credit packet 22 is received (e.g., which will impliedly convey that the token credit counter has reached the value M). In other cases, where Mis time varying, this information can be specifically embedded in the body of the token credit packet 22.
  • the token credit counter is updated again. This is shown in box 260.
  • the egress 14 is designed so as to send a token credit packet 22 after the token credit counter reaches a value M, then updating the token credit counter consists of decrementing it by M
  • the acknowledgement can take the form of a "token refresh packet" (TRP) 40.
  • TRP token refresh packet
  • the appropriate moment for triggering the transmittal of a token refresh packet may be dependent on received information (e.g., receipt of a special type of packet, called a "marker packet" as will be described later on or an amount of elapsed time since previous transmittal of an acknowledgement.)
  • the token refresh packet 40 is sent. This is shown in box 262.
  • the token credit counter is updated. This is shown in box 264 and basically consists of resetting the token credit counter to zero.
  • a function of the token refresh packet 40 is to acknowledge that a certain number of packets have been successfully received and re-sequenced at the egress 14. This number is equal to the value of the token credit counter at the time the acknowledgement is generated. Since this number is not known ahead of time, the value of the token credit counter can be specifically embedded in the body of the token refresh packet 40.
  • the token credit counter is updated again. This is shown in box 264. In particular, this consists of resetting the token credit counter to zero.
  • the ingress 12 applies a packet flow regulation scheme in order to alleviate congestion at the switch fabric 16.
  • the ingress 12 keeps track of the number of packets that have been fed to the switch fabric 16 and for which an acknowledgement of receipt has not yet been received from the egress 14. Let the number of such as-yet- unacknowledged packets equal L(t), where t is a time variable denoting that the quantity L(t) will change over time.
  • the ingress 12 keeps track of L(t) by, for example, implementing what can be referred to as a "reverse token bucket", where a "token” refers to a packet for which an acknowledgement has not yet been received.
  • the fill level E(t) of the reverse token bucket is generally increased upon sending packets to the switch fabric 16 and is reduced upon receipt of a token credit packet 22 or token refresh packet 40 from the egress 14.
  • the reverse token bucket may be one of many similar buckets stored in a table in memory, such table being referred to as a "token bucket table".
  • the token bucket table is indexed on a per-FLOW basis and thus, for each different FLOW being handled by the ingress, the appropriate entry in the token bucket table is consulted.
  • the reverse token bucket will, after the fifteenth (15 ) packet 13 is sent into the switch fabric 16, drop to the value 10. Subsequently, the fill level L(f) of the reverse token bucket will climb gradually again to 15 and then drop to 10; this pattern will continue indefinitely, assuming the ideal circumstances whereby there is no congestion through the switch fabric 16. Thus, under ideal circumstances, the fill level L(t) of the reverse token bucket is roughly indicative of the expected latency of the switch fabric 16.
  • the reverse token bucket is compared to a threshold, denoted a, which represents a demarcation between an acceptable and an unacceptable delay through the switch fabric 16. This is shown in box 210 in Fig. 2.
  • the comparison can be effected in hardware, software or a combination thereof.
  • the comparison can be effected for each packet or for each group of packets.
  • the ingress 12 can regulate the transmission of packets being fed to the switch fabric 16.
  • a next group of packets could continue to be sent to the switch fabric 16 (see box 212), while if the fill level Z(t) of the reverse token bucket is greater than or equal to the threshold a, then the ingress 12 could be made to refrain from sending the next group packets to the switch fabric 16 and, instead, placing the packets in temporary storage, such as a buffer (see box 214).
  • the act of refraining from sending packets to the switch fabric 16 need not be applied to the next packet (or group of packets) but rather to the following one.
  • transmission of the next packet (or group of packets) may be permitted regardless of the fill level L(f) of the reverse token bucket, although transmission of the subsequent packet (or group of packets) will be placed on hold.
  • a memory element in the ingress is made to contain a value S, which is the time-varying result of the operation sgn(E(t) - a).
  • S the time-varying result of the operation sgn(E(t) - a).
  • a binary decision to continue to send, or refrain from sending, packets to the switch fabric 16 is made on the basis of whether S is or is not equal to -1.
  • the value of S may be conveniently stored in a table, which may be termed a "backpressure table" 30.
  • the backpressure table 30 is indexed on a per-FLOW basis and thus, for each flow being handled by the ingress 12, the entry in the backpressure table corresponding to the appropriate FLOW is consulted.
  • An advantage of using the backpressure table 30 instead of performing the comparison of the fill level L(t) of the reverse token bucket to the threshold stems from the fact that the entries in the backpressure table 30 can be modified from "behind the scenes", namely as a result of the occurrence of other events, not only on the basis of the difference between the fill level L(t) of the reverse token bucket and the threshold a.
  • the decision to continue to send, or refrain from sending, packets to the switch fabric 16 can be made continually, periodically or sporadically, depending various design parameters, such as the desired responsiveness of the ingress 12 to changes in the latency through the switch fabric 16, the available processing power of the ingress 12, and so on. For instance, this may lead to a design in which a comparison between the fill level E(t) of the reverse token bucket and the threshold a, or alternatively the reading of the value S, is performed by the ingress 12 for each group of successive packets, so that the result of the comparison will lead to a decision to transmit or not transmit the next (or subsequent group of) packets.
  • the reverse token bucket is incremented by the number of packets in the group (see box 208). This update to the reverse token bucket may be done shortly before or shortly after packets have in actuality been sent to the switch fabric 16. It is recalled that the fill level of the reverse token bucket is decremented upon receipt of a token credit packet or token refresh packet 40 from the egress 14 (see box 206). Thus, it is envisaged that the fill level L(t) of the reverse token bucket may be increased in increments greater than one and decremented in increments of M It is recalled that the threshold a represents the maximum number of unacknowledged packets that are allowed to leave the ingress 12.
  • the threshold a can thus be referred to as a "backpressure watermark". While in some embodiments, a may fixed, in other embodiments it may vary over time from amongst a set of possible backpressure watermark values, for example. This means that traffic can be throttled even though the switch fabric 16 is not operating at maximum capacity. This enhancement may be used by the egress 14 to slow down (regulate) the rate at which each ingress 12 is sending traffic to it, based on a set of regulation criteria.
  • Example of such regulation criteria include an indication of resource (e.g., memory) utilization at the egress 14.
  • the egress 14 may be equipped with suitable circuitry, software and/or control logic for monitoring usage of a resource (such as memory) to determine a resource utilization level. If multiple egresses share the same resources (e.g., memory space), then the same resource utilization level could be used by all of the egresses concerned.
  • the egress 14 is adapted to sent the resource utilization level to the ingress 12.
  • the ingress 12 determines the backpressure watermark in accordance therewith, e.g., by assessing whether the resource utilization is considered low, medium or high.
  • the egress 14 is adapted to perform the further step of determining the backpressure watermark.
  • This backpressure watermark is then sent to the ingress, e.g., via the switch fabric 16 or an external link.
  • the backpressure watermark may be selected from a fixed set of watermarks that are associated with respective codes known to the ingress 12.
  • the egress 14 may sent the code to the ingress 12, allowing the ingress 12 to set the backpressure watermark in accordance with the code. Again, transmission of such information from the egress 14 to the ingress 12 can be achieved through the switch fabric 16 or via a separate link (not shown).
  • a first threshold may be set by the current level of resource utilization and a second threshold may be set by a known storage capacity of the switch fabric 16.
  • a second threshold may be set by a known storage capacity of the switch fabric 16.
  • the actual threshold used by the ingress 12 as the backpressure watermark it is reasonable for the actual threshold used by the ingress 12 as the backpressure watermark to be the minimum of the two thresholds.
  • the above feature could also be used to implement a slow-start mechanism or a token-based bandwidth allocation and control mechanism.
  • a positive token bucket keeps track of a number of unacknowledged packets that are still allowed to be emitted by the ingress 12.
  • the number of packets so transmitted is subtracted from the positive token bucket.
  • acknowledgements received from the egress 14 will tend to add to the fill level of the positive token bucket. If the fill level of the positive token bucket falls to zero, the ingress 12 is adapted to refrain from sending further packets to the switch fabric 16.
  • the positive token bucket is used to regulate the flow of packets through the switch fabric 16.
  • the above packet flow regulation scheme implemented by the cooperation of the ingress 12 and egress 14 in the system 10 controls the extent to which out-of-order packets occur, and hence can control congestion through the switch fabric 16 by ensuring that the egress 14 will not receive more packets than what it is prepared to accept.
  • each of the "packets" described above can be considered to be a "frame” consisting of multiple "segments”. Multiple frames are assumed to be received from an external entity by the ingress 12 and occupy the designated order.
  • the segments within a given frame also define a specific order within their
  • the ingress 12 and egress 14 implement a pure frame-based mode of the packet flow regulation scheme, which can be specified by a software flag at the ingress 12.
  • the re-ordering of segments within a frame is not considered in the pure frame-based mode of operation, although in another embodiment, both frame-based and non-frame-based modes of the packet flow regulation scheme may coexist simultaneously and independently.
  • To implement the pure frame-based mode of operation of the packet flow regulation scheme a sequencing and re-sequencing operation is performed amongst the frames themselves. It is recalled that the "packets" referred to above now represent "frames". Each frame has a sequence number which ranges from zero to (N-1).
  • the designated order referred to above refers to the order in which the frames are sent into the switching entity 16 by the ingress 12.
  • the reverse token bucket now counts unacknowledged frames sent by the ingress 12 into the switch fabric 16.
  • L(t) represents the number of as yet unacknowledged frames from the point of view of the ingress 12.
  • the token credit counter tracks the number of properly re-sequenced frames at the egress 14.
  • the egress 14 sends a token credit packet 22 to the ingress 12. Because the ingress 12 operates a frame-based packet flow regulation scheme, the ingress will know that the token credit packet 22 represents a total of Mproperly re-sequenced frames and updates the fill level L(f) of the reverse token bucket accordingly.
  • the above-described frame-based mode of operation tends to be characterized by a slower responsiveness to congestion, it tends utilizes less bandwidth in the egress-to-ingress direction than a non-frame-based mode of operation of the packet flow regulation scheme, since a token credit packet is sent by the egress 14 only once M frames have been successfully re-sequenced.
  • the ingress 12 is adapted to perform a monitoring operation on the token credit packets 22 received from the egress 14.
  • the token credit packets 22 can be enhanced so that they identify not only the number of packets for which they are acknowledging receipt, but also the identity of those packets themselves (e.g., by specifying a range of packet numbers).
  • the ingress 12 detects that there is a gap in the packets being acknowledged by the token credit packets 22, the ingress 12 can re-transmit the missing packets 13.
  • missing token credit packets 22 may be symptomatic of a more significant transmission integrity problem, which could necessitate the invocation of a packet verification process.
  • a verification process may also be triggered by the egress 14 which can be designed to monitor transmission integrity through the switch fabric 16 and to detect anomalies such as missing packets or extensive delays, as well as generalized failures of the ingress 12, egress 14 or switch fabric 16.
  • a verification process, or “transmission integrity verification scheme”, in accordance with an embodiment of the present invention contemplates the use of reference, or “marker”, packets.
  • the ingress 12 is adapted to send a marker packet 50 through the switch fabric 16 at a reference instant, hereinafter denoted T REF -
  • T REF a reference instant
  • the reference instant may be a reference instant in time, it may also represent, e.g., the sequence number of the most recently received packet from the outside world.
  • the reference instant T R EF refers to an event whose origins are detectable by the ingress 12 and need not be an absolute time reference.
  • the reference instant T R EF may occur at predetermined intervals or it may be determined dynamically by a control entity inside or outside the ingress 12.
  • T - may a instant in the future that includes an offset indicative of a maximum elapsed time (or number of received packets from the external world) since receipt of the most recent acknowledgement of received packets from the egress 14.
  • T REF may be triggered by a condition in the ingress 12, the egress 14 or the switch fabric 16, such as attaining a maximum permitted resource utilization (e.g., memory or processing).
  • TRE F is the time at which an integrity problem is detected, such as a missing acknowledgement or detection of a lost packet.
  • T E F may be neither pre-determined nor determined by a control entity. Instead, it may be set by the arrival of a marker packet 50 from outside the ingress 12, in which case TR E F is taken to be the current time.
  • the contents (e.g., header or body) of each received packet can be examined.
  • the marker packet 50 could originate as an ordinary (non-marker) packet received from outside the ingress 12 which is then modified (e.g., by changing its header) to turn it into a marker packet; again T RE F is taken to be the current time.
  • the ingress 12 checks to see whether the current time has reached T REF (box 302) and, if so, sends the marker packet 50 into the switch fabric 16. This is shown in Fig. 3 at box 302. It is noted that the steps in Fig. 3 could be executed by the ingress 12 at a point marked by the circled number "3" in Fig. 2, i.e., upon receipt of a packet from an entity external to the ingress 12. Also at the reference instant T REF , the ingress 12 resets a second counter (see box 304), which starts counting the number of packets for which an acknowledgement of receipt is received from the egress 14, starting at time T REF - The value of the second counter at a given time t can be denoted D(f). The manner in which the second counter is incremented will be described in further detail below.
  • L(T R E F ) represents the number of packets for which an acknowledgement had not been received at the reference instant T REF - It will be appreciated that L(T R E F ) is in fact the sum of the number of packets in four categories:
  • the egress 14 continues to operate in the manner described above with reference to Fig. 2. Specifically, the egress 14 sends token credit packets 22 at certain instants when the token credit counter in the egress 14 reaches the value M (see box 258). In addition, the egress 14 is adapted to send a token refresh packet 40, which, as recalled, is sent to the ingress 12 without waiting for the token credit counter to reach the value M (see box 262). In accordance with the present embodiment, one of the circumstances under which a token refresh packet 40 is sent to the ingress 12 (i.e., one of the conditions under which box 262 will yield a result of "YES") is upon receipt of the marker packet 50 from the switch fabric 16.
  • the token refresh packet sent to acknowledge successful receipt of the marker packet 50 is specially denoted 40 50 .
  • Such a token refresh packet 40 50 indicates to the ingress 12 the value of the token credit counter in the egress 14 at the time the token refresh packet 40 5 o is being sent to the ingress 12. Following transmission of the token refresh packet 40 5 o, the token credit counter in the egress 14 is reset as for any other token refresh packet (see box 264).
  • the acknowledgements registered in this manner include those received by the ingress 12 in the form of a token credit packet 22 that explicitly or implicitly acknowledges successful receipt of M packets by the egress 14, as well as those received by the ingress 12 in the form of a token refresh packet 40 (or 40 50 ) that acknowledges successful receipt of a number of packets specified in the token refresh packet itself. It is noted that the second counter is being incremented by an amount equal to that by which the reverse token bucket counter is being decremented.
  • the second counter is incremented (see box 308).
  • TwF+dt denote the time at which the token refresh packet 40 50 is received by the ingress 12 (see boxes 310 and 312). It is at this point that the ingress 12 can be assured that it has received acknowledgements for all packets that belong to categories A), B) and C) above.
  • the number of packets for which these acknowledgements have been received is stored in the value D(T REF +dt) of the second counter, since the second counter started counting when there were still no acknowledgements from any of the packets belonging to categories A), B) and C) above.
  • the marker packet 50 transmitted at time T F flushes out the token credit counter in the egress 14 and thus, none of the packets which may have been in transit prior to the transmission of the marker packet 50 should be unaccounted for at the time that the acknowledgement of the marker packet 50 is received at the ingress 12.
  • this difference is attributable to the number of lost packets, i.e., category D) above.
  • this difference can be used to correct (i.e., calibrate) the current value L(f) of the reverse token bucket at time, by subtracting from it the difference of L(T REF ) - D(TRE F + t (see box 316). If the difference is negative, then the net change to L(f) will be positive and signifies the possibility that packets were not lost but that excessive / erroneous acknowledgements may have been produced.
  • all token refresh packets 40 acknowledge the receipt of marker packets 50 at the egress 14 (i.e., 40 is equivalent to 40 50 ).
  • the token credit counter in the egress 14 is not reset following the transmission of a token refresh packet 40 (i.e., box 264 is eliminated).
  • box 206 will only apply when a token credit packet 22 is received (and not when a token refresh packet 40 is received) at the ingress 12.
  • the algorithm/arithmetic used by the ingress 12 to assess the integrity of the flow of packets will continue to operate as shown in Figs. 3 and 4. This provides yet further independence between the flow regulation and transmission integrity verification schemes.
  • the system 10 can be made to account for loss of the token refresh packet 40 50 .
  • the ingress 12 starts a timer at time T R E F , i.e., when the marker packet 50 is sent into the switch fabric 16.
  • the timer has an expiry time that is arbitrary and may be designed to take into account an expected reasonable delay before receiving the token refresh packet 40 50 from the egress 14. If no token refresh packet is received prior to the expiry time of the timer, then this may mean that the token refresh packet 40 0 is lost.
  • the ingress 12 may decide to send a second marker packet at an instant TR EF2 - The ingress 12 then considers T E F2 as being the reference instant.
  • the ingress 12 upon receipt of a token refresh packet acknowledging the second marker packet at time T EF2 + dt, the ingress 12 computes the number of lost packets (which refers to the number of packets that had been lost at time T R EF 2 ) as being L(TR EF2 ) - D(TREF2+ t).
  • the ingress 12 and egress 14 may be implemented as a processor having access to a code memory which stores program instructions for the operation of the processor.
  • the program instructions could be stored on a medium which is fixed, tangible and readable directly by the processor, (e.g., removable diskette, CD-ROM, ROM, or fixed disk), or the program instructions could be stored remotely but transmittable to the processor via a modem or other interface device (e.g., a communications adapter) connected to a network over a transmission medium.
  • the transmission medium may be either a tangible medium (e.g., optical or analog communications lines) or a medium implemented using wireless techniques (e.g., microwave, infrared or other transmission schemes).
  • program instructions stored in the code memory can be compiled from a high level program written in a number of programming languages for use with many computer architectures or operating systems.
  • the high level program may be written in assembly language, while other versions may be written in a procedural programming language (e.g., "C") or an object oriented programming language (e.g., "C++” or "JAVA").
  • the functionality of the processor may be implemented as pre-programmed hardware or firmware elements (e.g., application specific integrated circuits (ASICs), electrically erasable programmable read-only memories (EEPROMs), etc.), or other related components.
  • ASICs application specific integrated circuits
  • EEPROMs electrically erasable programmable read-only memories

Abstract

A system for regulating packet flow through a switch fabric and a system for verifying the transmission integrity of the fabric. In the first system, an ingress sends packets to the switching entity in a designated order and an egress receives packets from the switching entity, re-orders the packets in the designated order and sends an acknowledgement of receipt of the packets to the ingress upon re-ordering. The ingress receives acknowledgements of receipt of packets from the egress, maintains an indication of a number of packets for which an acknowledgement of receipt has not yet been received, performs a comparison between the number of packets for which an acknowledgement of receipt has not yet been received and a threshold, and regulates sending packets to the switching entity on the basis of the comparison. The principle behind the second system is based on exchanging marker packets at identifiable reference instant.

Description

SYSTEMS AND METHODS FOR PACKET FLOW REGULATION AND TRANSMISSION INTEGRITY VERIFICATION OF A SWITCHING ENTITY
CROSS-REFERENCES TO RELATED APPLICATION(S)
The present invention claims the benefit under 35 USC §119(e) of prior U.S. provisional patent application Serial no. 60/407,356 to Sammour et al., filed on September 3, 2002 and U.S. provisional application Serial no. 60/407,357 to De Maria et al, filed on September 3, 2002, both incorporated by reference herein.
FIELD OF THE INVENTION
The present invention relates generally to switching of packets by a switching entity and, in particular, to methods and apparatus for regulating the flow of packets through the switching entity and verifying the integrity of transmission through the switching entity.
BACKGROUND OF THE INVENTION
Switch fabrics are often used to route traffic between end points in a network. The need for regulating the flow of traffic through a switching entity arises whenever there is a risk of congestion in the switch fabric. Congestion results in the ingress or switch fabric buffers holding a large amount of packets which, for some reason, cannot leave the switch fabric at the rate they are entering. This leads to two major problems. Firstly, the degree of out-of- order packets that may be received at the egress will be higher than what the egress is dimensioned for, which can cause significant reordering problems. Disadvantageously, receipt of out-of-order packets can lead to corruption in the reassembled frames. Secondly, when packets are transmitted in groups (i.e., frames), the degree of congestion in the switch fabric may cause the frame reassembly process of certain frames to take a very long time, resulting in a reduction in the outgoing bit rate, effectively introducing blocking.
In parallel with the need to manage congestion, there is also the need to verify transmission integrity through the switch fabric, especially when there is a reliability issue with the switch fabric, e.g., when there is a risk that the switch fabric will lose packets or when lost packets are detected. The loss of packets is of course detrimental to the integrity of the traffic being transmitted between the end points, as it may cause an interruption in information flow between the end points of a traffic connection.
Current solutions to these problems are not satisfactory and thus there remains a need in the industry to regulate packet flow through a switching entity so as to reduce congestion and also there remains a need in the industry to verify transmission integrity of the switch fabric.
SUMMARY OF THE INVENTION
In accordance with a broad aspect, the present invention provides a scheme used for regulating packet flow through the switch fabric. The principle behind this scheme is based on exchanging tokens between the egress and ingress. Accordingly, the present invention may be broadly summarized as a system for regulating packet flow through a switching entity. The system comprises an ingress capable of sending packets to the switching entity in a designated order and an egress capable of receiving packets from the switching entity, re-ordering the packets in the designated order and sending an acknowledgement of receipt of the packets to the ingress upon re-ordering. The ingress is adapted to receive acknowledgements of receipt of packets from the egress, maintain an indication of a number of packets for which an acknowledgement of receipt has not yet been received from the egress, perform a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from the egress and a threshold, and regulate sending packets to the switching entity on the basis of the comparison.
In accordance with another broad aspect, the present invention provides a method of regulating packet flow through a switching entity. The method comprises sending packets from an ingress to the switching entity in a designated order, receiving packets from the switching entity at an egress, re-ordering the packets in the designated order and sending an acknowledgement of receipt of the re-ordered packets to the ingress, receiving acknowledgements of receipt of packets from the egress, maintaining an indication of a number of packets for which an acknowledgement of receipt has not yet been received from the egress, performing a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from the egress and a threshold, and regulating sending packets to the switching entity on the basis of the comparison. In accordance with yet another broad aspect, the present invention provides computer- readable storage media containing a program element for execution by a computing device to implement an ingress for regulating packet flow through a switching entity. The ingress comprises a control entity operative to send packets to the switching entity in a designated order, receive acknowledgements of receipt of re-ordered packets from an egress, maintain an indication of a number of packets for which an acknowledgement of receipt has not yet been received from the egress, perform a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from the egress and a threshold, and regulate sending packets to the switching entity on the basis of the comparison.
In accordance with another broad aspect, the present invention provides a scheme used for verifying the transmission integrity of the switch fabric. The principle behind this scheme is based on exchanging marker packets at identifiable reference instants. Accordingly, the present invention may be broadly summarized as a system for assessing integrity of a flow of packets through a switching entity. The system comprises an ingress capable of sending packets to the switching entity, the packets including a reference packet sent to the switching entity at a reference instant and an egress capable of receiving packets from the switching entity and sending to the ingress an acknowledgement of receipt of packets from the switching entity. The ingress is adapted to receive acknowledgements of receipt of packets from the egress, maintain a current indication of packets for which an acknowledgement of receipt has not yet been received from the egress, store a first data element indicative of packets for which an acknowledgement of receipt had not yet been received from the egress at the reference instant, maintain a second data element indicative of packets for which an acknowledgement of receipt is received from the egress between the reference instant and the instant at which an. acknowledgement of receipt of the reference packet is received from the egress, perform a comparison of the first and second data elements, and assess integrity of the flow of packets on the basis of the comparison.
In accordance with still another broad aspect of the present invention, there is provided a method of assessing integrity of a flow of packets through a switching entity. The method comprises sending packets from an ingress to the switching entity, the packets including a reference packet sent to the switching entity at a reference instant, receiving packets from the switching entity at an egress and sending to the ingress an acknowledgement of receipt of packets from the switching entity, receiving acknowledgements of receipt of packets from the egress, maintaining a current indication of packets for which an acknowledgement of receipt has not yet been received from the egress, storing a first data element indicative of packets for which an acknowledgement of receipt had not yet been received from the egress at the reference instant, maintaining a second data element indicative of packets for which an acknowledgement of receipt from the egress is received between the reference instant and the instant at which an acknowledgement of receipt of the reference packet is received from the egress, performing a comparison of the first and second data elements, and assessing integrity of the flow of packets on the basis of the comparison.
In accordance with still another broad aspect, the present invention provides computer- readable storage media containing a program element for execution by a computing device to implement an ingress for regulating packet flow through a switching entity. The ingress comprises a control entity operative to send packets to the switching entity, the packets including a reference packet sent to the switching entity at a reference instant, receive acknowledgements of receipt of packets from an egress, maintain a current indication of packets for which an acknowledgement of receipt has not yet been received from the egress, store a first data element indicative of packets for which an acknowledgement of receipt had not yet been received from the egress at the reference instant, maintain a second data element indicative of packets for which an acknowledgement of receipt from the egress is received between the reference instant and the instant at which an acknowledgement of receipt of the reference packet is received from the egress, perform a comparison of the first and second data elements, and assess integrity of the flow of packets on the basis of the comparison.
Various other aspects of the invention address complexity issues arising from the addition of further functionality to each scheme.
These and other aspects and features of the present invention will now become apparent to those of ordinary skill in the art upon review of the following description of specific embodiments of the invention in conjunction with the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGS
In the accompanying drawings:
Fig. 1 is a block diagram of a switching entity disposed between a plurality of ingresses and a plurality of egresses;
Fig. 2 illustrates steps executed by the ingress (on the left-hand side) and egress (on the right-hand side) in order to implement a packet flow regulation scheme in accordance with an embodiment of the present invention;
Figs. 3 and 4 show additional steps executed by the ingress in order to implement a transmission integrity verification scheme in accordance with an embodiment of the present invention.
DETAILED DESCRIPTION
With reference to Fig. 1, there is shown a system 10 that comprises plural ingresses 12 and plural egresses 14 connected on either side of a switching entity, such as a switch fabric 16. The ingresses 12 and the egresses 14 are functional entities that are logically interconnected to one another although they may or may not be physically distinct. Typically, the ingresses
12 and the egresses 14 are implemented using a combination of hardware, software and control logic.
The ingresses 12 receive data from a source external to the system 10. The data is received via a plurality of physical or logical input queues (IQ) 24. The ingresses 12 process the data and provide packets 13 to the switch fabric 16. The ingresses 12 can be uniquely associated with individual input ports 18 of the switch fabric 16. The ingresses 12 provide the packets
13 to the switch fabric 16 via its input ports 18.
The switch fabric 16 can be a centralized entity or a distributed entity made up of smaller interconnected entities, and those smaller entities may physically reside in different chassis. Thus, it will be appreciated that the present invention applies to both centralized and distributed architectures of the switch fabric 16. The switch fabric 16 generally has the capacity to switch packets from its input ports 18 to selected ones of a plurality of output ports 20, in accordance with routing instructions. The routing instructions provided to the switch fabric 16 may arrive from an external source and may take the form of instructions to "switch input port A to output port B". However, it is more common in the field of Internet routing that the routing instructions will be implicit in the information carried by the received packets themselves. For example, a given packet provided at input port A may specify an end destination, which is examined by the switch fabric 16 and translated by the switch fabric 16 into an output port B based on routing tables and the like. Such routing tables could be stored in a memory 21 that is internal or external to the switch fabric 16.
The egresses 14 can be uniquely associated with individual ones of the output ports 20 of the switch fabric 16. Each of the egresses 14 receives packets 13' from the output ports 20 of the switch fabric 16, performs processing and provides processed data to an optoelectronic converter (O/E) 28 via an output queue (OQ) 26. The opto-electronic converter 28 transforms the processed data received from the output queue 26 into an optical signal that is sent to an entity external to the system 10.
It is quite common for different ones of the packets 13 being released by a particular one of the ingresses 12 to be destined for different end destinations and, thus, different ones of the output ports 20 of the switch fabric 16. For example, it is possible that a first packet being received at input port A of the switch fabric 16 will need to be routed to output port B, but that the next packet received at that same input port A will need to be routed to output port B'. The relationship of A to B for the first packet and A to B' for the next packet can be referred to as a "source-destination pair".
For simplicity, but without limiting the scope of the invention, it may be useful to provide features for regulating the flow of packets on a per-source-destination pair basis. In other words, multiple packet flow regulation schemes are implemented, each for regulating the flow of packets sharing a common source-destination pair. These various packet flow regulation schemes may share some common resources, such as total memory or capacity, but the individual processes that implement flow regulation operate independently.
In fact, it is within the scope of the present invention to implement multiple packet flow regulation schemes for regulating the flow of packets sharing common characteristics other than, or in addition to, the source-destination pair. For example, packets sharing a common quality of service requirement may have their own packet flow regulation schemes, irrespective of source-destination pair. Similarly, the characteristic may be priority, bandwidth class, and so on, or any combination of two or more characteristics.
In the following, the specific case is considered where the characteristic is the source- destination pair (also known as a "flow"), so that the ingress-side functionality of the packet flow regulation scheme is localized to a single one of the ingresses, denoted 12, and the egress-side functionality of the packet flow regulation scheme is localized to a single one of the egresses, denoted 14. However, it will be appreciated that in other embodiments, the ingress-side functionality and the egress-side functionality of each one of a plurality of packet flow regulation schemes may be distributed across multiple ones of the ingresses 12 and egresses 14, respectively.
With reference now to Fig. 2, the ingress 12 is adapted to send the packets 13 to the switch fabric 16 in a designated order. This can be referred to as performing a "sequencing operation" on a series of packets arriving at the ingress from an external entity. The sequencing operation is shown as box 202. The designated order is maintained / differentiated through the use of numbers (called sequence numbers). The size of the sequence number space is N. The value of N is chosen to be sufficiently large so as to account for the maximum delay through the switch fabric 16.
As the incoming packets typically consist of a header containing control information (e.g., implicit or explicit routing instructions) and a body containing data, the designated order can be established, for example, by modifying the header of each packet 13 so as to implement a linked list of numbers from zero to (N-l) and back to zero again. This is but one simple technique that permits the packets 13 to be eventually re-sequenced upon receipt at the egress 14 in a possibly out-of-order fashion.
Other boxes pertaining to the ingress 12 in Fig. 2 represent operations that form part of a packet flow regulation scheme that controls the rate at which the packets 13 are released into the switch fabric 16. The packet flow regulation scheme will be described in further detail herein below. It should also be appreciated that the ingress 12 may include various other features, such as an arbiter or scheduler connected to the ingress queues 24, which controls the release of packets into the switch fabric 16, subject to the packet flow regulation scheme.
At the egress 14, the order of the packets 13' received from the switch fabric 16 may be different from the designated order. This can be the result of queuing, arbitration or other processing features of the switch fabric 16. Accordingly, the egress 14 receives packets 13' from the switch fabric 16 and is adapted to re-order the packets 13' so that they re-acquire the designated order. This is known as a re-sequencing operation as shown as box 252. The re-sequencing operation can be effected by, for example, examining the contents of the header in each packet 13' received from the switch fabric 16. The re-sequenced packets undergo further processing and the processed data is then sent to the opto-electronic converter 28 via the output queue 26 (see box 254). It will be understood that independent re-sequencing operations may be performed for each of the packet flow regulation schemes being implemented, e.g., on the basis of a characteristic such as source-destination pair, quality of service, priority, bandwidth class, etc.
The egress 14 keeps track of successfully re-sequenced packets by updating a "token credit counter" (see box 256), that may be implemented in software, for example. When the value of the token credit counter has reached a certain level, i.e., upon re-sequencing a certain number of the received packets 13 ' , an acknowledgement of receipt of the re-sequenced packets is send back to the ingress 12. The acknowledgement can be sent to the ingress 12 via the switch fabric 16 or through a separate external link (not shown). In a specific embodiment, the acknowledgement is sent back to the ingress 12 after the processed data corresponding to the packets 13' has exited the output queue 26 (i.e., just prior to opto- electronic conversion by the opto-electronic converter 28).
Specifically, the acknowledgement may take the form of a special packet, referred to as a "token credit packet" (TCP) 22, which distinguishes itself from other packets by, e.g., a code in its header. A token credit packet 22 may be sent each time a packet is re-sequenced at the egress 14; however, this approach may create congestion in the direction from the egress 14 to the ingress 12. Thus, in a specific embodiment, a token credit packet 22 is sent once the value of the token credit counter exceeds a value M, where is a desired integer, possibly although not necessarily less than N. This is shown in box 258. ean be fixed or time varying. One rationale for varying M would be to ensure that the token credit packet 22 will not be unduly congested on its way back to the ingress 12. Thus, M could be made to vary on the basis of a measure of the amount of available return bandwidth through the switch fabric 16. This measure of the amount of available return bandwidth may be effected by the switch fabric 16 and communicated to the egress 14. Alternatively, Mean remain fixed, but an additional condition for sending a token credit packet could be that the available return bandwidth be above a particular threshold. Other embodiments contemplate allocating the token credit packet 22 with the highest possible priority in order to ensure that it is timely received at the ingress 12 so as not to unduly slow down the flow through the switch fabric 16.
A function of the token credit packet 22 is to acknowledge the receipt of M successfully re- sequenced packets with which a particular token credit packet 22 is associated. In one embodiment, where M would be fixed, it is not necessary to transmit the value M, as this information can be implicitly known by the ingress 12 from the mere fact that a token credit packet 22 is received (e.g., which will impliedly convey that the token credit counter has reached the value M). In other cases, where Mis time varying, this information can be specifically embedded in the body of the token credit packet 22. Once the token credit packet 22 is sent to the ingress 12, the token credit counter is updated again. This is shown in box 260. In particular, if the egress 14 is designed so as to send a token credit packet 22 after the token credit counter reaches a value M, then updating the token credit counter consists of decrementing it by M
Another option is to send an acknowledgement on a periodic, irregular or spontaneous basis, regardless of the value of the token credit counter. In such instances, the acknowledgement can take the form of a "token refresh packet" (TRP) 40. The appropriate moment for triggering the transmittal of a token refresh packet may be dependent on received information (e.g., receipt of a special type of packet, called a "marker packet" as will be described later on or an amount of elapsed time since previous transmittal of an acknowledgement.) Once the appropriate moment is reached, the token refresh packet 40 is sent. This is shown in box 262. Once the token refresh packet 40 is sent to the ingress 12, the token credit counter is updated. This is shown in box 264 and basically consists of resetting the token credit counter to zero. A function of the token refresh packet 40 is to acknowledge that a certain number of packets have been successfully received and re-sequenced at the egress 14. This number is equal to the value of the token credit counter at the time the acknowledgement is generated. Since this number is not known ahead of time, the value of the token credit counter can be specifically embedded in the body of the token refresh packet 40. Once the token refresh packet 40 is sent to the ingress 12, the token credit counter is updated again. This is shown in box 264. In particular, this consists of resetting the token credit counter to zero.
It will be appreciated that because the token credit counter is updated, read and updated again, only once the egress 14 has ensured that the received packets 13 ' have re-acquired the designated order, this ensures that new traffic can be fed to the switch fabric 16 by the ingress 12 without endangering the integrity of the re-sequencing operation.
In accordance with an embodiment of the present invention, the ingress 12 applies a packet flow regulation scheme in order to alleviate congestion at the switch fabric 16. In order to apply the packet flow regulation scheme, the ingress 12 keeps track of the number of packets that have been fed to the switch fabric 16 and for which an acknowledgement of receipt has not yet been received from the egress 14. Let the number of such as-yet- unacknowledged packets equal L(t), where t is a time variable denoting that the quantity L(t) will change over time. The ingress 12 keeps track of L(t) by, for example, implementing what can be referred to as a "reverse token bucket", where a "token" refers to a packet for which an acknowledgement has not yet been received. As will be shown in greater detail below (see also boxes 204, 206 and 208 in Fig. 2), the fill level E(t) of the reverse token bucket is generally increased upon sending packets to the switch fabric 16 and is reduced upon receipt of a token credit packet 22 or token refresh packet 40 from the egress 14. The reverse token bucket may be one of many similar buckets stored in a table in memory, such table being referred to as a "token bucket table". The token bucket table is indexed on a per-FLOW basis and thus, for each different FLOW being handled by the ingress, the appropriate entry in the token bucket table is consulted.
By way of a simplistic example, let Mbe equal to 5 and let the expected delay through the switch fabric 16 be equivalent to the duration often (10) packets. Also, it will be assumed that that the packets 13' received by the egress 14 are already in the designated order so that re-sequencing is not required. In such a scenario, the first fifteen (15) packets will be sent out of the ingress 12 without yet receiving an acknowledgement. The reverse token bucket would thus hold a fill level L(t) of 15. Meanwhile, due to the delay through the switch fabric 16, five (5) packets 13' will have emerged from the switch fabric 16 at the egress 14. Assuming that the egress 14 instantaneously recognizes that it has received M= 5 packets 13' and that the egress 14 instantaneously transmits a token credit packet 22 to the ingress 12, and that the ingress 12 is instantaneously capable of updating the reverse token bucket, the reverse token bucket will, after the fifteenth (15 ) packet 13 is sent into the switch fabric 16, drop to the value 10. Subsequently, the fill level L(f) of the reverse token bucket will climb gradually again to 15 and then drop to 10; this pattern will continue indefinitely, assuming the ideal circumstances whereby there is no congestion through the switch fabric 16. Thus, under ideal circumstances, the fill level L(t) of the reverse token bucket is roughly indicative of the expected latency of the switch fabric 16.
However, under non-ideal circumstances, the fill level L(t) of the reverse token bucket will rise above the expected delay through the switch fabric 16. Thus, the fill level L(t) of the reverse token bucket effectively becomes a measure of the congestion through the switch fabric 16. Accordingly, in an embodiment of the present invention, the reverse token bucket is compared to a threshold, denoted a, which represents a demarcation between an acceptable and an unacceptable delay through the switch fabric 16. This is shown in box 210 in Fig. 2. The comparison can be effected in hardware, software or a combination thereof. The comparison can be effected for each packet or for each group of packets. On the basis of this comparison, the ingress 12 can regulate the transmission of packets being fed to the switch fabric 16. For example, if the fill level L(t) of the reverse token bucket is less than the threshold a, then a next group of packets could continue to be sent to the switch fabric 16 (see box 212), while if the fill level Z(t) of the reverse token bucket is greater than or equal to the threshold a, then the ingress 12 could be made to refrain from sending the next group packets to the switch fabric 16 and, instead, placing the packets in temporary storage, such as a buffer (see box 214). The act of refraining from sending packets to the switch fabric 16 need not be applied to the next packet (or group of packets) but rather to the following one. Thus, for example, transmission of the next packet (or group of packets) may be permitted regardless of the fill level L(f) of the reverse token bucket, although transmission of the subsequent packet (or group of packets) will be placed on hold. A similar effect could be achieved if a memory element in the ingress is made to contain a value S, which is the time-varying result of the operation sgn(E(t) - a). In this case, a binary decision to continue to send, or refrain from sending, packets to the switch fabric 16 is made on the basis of whether S is or is not equal to -1. The value of S may be conveniently stored in a table, which may be termed a "backpressure table" 30. The backpressure table 30 is indexed on a per-FLOW basis and thus, for each flow being handled by the ingress 12, the entry in the backpressure table corresponding to the appropriate FLOW is consulted. An advantage of using the backpressure table 30 instead of performing the comparison of the fill level L(t) of the reverse token bucket to the threshold stems from the fact that the entries in the backpressure table 30 can be modified from "behind the scenes", namely as a result of the occurrence of other events, not only on the basis of the difference between the fill level L(t) of the reverse token bucket and the threshold a. For example, it may be useful to allow a software module that monitors error conditions or that performs debugging to have access to the backpressure table 30 so as to allow it to exert control over the transmission of packets to the switch fabric 16.
The decision to continue to send, or refrain from sending, packets to the switch fabric 16 can be made continually, periodically or sporadically, depending various design parameters, such as the desired responsiveness of the ingress 12 to changes in the latency through the switch fabric 16, the available processing power of the ingress 12, and so on. For instance, this may lead to a design in which a comparison between the fill level E(t) of the reverse token bucket and the threshold a, or alternatively the reading of the value S, is performed by the ingress 12 for each group of successive packets, so that the result of the comparison will lead to a decision to transmit or not transmit the next (or subsequent group of) packets. It is noted that upon transmission of a group of packets to the switch fabric 16, the reverse token bucket is incremented by the number of packets in the group (see box 208). This update to the reverse token bucket may be done shortly before or shortly after packets have in actuality been sent to the switch fabric 16. It is recalled that the fill level of the reverse token bucket is decremented upon receipt of a token credit packet or token refresh packet 40 from the egress 14 (see box 206). Thus, it is envisaged that the fill level L(t) of the reverse token bucket may be increased in increments greater than one and decremented in increments of M It is recalled that the threshold a represents the maximum number of unacknowledged packets that are allowed to leave the ingress 12. The threshold a can thus be referred to as a "backpressure watermark". While in some embodiments, a may fixed, in other embodiments it may vary over time from amongst a set of possible backpressure watermark values, for example. This means that traffic can be throttled even though the switch fabric 16 is not operating at maximum capacity. This enhancement may be used by the egress 14 to slow down (regulate) the rate at which each ingress 12 is sending traffic to it, based on a set of regulation criteria.
Example of such regulation criteria include an indication of resource (e.g., memory) utilization at the egress 14. In one embodiment, the egress 14 may be equipped with suitable circuitry, software and/or control logic for monitoring usage of a resource (such as memory) to determine a resource utilization level. If multiple egresses share the same resources (e.g., memory space), then the same resource utilization level could be used by all of the egresses concerned. The egress 14 is adapted to sent the resource utilization level to the ingress 12. The ingress 12 then determines the backpressure watermark in accordance therewith, e.g., by assessing whether the resource utilization is considered low, medium or high.
In another embodiment, the egress 14 is adapted to perform the further step of determining the backpressure watermark. This backpressure watermark is then sent to the ingress, e.g., via the switch fabric 16 or an external link. Alternatively, the backpressure watermark may be selected from a fixed set of watermarks that are associated with respective codes known to the ingress 12. In such a scenario, the egress 14 may sent the code to the ingress 12, allowing the ingress 12 to set the backpressure watermark in accordance with the code. Again, transmission of such information from the egress 14 to the ingress 12 can be achieved through the switch fabric 16 or via a separate link (not shown).
Also, there may exist a plurality of thresholds, at least one of which is time- varying. For example, a first threshold may be set by the current level of resource utilization and a second threshold may be set by a known storage capacity of the switch fabric 16. In this case, it is reasonable for the actual threshold used by the ingress 12 as the backpressure watermark to be the minimum of the two thresholds. In other embodiments, it is envisaged that the above feature could also be used to implement a slow-start mechanism or a token-based bandwidth allocation and control mechanism.
In yet another embodiment, instead of implementing a reverse token bucket, which keeps track of as-yet-unacknowledged packets, it is within the scope of the present invention to implement a "positive" token bucket. For example, a positive token bucket keeps track of a number of unacknowledged packets that are still allowed to be emitted by the ingress 12. Each time that a packet is sent to the switch fabric 16 by the ingress 12, the number of packets so transmitted is subtracted from the positive token bucket. Meanwhile, acknowledgements received from the egress 14 will tend to add to the fill level of the positive token bucket. If the fill level of the positive token bucket falls to zero, the ingress 12 is adapted to refrain from sending further packets to the switch fabric 16. Thus, the positive token bucket is used to regulate the flow of packets through the switch fabric 16.
The above packet flow regulation scheme implemented by the cooperation of the ingress 12 and egress 14 in the system 10 controls the extent to which out-of-order packets occur, and hence can control congestion through the switch fabric 16 by ensuring that the egress 14 will not receive more packets than what it is prepared to accept.
The packet flow regulation scheme of the present invention can also be made to operate in a frame-based mode. Specifically, each of the "packets" described above can be considered to be a "frame" consisting of multiple "segments". Multiple frames are assumed to be received from an external entity by the ingress 12 and occupy the designated order. In addition, the segments within a given frame also define a specific order within their
"parent" frame. Thus, the segments received at the egress 14 (in an unknown order) need to be reassembled into the appropriate parent frame and the frames themselves need to be reordered into the designated specific order prior to being and released by the egress 14.
In a first variant, the ingress 12 and egress 14 implement a pure frame-based mode of the packet flow regulation scheme, which can be specified by a software flag at the ingress 12. The re-ordering of segments within a frame is not considered in the pure frame-based mode of operation, although in another embodiment, both frame-based and non-frame-based modes of the packet flow regulation scheme may coexist simultaneously and independently. To implement the pure frame-based mode of operation of the packet flow regulation scheme, a sequencing and re-sequencing operation is performed amongst the frames themselves. It is recalled that the "packets" referred to above now represent "frames". Each frame has a sequence number which ranges from zero to (N-1). The designated order referred to above refers to the order in which the frames are sent into the switching entity 16 by the ingress 12. Thus, the reverse token bucket now counts unacknowledged frames sent by the ingress 12 into the switch fabric 16.
The comparison between the fill level L(t) of the reverse token bucket and threshold a needs to take the frame-based mode of operation into account. Thus, L(t) represents the number of as yet unacknowledged frames from the point of view of the ingress 12. For its part, the token credit counter tracks the number of properly re-sequenced frames at the egress 14. When the token credit counter reaches a value of M, the egress 14 sends a token credit packet 22 to the ingress 12. Because the ingress 12 operates a frame-based packet flow regulation scheme, the ingress will know that the token credit packet 22 represents a total of Mproperly re-sequenced frames and updates the fill level L(f) of the reverse token bucket accordingly.
Although the above-described frame-based mode of operation tends to be characterized by a slower responsiveness to congestion, it tends utilizes less bandwidth in the egress-to-ingress direction than a non-frame-based mode of operation of the packet flow regulation scheme, since a token credit packet is sent by the egress 14 only once M frames have been successfully re-sequenced.
In another variant of the present invention, which can be used with either mode of operation, the ingress 12 is adapted to perform a monitoring operation on the token credit packets 22 received from the egress 14. The token credit packets 22 can be enhanced so that they identify not only the number of packets for which they are acknowledging receipt, but also the identity of those packets themselves (e.g., by specifying a range of packet numbers). Thus, if the ingress 12 detects that there is a gap in the packets being acknowledged by the token credit packets 22, the ingress 12 can re-transmit the missing packets 13. It should also be noted that missing token credit packets 22 may be symptomatic of a more significant transmission integrity problem, which could necessitate the invocation of a packet verification process. A verification process may also be triggered by the egress 14 which can be designed to monitor transmission integrity through the switch fabric 16 and to detect anomalies such as missing packets or extensive delays, as well as generalized failures of the ingress 12, egress 14 or switch fabric 16.
With reference to Fig. 3, a verification process, or "transmission integrity verification scheme", in accordance with an embodiment of the present invention contemplates the use of reference, or "marker", packets. Specifically, the ingress 12 is adapted to send a marker packet 50 through the switch fabric 16 at a reference instant, hereinafter denoted TREF- Although the reference instant may be a reference instant in time, it may also represent, e.g., the sequence number of the most recently received packet from the outside world. Generally, the reference instant TREF refers to an event whose origins are detectable by the ingress 12 and need not be an absolute time reference.
It will be appreciated that different transmission integrity verification schemes may be operating in parallel, each such scheme being applied to packets that undergo a common packet flow regulation scheme. It is recalled that a variety of packet flow regulation schemes may be implemented to handle packets sharing common characteristics such as source-destination pair, bandwidth class, priority and quality of service, to name a few. However, it is also within the scope of the present invention to implement a transmission integrity verification scheme without recourse to an associated packet flow regulation scheme, or, alternatively, to use a packet flow regulation scheme that does not require sequencing and re-sequencing of packets.
The reference instant TREF may occur at predetermined intervals or it may be determined dynamically by a control entity inside or outside the ingress 12. For example, T -may a instant in the future that includes an offset indicative of a maximum elapsed time (or number of received packets from the external world) since receipt of the most recent acknowledgement of received packets from the egress 14. In another embodiment, TREF may be triggered by a condition in the ingress 12, the egress 14 or the switch fabric 16, such as attaining a maximum permitted resource utilization (e.g., memory or processing). In yet another embodiment, TREF is the time at which an integrity problem is detected, such as a missing acknowledgement or detection of a lost packet. On the other hand, T EF may be neither pre-determined nor determined by a control entity. Instead, it may be set by the arrival of a marker packet 50 from outside the ingress 12, in which case TREF is taken to be the current time. In order to determine whether a packet received by the ingress 12 is a marker packet, the contents (e.g., header or body) of each received packet can be examined. It should also be noted that the marker packet 50 could originate as an ordinary (non-marker) packet received from outside the ingress 12 which is then modified (e.g., by changing its header) to turn it into a marker packet; again TREF is taken to be the current time.
In any event, the ingress 12 checks to see whether the current time has reached TREF (box 302) and, if so, sends the marker packet 50 into the switch fabric 16. This is shown in Fig. 3 at box 302. It is noted that the steps in Fig. 3 could be executed by the ingress 12 at a point marked by the circled number "3" in Fig. 2, i.e., upon receipt of a packet from an entity external to the ingress 12. Also at the reference instant TREF, the ingress 12 resets a second counter (see box 304), which starts counting the number of packets for which an acknowledgement of receipt is received from the egress 14, starting at time TREF- The value of the second counter at a given time t can be denoted D(f). The manner in which the second counter is incremented will be described in further detail below.
Also upon transmittal of the marker packet 50 at the reference instant TREF, the ingress 12 takes note of the current fill level of the reverse token bucket (see box 306). This value is denoted L(TREF) and represents the number of packets for which an acknowledgement had not been received at the reference instant TREF- It will be appreciated that L(TREF) is in fact the sum of the number of packets in four categories:
A) packets in transit between the ingress 12 and the egress 14, at time TREF',
B) packets accounted for by the value of the token credit counter at time TREF,
C) packets accounted for by token credit packets 22 on their way towards the ingress 22; D) lost packets.
Meanwhile, the egress 14 continues to operate in the manner described above with reference to Fig. 2. Specifically, the egress 14 sends token credit packets 22 at certain instants when the token credit counter in the egress 14 reaches the value M (see box 258). In addition, the egress 14 is adapted to send a token refresh packet 40, which, as recalled, is sent to the ingress 12 without waiting for the token credit counter to reach the value M (see box 262). In accordance with the present embodiment, one of the circumstances under which a token refresh packet 40 is sent to the ingress 12 (i.e., one of the conditions under which box 262 will yield a result of "YES") is upon receipt of the marker packet 50 from the switch fabric 16. For the purposes of more clearly describing the present embodiment, the token refresh packet sent to acknowledge successful receipt of the marker packet 50 is specially denoted 4050. Such a token refresh packet 4050 indicates to the ingress 12 the value of the token credit counter in the egress 14 at the time the token refresh packet 405o is being sent to the ingress 12. Following transmission of the token refresh packet 405o, the token credit counter in the egress 14 is reset as for any other token refresh packet (see box 264).
Now, returning to the description of operation of the ingress 12, and with reference to Figs. 2 and 4, the value, £>(t), of the second counter increases as acknowledgements are received by the ingress 12. The circle in Fig. 2 which contains the numeral "4" indicates a possible point in the processing effected by the ingress 12 to execute the steps indicated by the boxed in Fig. 4. Box 308 is indicative of the fact that the second counter is increased by the number of acknowledgements registered by the ingress 12. The acknowledgements registered in this manner include those received by the ingress 12 in the form of a token credit packet 22 that explicitly or implicitly acknowledges successful receipt of M packets by the egress 14, as well as those received by the ingress 12 in the form of a token refresh packet 40 (or 4050) that acknowledges successful receipt of a number of packets specified in the token refresh packet itself. It is noted that the second counter is being incremented by an amount equal to that by which the reverse token bucket counter is being decremented.
Thus, each time such a token credit packet 22 or token refresh packet 40 (or 4050) is received, the second counter is incremented (see box 308). Let TwF+dt denote the time at which the token refresh packet 4050 is received by the ingress 12 (see boxes 310 and 312). It is at this point that the ingress 12 can be assured that it has received acknowledgements for all packets that belong to categories A), B) and C) above. Moreover, the number of packets for which these acknowledgements have been received is stored in the value D(TREF+dt) of the second counter, since the second counter started counting when there were still no acknowledgements from any of the packets belonging to categories A), B) and C) above. Stated differently, the marker packet 50 transmitted at time T F flushes out the token credit counter in the egress 14 and thus, none of the packets which may have been in transit prior to the transmission of the marker packet 50 should be unaccounted for at the time that the acknowledgement of the marker packet 50 is received at the ingress 12. Hence, if there is a difference (see box 314) between L(TREF) and D(TREF+dt), this difference is attributable to the number of lost packets, i.e., category D) above. In fact, this difference can be used to correct (i.e., calibrate) the current value L(f) of the reverse token bucket at time, by subtracting from it the difference of L(TREF) - D(TREF+ t (see box 316). If the difference is negative, then the net change to L(f) will be positive and signifies the possibility that packets were not lost but that excessive / erroneous acknowledgements may have been produced.
In a variant, all token refresh packets 40 acknowledge the receipt of marker packets 50 at the egress 14 (i.e., 40 is equivalent to 4050). In this variant, the token credit counter in the egress 14 is not reset following the transmission of a token refresh packet 40 (i.e., box 264 is eliminated). In addition, box 206 will only apply when a token credit packet 22 is received (and not when a token refresh packet 40 is received) at the ingress 12. Meanwhile, the algorithm/arithmetic used by the ingress 12 to assess the integrity of the flow of packets will continue to operate as shown in Figs. 3 and 4. This provides yet further independence between the flow regulation and transmission integrity verification schemes.
In a further variant, the system 10 can be made to account for loss of the token refresh packet 4050. Specifically, according to this variant, the ingress 12 starts a timer at time TREF, i.e., when the marker packet 50 is sent into the switch fabric 16. The timer has an expiry time that is arbitrary and may be designed to take into account an expected reasonable delay before receiving the token refresh packet 4050 from the egress 14. If no token refresh packet is received prior to the expiry time of the timer, then this may mean that the token refresh packet 40 0 is lost. At this point, the ingress 12 may decide to send a second marker packet at an instant TREF2- The ingress 12 then considers T EF2 as being the reference instant.
Hence, upon receipt of a token refresh packet acknowledging the second marker packet at time T EF2 +dt, the ingress 12 computes the number of lost packets (which refers to the number of packets that had been lost at time TREF2) as being L(TREF2) - D(TREF2+ t). Those of ordinary skill in the art will also appreciate that the ingress 12 and egress 14 may be implemented as a processor having access to a code memory which stores program instructions for the operation of the processor. The program instructions could be stored on a medium which is fixed, tangible and readable directly by the processor, (e.g., removable diskette, CD-ROM, ROM, or fixed disk), or the program instructions could be stored remotely but transmittable to the processor via a modem or other interface device (e.g., a communications adapter) connected to a network over a transmission medium. The transmission medium may be either a tangible medium (e.g., optical or analog communications lines) or a medium implemented using wireless techniques (e.g., microwave, infrared or other transmission schemes).
Those skilled in the art should also appreciate that the program instructions stored in the code memory can be compiled from a high level program written in a number of programming languages for use with many computer architectures or operating systems. For example, the high level program may be written in assembly language, while other versions may be written in a procedural programming language (e.g., "C") or an object oriented programming language (e.g., "C++" or "JAVA").
Those skilled in the art should further appreciate that in some embodiments of the invention, the functionality of the processor may be implemented as pre-programmed hardware or firmware elements (e.g., application specific integrated circuits (ASICs), electrically erasable programmable read-only memories (EEPROMs), etc.), or other related components.
While specific embodiments of the present invention have been described and illustrated, it will be apparent to those skilled in the art that numerous modifications and variations can be made without departing from the scope of the invention as defined in the appended claims.

Claims

WE CLAIM:
1. A system for regulating packet flow through a switching entity, comprising:
- an ingress capable of sending packets to the switching entity in a designated order; - an egress capable of receiving packets from the switching entity, re-ordering the packets in the designated order and sending an acknowledgement of receipt of the packets to said ingress upon re-ordering; said ingress being adapted to:
- receive acknowledgements of receipt of packets from said egress; - maintain an indication of a number of packets for which an acknowledgement of receipt has not yet been received from said egress;
- perform a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from said egress and a threshold; and - regulate sending packets to the switching entity on the basis of the comparison.
2. A system as defined in claim 1, wherein said ingress being adapted to regulate sending packets to the switching entity on the basis of the comparison includes said ingress being adapted to: - send packets to the switching entity if the comparison indicates that the number of packets for which an acknowledgement of receipt has not yet been received from said egress does not exceed the threshold; and
- refrain from sending packets to the switching entity if the comparison indicates that the number of packets for which an acknowledgement of receipt has not yet been received from said egress exceeds the threshold.
3. A system as defined in claim 1, wherein the designated order is defined by a sequence space of N sequence numbers, wherein each packet is associated with a corresponding one of the sequence numbers and wherein the threshold is selected so as to be no greater thanN.
4. A system as defined in claim 1, wherein said ingress being adapted to regulate sending packets to the switching entity on the basis of the comparison includes said ingress being adapted to: - maintain a memory element indicative of whether the number of packets for which an acknowledgement of receipt has not yet been received from said egress exceeds the threshold;
- send packets to the switching entity if the contents of the memory element indicates that the number of packets for which an acknowledgement of receipt has not yet been received from said egress does not exceed the threshold; and
- refrain from sending packets to the switching entity if the contents of the memory element indicates that the number of packets for which an acknowledgement of receipt has not yet been received from said egress exceeds the threshold.
5. A system as defined in claim 1, wherein said ingress being adapted to perform a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from said egress and a threshold includes said ingress being adapted to perform the comparison for each packet prior to sending that packet to the switching entity.
6. A system as defined in claim 2,
- wherein each packet is associated with a characteristic;
' - wherein said ingress is further adapted to determine the characteristic associated with each packet;
- wherein said ingress being adapted to maintain an indication of a number of packets for which an acknowledgement of receipt has not yet been received from said ingress includes said ingress being adapted to maintain an indication of a number of packets of each characteristic for which an acknowledgement of receipt has not yet been received from said egress; and
- wherein said ingress being adapted to perform a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from said egress and a threshold includes said ingress being adapted to perform a comparison between the number of packets of the associated characteristic for which an acknowledgement of receipt has not yet been received from said egress and a threshold associated with the associated characteristic.
7. A system as defined in claim 6, wherein the threshold associated with at least one characteristic is time-invariant.
8. A system as defined in claim 6, wherein the threshold associated with at least one characteristic varies in time.
9. A system as defined in claim 2, wherein the tlireshold associated with all characteristics is a common threshold.
10. A system as defined in claim 6, wherein the characteristic is a source-destination pair.
11. A system as defined in claim 6, wherein the characteristic is a bandwidth class.
12. A system as defined in claim 6, wherein the characteristic is a quality of service.
13. A system as defined in claim 6, wherein the characteristic is a priority.
14. A system as defined in claim 2, wherein the threshold is the lowest of a plurality of thresholds, at least one of which is time varying.
15. A system as defined in claim 1, - wherein said egress is adapted to monitor resource utilization at said egress and send egress resource utilization information to said ingress;
- wherein said ingress is further adapted to receive the egress resource utilization information from said egress; and
- wherein the threshold is selected to be a function of the egress resource utilization information received from said egress.
16. A system as defined in claim 15, wherein the resource is memory.
17. A system as defined in claim 15, wherein the threshold is selected to be a function of the egress resource utilization information received from said egress and a storage capacity of at least one of the switching entity and said egress.
18. A system as defined in claim 1,
- wherein said egress is adapted to: - monitor resource utilization at said egress;
- compute the threshold as a function of the monitored resource utilization at said egress; and
- transmit the threshold to said ingress; - wherein said ingress is further adapted to: receive the threshold from said egress.
19. A system as defined in claim 1, wherein said egress is adapted to: - monitor resource utilization at said egress;
- compute the threshold as a function of the monitored resource utilization at said egress; and
- transmit an indication of the threshold to said ingress; wherein said ingress is further adapted to: - receive the indication of the threshold from said egress; and
- determine the threshold from the received indication of the threshold.
20. A system as defined in claim 19, wherein said ingress is further adapted to determine the threshold by consulting a look-up table.
21. A system as defined in claim 18, wherein said egress is further adapted to transmit the threshold to said ingress via the switching entity.
22. A system as defined in claim 19, wherein said egress is further adapted to transmit the indication of the threshold to said ingress via the switching entity.
23. A system as defined in claim 1, wherein said egress is further adapted to transmit the acknowledgements of received packets to said ingress via the switching entity.
24. A system as defined in claim 1 ,
- wherein each packet represents a frame consisting of an ordered set of segments;
- wherein said egress being capable of re-ordering the packets in the designated order includes said egress being capable of (I) re-ordering the segments within each frame and (II) re-ordering the frames in the designated order.
25. A system as defined in claim 1, wherein said egress being capable of sending an acknowledgement of receipt of the re-ordered packets to said ingress includes said egress being capable of sending an acknowledgement of receipt of plural re-ordered packets.
26. A system as defined in claim 1, wherein each packet is associated with a characteristic and wherein said egress being capable of sending an acknowledgement of receipt of the re-ordered packets to said ingress includes said egress being capable of sending an acknowledgement of receipt of a number M of re-ordered packets having the characteristic upon re-ordering of Mpackets received from the switching entity and having that characteristic, wherein Mis an integer.
27. A system as defined in claim 26, wherein the characteristic is a source-destination pair.
28. A system as defined in claim 26, wherein the characteristic is a bandwidth class.
29. A system as defined in claim 26, wherein the characteristic is a quality of service.
30. A system as defined in claim 26, wherein the characteristic is a priority.
31. A system as defined in claim 26, wherein Mis constant.
32. A system as defined in claim 26, wherein Mis time varying.
33. A system as defined in claim 1, wherein said egress is further adapted to monitor a property and wherein said egress being capable of sending an acknowledgement of receipt of the re-ordered packets to said ingress includes said egress being capable of sending an acknowledgement of receipt of the number of re-ordered packets received since the previous time an acknowledgement was sent upon the property satisfying a condition.
34. A system as defined in claim 33, wherein the property is a level of available return bandwidth to said ingress and wherein the condition is that the available return bandwidth to said ingress be above a return bandwidth threshold.
35. A system as defined in claim 1, further comprising: an output queue connected to said egress;
- an electro-optic conversion module connected to said output queue; and
- wherein said egress being adapted to send an acknowledgement of receipt of the reordered packets includes said egress being adapted to send an acknowledgement of receipt of the re-ordered packets upon the re-ordered packets exiting said output queue and prior to the re-ordered packets entering said electro-optic conversion module.
36. A system as defined in claim 1, wherein said egress is further adapted to send acknowledgements of receipt of the re-ordered packets to said ingress in a second designated order.
37. A system as defined in claim 36, wherein said ingress is further adapted to monitor an order in which the acknowledgements of receipt of the packets are received from said egress.
38. A system as defined in claim 37, wherein said ingress is further adapted to determine a degree to which the order in which the acknowledgement of receipt of packets are received from said egress matches the second designated order and to perform an action that depends on the degree of match.
39. A system as defined in claim 38, wherein said ingress is further adapted to determine whether a particular acknowledgement of receipt of packets is lost and, if so, to identify a set of packets associated with the lost acknowledgement of receipt.
40. A system as defined in claim 39, wherein said ingress is further adapted to include retransmitting to said egress the set of packets associated with the lost acknowledgement.
41. A system as defined in claim 1, wherein each packet is associated with a corresponding one of a plurality of characteristics and wherein said ingress includes:
- a plurality of ingress queues, each ingress queue associated with a respective one of the characteristics; - an arbiter connected to said ingress queues, for controlling the release of packets from said ingress queues into the switching entity.
42. A system as defined in claim 1, said ingress being adapted to monitor transmission integrity through the switching entity and to invoke a verification process upon detecting a transmission integrity problem.
43. A system as defined in claim 42, wherein said ingress being adapted to detect a transmission integrity problem includes said ingress being adapted to detect a missing acknowledgement of receipt of packets.
44. A system as defined in claim 42,
- wherein said ingress being adapted to invoke a verification process includes said ingress triggering the release of a reference packet into the switching entity at a reference instant; - wherein said ingress is further adapted to:
- store a first data element indicative of packets for which an acknowledgement of receipt had not yet been received from said egress at the reference instant;
- maintain a second data element indicative of packets for which an acknowledgement of receipt is received from said egress between the reference instant and the instant at which an acknowledgement of receipt of the reference packet is received from said egress;
- perform a comparison of the first and second data elements; and
- assess integrity of the flow of packets on the basis of the comparison.
45. A system as defined in claim 1, said egress being adapted to monitor transmission integrity through the switching entity and to invoke a verification process upon detecting a transmission integrity problem.
46. A system as defined in claim 45, wherein said egress being adapted to detect a transmission integrity problem includes said egress being adapted to detect a missing packet or a missing packet portion.
47. A system as defined in claim 45,
- wherein said egress being adapted to invoke a verification process includes said egress transmitting a message to said ingress capable of triggering the release of a reference packet into the switching entity at a reference instant;
- wherein said ingress is further adapted to: - store a first data element indicative of packets for which an acknowledgement of receipt had not yet been received from said egress at the reference instant;
- maintain a second data element indicative of packets for which an acknowledgement of receipt is received from said egress between the reference instant and the instant at which an acknowledgement of receipt of the reference packet is received from said egress;
- perform a comparison of the first and second data elements; and
- assess integrity of the flow of packets on the basis of the comparison.
48. A system for regulating packet flow through a switching entity, comprising:
- an ingress capable of sending packets to the switching entity in a designated order;
- an egress capable of receiving packets from the switching entity, re-ordering the packets in the designated order and sending an acknowledgement of receipt of the re-ordered packets to said ingress; - said ingress being adapted to:
- receive acknowledgements of receipt of packets from said egress;
- maintain an indication of a number of packets that are allowed to be transmitted without receiving additional acknowledgements of receipt of packets from said egress; and - regulate sending packets to the switching entity on the basis of the number of packets that are allowed to be transmitted without receiving additional acknowledgements of receipt of packets from said egress.
49. A system as defined in claim 48, wherein said ingress being adapted to regulate sending packets to the switching entity includes said ingress being adapted to refrain from sending packets to the switching entity unless the number of packets that are allowed to be transmitted without receiving additional acknowledgements of receipt of packets from said egress is greater than zero.
50. A system as defined in claim 48^
- wherein each packet belongs to a corresponding frame consisting of an ordered set of N packets that follow the designated order; - wherein said ingress being adapted to regulate sending packets to the switching entity includes said ingress being adapted to refrain from sending packets to the switching entity unless the number of packets that are allowed to be transmitted without receiving additional acknowledgements of receipt of packets from said egress is greater than N.
51. A system as defined in claim 48, wherein said egress being capable of re-ordering the packets includes said egress being capable of re-ordering the N packets corresponding to a common frame prior to sending an acknowledgement of receipt of the N re-ordered packets to said ingress.
52. A method of regulating packet flow through a switching entity, comprising:
- sending packets from an ingress to the switching entity in a designated order;
- receiving packets from the switching entity at an egress, re-ordering the packets in the designated order and sending an acknowledgement of receipt of the re-ordered packets to the ingress;
- receiving acknowledgements of receipt of packets from the egress;
- maintaining an indication of a number of packets for which an acknowledgement of receipt has not yet been received from the egress;
- performing a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from the egress and a threshold; and
- regulating sending packets to the switching entity on the basis of the comparison.
53. Computer-readable storage media containing a program element for execution by a computing device to implement an ingress for regulating packet flow through a switching entity, the ingress comprising:
- a control entity operative to: - send packets to the switching entity in a designated order;
- receive acknowledgements of receipt of re-ordered packets from an egress;
- maintain an indication of a number of packets for which an acknowledgement of receipt has not yet been received from the egress;
- perform a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from the egress and a threshold; and
- regulate sending packets to the switching entity on the basis of the comparison.
54. A system for assessing integrity of a flow of packets through a switching entity, comprising:
- an ingress capable of sending packets to the switching entity, the packets including a reference packet sent to the switching entity at a reference instant; an egress capable of receiving packets from the switching entity and sending to said ingress an acknowledgement of receipt of packets from the switching entity; - said ingress being adapted to:
- receive acknowledgements of receipt of packets from said egress;
- maintain a current indication of packets for which an acknowledgement of receipt has not yet been received from said egress; store a first data element indicative of packets for which an acknowledgement of receipt had not yet been received from said egress at the reference instant;
- maintain a second data element indicative of packets for which an acknowledgement of receipt is received from said egress between the reference instant and the instant at which an acknowledgement of receipt of the reference packet is received from said egress; - perform a comparison of the first and second data elements; and
- assess integrity of the flow of packets on the basis of the comparison.
55. A system as defined in claim 54, wherein said egress being capable of sending an acknowledgement of receipt of packets from the switching entity includes said egress being capable of sending an acknowledgement of receipt of plural packets from the switching entity.
56. A system as defined in claim 54, wherein said egress being capable of sending an acknowledgement of receipt of packets from the switching entity includes said egress being capable of sending an acknowledgement of receipt of M non-reference packets from the switching entity, M being an integer greater than one.
57. A system as defined in claim 56, said egress being further adapted to send an acknowledgement of receipt of a reference packet from the switching entity.
58. A system as defined in claim 57, wherein said ingress is further adapted to perform a correction of the current indication maintained by said ingress on the basis of the comparison.
59. A system as defined in claim 58, wherein the first and second data elements respectively hold first and second values and wherein the correction is performed by evaluating a difference between the first and second data values and subtracting the difference from the current indication maintained by said ingress.
60. A system as defined in claim 54, wherein said ingress is further adapted to: - maintain an indication of a time elapsed since the reference instant; and if the time elapsed since the reference instant exceeds a threshold before an acknowledgement of receipt of the reference packet is received from said egress: - send a second reference packet to the switching entity at a second reference instant;
- store a third data element indicative of packets for which an acknowledgement of receipt had not yet been received from said egress at the second reference instant; - maintain a fourth data element indicative of packets for which an acknowledgement of receipt is received from said egress between the second reference instant and the instant at which an acknowledgement of receipt of the second reference packet is received from said egress;
- perform a comparison of the third and fourth data elements; and assess integrity of the flow of packets on the basis of the comparison.
61. A system as defined in claim 54, wherein said ingress is further adapted to mark as reference packets selected packets received from an external entity.
62. A system as defined in claim 54, wherein said ingress is further adapted to create reference packets and introduce the reference packets amongst packets received from an external entity.
63. A system as defined in claim 54, wherein the reference instant is pre-determined.
64. A system as defined in claim 54, wherein said ingress is further adapted to:
- monitor the contents of selected packets; and select the reference instant on the basis of the contents of each selected packet.
65. A system as defined in claim 64, wherein said ingress is further adapted to:
- monitor transmission integrity; and upon detection of a transmission integrity problem, transmit a reference packet.
66. A system as defined in claim 54, wherein said ingress is further adapted to: monitor a condition at said ingress; and select the reference instant on the basis of the monitored condition.
67. A system as defined in claim 66, wherein said ingress is further adapted to maintain an indication of time elapsed since receiving the most recently received acknowledgement of receipt of packets from said egress, wherein the condition is the elapsed time exceeding a threshold.
68. A system as defined in claim 54, wherein said ingress is further adapted to: - monitor a condition in the switching entity; and select the reference instant on the basis of the monitored condition.
69. A system as defined in claim 68, wherein said ingress is further adapted to obtain an indication of resource utilization in the switching entity, wherein the condition is the resource utilization in the switching entity exceeding a threshold.
70. A system as defined in claim 54, wherein said ingress is further adapted to: - monitor a condition at said egress; and select the reference instant on the basis of the monitored condition.
71. A system as defined in claim 70, wherein said egress is further adapted to maintain an indication of resource utilization at said egress, wherein the condition is the resource utilization at said egress exceeding a threshold.
72. A system as defined in claim 54, wherein the reference instant is selected as a function of user input.
73. A system as defined in claim 54, wherein each packet is associated with a characteristic and wherein the reference instant, the current indication of packets for which an acknowledgement of receipt has not yet been received from said egress, the indication stored in the first data element and the indication maintained in the second data element are characteristic-dependent.
74. A system as defined in claim 73, wherein the characteristic is a source-destination pair.
75. A system as defined in claim 73, wherein the characteristic is a bandwidth class.
76. A system as defined in claim 73, wherein the characteristic is a quality of service.
77. A system as defined in claim 73, wherein the characteristic is a priority.
78. A system as defined in claim 54, wherein said egress is adapted to transmit the acknowledgements of received packets to said ingress via the switching entity.
79. A system as defined in claim 54, wherein said egress is further adapted to monitor a property and wherein said egress being capable of sending an acknowledgement of receipt of packets to said ingress includes said egress being capable of sending an acknowledgement of receipt of the indication of packets received since the previous time an acknowledgement was sent upon the property satisfying a condition.
80. A system as defined in claim 79, wherein the property is a level of available return bandwidth to said ingress and wherein the condition is that the available return bandwidth to said ingress be above a return bandwidth tlireshold.
81. A system as defined in claim 54, further comprising: - an output queue connected to said egress;
- an electro-optic conversion module connected to said output queue;
- wherein said egress being adapted to send an acknowledgement of receipt of packets includes said egress being adapted to send an acknowledgement of receipt of packets upon the packets exiting said output queue and prior to the packets entering said electro-optic conversion module.
82. A system as defined in claim 54, wherein said ingress is further adapted to:
- perform a comparison between the indication of packets for which an acknowledgement of receipt has not yet been received from said egress and a tlireshold;
- regulate sending packets to the switching entity on the basis of the comparison.
83. A system as defined in claim 82, wherein said ingress being adapted to regulate sending packets to the switching entity on the basis of the comparison includes said ingress being adapted to :
- send packets to the switching entity if the comparison indicates that the number of packets for which an acknowledgement of receipt has not yet been received from said egress is below the threshold; and
- refrain from sending packets to the switching entity if the comparison indicates that the number of packets for which an acknowledgement of receipt has not yet been received from said egress exceeds the threshold.
84. A system as defined in claim 82, wherein said ingress being adapted to regulate sending packets to the switching entity on the basis of the comparison includes said ingress being adapted to:
- maintain a memory element indicative of whether the number of packets for which an acknowledgement of receipt has not yet been received from said egress exceeds the threshold;
- send packets to the switching entity if the contents of the memory element is indicative of the number of packets for which an acknowledgement of receipt has not yet been received from said egress exceeding the threshold; and - refrain from sending packets to the switching entity if the contents of the memory element is not indicative of the number of packets for which an acknowledgement of receipt has not yet been received from said egress exceeding the threshold.
85. A system as defined in claim 82, wherein said ingress being adapted to perform a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from said egress and a threshold includes said ingress being adapted to perform the comparison for each packet prior to sending that packet to the switching entity.
86. A system as defined in claim 82,
- wherein each packet is associated with a characteristic;
- wherein said ingress is further adapted to deteπnine the characteristic associated with each packet;
- wherem said ingress being adapted to maintain an indication of a number of packets for which an acknowledgement of receipt has not yet been received from said egress includes said ingress being adapted to maintain an indication of a number of packets of each characteristic for which an acknowledgement of receipt has not yet been received from said egress;
- wherein said ingress being adapted to perform a comparison between the number of packets for which an acknowledgement of receipt has not yet been received from said egress and a threshold includes said ingress being adapted to perform a comparison between the number of packets of the associated characteristic for which an acknowledgement of receipt has not yet been received from said egress and a threshold associated with the associated characteristic.
87. A system as defined in claim 86, wherein the threshold associated with at least one characteristic is time-invariant.
88. A system as defined in claim 86, wherein the threshold associated with at least one characteristic varies in time.
89. A system as defined in claim 86, wherein the threshold associated with all characteristics is a common threshold.
90. A system as defined in claim 86, wherein the characteristic is a source-destination pair.
91. A system as defined in claim 86, wherein the characteristic is a bandwidth class.
92. A system as defined in claim 86, wherein the characteristic is a quality of service.
93. A system as defined in claim 86, wherein the characteristic is a priority.
94. A system as defined in claim 82, wherein the threshold is the lowest of a plurality of thresholds, at least one of which is modifiable.
95. A system as defined in claim 82,
- wherein said egress is adapted to monitor resource utilization at said egress and send egress resource utilization information to said ingress; - wherein said ingress is further adapted to receive the egress resource utilization information from said egress;
- wherein the threshold is selected to be a function of the egress resource utilization information received from said egress.
96. A system as defined in claim 95, wherein said egress being adapted to send the egress resource utilization information to said ingress includes said egress being adapted to send egress resource utilization information to said ingress along with an acknowledgement of receipt of the reference packet.
97. A system as defined in claim 96, wherein the resource is memory.
98. A system as defined in claim 96, wherein the threshold is selected to be a function of the egress resource utilization information received from said egress and a storage capacity of at least one of the switching entity and said egress.
99. A system as defined in claim 82,
- wherein said egress is adapted to:
- monitor resource utilization at said egress; - compute the threshold as a function of the monitored resource utilization at said egress; and
- transmit the tlireshold to said ingress;
- wherein said ingress is further adapted to:
- receive the threshold from said egress.
100. A system as defined in claim 99, wherein said egress being adapted to transmit the threshold to said ingress includes said egress being adapted to send the threshold to said ingress along with an acknowledgement of receipt of the reference packet.
101. A system as defined in claim 99, wherein said egress is further adapted to transmit the threshold to said ingress via the switching entity.
102. A system as defined in claim 82,
- wherein said egress is adapted to: - monitor resource utilization at said egress;
- compute the threshold as a function of the monitored resource utilization at said egress; and
- transmit an indication of the threshold to said ingress;
- wherein said ingress is further adapted to: - receive the indication of the threshold from said egress; and
- determine the tlireshold from the received indication of the threshold.
103. A system as defined in claim 102, wherein said egress being adapted to transmit an indication of the threshold to said ingress includes said egress being adapted to send an indication of the threshold to said ingress along with an acknowledgement of receipt of the reference packet.
104. A system as defined in claim 102, wherein said egress is further adapted to transmit the indication of the threshold to said ingress via the switching entity.
105. A system as defined in claim 54, wherein said ingress is further adapted to:
- maintain a third data element indicative of a number of packets that are allowed to be transmitted without receiving additional acknowledgements of receipt of packets from said egress;
- regulate sending packets to the switching entity on the basis of the third data element.
106. A system as defined in claim 105, wherein said ingress being adapted to regulate sending packets to the switching entity includes said ingress being adapted to refrain from sending packets to the switching entity unless the third data element is indicative of the number of packets that are allowed to be transmitted without receiving additional acknowledgements of receipt of packets from said egress being greater than zero.
107. A system as defined in claim 105,
- wherein each packet belongs to a corresponding frame consisting of an ordered set of N packets;
- wherein said ingress being adapted to regulate sending packets to the switching entity includes said ingress being adapted to refrain from sending packets to the switching entity unless the third data element is indicative of the number of packets that are allowed to be transmitted without receiving additional'acknowledgements of receipt of packets from said egress being greater than N.
108. A system as defined in claim 54, wherein - said ingress is further adapted to send packets to the switching entity in a designated order;
- said egress' is further adapted to re-order the packets received from the switching entity in the designated order; - wherein said egress being adapted to send an acknowledgement of receipt of packets received from the switching entity includes said egress being adapted to send an acknowledgement of receipt of a set of packets received from the switching entity only upon re-ordering the set of packets.
109. A system as defined in claim 108, wherein each packet represents a frame consisting of an ordered set of segments; - wherein said egress being capable of re-ordering the packets in the designated order includes said egress being capable of (I) re-ordering the segments within each frame and (II) re-ordering the frames in the designated order.
110. A system as defined in claim 108, wherein each packet is associated with a characteristic and wherein said egress being capable of sending an acknowledgement of receipt of the re-ordered packets to said ingress includes said egress being capable of sending an acknowledgement of receipt of a number M of re-ordered packets having the characteristic upon re-ordering of M packets received from the switching entity and having that characteristic, wherein is an integer.
111. A system as defined in claim 110, wherein the characteristic is a source-destination pair.
112. A system as defined in claim 110, wherein the characteristic is a bandwidth class.
113. A system as defined in claim 110, wherein the characteristic is a quality of service.
114. A system as defined in claim 110, wherein the characteristic is a priority.
115. A system as defined in claim 110, wherein M is constant.
116. A system as defined in claim 110, wherein M is modifiable.
117. A system as defined in claim 108, wherein said egress is further adapted to send acknowledgements of receipt of the re-ordered packets to said ingress in a second designated order.
118. A system as defined in claim 117, wherein said ingress is further adapted to monitor an order in which the acknowledgements of receipt of the packets are received from said egress.
119. A system as defined in claim 118, wherein said ingress is further adapted to determine a degree to which the order in which the acknowledgement of receipt of packets are received from said egress matches the second designated order and to perform an action that depends on the degree of match.
120. A system as defined in claim 118, wherein said ingress is further adapted to determine whether a particular acknowledgement of receipt of packets is lost and, if so, to identify a set of packets associated with the lost acknowledgement of receipt.
121. A system as defined in claim 120, wherein said ingress is further adapted to include re-transmitting to said egress the set of packets associated with the lost acknowledgement.
122. A system as defined in claim 108, wherein each packet is associated with a corresponding one of a plurality of characteristics and wherein said system includes: a plurality of ingress queues, each ingress queue associated with a respective one of the characteristics; an arbiter connected to said ingress queues, for controlling the release of packets from said ingress queues into the switching entity.
123. A method of assessing integrity of a flow of packets through a switching entity, comprising: sending packets from an ingress to the switching entity, the packets including a reference packet sent to the switching entity at a reference instant; - receiving packets from the switching entity at an egress and sending to the ingress an acknowledgement of receipt of packets from the switching entity;
- receiving acknowledgements of receipt of packets from the egress;
- maintaining a current indication of packets for which an acknowledgement of receipt has not yet been received from the egress; - storing a first data element indicative of packets for which an acknowledgement of receipt had not yet been received from the egress at the reference instant;
- maintaining a second data element indicative of packets for which an acknowledgement of receipt from the egress is received between the reference instant and the instant at which an acknowledgement of receipt of the reference packet is received from the egress;
- performing a comparison of the first and second data elements; and
- assessing integrity of the flow of packets on the basis of the comparison.
124. Computer-readable storage media containing a program element for execution by a computing device to implement an ingress for regulating packet flow through a switching entity, the ingress comprising: a control entity operative to: send packets to the switching entity, the packets including a reference packet sent to the switching entity at a reference instant;
- receive acknowledgements of receipt of packets from an egress; maintain a current indication of packets for which an acknowledgement of receipt has not yet been received from the egress;
- store a first data element indicative of packets for which an acknowledgement of receipt had not yet been received from the egress at the reference instant;
- maintain a second data element indicative of packets for which an acknowledgement of receipt from the egress is received between the reference instant and the instant at which an acknowledgement of receipt of the reference packet is received from the egress; - perform a comparison of the first and second data elements; and
- assess integrity of the flow of packets on the basis of the comparison.
,
125. A system as defined in claim 6, wherein the characteristic is a combination of a source-destination pair and at least one of a priority, a bandwidth class and a quality of service.
126. A system as defined in claim 26, wherein the characteristic is a combination of a source-destination pair and at least one of a priority, a bandwidth class and a quality of service.
127. A system as defined in claim 73, wherein the characteristic is a combination of a source-destination pair and at least one of a priority, a bandwidth class and a quality of service.
128. A system as defined in claim 86, wherein the characteristic is a combination of a source-destination pair and at least one of a priority, a bandwidth class and a quality of service.
129. A system as defined in claim 110, wherein the characteristic is a combination of a source-destination pair and at least one of a priority, a bandwidth class and a quality of service.
130. A system as defined in claim 95, wherein said egress being adapted to send the egress resource utilization information to said ingress includes said egress being adapted to send egress resource utilization information to said ingress along with an acknowledgement of receipt of a packet other than the reference packet.
131. A system as defined in claim 99, wherein said egress being adapted to transmit the threshold to said ingress includes said egress being adapted to send the threshold to said ingress along with an acknowledgement of receipt of a packet other than the reference packet.
132. A system as defined in claim 102, wherein said egress being adapted to transmit an indication of the threshold to said ingress includes said egress being adapted to send an indication of the threshold to said ingress along with an acknowledgement of receipt of a packet other than the reference packet.
133. A system as defined in claim 68, wherein said ingress is further adapted to obtain an indication of failure in the switching entity, wherein the condition is the indication of failure in the switching entity.
134. A system as defined in claim 66, wherein said ingress is further adapted to obtain an indication of failure in said ingress, wherein the condition is the indication of failure in said ingress.
135. A system as defined in claim 70, wherein said ingress is further adapted to obtain an indication of failure in said egress, wherein the condition is the indication of failure in said egress.
PCT/CA2003/001353 2002-09-03 2003-09-03 Systems and methods for packet flow regulation and transmission integrity verification of a switching entity WO2004023718A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2003264210A AU2003264210A1 (en) 2002-09-03 2003-09-03 Systems and methods for packet flow regulation and transmission integrity verification of a switching entity

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US40735702P 2002-09-03 2002-09-03
US40735602P 2002-09-03 2002-09-03
US60/407,356 2002-09-03
US60/407,357 2002-09-03

Publications (2)

Publication Number Publication Date
WO2004023718A2 true WO2004023718A2 (en) 2004-03-18
WO2004023718A3 WO2004023718A3 (en) 2004-09-10

Family

ID=31981502

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CA2003/001353 WO2004023718A2 (en) 2002-09-03 2003-09-03 Systems and methods for packet flow regulation and transmission integrity verification of a switching entity

Country Status (2)

Country Link
AU (1) AU2003264210A1 (en)
WO (1) WO2004023718A2 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014158407A1 (en) * 2013-03-14 2014-10-02 Cisco Technology, Inc. Intra switch transport protocol
US8958329B2 (en) 2012-11-20 2015-02-17 Cisco Technology, Inc. Fabric load balancing
US9059915B2 (en) 2012-08-31 2015-06-16 Cisco Technology, Inc. Multicast replication skip
US9628406B2 (en) 2013-03-13 2017-04-18 Cisco Technology, Inc. Intra switch transport protocol
US10122645B2 (en) 2012-12-07 2018-11-06 Cisco Technology, Inc. Output queue latency behavior for input queue based device
CN112822119A (en) * 2020-12-31 2021-05-18 北京浩瀚深度信息技术股份有限公司 Flow control method, flow control equipment and storage medium based on reverse token bucket

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6076112A (en) * 1995-07-19 2000-06-13 Fujitsu Network Communications, Inc. Prioritized access to shared buffers

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6076112A (en) * 1995-07-19 2000-06-13 Fujitsu Network Communications, Inc. Prioritized access to shared buffers

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DIOT C ET AL: "Impact of out-of-sequence processing on the performance of data transmission" COMPUTER NETWORKS, ELSEVIER SCIENCE PUBLISHERS B.V., AMSTERDAM, NL, vol. 31, no. 5, 11 March 1999 (1999-03-11), pages 475-492, XP004304495 ISSN: 1389-1286 *
KOFMAN D: "Traffic and congestion control in broadband networks" DECISION AND CONTROL, 1996., PROCEEDINGS OF THE 35TH IEEE CONFERENCE ON KOBE, JAPAN 11-13 DEC. 1996, NEW YORK, NY, USA,IEEE, US, 11 December 1996 (1996-12-11), pages 2894-2898, XP010213672 ISBN: 0-7803-3590-2 *
TAHAR S ET AL: "Formal verification of an ATM switch fabric using multiway decision graphs" VLSI, 1996. PROCEEDINGS., SIXTH GREAT LAKES SYMPOSIUM ON AMES, IA, USA 22-23 MARCH 1996, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 22 March 1996 (1996-03-22), pages 106-111, XP010157903 ISBN: 0-8186-7502-0 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9059915B2 (en) 2012-08-31 2015-06-16 Cisco Technology, Inc. Multicast replication skip
US8958329B2 (en) 2012-11-20 2015-02-17 Cisco Technology, Inc. Fabric load balancing
US10122645B2 (en) 2012-12-07 2018-11-06 Cisco Technology, Inc. Output queue latency behavior for input queue based device
US9628406B2 (en) 2013-03-13 2017-04-18 Cisco Technology, Inc. Intra switch transport protocol
WO2014158407A1 (en) * 2013-03-14 2014-10-02 Cisco Technology, Inc. Intra switch transport protocol
US9860185B2 (en) 2013-03-14 2018-01-02 Cisco Technology, Inc. Intra switch transport protocol
CN112822119A (en) * 2020-12-31 2021-05-18 北京浩瀚深度信息技术股份有限公司 Flow control method, flow control equipment and storage medium based on reverse token bucket
CN112822119B (en) * 2020-12-31 2022-09-13 北京浩瀚深度信息技术股份有限公司 Flow control method, flow control equipment and storage medium based on reverse token bucket

Also Published As

Publication number Publication date
AU2003264210A1 (en) 2004-03-29
WO2004023718A3 (en) 2004-09-10
AU2003264210A8 (en) 2004-03-29

Similar Documents

Publication Publication Date Title
US11757764B2 (en) Optimized adaptive routing to reduce number of hops
JP4080911B2 (en) Bandwidth monitoring device
US8780719B2 (en) Packet relay apparatus and congestion control method
US6535482B1 (en) Congestion notification from router
US7069356B2 (en) Method of controlling a queue buffer by performing congestion notification and automatically adapting a threshold value
US7369498B1 (en) Congestion control method for a packet-switched network
US6625118B1 (en) Receiver based congestion control
KR101143172B1 (en) Efficient transfer of messages using reliable messaging protocols for web services
US20200236052A1 (en) Improving end-to-end congestion reaction using adaptive routing and congestion-hint based throttling for ip-routed datacenter networks
JP5710418B2 (en) Packet relay apparatus and method
US20050201284A1 (en) TCP optimized single rate policer
US9860185B2 (en) Intra switch transport protocol
US8717893B2 (en) Network stabilizer
WO2008027310A2 (en) Systems and methods for energy-conscious communication in wireless ad-hoc networks
JP5541293B2 (en) Packet receiving apparatus, packet communication system, and packet order control method
JP2003124984A (en) Data distribution managing apparatus, system and method therefor
EP0955749A1 (en) Receiver based congestion control and congestion notification from router
WO2004023718A2 (en) Systems and methods for packet flow regulation and transmission integrity verification of a switching entity
US20230261973A1 (en) Method for distributing multipath flows in a direct interconnect network
CN117395206B (en) Rapid and accurate congestion feedback method for lossless data center network
TWI831622B (en) Apparatus for managing network flow congestion and method thereof
CN115022227B (en) Data transmission method and system based on circulation or rerouting in data center network
US20230353472A1 (en) Method for verifying flow completion times in data centers
US20230269187A1 (en) Apparatus and method for managing network flow congestion
Kojo et al. Supporting low latency near the network edge and with challenging link technologies

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP