ANALYSIS OF NETWORK PERFORMANCE
FIELD OF THE INVENTION
The present invention relates generally to communication networks, and specifically to testing and fault discovery in communication networks. BACKGROUND OF THE INVENTION
Communication networks are in wide use in many technological fields including distributed computing, data exchange and telecommunication applications. Communication networks generally include many nodes, such as bridges, LAN switches, routers, cross-connections and telephone switches. The networks further include communication links, such as cables, point-to-point radio connections and optical fibers, which connect the nodes. The networks also include ports, generally within some of the nodes, for attaching external devices such as computers, terminals, handsets, and multiplexers. These external devices are referred to as end-points, or hosts.
Networks are becoming increasingly complex, especially due to their increasing speeds of operation, the number of units interconnected by a network and the formation of large networks from different types of sub-networks. In addition, networks may transmit concurrently various types of data, such as text, voice, video and other multimedia files. In order to allow for these different types of data, some networks are designed to provide different amounts of bandwidth and different levels of quality of service. A major issue in both newly-deployed and existing communication networks is testing and trouble-shooting, i.e., checking whether the network is operating according to its specifications and, if not, determining the cause of the network's inadequate performance (for example, the identity of a faulty unit). Dedicated point-to-point testing equipment is a commonly-used network testing tool. Such equipment is described, for example, in U.S. Patent 5,477,531, whose disclosure is incorporated herein by reference. Usually, dedicated point-to-point testing equipment requires two users to coordinate their operations in order to identify a misbehaving component of the network. To test a large network, the testing equipment must be moved between many ports of the network.
U.S. Patent 5,812,529, whose disclosure is incorporated herein by reference, describes a system and method for acquiring network performance data, built around a "mission server,"
which interfaces with clients to receive requests for "missions." A typical mission includes operations such as transmission and reception of data packets among devices connected to segments of the network. The mission is performed and/or supported by "sentries," typically software agents running on stand-alone network devices or end-points. The sentries carry out mission operations in response to commands from the mission server, and report back to the mission server on the mission results.
U.S. Patents 5,838,919 and 5,881,237, whose disclosures are incorporated herein by reference, describe methods, systems and computer program products for testing of network performance using test scenarios that simulate actual communications traffic between network endpoints. Specific test protocols are assigned to endpoint nodes on the network. Typically, the nodes are paired, and one of the nodes in the pair communicates the protocol to the other, associated node. A console node sets up the test protocols, initiates their execution and receives data on the test performance from the endpoint nodes.
Application performance measurement tools evaluate the performance of existing or new applications as they are introduced into a network. Typical tools of this sort include "Chariot," produced by Ganymede (Research Triangle Park, North Carolina), and "Webload" and "Webexam," produced by Radview (Tel Aviv, Israel). Such tools, however, do not test the network itself independent of specific applications. Therefore, they cannot readily distinguish between problems whose root causes are in the application and those that are in the network itself.
SUMMARY OF THE INVENTION It is an object of some aspects of the present invention to provide improved methods and apparatus for locating faults within communication networks.
It is another object of some aspects of the present invention to provide improved methods and apparatus for evaluation of the performance of communication networks.
In preferred embodiments of the present invention, a distributed testing system for evaluation and/or testing of a communication network comprises a plurality of traffic agents coupled to nodes and/or hosts of the network. The traffic agents act as artificial users of the network by, for example, transmitting and receiving packets of data, establishing connections, and determining traffic statistics. The testing system further comprises a testing center, which
controls the operations of the traffic agents and receives reports from the agent regarding the results of tests conducted thereby.
In some preferred embodiments of the present invention, the testing center orders at least one of the traffic agents to transmit packets to at least one other traffic agent. The relative times and order of arrival of the packets at the receiving traffic agent or agents are preferably analyzed to find one or more measures of traffic variability. These measures are typically used to determine whether network transmissions are orderly and regular, or whether there are irregularities in packet arrival that may be indicative of network faults. While measurements of packet transmission times are used in network diagnostic systems known in the art, it is generally only the average transmission time that is of concern in these systems. Preferred embodiments of the present invention, on the other hand, make use of comparative statistical properties among the received packets to derive richer diagnostic information. For example, in one of these preferred embodiments, packets are transmitted at regular intervals, and the system compiles statistics on packets that do not reach their destination in order to determine whether packet loss occurs regularly or in bursts. In another preferred embodiment, the order of the arrival of packets at their destination is compared to the order of their transmission, and a measure is derived of the extent to which packets have arrived out of order. Other such comparative variability measures will be apparent to those skilled in the art.
In still another preferred embodiment of the present invention, the traffic agents are used to diagnose problems associated with an application running on a server and accessed over the network. A first traffic agent is installed on a first computer that is also an application server. A second traffic agent, on a second computer, both communicates with the first traffic agent and accesses the application server, by emulating a client of the server or using an actual client program on the second computer. By comparing the performance of these two types of communications, it is possible to assess whether the application service problems are due to difficulties in the application or to network communication delays. This type of comparison cannot be carried out by diagnostic systems known in the art.
In some preferred embodiments of the present invention, the testing center initiates a test by commanding a number of the traffic agents to begin transmitting packets. Typically, two or more of the traffic agents are to begin transmitting substantially simultaneously. Preferably, the command conveyed to the traffic agents includes the current time, as measured
by the testing center, and a time at which the transmission is to begin. This method of test initiation obviates the need to send an initiate command to all participating nodes at the start of the test, as is practiced in diagnostic systems known in the art, and generally provides more accurate synchronization of the participating traffic agents. In one of these preferred embodiments, a pair of traffic agents are commanded to transmit packets to one another and to determine the times of arrival of the packets that they respectively receive. The times of transmission by the two agents are generally uncorrelated. The agents then inform the testing center of the times at which they sent and received the packets, or simply of the difference between their respective send and receive times. The testing center uses these essentially one-way transmission data in order to determine accurately the round-trip travel time of a packet. By contrast, in systems known in the art, measurements of round-trip delay are based on sending a packet from a first node to a second node, and then waiting to receive a return packet from the second node. The measurements thus require consecutive actions by the two nodes and are complicated by processing delays that may occur at one or both ends.
While preferred embodiments are described herein for the most part with reference to tests involving transmission of packets between pairs of traffic agents, the principles of the present invention can also be applied in more complex test scenarios. For example, test agents may be chained, so that each one sends a packet in turn to the next agent in the chain. Alternatively or additionally, multiple agents may send packets simultaneously to the same receiving agent. All such variations are considered to be within the scope of the present invention.
There is therefore provided, in accordance with a preferred embodiment of the present invention, a method for testing of a communication network, using a plurality of traffic agents coupled to communicate via the network, the method including: transmitting a sequence of data packets via the network from a first one of the traffic agents to a second one of the traffic agents; creating a record of the packets in the sequence that were not received at the second traffic agent; and assessing a relative irregularity in the occurrence of packet loss, based on the record.
Preferably, assessing the relative irregularity of packet loss includes detecting bursts of lost packets.
There is also provided, in accordance with a preferred embodiment of the present invention, a method for testing of a communication network, using a plurality of traffic agents coupled to communicate via the network, the method including: transmitting a sequence of data packets via the network from a first one of the traffic agents to a second one of the traffic agents; determining an order of arrival of the packets at the second traffic agent; and comparing the order of arrival to an order in which the packets were transmitted. Preferably, comparing the order of arrival includes finding a measure of discrepancy between the order of arrival and the order in which the packets were transmitted.
There is additionally provided, in accordance with a preferred embodiment of the present invention, a method for testing of a communication network, using a plurality of traffic agents coupled to communicate via the network, the method including: transmitting a sequence of data packets via the network from a first one of the traffic agents to a second one of the traffic agents; determining respective arrival times of the packets in the sequence; determining a packet transmission delay between the traffic agents responsive to the arrival times; and finding a change in the transmission delay over time.
There is further provided, in accordance with a preferred embodiment of the present invention, a method for testing of a communication network, using a plurality of traffic agents coupled to communicate via the network, the method including: transmitting a sequence of data packets via the network from a first one of the traffic agents to a second one of the traffic agents, the sequence including both communication test packets and packets associated with an application that is accessed via the network; recording arrival characteristics of the packets in the sequence, responsive to receiving the packets at the second traffic agent; and observing a difference in the arrival characteristics of the communication test packets relative to those of the packets associated with the application.
There is moreover provided, in accordance with a preferred embodiment of the present invention, a method for testing of a communication network, using a plurality of traffic agents coupled to communicate via the network, the method including: transmitting a first sequence of data packets via the network from a first one of the traffic agents to a second one of the traffic agents; transmitting a second sequence of data packets via the network from the second one of the traffic agents, responsive to receiving the data packets in the first sequence, to a third one of the traffic agents; recording arrival characteristics of the packets in the second sequence, responsive to receiving the packets at the third traffic agent; and comparing the arrival characteristics of different packets in the sequence so as to determine a measure of variability in transmission of the packets via the network.
There is furthermore provided, in accordance with a preferred embodiment of the present invention, a method for testing of a computer application accessed via a communication network, using a plurality of traffic agents coupled to communicate via the network, the method including: running an instance of the application on a first computer coupled to the network, on which a first one of the traffic agents is also running; exchanging test data packets via the network between a second one of the traffic agents, running on a second computer coupled to the network, and the first traffic agent, so as to determine test packet exchange characteristics generally independent of the application; exchanging application data packets via the network between the second computer and the instance of the application running on the first computer, so as to determine application packet exchange characteristics; and comparing the exchange characteristics of the application and test packets.
Preferably, running the instance of the application on the first computer includes running an application server, and exchanging the application data packets includes transmitting application client messages from the second computer to the first computer. Alternatively or additionally, running the instance of the application includes running a distributed computing application on the first computer, and exchanging the application data packets includes running another instance of the application on the second computer. Further
alternatively or additionally, comparing the exchange characteristics includes comparing a delay in the exchange of application data between the first and second computers relative to the exchange of test data.
There is also provided, in accordance with a preferred embodiment of the present invention, a method for determining a round-trip transmission delay in a communication network, including: transmitting a first data packet through the network at a first transmit time, from a first endpoint of the network to a second endpoint of the network; receiving the first data packet at the second endpoint at a first receive time; transmitting a second data packet through the network at a second transmit time, substantially independent of the first transmit and receive times, from the second endpoint to the first endpoint; receiving the second data packet at the first endpoint at a second receive time; and comparing the first and second transmit times and the first and second receive times so as to determine the round-trip transmission delay.
Preferably, transmitting the second data packet includes transmitting the second packet without waiting to receive the first data packet at the second endpoint. Additionally or alternatively, comparing the first and second transmit times and the first and second receive times includes using transmit and receive times recorded in accordance with different clocks maintained at the first and second endpoints. Most preferably, comparing the first and second transmit times and the first and second receive times includes canceling out a relative offset between the different clocks, substantially without an a priori knowledge of the offset.
There is additionally provided, in accordance with a preferred embodiment of the present invention, a method for testing of a communication network, using a plurality of traffic agents coupled to communicate via the network and having respective agent clocks that are generally independent of one another, the method including: determining at a testing center a start time at which a test of the network is to begin; sending respective start messages to the traffic agents, each start message containing the start time and a time of sending the start message determined with reference to a local clock maintained by the testing center; and
synchronizing initiation of the test by the traffic agents, responsive to the respective start messages.
Preferably, determining the start time includes choosing a time to start the test that is delayed relative to expected times of sending the start messages. Further preferably, synchronizing the initiation of the test includes starting the test at each of the traffic agents at the start time, as indicated by the respective agent clock, corrected responsive to the time of sending contained in the respective start message.
There is further provided, in accordance with a preferred embodiment of the present invention, apparatus for testing of a communication network, including: a first traffic agent, coupled to transmit a sequence of data packets via the network; and a second traffic agent, coupled to receive the data packets transmitted by the first traffic agent and to record, responsive to receiving the data packets, an indication of the packets in the sequence that were lost in transmission, wherein a relative irregularity in the occurrence of packet loss is assessed based on the indication.
Preferably, the apparatus includes a testing center, coupled to the network, which is adapted to receive the indication of the packets that were lost in transmission and to assess the relative irregularity in the occurrence of packet loss.
There is moreover provided, in accordance with a preferred embodiment of the present invention, apparatus for testing of a communication network, including: a first traffic agent, coupled to transmit a sequence of data packets via the network; and a second traffic agent, coupled to receive the data packets transmitted by the first traffic agent and to record, responsive to receiving the data packets, an order of arrival of the packets at the second traffic agent, wherein a measure of discrepancy is determined between the order of arrival and an order in which the packets were transmitted.
There is furthermore provided, in accordance with a preferred embodiment of the present invention, apparatus for testing of a communication network, including: a first traffic agent, coupled to transmit a sequence of data packets via the network; and
a second traffic agent, coupled to receive the data packets transmitted by the first traffic agent and to record, responsive to receiving the data packets, respective arrival times of the packets in the sequence at the second traffic agent, wherein a change in a transmission delay over time between the first and second traffic agents is detected responsive to the recorded arrival times.
There is also provided, in accordance with a preferred embodiment of the present invention, apparatus for testing of a computer application accessed via a communication network, including: a first computer, coupled to communicate via the network, and configured both to run an instance of the application and to act as a first traffic agent; and a second computer, coupled to communicate via the network with the first computer, and configured both to act as a second traffic agent so as to exchange test data packets via the network with the first traffic agent, generally independent of the application, and to exchange application data packets via the network with the instance of the application running on the first computer, so as to determine and compare characteristics of the exchange of the test data with corresponding characteristics of the exchange of the application data.
Preferably, the instance of the application running on the first computer includes an application server, and wherein the second computer acts as a client of the application.
There is additionally provided, in accordance with a preferred embodiment of the present invention, apparatus for determining a round-trip transmission delay in a communication network, including: a first traffic agent, adapted to be coupled to a first network endpoint and configured to transmit a first data packet through the network at a first transmit time, from the first endpoint to a second endpoint of the network; and a second traffic agent, adapted to be coupled to the second network endpoint, so as to receive the first data packet at a first receive time and to transmit a second data packet through the network to the first network endpoint at a second transmit time, substantially independent of the first transmit and receive times, to be received by the first traffic agent at a second receive time, wherein the first and second transmit times and the first and second receive times are compared so as to determine the round-trip transmission delay.