US20070208894A1 - Modification of a layered protocol communication apparatus - Google Patents
Modification of a layered protocol communication apparatus Download PDFInfo
- Publication number
- US20070208894A1 US20070208894A1 US11/390,488 US39048806A US2007208894A1 US 20070208894 A1 US20070208894 A1 US 20070208894A1 US 39048806 A US39048806 A US 39048806A US 2007208894 A1 US2007208894 A1 US 2007208894A1
- Authority
- US
- United States
- Prior art keywords
- processor
- layer
- version
- connection data
- software
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000004891 communication Methods 0.000 title claims abstract description 35
- 230000004048 modification Effects 0.000 title claims description 8
- 238000012986 modification Methods 0.000 title description 6
- 238000000034 method Methods 0.000 claims abstract description 58
- 238000012546 transfer Methods 0.000 claims description 17
- 238000013507 mapping Methods 0.000 claims description 5
- 230000000977 initiatory effect Effects 0.000 claims 4
- 239000010410 layer Substances 0.000 description 134
- 230000008569 process Effects 0.000 description 30
- 230000006870 function Effects 0.000 description 12
- 238000013459 approach Methods 0.000 description 10
- 238000007726 management method Methods 0.000 description 7
- 238000002955 isolation Methods 0.000 description 6
- 238000012423 maintenance Methods 0.000 description 6
- 238000002360 preparation method Methods 0.000 description 6
- 230000008859 change Effects 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 230000003068 static effect Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000007123 defense Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000002346 layers by function Substances 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000000704 physical effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/382—Information transfer, e.g. on bus using universal interface adapter
- G06F13/387—Information transfer, e.g. on bus using universal interface adapter for adaptation of different data processing systems to different peripheral devices, e.g. protocol converters for incompatible systems, open system
Definitions
- This invention relates to the field of communications.
- this invention is drawn to methods and apparatus for modifying a layered protocol communication apparatus including software modifications associated with different levels of the layered protocol communication apparatus.
- Communication networks are used to carry a wide variety of data.
- a communication network includes a number of interconnected nodes. Communication between source and destination is accomplished by routing data from a source through the communication network to a destination.
- Such a network might carry voice communications, financial transaction data, real-time data, etc., not all of which require the same level of performance from the network.
- the network might be used, for example, to communicate data associated with different classes of service such as “first available”, business data, priority data, or real-time data which place different constraints on the requirements for the delivery of the data including the timeframe within which it will be delivered.
- classes of service such as “first available”, business data, priority data, or real-time data which place different constraints on the requirements for the delivery of the data including the timeframe within which it will be delivered.
- Disruption to the network can be very costly.
- the revenue stream for many businesses is highly dependent upon the availability of the network.
- the network service provider frequently is under contract to guarantee certain levels of availability to customers and may incur significant financial liability in the event of disruption.
- maintenance is performed on the nodes. Maintenance may also be required to ensure that the nodes support various communication protocols as they evolve over time.
- the maintenance process itself can contribute to disruption of network availability.
- One type of maintenance is a software upgrade. Although nodes with redundant capabilities may avoid the disruption of traffic during the upgrade, providing such redundancies for every node may either be financially or operationally impractical.
- Non-redundant elements in the upgrade path represent a significant risk to uninterrupted traffic flow.
- One approach for performing a software upgrade on non-redundant elements is to physically remove modules with the dated software and replace them with modules for which the software has been updated. This undesirably disrupts all traffic being handled by the module prior to removal.
- a method of modifying a layered protocol communication apparatus includes transferring a control plane from a first processor handling a first layer to a second processor handling a second layer.
- software associated with the first processor is modified prior to transferring the control plane from the second processor back to the first processor for handling.
- Another method of modifying a layered protocol communication apparatus includes transferring a first layer handled by a first processor to a second processor handling a second layer.
- software associated with the first processor is modified prior to transferring the first layer from the second processor back to the first processor for handling.
- FIG. 1 illustrates one embodiment of a layered protocol model for a communications network.
- FIG. 2 illustrates one embodiment of an alternative layered protocol model for a communications network.
- FIG. 3 illustrates one embodiment of a communications network component implementing a layered protocol.
- FIG. 4 illustrates a software download status prior to performing an upgrade of the software for one element associated with an upper level layer of a layered protocol communication apparatus.
- FIG. 5 illustrates the layered protocol communication apparatus after the software upgrade of the element associated with the upper level layer.
- FIG. 6 illustrates transfer of layer functionality from processors at one hierarchical level to a processor at a higher hierarchical level.
- FIG. 7 illustrates transfer of layer functionality from processors at one hierarchical level to another processor at the same hierarchical level.
- FIG. 8 illustrates the apparatus after the software upgrade of the elements normally associated with the transferred layer.
- FIG. 9 illustrates the reconfiguration of layer hardware and the transfer of layer functionality to the processors normally associated with the layer.
- FIG. 10 illustrates the swap in active/standby status for redundant elements at a higher level.
- FIG. 11 illustrates the layered protocol communication apparatus after the software upgrade of another higher level element.
- FIG. 12 illustrates one embodiment of process of upgrading the software of a communications node.
- FIG. 13 illustrates one embodiment of a preparation phase of the software upgrade process.
- FIG. 14 illustrates one embodiment of the beginning of the execution phase of the software upgrade process.
- FIG. 15 illustrates one embodiment of transferring a control plane between processors at different levels or alternately at the same level of the element hierarchy.
- FIG. 16 illustrates one embodiment of transferring layer functionality between processors.
- FIG. 17 illustrates one embodiment of re-configuring low-level hardware handling the data traffic.
- FIG. 18 illustrates an alternative embodiment of re-configuring low-level hardware handling the data traffic.
- FIG. 19 illustrates one embodiment of the completion of the execution phase of the software upgrade process.
- Protocol layering entails dividing the network design into functional layers and assigning protocols for each layer's tasks.
- the layers represent levels of abstraction for performing functions such as data handling and connection management.
- one or more physical entities implement its functionality.
- connection management may be put into separate layers, and therefore separate protocols.
- one protocol is designed to perform data delivery, and another protocol performs connection management.
- the protocol for connection management is “layered” above the protocol handling data delivery.
- the data delivery protocol has no knowledge of connection management.
- the connection management protocol is not concerned with data delivery. Abstraction through layering enables simplification of the various individual layers and protocols. The protocols can then be assembled into a useful whole. Protocol layering thus produces simple protocols, each with a few well-defined tasks. Individual protocols can also be removed, modified, or replaced as needed for particular applications.
- Implementation of a given functional layer may occur within a single element or be distributed across multiple elements.
- the layering corresponds to a hardware or software hierarchy of elements.
- Each layer interacts directly only with the layer immediately beneath it, and provides facilities for use by the layer above it.
- the protocols enable an entity in one host to interact with a corresponding entity at the same layer in a remote host.
- FIG. 1 illustrates one embodiment of a layered protocol design.
- This four layer model 100 was promulgated by the Defense Advanced Research Projects Agency's (DARPA) Internetwork Project for the United States Department of Defense in the 1970s.
- DARPA Internetwork Project is the forerunner of the modern day ubiquitous Internet.
- the network access layer 110 is responsible for dealing with the specific physical properties of the communications media. Different protocols may be used depending upon the type of physical network.
- the Internet layer 120 is responsible for source-to-destination routing of data across different physical networks.
- the host-to-host layer 130 establishes connections between hosts and is responsible for session management, data re-transmission, flow control, etc.
- the process layer 140 is responsible for user-level functions such as mail delivery, file transfer, remote login, etc.
- Layer 1 network access layer
- Layer 4 process layer
- FIG. 2 illustrates an abstract networking model promulgated by the International Standard Organization. This model is also referred to as the basic reference model or the 7-layer model 200 of the Open Systems Interconnection network. Layers 210 - 230 are referred to as the “lower layers”. Layers 240 - 270 are referred to as the “upper layers”. The lower layers are concerned with moving packets of data from a source to a destination. The upper layers
- the physical layer 210 describes the physical properties of the communications media, as well as how the communicated signals should be interpreted.
- the data link layer 220 describes the logical organization (e.g., framing, addressing, etc.) of data transmitted on the media.
- the data link layer for example, handles frame synchronization
- the network layer 230 defines the addressing and routing structure of the network. More generally, the network layer defines how data can be delivered between any two nodes in the network. Routing, forwarding, addressing, error handling, and packet sequencing are handled at this layer. This layer is responsible for establishing the virtual circuits when communicating between nodes of the network.
- the transport layer 240 is responsible for end-to-end communication of the data between hosts or nodes.
- the transport layer for example, performs a sequence check to ensure that all the packets associated with a file have been received.
- the session layer 250 establishes, manages, and terminates connections between applications. The session layer functions are often incorporated into another layer for implementation.
- the presentation layer 260 describes the syntax of data being communicated.
- the presentation layer aids in the exchange of data between the application and the network. Where necessary, the data is translated to the syntax needed by the destination. Conversions between different floating point formats as well as encryption and decryption are handled by the presentation layer.
- the application layer 270 identifies the hosts to be communicated with, user authentication, data syntax, quality of service, users, etc.
- the types of operations handled by the application layer include execution of remote jobs and opening, writing, reading, and closing files.
- Protocol layers may be defined in other ways. Moreover, the protocol layers do not need to correspond to distinct layers in the hardware hierarchy. Implementation of a layer may be distributed across multiple levels in a hardware hierarchy. Alternatively, a single hardware element might handle more than one layer of the stack.
- FIG. 3 illustrates one embodiment of an apparatus for implementing a layered protocol for a communications network.
- the apparatus may be one node 300 of a larger communications network.
- node 300 is a router.
- Node 300 includes a hierarchy of elements for implementing the various protocol layers. There is not necessarily a one-to-one correspondence between layers and elements handling those layers. Thus for example, element 330 handles Layers A and, B, while element 310 handles Layer C and provides the interface to the physical media which connects apparatus 300 with other network nodes.
- the letter “A” indicates the lowest level in the layered protocol.
- the apparatus of FIG. 3 includes redundant elements as well as non-redundant elements. Active elements 310 , 320 represent redundant elements. One of the elements is in a standby mode while the other is active.
- the apparatus provides fail-over capabilities so that the standby processor can assume active status and responsibility for the services provided by the former active processor. In such a case, the formerly active processor is placed into a standby mode or a disabled mode until the event that caused the fail-over is resolved.
- Elements 330 - 360 provide the interface to the physical media carrying the communications.
- elements 330 - 360 are referred to as line cards. Although multiple (n) line cards 330 - 360 are illustrated, the line cards are not provided with redundancies in this embodiment.
- elements 330 - 360 might be referred to as “data plane” elements while elements 310 and 320 are referred to as “control plane” elements.
- the data plane examines the destination address or label and sends the packet in the direction and manner specified by a routing table.
- the control plane describes the entities and processes that update the routing tables.
- elements 310 and 320 may include some data plane functions or associated hardware such as a switch matrix.
- elements 330 - 360 may include some aspects of a control plane.
- Processors 314 or 324 may be responsible, for example, for modifying or updating routing tables utilized by the processors of elements 330 - 360 .
- Lower level processors such as processor 334 are responsible for configuring even lower-level hardware such as hardware 336 .
- Hardware 336 might be a field programmable gate array (FPGA) or an application-specific integrated circuit (ASIC), for example.
- FPGA field programmable gate array
- ASIC application-specific integrated circuit
- processor-executable instructions that determine the implementation of a particular protocol layer by that processor.
- the processor-executable instructions may be embodied as “software” or “firmware” depending upon the storage medium or the method used to access these instructions.
- software will refer to “processor executable instructions” regardless of the storage medium or the method of access, unless indicated otherwise.
- the network component must be upgraded to handle new protocols, expansions to existing protocols, or new or changed features.
- hardware upgrades i.e., replacement of processors
- the component can be upgraded through software upgrades.
- different versions of software 312 , 322 , 332 may reside with the storage medium associated with a particular processor 314 , 324 , and 334 , respectively, an upgrade or change is not effective until the processor has loaded and is executing the desired version. Thus mere storage of a particular version is not sufficient to effect an upgrade or modification.
- the processors must be reset or re-booted to load a different version of the software.
- One approach is to upgrade the software of all the processors at the same time. Although this can minimize the total amount of time required for the upgrade, this approach is also likely to render the entire apparatus effectively nonfunctional throughout the entire upgrade process thus incurring a large penalty as a result of unavailability.
- An alternative staggered upgrade approach staggers the upgrades across the hierarchical levels. This approach requires more time to perform the upgrade of all the software, however, much of the functionality of the apparatus is preserved throughout the upgrade process. In particular, the functioning of an individual layer is substantially preserved while upgrading the software associated with higher protocol layers.
- a layer is transferred from the processor normally handling that layer to a processor at a different hierarchical level in order to preserve some, if not all, of the functionality of the transferred layer during the upgrade of the software associated with the normal processor.
- the data traffic “status quo” should be preserved while upgrading the software.
- the appropriate version of target software is downloaded for each processor.
- the software may be stored in nonvolatile memory or a non-volatile memory.
- the target version software is downloaded to a random access memory local to the associated processor.
- the software required for processors at the same hierarchy level will be the same.
- the software required for a processor at one level is not, however, typically the same as the software required for a processor at a different level because of the different functions performed at the different levels.
- the downloading process does not impact data traffic.
- FIGS. 4-11 illustrate this upgrade process graphically for upgrading a node 400 from a starting version (4.1) to a target version (5.0) of software.
- FIG. 4 illustrates the version status of software stored and used by the elements after first downloading the target version. After the download, both the starting version and the target version of software are present for each element.
- software 412 , 422 includes version 4.1 and 5.0 appropriate for processors 414 and 424 .
- elements 430 - 460 have versions 4.1 and 5.0 of the software 432 appropriate for the respective processor 434 .
- the hardware associated with some layers such as the Layer A hardware 436 may only require the re-programming of registers with new values to implement the desired changes for that layer.
- the active element 410 controls the upgrade process until the point at which element 410 must be upgraded.
- the software 522 associated with the standby element 520 is updated first. This is accomplished by performing a reset of processor 524 with the boot vector directed to the target version of the software. After the reset, standby element 520 is executing the target version of the software. The standby element attempts to synchronize with the active element. The standby element retrieves configuration information and checkpoint data from the active element for synchronization. The standby element stores information using the updated version of any database as dictated by the target version of the software. This update has no impact on lower level layers handling data traffic such as the Layer A hardware 536 for elements 530 - 536 .
- fail-over mechanisms can be used to update the active elements.
- the active/standby status of the two elements 510 , 520 can be swapped and a reset can be performed on processor 514 similar to that previously performed on processor 524 .
- the upgrade process proceeds to update lower levels before completely updating the current level.
- the apparatus 500 may return to either the starting version or the target version of the software depending upon when the failure occurred.
- Layer B might provide, for example, “keep alive”, “hello” or other connection maintenance functionality such as that found in layer 3 of the OSI model.
- Such connection maintenance functionality may be required to support various protocols and connections including the Intermediate System-to-Intermediate System (IS-IS) and Open Shortest Path First routing protocols, label switch paths (LSP), etc. If this functionality is absent, one or more connections or sessions will be terminated despite the ability of lower level layers to otherwise continue to forward packets. Failure to provide this functionality will result in the loss of various connections and sessions.
- IS-IS Intermediate System-to-Intermediate System
- LSP Label switch paths
- Layer B is moved from the processor 634 at one hierarchical level to a processor 614 at a higher hierarchical level. The layer is thus moved to another processor for handling.
- Processor 614 reads the connection data from elements 630 - 660 prior to the transfer. Connection data includes both the static configuration information such as the types of interfaces as well as the dynamic state information regarding the protocols executing on those interfaces.
- Layer B is then transferred from the processors 634 of elements 630 - 660 to processor 614 .
- Processor 614 of active element 610 executes program code supporting Layer B functionality with the initial conditions established by the connection and configuration information read from elements 630 - 660 . This is equivalent to moving the control plane from one processor to another processor at a different location in the processor hierarchy.
- Layer B functionality is transferred, a reset is performed on the processors 634 normally associated with Layer B processing.
- the boot vector is directed to the target version of the software. This activity does not disrupt the data traffic handled by the Layer A hardware of elements 630 - 660 .
- FIG. 7 illustrates an alternative embodiment in which the Layer B functionality for node 700 is transferred from one processor 734 to another processor 764 at the same location in the processor hierarchy.
- Processor 764 is not a dedicated redundant resource nor is element 760 redundant to 730 .
- the Layer A hardware 736 of element 730 for example continues to function while relying on a different processor 764 for its Layer B functionality.
- Clearly not all of the processors 734 can be upgraded at the same time.
- the software for all but one of the processors is upgraded at the same time.
- only the software associated with a single processor is upgraded at one time.
- FIG. 8 illustrates the node 800 after the reset.
- Processors 834 of elements 830 - 860 are executing the target version (5.0) of the software.
- Processors 834 of elements 830 - 860 then retrieve the connection data associated with Layer B from either the hierarchically higher processor 814 of active element 810 or the processor 864 residing at the same location in the processor hierarchy depending upon where the Layer B functionality was previously transferred.
- the Layer A hardware must be updated to support the various protocol changes resulting from the software update.
- Reconfiguration of the Layer A hardware necessarily disrupts the traffic handled by the Layer A hardware, however, the reconfiguration primarily entails writing values to registers of low level hardware such as ASICs. Instead of disrupting Layer A functionality throughout the upgrade of the node, Layer A functionality is disrupted only for the relatively short period of time required to reconfigure the low-level hardware. In contrast to the update procedure for the higher level processors, reconfiguration of low level hardware such as ASICs is on the order of fractional seconds to seconds.
- FIG. 9 illustrates reconfiguring the Layer A hardware 936 of elements 930 - 960 for node 900 .
- Processors 934 configure their respective Layer A hardware 936 to support the functionality determined by the software upgrade. Following the re-configuration of the Layer A hardware, the transfer of Layer B functionality back to the processors of elements 930 - 960 is completed.
- the processors 934 of elements 930 - 960 begin executing Layer B program code using the retrieved connection data.
- Processor 934 of elements 930 - 960 handle the control plane for the Layer A hardware 936 . Thus the control plane is restored to the elements normally associated with Layer B functionality.
- software 912 can be updated using typical fail-over mechanisms to avoid disruption.
- the active and standby status of elements 1010 , 1020 is swapped such that element 1010 is now in standby mode and element 1020 is the active element. Active element 1020 assumes control for the remainder of the upgrade process.
- FIG. 11 illustrates the result of a reset of processor 1114 using a boot vector pointing to the target version of the software 1112 .
- processor 1114 is executing the target version of the software.
- Standby element 1110 then retrieves configuration and checkpoint information from active element 1120 in order to synchronize with active element 1120 .
- the upgrade of the software at this level of the hierarchy does not disrupt the data traffic handled by the Layer A hardware 1136 .
- the static component of the Layer B connection data (i.e., the configuration data) is not permitted to change throughout the upgrade of the software associated with Layer B. For a router, this could imply that alarms, requests to establish/terminate connections, and routing table updates/modifications are ignored.
- Network components external to node 1100 may terminate connections, for example, but the termination will not be recognized by node 1100 until the upgrade has completed and the termination has been subsequently detected by node 1100 .
- the layered protocols are typically robust and they permit node 1100 to re-detect conditions that were ignored during the upgrade process in the event that such conditions were not resolved prior to the completion of the software upgrade.
- the upgrade process is performed in two phases: a preparation phase and an execution phase as indicated in FIG. 12 .
- the preparation phase is performed in step 1210 . If problems are discovered in the preparation phase as determined by step 1220 , the upgrade to the target version is terminated in step 1230 . Otherwise, the upgrade process continues with the execution phase in step 1240 . If no problems occur during the execution phase, the process is completed with step 1290 .
- the upgrade process may either be “unwound” to the starting version of the software or alternately catastrophic failure mechanisms may be used to complete the upgrade to the target version of the software.
- the upgrade process is terminated and catastrophic failover mechanisms are used to upgrade the software to the target version in step 1254 . If the problem occurs prior to entering the isolation mode, then the upgrade process is “unwound” to the starting version of the software in step 1260 .
- the isolation mode is a mode that prevents the node from accommodating externally requested configuration changes.
- FIG. 13 illustrates one embodiment of the preparation phase.
- the target version of the software is downloaded to memory for each processor in the element hierarchy that needs to have its associated software upgraded.
- the starting version may be preserved to enable restoration to the starting version of the software in the event of a failure in the upgrade process.
- step 1320 the node is checked to ensure that all elements are functioning properly.
- the preparation phase cannot complete successfully unless all elements have full operational functionality.
- the determination of operational functionality might include checking whether the node has operational redundancy, whether all elements are working, and whether any element is in a transitional state (e.g., being reset, updated, etc.).
- FIG. 14 illustrates one embodiment of the beginning of the execution phase.
- a standby element of a redundant plurality of elements is upgraded to a target version of software. In one embodiment, this is accomplished by performing the reset previously described.
- the standby element retrieves configuration and checkpoint data from an active element of the redundant plurality of elements. The standby element performs any necessary data conversions required to bring the retrieved data into conformance with the formats dictated by the target version of the software. At this point, the node no longer has redundancy protection.
- the node is placed into isolation mode in step 1430 to prevent configuration changes.
- a router for example, alarms, requests to establish/terminate connections, and routing table modifications are ignored.
- the software for lower level processors may also be upgraded. As previously indicated, however, layer functionality must be preserved throughout the upgrade. In order to preserve layer functionality, the associated control plane is transferred from a processor at one level of the element hierarchy to a processor at the same level or another level of the element hierarchy as indicated in FIG. 15 .
- a control plane is transferred from at least one first processor handling a first layer to a second processor handling a second layer in step 1510 . This is equivalent to transferring the layer or layer portion handled by the first processor to the second processor handling another layer or layer portion.
- the node may have a single first processor or n first processors such as the processors 434 associated with each of elements 430 - 460 .
- the first and second processors are located at different levels of the element hierarchy. Effectively the layer or portion of a layer handled by a first processor is transferred to a second processor at another level of the hierarchy.
- all the processors (e.g. 434 ) handling the first layer or first layer portion prior to the transfer can have a software upgrade at substantially the same time.
- the redundancy approach requires swapping the roles of active and standby components such that upgrades for all elements at the same level cannot occur substantially simultaneously.
- a control plane is transferred from at least one first processor handling an associated first layer to a second processor handling an associated first layer in step 1512 .
- This is equivalent to transferring the layer or layer portion handled by the first processor to a second processor handling another instance of the same layer or layer portion.
- the node may have a single first processor or n first processors such as the processors 434 associated with each of elements 430 - 460 .
- the first and second processors are located at the same level of the element hierarchy. Effectively the layer or portion of a layer handled by a first processor is transferred to a second processor at the same level of the hierarchy. In contrast to the redundancy approach, the second processor is not duplicative or redundant. Prior to transfer of the control plane, the second processor is handling its own instance of the same layer or layer portion.
- step 1520 the software associated with the at least one first processor is upgraded in step 1520 . This may be accomplished by using a soft reset to force the first processor(s) to load the target version of the software as previously described. This upgrade does not impact data traffic handled by lower level layers.
- step 1530 the lower level layer hardware associated with the first processor is re-configured. This re-configuration disrupts the data traffic handled by the lower level layer hardware.
- step 1540 the control plane is transferred back to the at least one first processor.
- FIG. 16 illustrates the transfer of the control plane or layer functionality in greater detail.
- a first processor handling a first layer provides connection data (i.e., the static configuration and dynamic state) to a second processor.
- connection data i.e., the static configuration and dynamic state
- the second processor is either handling a second layer or another instance of the first layer.
- the first processor terminates handling first layer functions.
- the second processor initiates handling of the first layer functions previously associated with the first processor in step 1630 .
- a first layer being handled by the first processor is transferred to a second processor.
- the software upgrade for the first processor is performed in step 1640 .
- the second processor is handling first layer functionality. This might include, for example “hello”, “keep alive”, or other functionality required to preserve the status quo with respect to other nodes in the communications network.
- the first processor retrieves the connection data from the second processor in step 1650 .
- the lower level hardware associated with the first processor is re-configured in step 1660 .
- the second processor terminates handling first layer functions in step 1670 .
- the first processor initiates handling first layer functions in step 1680 using the connection data. This is equivalent to transferring the first layer being handled by the second processor back to the first processor for handling.
- the re-configuration of the low level hardware is typically required in order to support the protocol modifications at the data traffic layer.
- the connection data preserved throughout the upgrade of the control plane for the low level hardware must be re-mapped or otherwise modified to ensure compatibility with the upgraded versions of the protocols instituted by the software upgrade.
- FIG. 17 illustrates one embodiment of re-configuring the low-level hardware.
- a first version of connection data compatible with a first version of a layer is mapped to a second version of connection data compatible with a second version of a layer.
- the connection data includes static configuration data as well as dynamic state data.
- the low-level layer hardware is re-configured in accordance with the second version of connection data. This might entail, for example, writing values to a number of registers.
- This re-configuration disrupts the data traffic handled by the low-level hardware, but the amount of time required to write values to the registers is on the order of fractions of a second to seconds and thus of sufficiently short period of time to avoid causing other nodes in the communications network from taking corrective action such as re-routing communications around the node being updated.
- An alternative approach to re-configuring the low-level hardware can potentially decrease the amount of time needed for re-configuration by reducing the number of write operations required.
- the aforementioned re-mapping operation does not necessarily result in a change in value for every register of the low-level layer hardware.
- the number of write operations might be significantly reduced if values are written only to the registers that have changed values.
- FIG. 18 illustrates one embodiment of the alternative approach to re-configuring the low-level hardware.
- a first version of connection data compatible with a first version of a layer is mapped to a second version of connection data compatible with a second version of the layer.
- the first and second versions of the layer refer to the pre- and post-upgrade versions of the layer.
- a read operation is performed to retrieve the current version of the connection data from the low-level layer hardware in step 1820 .
- the current connection data is compared to the second version of the connection data to identify a difference (DIFF) version of the connection data in step 1830 .
- the DIFF version identifies only the registers that have changes in value and what those values should be.
- the DIFF version thus identifies only the locations that actually require a change.
- the low-level hardware is then re-configured in accordance with the difference version of the connection data in step 1840 .
- the difference version can potentially decrease the amount of time that the data traffic is disrupted by eliminating the time spent writing to registers that do not require changes.
- the remaining elements of the redundant plurality of elements may now be upgraded as indicated in FIG. 19 .
- the upgrade process has been controlled by the active element of the redundant plurality of elements.
- a first selected active element swaps active/standby status with a second selected standby element in step 1910 .
- the first selected element is now a standby element and the second selected element is now the active element.
- the second selected element is now responsible for controlling the remainder of the upgrade process.
- the first selected element is upgraded to a target version of the software in step 1920 . This may be accomplished, for example, by performing a reset of the processor with a boot vector directed to the target version of the software.
- the node exits the isolation mode in step 1930 to enable configuration changes.
- the first selected element retrieves configuration and checkpoint data from the second selected element. At this point the redundant plurality of elements are synchronized and capable of providing redundancy protection.
- step 1940 is performed prior to step 1930 to ensure redundancy before exiting the isolation mode.
Abstract
A method of modifying a layered protocol communication apparatus includes transferring a control plane from a first processor handling a first layer to a second processor handling a second layer.
Description
- This application claims the benefit of U.S. Provisional Application No. 60/778,437 filed on Mar. 2, 2006.
- This invention relates to the field of communications. In particular, this invention is drawn to methods and apparatus for modifying a layered protocol communication apparatus including software modifications associated with different levels of the layered protocol communication apparatus.
- Communication networks are used to carry a wide variety of data. Typically, a communication network includes a number of interconnected nodes. Communication between source and destination is accomplished by routing data from a source through the communication network to a destination. Such a network, for example, might carry voice communications, financial transaction data, real-time data, etc., not all of which require the same level of performance from the network.
- One metric for rating a communication network is the availability of the network. The network might be used, for example, to communicate data associated with different classes of service such as “first available”, business data, priority data, or real-time data which place different constraints on the requirements for the delivery of the data including the timeframe within which it will be delivered.
- Disruption to the network can be very costly. The revenue stream for many businesses is highly dependent upon the availability of the network. The network service provider frequently is under contract to guarantee certain levels of availability to customers and may incur significant financial liability in the event of disruption.
- In the interest of ensuring the continued availability of the network or the avoidance of an event that might lead to catastrophic disruption, maintenance is performed on the nodes. Maintenance may also be required to ensure that the nodes support various communication protocols as they evolve over time.
- The maintenance process itself can contribute to disruption of network availability. One type of maintenance is a software upgrade. Although nodes with redundant capabilities may avoid the disruption of traffic during the upgrade, providing such redundancies for every node may either be financially or operationally impractical.
- Non-redundant elements in the upgrade path represent a significant risk to uninterrupted traffic flow. One approach for performing a software upgrade on non-redundant elements is to physically remove modules with the dated software and replace them with modules for which the software has been updated. This undesirably disrupts all traffic being handled by the module prior to removal.
- A method of modifying a layered protocol communication apparatus includes transferring a control plane from a first processor handling a first layer to a second processor handling a second layer.
- In one embodiment software associated with the first processor is modified prior to transferring the control plane from the second processor back to the first processor for handling.
- Another method of modifying a layered protocol communication apparatus includes transferring a first layer handled by a first processor to a second processor handling a second layer.
- In one embodiment software associated with the first processor is modified prior to transferring the first layer from the second processor back to the first processor for handling.
- The present invention is illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:
-
FIG. 1 illustrates one embodiment of a layered protocol model for a communications network. -
FIG. 2 illustrates one embodiment of an alternative layered protocol model for a communications network. -
FIG. 3 illustrates one embodiment of a communications network component implementing a layered protocol. -
FIG. 4 illustrates a software download status prior to performing an upgrade of the software for one element associated with an upper level layer of a layered protocol communication apparatus. -
FIG. 5 illustrates the layered protocol communication apparatus after the software upgrade of the element associated with the upper level layer. -
FIG. 6 illustrates transfer of layer functionality from processors at one hierarchical level to a processor at a higher hierarchical level. -
FIG. 7 illustrates transfer of layer functionality from processors at one hierarchical level to another processor at the same hierarchical level. -
FIG. 8 illustrates the apparatus after the software upgrade of the elements normally associated with the transferred layer. -
FIG. 9 illustrates the reconfiguration of layer hardware and the transfer of layer functionality to the processors normally associated with the layer. -
FIG. 10 illustrates the swap in active/standby status for redundant elements at a higher level. -
FIG. 11 illustrates the layered protocol communication apparatus after the software upgrade of another higher level element. -
FIG. 12 illustrates one embodiment of process of upgrading the software of a communications node. -
FIG. 13 illustrates one embodiment of a preparation phase of the software upgrade process. -
FIG. 14 illustrates one embodiment of the beginning of the execution phase of the software upgrade process. -
FIG. 15 illustrates one embodiment of transferring a control plane between processors at different levels or alternately at the same level of the element hierarchy. -
FIG. 16 illustrates one embodiment of transferring layer functionality between processors. -
FIG. 17 illustrates one embodiment of re-configuring low-level hardware handling the data traffic. -
FIG. 18 illustrates an alternative embodiment of re-configuring low-level hardware handling the data traffic. -
FIG. 19 illustrates one embodiment of the completion of the execution phase of the software upgrade process. - Communication networks frequently rely on protocol layering to simplify network designs. Protocol layering entails dividing the network design into functional layers and assigning protocols for each layer's tasks. The layers represent levels of abstraction for performing functions such as data handling and connection management. Within each layer, one or more physical entities implement its functionality.
- For example, the functions of data delivery and connection management may be put into separate layers, and therefore separate protocols. Thus, one protocol is designed to perform data delivery, and another protocol performs connection management. The protocol for connection management is “layered” above the protocol handling data delivery. The data delivery protocol has no knowledge of connection management. Similarly, the connection management protocol is not concerned with data delivery. Abstraction through layering enables simplification of the various individual layers and protocols. The protocols can then be assembled into a useful whole. Protocol layering thus produces simple protocols, each with a few well-defined tasks. Individual protocols can also be removed, modified, or replaced as needed for particular applications.
- Implementation of a given functional layer may occur within a single element or be distributed across multiple elements. Generally, however, the layering corresponds to a hardware or software hierarchy of elements. Each layer interacts directly only with the layer immediately beneath it, and provides facilities for use by the layer above it. The protocols enable an entity in one host to interact with a corresponding entity at the same layer in a remote host.
-
FIG. 1 illustrates one embodiment of a layered protocol design. This fourlayer model 100 was promulgated by the Defense Advanced Research Projects Agency's (DARPA) Internetwork Project for the United States Department of Defense in the 1970s. The DARPA Internetwork Project is the forerunner of the modern day ubiquitous Internet. - The
network access layer 110 is responsible for dealing with the specific physical properties of the communications media. Different protocols may be used depending upon the type of physical network. TheInternet layer 120 is responsible for source-to-destination routing of data across different physical networks. - The host-to-
host layer 130 establishes connections between hosts and is responsible for session management, data re-transmission, flow control, etc. Theprocess layer 140 is responsible for user-level functions such as mail delivery, file transfer, remote login, etc. - When traversing the layers or “stack” for a given model, the layers are typically numbered ascending from the bottom layer (i.e., Layer 1=network access layer) to the top layer (i.e.,
Layer 4=process layer). However, enumeration (e.g., numerical or alphabetical) is not intended to be limited to the reference from either the top or bottom unless the context demands it. -
FIG. 2 illustrates an abstract networking model promulgated by the International Standard Organization. This model is also referred to as the basic reference model or the 7-layer model 200 of the Open Systems Interconnection network. Layers 210-230 are referred to as the “lower layers”. Layers 240-270 are referred to as the “upper layers”. The lower layers are concerned with moving packets of data from a source to a destination. The upper layers - The
physical layer 210 describes the physical properties of the communications media, as well as how the communicated signals should be interpreted. Thedata link layer 220 describes the logical organization (e.g., framing, addressing, etc.) of data transmitted on the media. The data link layer for example, handles frame synchronization - The
network layer 230 defines the addressing and routing structure of the network. More generally, the network layer defines how data can be delivered between any two nodes in the network. Routing, forwarding, addressing, error handling, and packet sequencing are handled at this layer. This layer is responsible for establishing the virtual circuits when communicating between nodes of the network. - The
transport layer 240 is responsible for end-to-end communication of the data between hosts or nodes. The transport layer, for example, performs a sequence check to ensure that all the packets associated with a file have been received. Thesession layer 250 establishes, manages, and terminates connections between applications. The session layer functions are often incorporated into another layer for implementation. - The
presentation layer 260 describes the syntax of data being communicated. The presentation layer aids in the exchange of data between the application and the network. Where necessary, the data is translated to the syntax needed by the destination. Conversions between different floating point formats as well as encryption and decryption are handled by the presentation layer. - The
application layer 270 identifies the hosts to be communicated with, user authentication, data syntax, quality of service, users, etc. The types of operations handled by the application layer include execution of remote jobs and opening, writing, reading, and closing files. - Different networks may define the protocol layers in other ways. Moreover, the protocol layers do not need to correspond to distinct layers in the hardware hierarchy. Implementation of a layer may be distributed across multiple levels in a hardware hierarchy. Alternatively, a single hardware element might handle more than one layer of the stack.
-
FIG. 3 illustrates one embodiment of an apparatus for implementing a layered protocol for a communications network. The apparatus may be onenode 300 of a larger communications network. In one embodiment, for example,node 300 is a router.Node 300 includes a hierarchy of elements for implementing the various protocol layers. There is not necessarily a one-to-one correspondence between layers and elements handling those layers. Thus for example,element 330 handles Layers A and, B, while element 310 handles Layer C and provides the interface to the physical media which connectsapparatus 300 with other network nodes. The letter “A” indicates the lowest level in the layered protocol. - The apparatus of
FIG. 3 includes redundant elements as well as non-redundant elements.Active elements 310, 320 represent redundant elements. One of the elements is in a standby mode while the other is active. The apparatus provides fail-over capabilities so that the standby processor can assume active status and responsibility for the services provided by the former active processor. In such a case, the formerly active processor is placed into a standby mode or a disabled mode until the event that caused the fail-over is resolved. - Elements 330-360 provide the interface to the physical media carrying the communications. In one embodiment, elements 330-360 are referred to as line cards. Although multiple (n) line cards 330-360 are illustrated, the line cards are not provided with redundancies in this embodiment.
- For router nodes, elements 330-360 might be referred to as “data plane” elements while
elements 310 and 320 are referred to as “control plane” elements. The data plane examines the destination address or label and sends the packet in the direction and manner specified by a routing table. The control plane describes the entities and processes that update the routing tables. In practice,elements 310 and 320 may include some data plane functions or associated hardware such as a switch matrix. Similarly, elements 330-360 may include some aspects of a control plane. -
Processors processor 334 are responsible for configuring even lower-level hardware such ashardware 336.Hardware 336 might be a field programmable gate array (FPGA) or an application-specific integrated circuit (ASIC), for example. - Each processor throughout the hierarchy requires a set of processor-executable instructions that determine the implementation of a particular protocol layer by that processor. The processor-executable instructions may be embodied as “software” or “firmware” depending upon the storage medium or the method used to access these instructions. Generally the term software will refer to “processor executable instructions” regardless of the storage medium or the method of access, unless indicated otherwise.
- Occasionally the network component must be upgraded to handle new protocols, expansions to existing protocols, or new or changed features. Although hardware upgrades (i.e., replacement of processors) might be required, typically the component can be upgraded through software upgrades. Although different versions of
software particular processor - Software upgrades necessarily disrupt the functioning of the associated processor. Upgrading or modifying the software associated with a processor renders the processor unavailable and effectively nonfunctional throughout the upgrade. Accordingly, the processor cannot perform its intended functions during the upgrade. The apparatus as a whole cannot fully implement the layered protocol as long as any hierarchy is nonfunctional due to the upgrading of its processor. Outages or loss of service of the apparatus as a whole for even a few minutes may be extremely costly thus the amount of time that the apparatus is nonfunctional should be minimized.
- One approach is to upgrade the software of all the processors at the same time. Although this can minimize the total amount of time required for the upgrade, this approach is also likely to render the entire apparatus effectively nonfunctional throughout the entire upgrade process thus incurring a large penalty as a result of unavailability.
- An alternative staggered upgrade approach staggers the upgrades across the hierarchical levels. This approach requires more time to perform the upgrade of all the software, however, much of the functionality of the apparatus is preserved throughout the upgrade process. In particular, the functioning of an individual layer is substantially preserved while upgrading the software associated with higher protocol layers. When necessary, a layer is transferred from the processor normally handling that layer to a processor at a different hierarchical level in order to preserve some, if not all, of the functionality of the transferred layer during the upgrade of the software associated with the normal processor. Preferably, the data traffic “status quo” should be preserved while upgrading the software.
- Prior to execution of the upgrade, the appropriate version of target software is downloaded for each processor. The software may be stored in nonvolatile memory or a non-volatile memory. In one embodiment, the target version software is downloaded to a random access memory local to the associated processor. Typically, the software required for processors at the same hierarchy level will be the same. The software required for a processor at one level is not, however, typically the same as the software required for a processor at a different level because of the different functions performed at the different levels. The downloading process does not impact data traffic.
-
FIGS. 4-11 illustrate this upgrade process graphically for upgrading anode 400 from a starting version (4.1) to a target version (5.0) of software. -
FIG. 4 illustrates the version status of software stored and used by the elements after first downloading the target version. After the download, both the starting version and the target version of software are present for each element. Thussoftware processors software 432 appropriate for therespective processor 434. The hardware associated with some layers such as theLayer A hardware 436 may only require the re-programming of registers with new values to implement the desired changes for that layer. Theactive element 410 controls the upgrade process until the point at whichelement 410 must be upgraded. - Referring to
FIG. 5 , thesoftware 522 associated with thestandby element 520 is updated first. This is accomplished by performing a reset ofprocessor 524 with the boot vector directed to the target version of the software. After the reset,standby element 520 is executing the target version of the software. The standby element attempts to synchronize with the active element. The standby element retrieves configuration information and checkpoint data from the active element for synchronization. The standby element stores information using the updated version of any database as dictated by the target version of the software. This update has no impact on lower level layers handling data traffic such as theLayer A hardware 536 for elements 530-536. - If an update of the redundant elements is the only update required, then fail-over mechanisms can be used to update the active elements. Using existing fail-over protocols, the active/standby status of the two
elements processor 514 similar to that previously performed onprocessor 524. In one embodiment, when more than one level must be updated, however, the upgrade process proceeds to update lower levels before completely updating the current level. In the event of a failure during the upgrade process, theapparatus 500 may return to either the starting version or the target version of the software depending upon when the failure occurred. - Although the next lower level of the hardware hierarchy includes
several processors 534, these processors are not configured to provide redundancy. Thus performing a reset on these processors may terminate connections or sessions requiring Layer B functionality. Layer B might provide, for example, “keep alive”, “hello” or other connection maintenance functionality such as that found in layer 3 of the OSI model. Such connection maintenance functionality may be required to support various protocols and connections including the Intermediate System-to-Intermediate System (IS-IS) and Open Shortest Path First routing protocols, label switch paths (LSP), etc. If this functionality is absent, one or more connections or sessions will be terminated despite the ability of lower level layers to otherwise continue to forward packets. Failure to provide this functionality will result in the loss of various connections and sessions. - Referring to
FIG. 6 , Layer B is moved from theprocessor 634 at one hierarchical level to aprocessor 614 at a higher hierarchical level. The layer is thus moved to another processor for handling.Processor 614 reads the connection data from elements 630-660 prior to the transfer. Connection data includes both the static configuration information such as the types of interfaces as well as the dynamic state information regarding the protocols executing on those interfaces. - Layer B is then transferred from the
processors 634 of elements 630-660 toprocessor 614.Processor 614 ofactive element 610 executes program code supporting Layer B functionality with the initial conditions established by the connection and configuration information read from elements 630-660. This is equivalent to moving the control plane from one processor to another processor at a different location in the processor hierarchy. - After Layer B functionality is transferred, a reset is performed on the
processors 634 normally associated with Layer B processing. The boot vector is directed to the target version of the software. This activity does not disrupt the data traffic handled by the Layer A hardware of elements 630-660. -
FIG. 7 illustrates an alternative embodiment in which the Layer B functionality fornode 700 is transferred from oneprocessor 734 to another processor 764 at the same location in the processor hierarchy. Processor 764 is not a dedicated redundant resource nor iselement 760 redundant to 730. TheLayer A hardware 736 ofelement 730, for example continues to function while relying on a different processor 764 for its Layer B functionality. Clearly not all of theprocessors 734 can be upgraded at the same time. In one embodiment, the software for all but one of the processors is upgraded at the same time. In an alternative embodiment, only the software associated with a single processor is upgraded at one time. -
FIG. 8 illustrates thenode 800 after the reset.Processors 834 of elements 830-860 are executing the target version (5.0) of the software.Processors 834 of elements 830-860 then retrieve the connection data associated with Layer B from either the hierarchicallyhigher processor 814 ofactive element 810 or the processor 864 residing at the same location in the processor hierarchy depending upon where the Layer B functionality was previously transferred. - The Layer A hardware must be updated to support the various protocol changes resulting from the software update. Reconfiguration of the Layer A hardware necessarily disrupts the traffic handled by the Layer A hardware, however, the reconfiguration primarily entails writing values to registers of low level hardware such as ASICs. Instead of disrupting Layer A functionality throughout the upgrade of the node, Layer A functionality is disrupted only for the relatively short period of time required to reconfigure the low-level hardware. In contrast to the update procedure for the higher level processors, reconfiguration of low level hardware such as ASICs is on the order of fractional seconds to seconds.
-
FIG. 9 illustrates reconfiguring theLayer A hardware 936 of elements 930-960 fornode 900.Processors 934 configure their respectiveLayer A hardware 936 to support the functionality determined by the software upgrade. Following the re-configuration of the Layer A hardware, the transfer of Layer B functionality back to the processors of elements 930-960 is completed. Theprocessors 934 of elements 930-960 begin executing Layer B program code using the retrieved connection data.Processor 934 of elements 930-960 handle the control plane for theLayer A hardware 936. Thus the control plane is restored to the elements normally associated with Layer B functionality. - In order to finish the upgrade process,
software 912 can be updated using typical fail-over mechanisms to avoid disruption. Referring tonode 1000 ofFIG. 10 , the active and standby status ofelements 1010, 1020 is swapped such thatelement 1010 is now in standby mode and element 1020 is the active element. Active element 1020 assumes control for the remainder of the upgrade process. -
FIG. 11 illustrates the result of a reset ofprocessor 1114 using a boot vector pointing to the target version of thesoftware 1112. After the reset,processor 1114 is executing the target version of the software.Standby element 1110 then retrieves configuration and checkpoint information from active element 1120 in order to synchronize with active element 1120. The upgrade of the software at this level of the hierarchy does not disrupt the data traffic handled by theLayer A hardware 1136. - Booting any of the processors using the target version of the software might take considerable time, however, the functionality of the processors has been “covered” either through redundancy or by moving layer support to a processor at either the same or a different level in the hierarchy. The time required to transfer a control plane back and forth is very short compared to the time required to complete the upgrade and bring the processors online with the target version of software. Such transfer does not disrupt the data traffic handled by the
Layer A hardware 1136. - The static component of the Layer B connection data (i.e., the configuration data) is not permitted to change throughout the upgrade of the software associated with Layer B. For a router, this could imply that alarms, requests to establish/terminate connections, and routing table updates/modifications are ignored. Network components external to
node 1100 may terminate connections, for example, but the termination will not be recognized bynode 1100 until the upgrade has completed and the termination has been subsequently detected bynode 1100. - Thus some functionality is lost during the upgrade process, however, the traffic moving capabilities having the greatest impact on availability are maintained throughout the upgrade process. The layered protocols are typically robust and they permit
node 1100 to re-detect conditions that were ignored during the upgrade process in the event that such conditions were not resolved prior to the completion of the software upgrade. - To reduce the risk of failure in the upgrade process, the upgrade process is performed in two phases: a preparation phase and an execution phase as indicated in
FIG. 12 . The preparation phase is performed instep 1210. If problems are discovered in the preparation phase as determined bystep 1220, the upgrade to the target version is terminated instep 1230. Otherwise, the upgrade process continues with the execution phase instep 1240. If no problems occur during the execution phase, the process is completed withstep 1290. - If problems are encountered during the execution phase as determined by
step 1250, the upgrade process may either be “unwound” to the starting version of the software or alternately catastrophic failure mechanisms may be used to complete the upgrade to the target version of the software. In one embodiment, if a problem occurs after entering an isolation mode as determined bystep 1252, then the upgrade process is terminated and catastrophic failover mechanisms are used to upgrade the software to the target version instep 1254. If the problem occurs prior to entering the isolation mode, then the upgrade process is “unwound” to the starting version of the software instep 1260. The isolation mode is a mode that prevents the node from accommodating externally requested configuration changes. -
FIG. 13 illustrates one embodiment of the preparation phase. Instep 1310, the target version of the software is downloaded to memory for each processor in the element hierarchy that needs to have its associated software upgraded. The starting version may be preserved to enable restoration to the starting version of the software in the event of a failure in the upgrade process. - In
step 1320, the node is checked to ensure that all elements are functioning properly. The preparation phase cannot complete successfully unless all elements have full operational functionality. The determination of operational functionality might include checking whether the node has operational redundancy, whether all elements are working, and whether any element is in a transitional state (e.g., being reset, updated, etc.). -
FIG. 14 illustrates one embodiment of the beginning of the execution phase. Instep 1410, a standby element of a redundant plurality of elements is upgraded to a target version of software. In one embodiment, this is accomplished by performing the reset previously described. Instep 1420, the standby element retrieves configuration and checkpoint data from an active element of the redundant plurality of elements. The standby element performs any necessary data conversions required to bring the retrieved data into conformance with the formats dictated by the target version of the software. At this point, the node no longer has redundancy protection. - The node is placed into isolation mode in
step 1430 to prevent configuration changes. In the case of a router, for example, alarms, requests to establish/terminate connections, and routing table modifications are ignored. - The software for lower level processors may also be upgraded. As previously indicated, however, layer functionality must be preserved throughout the upgrade. In order to preserve layer functionality, the associated control plane is transferred from a processor at one level of the element hierarchy to a processor at the same level or another level of the element hierarchy as indicated in
FIG. 15 . - In one embodiment, a control plane is transferred from at least one first processor handling a first layer to a second processor handling a second layer in
step 1510. This is equivalent to transferring the layer or layer portion handled by the first processor to the second processor handling another layer or layer portion. The node may have a single first processor or n first processors such as theprocessors 434 associated with each of elements 430-460. - The first and second processors are located at different levels of the element hierarchy. Effectively the layer or portion of a layer handled by a first processor is transferred to a second processor at another level of the hierarchy. In contrast to the redundancy approach, all the processors (e.g. 434) handling the first layer or first layer portion prior to the transfer can have a software upgrade at substantially the same time. The redundancy approach requires swapping the roles of active and standby components such that upgrades for all elements at the same level cannot occur substantially simultaneously.
- In an alternative embodiment, a control plane is transferred from at least one first processor handling an associated first layer to a second processor handling an associated first layer in
step 1512. This is equivalent to transferring the layer or layer portion handled by the first processor to a second processor handling another instance of the same layer or layer portion. The node may have a single first processor or n first processors such as theprocessors 434 associated with each of elements 430-460. - The first and second processors are located at the same level of the element hierarchy. Effectively the layer or portion of a layer handled by a first processor is transferred to a second processor at the same level of the hierarchy. In contrast to the redundancy approach, the second processor is not duplicative or redundant. Prior to transfer of the control plane, the second processor is handling its own instance of the same layer or layer portion.
- Regardless of whether the control plane is transferred to a processor at the same or a different level of the element hierarchy, after the transfer the software associated with the at least one first processor is upgraded in
step 1520. This may be accomplished by using a soft reset to force the first processor(s) to load the target version of the software as previously described. This upgrade does not impact data traffic handled by lower level layers. Instep 1530, the lower level layer hardware associated with the first processor is re-configured. This re-configuration disrupts the data traffic handled by the lower level layer hardware. Instep 1540, the control plane is transferred back to the at least one first processor. -
FIG. 16 illustrates the transfer of the control plane or layer functionality in greater detail. Instep 1610, a first processor handling a first layer provides connection data (i.e., the static configuration and dynamic state) to a second processor. Depending upon the location of the second processor in the element hierarchy, the second processor is either handling a second layer or another instance of the first layer. Instep 1620, the first processor terminates handling first layer functions. Using the connection data, the second processor initiates handling of the first layer functions previously associated with the first processor instep 1630. Thus a first layer being handled by the first processor is transferred to a second processor. - The software upgrade for the first processor is performed in
step 1640. During the upgrade, the second processor is handling first layer functionality. This might include, for example “hello”, “keep alive”, or other functionality required to preserve the status quo with respect to other nodes in the communications network. - After the upgrade, the first processor retrieves the connection data from the second processor in
step 1650. The lower level hardware associated with the first processor is re-configured instep 1660. The second processor terminates handling first layer functions instep 1670. The first processor initiates handling first layer functions instep 1680 using the connection data. This is equivalent to transferring the first layer being handled by the second processor back to the first processor for handling. - The re-configuration of the low level hardware is typically required in order to support the protocol modifications at the data traffic layer. The connection data preserved throughout the upgrade of the control plane for the low level hardware must be re-mapped or otherwise modified to ensure compatibility with the upgraded versions of the protocols instituted by the software upgrade.
-
FIG. 17 illustrates one embodiment of re-configuring the low-level hardware. Instep 1710, a first version of connection data compatible with a first version of a layer is mapped to a second version of connection data compatible with a second version of a layer. The connection data includes static configuration data as well as dynamic state data. Instep 1720, the low-level layer hardware is re-configured in accordance with the second version of connection data. This might entail, for example, writing values to a number of registers. This re-configuration disrupts the data traffic handled by the low-level hardware, but the amount of time required to write values to the registers is on the order of fractions of a second to seconds and thus of sufficiently short period of time to avoid causing other nodes in the communications network from taking corrective action such as re-routing communications around the node being updated. - An alternative approach to re-configuring the low-level hardware can potentially decrease the amount of time needed for re-configuration by reducing the number of write operations required. The aforementioned re-mapping operation does not necessarily result in a change in value for every register of the low-level layer hardware. The number of write operations might be significantly reduced if values are written only to the registers that have changed values.
-
FIG. 18 illustrates one embodiment of the alternative approach to re-configuring the low-level hardware. Instep 1810, a first version of connection data compatible with a first version of a layer is mapped to a second version of connection data compatible with a second version of the layer. The first and second versions of the layer refer to the pre- and post-upgrade versions of the layer. - A read operation is performed to retrieve the current version of the connection data from the low-level layer hardware in
step 1820. The current connection data is compared to the second version of the connection data to identify a difference (DIFF) version of the connection data instep 1830. The DIFF version identifies only the registers that have changes in value and what those values should be. The DIFF version thus identifies only the locations that actually require a change. The low-level hardware is then re-configured in accordance with the difference version of the connection data instep 1840. The difference version can potentially decrease the amount of time that the data traffic is disrupted by eliminating the time spent writing to registers that do not require changes. - The remaining elements of the redundant plurality of elements may now be upgraded as indicated in
FIG. 19 . Until this point the upgrade process has been controlled by the active element of the redundant plurality of elements. A first selected active element swaps active/standby status with a second selected standby element instep 1910. The first selected element is now a standby element and the second selected element is now the active element. The second selected element is now responsible for controlling the remainder of the upgrade process. - The first selected element is upgraded to a target version of the software in
step 1920. This may be accomplished, for example, by performing a reset of the processor with a boot vector directed to the target version of the software. In one embodiment, the node exits the isolation mode instep 1930 to enable configuration changes. Instep 1940, the first selected element retrieves configuration and checkpoint data from the second selected element. At this point the redundant plurality of elements are synchronized and capable of providing redundancy protection. In an alternative embodiment,step 1940 is performed prior to step 1930 to ensure redundancy before exiting the isolation mode. - Methods and apparatus for modifying a layered protocol communications apparatus have been described. For example, software is updated for different layers without disrupting lower layer data traffic. In particular functionality is preserved for a layer either by providing a redundant element to handle the layer or by transferring the layer to an element at the same or a different hierarchical level of the layered protocol hierarchy.
- In the preceding detailed description, the invention is described with reference to specific exemplary embodiments thereof. Various modifications and changes may be made thereto without departing from the broader scope of the invention as set forth in the claims. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.
Claims (25)
1. A method of modifying a layered protocol communication apparatus, comprising:
a)transferring a control plane from a first processor handling a first layer to a second processor handling a second layer of a layered protocol.
2. The method of claim 1 wherein the transfer of the control plane from the first processor to the second processor does not interrupt data traffic handled by any layers lower than the first layer.
3. The method of claim 1 wherein step a) further comprises:
i) providing connection data from the first processor to the second processor;
ii) halting the first processor's handling of the first layer; and
iii) initiating handling of the first layer by the second processor using the connection data.
4. The method of claim 1 further comprising:
b) modifying software associated with the first processor.
5. The method of claim 4 wherein step b) comprises performing a soft reset of the first processor with a boot vector directed to a target version of the software.
6. The method of claim 4 further comprising:
c)transferring the control plane from the second processor to the first processor.
7. The method of claim 6 wherein the transfer of the control plane from the second processor to the first processor does not interrupt data traffic handled by any layers lower than the first layer.
8. The method of claim 6 wherein step c) further comprises:
i) providing connection data from the second processor to the first processor;
ii) halting the second processor's handling of the first layer; and
iii) initiating handling of the first layer by the first processor using the connection data.
9. The method of claim 6 further comprising:
d) mapping a first version of the connection data to a second version of the connection data; and
e) configuring a lower layer hardware in accordance with the second version of the connection data, wherein the lower layer is lower than the first layer.
10. The method of claim 6 further comprising:
d) mapping a first version of the connection data to a second version of the connection data;
e) reading a current version of the connection data;
f) comparing the second version and the current version of the connection data to generate a difference version identifying only the changed registers and values; and
g) configuring a lower layer hardware in accordance with the difference version of the connection data, wherein the lower layer is lower than the first layer.
11. A method of modifying a layered protocol communication apparatus, comprising:
a) transferring a first layer handled by a first processor to a second processor handling a second layer of a layered protocol.
12. The method of claim 11 wherein the transfer of the first layer from the first processor to the second processor does not interrupt data traffic handled by any layers lower than the first layer.
13. The method of claim 11 wherein step a) further comprises:
i) providing connection data from the first processor to the second processor;
ii) halting the first processor's handling of the first layer; and
iii) initiating handling of the first layer by the second processor using the connection data.
14. The method of claim 11 further comprising:
b) modifying software associated with the first processor.
15. The method of claim 14 wherein step b) comprises performing a soft reset of the first processor with a boot vector directed to a target version of the software.
16. The method of claim 14 further comprising:
c)transferring the first layer from the second processor to the first processor for handling.
17. The method of claim 16 wherein the transfer of the first layer from the second processor to the first processor does not interrupt data traffic handled by any layers lower than the first layer.
18. The method of claim 16 wherein step c) further comprises:
i) providing connection data from the second processor to the first processor;
ii) halting the second processor's handling of the first layer; and
iii) initiating handling of the first layer by the first processor using the connection data.
19. The method of claim 16 further comprising:
d) mapping a first version of the connection data to a second version of the connection data; and
e) configuring a lower layer hardware in accordance with the second version of the connection data, wherein the lower layer is lower than the first layer.
20. The method of claim 16 further comprising:
d) mapping a first version of the connection data to a second version of the connection data;
e) reading a current version of the connection data;
f) comparing the second version and the current version of the connection data to generate a difference version identifying only the changed registers and values; and
g) configuring a lower layer hardware in accordance with the difference version of the connection data, wherein the lower layer is lower than the first layer.
21. A communication apparatus comprising:
a hierarchy of processors including a first processor associated with a first layer and a second processor associated with a second layer of a layered protocol, wherein a control plane associated with the first processor is transferred to the second processor prior to modifying a software associated with the first processor.
22. The apparatus of claim 21 wherein the apparatus is at least one of a network router and a network switch.
23. The apparatus of claim 21 wherein the first processor provides the second processor with connection data describing a data plane to facilitate the transfer of the control plane.
24. The apparatus of claim 21 wherein the control plane is transferred back to the first processor after the software modification.
25. The apparatus of claim 21 wherein the first processor performs a soft reset with a boot vector pointing to a target version of the software for modifying of the software
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/390,488 US20070208894A1 (en) | 2006-03-02 | 2006-03-27 | Modification of a layered protocol communication apparatus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US77843706P | 2006-03-02 | 2006-03-02 | |
US11/390,488 US20070208894A1 (en) | 2006-03-02 | 2006-03-27 | Modification of a layered protocol communication apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
US20070208894A1 true US20070208894A1 (en) | 2007-09-06 |
Family
ID=38472697
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/390,488 Abandoned US20070208894A1 (en) | 2006-03-02 | 2006-03-27 | Modification of a layered protocol communication apparatus |
Country Status (1)
Country | Link |
---|---|
US (1) | US20070208894A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090268660A1 (en) * | 1997-07-15 | 2009-10-29 | Viasat, Inc. | Frame format and frame assembling/disassembling method for the frame format |
US20200412586A1 (en) * | 2019-11-29 | 2020-12-31 | Intel Corporation | Communication link re-training |
Citations (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5778189A (en) * | 1996-05-29 | 1998-07-07 | Fujitsu Limited | System and method for converting communication protocols |
US5989060A (en) * | 1997-05-02 | 1999-11-23 | Cisco Technology | System and method for direct communication with a backup network device via a failover cable |
US6065102A (en) * | 1997-09-12 | 2000-05-16 | Adaptec, Inc. | Fault tolerant multiple client memory arbitration system capable of operating multiple configuration types |
US20020073410A1 (en) * | 2000-12-13 | 2002-06-13 | Arne Lundback | Replacing software at a telecommunications platform |
US20020091826A1 (en) * | 2000-10-13 | 2002-07-11 | Guillaume Comeau | Method and apparatus for interprocessor communication and peripheral sharing |
US20020143969A1 (en) * | 2001-03-30 | 2002-10-03 | Dietmar Loy | System with multiple network protocol support |
US6490631B1 (en) * | 1997-03-07 | 2002-12-03 | Advanced Micro Devices Inc. | Multiple processors in a row for protocol acceleration |
US20030037323A1 (en) * | 2001-08-18 | 2003-02-20 | Lg Electronics Inc. | Method for upgrading data |
US20030140339A1 (en) * | 2002-01-18 | 2003-07-24 | Shirley Thomas E. | Method and apparatus to maintain service interoperability during software replacement |
US20030149970A1 (en) * | 2002-01-23 | 2003-08-07 | Vedvyas Shanbhogue | Portable software for rolling upgrades |
US6622215B2 (en) * | 2000-12-29 | 2003-09-16 | Intel Corporation | Mechanism for handling conflicts in a multi-node computer architecture |
US6691184B2 (en) * | 2001-04-30 | 2004-02-10 | Lsi Logic Corporation | System and method employing a dynamic logical identifier |
US6934880B2 (en) * | 2001-11-21 | 2005-08-23 | Exanet, Inc. | Functional fail-over apparatus and method of operation thereof |
US7055147B2 (en) * | 2003-02-28 | 2006-05-30 | Sun Microsystems, Inc. | Supporting interactions between different versions of software for accessing remote objects |
US20060190775A1 (en) * | 2005-02-17 | 2006-08-24 | International Business Machines Corporation | Creation of highly available pseudo-clone standby servers for rapid failover provisioning |
US20070002841A1 (en) * | 2005-06-03 | 2007-01-04 | Kevin Riley | Publicly-switched telephone network signaling at a media gateway for a packet-based network |
US20070156915A1 (en) * | 2006-01-05 | 2007-07-05 | Sony Corporation | Information processing apparatus, information processing method, and program |
US7260818B1 (en) * | 2003-05-29 | 2007-08-21 | Sun Microsystems, Inc. | System and method for managing software version upgrades in a networked computer system |
US7266816B1 (en) * | 2001-04-30 | 2007-09-04 | Sun Microsystems, Inc. | Method and apparatus for upgrading managed application state for a java based application |
US7305669B2 (en) * | 2002-09-27 | 2007-12-04 | Sun Microsystems, Inc. | Software upgrades with multiple version support |
US7353285B2 (en) * | 2003-11-20 | 2008-04-01 | International Business Machines Corporation | Apparatus, system, and method for maintaining task prioritization and load balancing |
US7444502B2 (en) * | 2005-09-02 | 2008-10-28 | Hitachi, Ltd. | Method for changing booting configuration and computer system capable of booting OS |
-
2006
- 2006-03-27 US US11/390,488 patent/US20070208894A1/en not_active Abandoned
Patent Citations (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5778189A (en) * | 1996-05-29 | 1998-07-07 | Fujitsu Limited | System and method for converting communication protocols |
US6490631B1 (en) * | 1997-03-07 | 2002-12-03 | Advanced Micro Devices Inc. | Multiple processors in a row for protocol acceleration |
US5989060A (en) * | 1997-05-02 | 1999-11-23 | Cisco Technology | System and method for direct communication with a backup network device via a failover cable |
US6065102A (en) * | 1997-09-12 | 2000-05-16 | Adaptec, Inc. | Fault tolerant multiple client memory arbitration system capable of operating multiple configuration types |
US20020091826A1 (en) * | 2000-10-13 | 2002-07-11 | Guillaume Comeau | Method and apparatus for interprocessor communication and peripheral sharing |
US20020073410A1 (en) * | 2000-12-13 | 2002-06-13 | Arne Lundback | Replacing software at a telecommunications platform |
US6622215B2 (en) * | 2000-12-29 | 2003-09-16 | Intel Corporation | Mechanism for handling conflicts in a multi-node computer architecture |
US20020143969A1 (en) * | 2001-03-30 | 2002-10-03 | Dietmar Loy | System with multiple network protocol support |
US6691184B2 (en) * | 2001-04-30 | 2004-02-10 | Lsi Logic Corporation | System and method employing a dynamic logical identifier |
US7266816B1 (en) * | 2001-04-30 | 2007-09-04 | Sun Microsystems, Inc. | Method and apparatus for upgrading managed application state for a java based application |
US20030037323A1 (en) * | 2001-08-18 | 2003-02-20 | Lg Electronics Inc. | Method for upgrading data |
US6934880B2 (en) * | 2001-11-21 | 2005-08-23 | Exanet, Inc. | Functional fail-over apparatus and method of operation thereof |
US20030140339A1 (en) * | 2002-01-18 | 2003-07-24 | Shirley Thomas E. | Method and apparatus to maintain service interoperability during software replacement |
US20030149970A1 (en) * | 2002-01-23 | 2003-08-07 | Vedvyas Shanbhogue | Portable software for rolling upgrades |
US7305669B2 (en) * | 2002-09-27 | 2007-12-04 | Sun Microsystems, Inc. | Software upgrades with multiple version support |
US7055147B2 (en) * | 2003-02-28 | 2006-05-30 | Sun Microsystems, Inc. | Supporting interactions between different versions of software for accessing remote objects |
US7260818B1 (en) * | 2003-05-29 | 2007-08-21 | Sun Microsystems, Inc. | System and method for managing software version upgrades in a networked computer system |
US7353285B2 (en) * | 2003-11-20 | 2008-04-01 | International Business Machines Corporation | Apparatus, system, and method for maintaining task prioritization and load balancing |
US20060190775A1 (en) * | 2005-02-17 | 2006-08-24 | International Business Machines Corporation | Creation of highly available pseudo-clone standby servers for rapid failover provisioning |
US20070002841A1 (en) * | 2005-06-03 | 2007-01-04 | Kevin Riley | Publicly-switched telephone network signaling at a media gateway for a packet-based network |
US7444502B2 (en) * | 2005-09-02 | 2008-10-28 | Hitachi, Ltd. | Method for changing booting configuration and computer system capable of booting OS |
US20070156915A1 (en) * | 2006-01-05 | 2007-07-05 | Sony Corporation | Information processing apparatus, information processing method, and program |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090268660A1 (en) * | 1997-07-15 | 2009-10-29 | Viasat, Inc. | Frame format and frame assembling/disassembling method for the frame format |
US20200412586A1 (en) * | 2019-11-29 | 2020-12-31 | Intel Corporation | Communication link re-training |
US11863357B2 (en) * | 2019-11-29 | 2024-01-02 | Intel Corporation | Communication link re-training |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9378005B2 (en) | Hitless software upgrades | |
US7062642B1 (en) | Policy based provisioning of network device resources | |
US6671699B1 (en) | Shared database usage in network devices | |
US7933987B2 (en) | Application of virtual servers to high availability and disaster recovery solutions | |
US6983362B1 (en) | Configurable fault recovery policy for a computer system | |
US6694450B1 (en) | Distributed process redundancy | |
EP1665672B1 (en) | High availability virtual switch | |
US6601186B1 (en) | Independent restoration of control plane and data plane functions | |
US20120079090A1 (en) | Stateful subnet manager failover in a middleware machine environment | |
US7039827B2 (en) | Failover processing in a storage system | |
US7652982B1 (en) | Providing high availability network services | |
US6715097B1 (en) | Hierarchical fault management in computer systems | |
CN115495409A (en) | Method and system for facilitating inter-container communication via cloud exchange | |
WO2010022100A2 (en) | Upgrading network traffic management devices while maintaining availability | |
US6654903B1 (en) | Vertical fault isolation in a computer system | |
US7430735B1 (en) | Method, system, and computer program product for providing a software upgrade in a network node | |
US11601365B2 (en) | Wide area networking service using provider network backbone network | |
US6742134B1 (en) | Maintaining a local backup for data plane processes | |
US20230083347A1 (en) | Near-hitless upgrade or fast bootup with mobile virtualized hardware | |
US7117213B2 (en) | Primary-backup group with backup resources failover handler | |
US20100185682A1 (en) | Object identifier and common registry to support asynchronous checkpointing with audits | |
US20070233867A1 (en) | Method and apparatus for preserving MAC addresses across a reboot | |
US11824773B2 (en) | Dynamic routing for peered virtual routers | |
US10735259B2 (en) | Virtual switch updates via temporary virtual switch | |
EP1782202A2 (en) | Computing system redundancy and fault tolerance |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: TELLABS OPERATIONS, INC., ILLINOIS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CURRY, DAVID S.;MCLOUGHLIN, BRUCE;KRISHNAMOORTHY, RAMKUMAR;REEL/FRAME:018056/0105 Effective date: 20060623 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |