METHOD AND ESTIMATOR FOR PROVIDING BUSINESS RECOVERY PLANNING
RELATED APPLICATIONS
This application claims the benefit of U.S. Provisional Application 60/158,259, filed October 6, 1999. This application is related to Application
Serial No. entitled Organization Of Information Technology
Functions", by Dove et al. (Attorney docket No. 10022/45), filed herewith. These applications are incorporated herein by reference in their entirety.
BACKGROUND OF THE INVENTION The biggest challenges in Information Technology ("IT") development today are actually not in the technologies, but in the management of those technologies in a complex business environment. From idea conception to capability delivery and to operation, all IT activities, including strategy development, planning, administration, coordination of project requests, change administration, and managing demand for discretionary and non- discretionary activities and operations, preferably are collectively managed. A shared understanding and representation of IT management is needed because today's technological and business environment demands it. The new technological management orientation should include ways for planning, assessing, and deploying technology within and across enterprises.
Businesses need to balance technological capability with enterprise capability in order to become, or stay, a modern organization that has a chance of survival.
There is a need, therefore, to construct a complete yet simple IT framework that would quickly convey the entire scope of IT capability in a functional composition. Such IT framework has to be a single framework for describing such IT management. The IT framework should be a framework of all functions; a representation of a complete checklist of all relevant activities performed in an IT organization. A single IT Framework should represent all functions operative in an IT organization.
Within that IT Framework, there is also a need for Risk Management that manages all functions aimed at identifying and securing enterprise assets against various forms of business interruption or loss. By marketing current IT service offerings, increasing customer satisfaction, and building stronger customer relationships, the IT enterprise can better service their business customer. A risk management capability becomes critical to the IT enterprise as competition to provide IT services begins to increase from outsourcers. Business recovery planning is a key function of risk management. Therefore, to meet this competition, there are needs for improved methods for providing business recovery planning and an estimator for doing so.
BRIEF SUMMARY OF THE INVENTION
The present invention is defined by the following claims, and nothing in this section should be taken as a limitation on those claims. By way of introduction, one embodiment of the invention is a method for providing a business recovery planning function that includes planning, designing, building, testing, and deploying a business recovery planning function for an
IT organization.
In one aspect of the preferred embodiment, the planning step includes developing a business recovery model for the business recovery planning. Preferably, the step of developing a business recovery model may further include defining project vision; assessing risks; and analyzing business impact.
In another aspect of the preferred embodiment, the designing step includes developing business recovery strategies for the business recovery planning function. Preferably, the step of developing business recovery strategies method may further include validating interruption and service levels; developing strategies for critical functions; developing strategies for critical resources; performing cost-benefit analysis; and identifying spin-off projects. In another aspect of the preferred embodiment, the building step includes developing policies and procedures for the business recovery
planning function. Preferably, the step of developing policies and procedures may further include developing plan components; developing component procedures; developing task responsibility matrix; defining maintenance approach; developing update procedures; and developing testing strategy. In another aspect of the preferred embodiment, the building step includes developing learning products for the business recovery planning function. Preferably, the step of developing learning products may further include developing learning materials; conducting training; and evaluating the training. In another aspect of the preferred embodiment, the building step includes acquiring recovery technology infrastructure for the business recovery planning function. Preferably, the step of acquiring recovery technology infrastructure may further include developing alternate site requirements; developing alternate site RFP; selecting alternate site vendor; and negotiating contracts.
In another aspect of the preferred embodiment, the testing step includes preparing and executing a business recovery product test for the business recovery planning function. Preferably, the step of preparing and executing a business recovery product test may further include preparing the recovery plan test; conducting the recovery plan test; and evaluating the recovery plan test.
In another aspect of the preferred embodiment, the deploying step includes deploying business recovery infrastructure for the business recovery planning function. Preferably, the step of deploying a business recovery infrastructure may further include publishing recovery plan; developing summary presentation; and providing continuing support.
Another aspect of the present invention is a method for providing an estimate for building a business recovery planning function in an information technology enterprise. This aspect of the present invention allows an IT consultant to give on-site estimations to a client within minutes. The estimator produces a detailed break down of cost and time to complete a project by displaying the costs and time corresponding to each stage of a project along
with each task. Another aspect of the present invention is a computer system for allocating time and computing cost for building a business recovery planning system in an information technology system.
These and other features and advantages of the invention will become apparent upon review of the following detailed description of the presently preferred embodiments of the invention, taken in conjunction with the appended drawings.
BRIEF DESCRIPTION OF SEVERAL VIEWS OF THE DRAWINGS
The present invention is illustrated by way of example and not limitation in the accompanying figures. In the figures, like reference numbers indicate identical or functionally similar elements.
Figure 1 shows a representation of the steps in a method for providing a business recovery planning function according to the presently preferred embodiment of the invention. Figure 2 shows a representation of the tasks for refining a business recovery model for the method represented in Figure 1.
Figure 3 shows a representation of the tasks for developing business recovery strategies for the method represented in Figure 1.
Figure 4 shows a representation of the tasks for developing policies and procedures for the method represented in Figure 1.
Figure 5 shows a representation of the tasks for developing learning products for the method represented in Figure 1.
Figure 6 shows a representation of the tasks for acquiring recovery technology infrastructure for the method represented Figure 1. Figure 7 shows a representation of the tasks for preparing and executing a business recovery product test for the method represented in Figure 1.
Figure 8 shows a representation of the tasks for deploying a business recovery infrastructure for the method represented in Figure 1. Figure 9 shows a flow chart for obtaining an estimate of cost and time allocation for a project.
Figures 10a though 10e show one embodiment of an estimating worksheet for an OM business recovery planning estimating guide.
DETAILED DESCRIPTION OF THE INVENTION
For the purposes of this invention, an information technology ("IT") enterprise may be considered to be a business organization, charitable organization, government organization, etc. that uses an information technology system with or to support its activities. An IT organization is the group and associated systems and processes within the enterprise that are responsible for the management and delivery of information technology services to users in the enterprise. In a modern IT enterprise, multiple functions may be organized and categorized to provide comprehensive service to the user. Thereby, an information technology framework for understanding the interrelationships of the various functionalities, and for managing the complex IT organization is provided. The various operations management functionalities within the IT framework include a customer service management function; a service integration function; a service delivery function; a capability development function; a change administration function; a strategy, architecture, and planning function; a management and administration function; a human performance management function; and a governance and strategic relationships function. Risk management plays an important role within the strategy, architecture, and planning function. Business Recovery Planning is a key component of risk management. The present invention includes a method for providing a business recovery planning function and an estimator useful for determining the times and cost to provide such a function.
Before describing the method and system for business recovery planning, some related terms are first described as follows:
Strategy, Architecture & Planning:
Strategy, Architecture & Planning is an important function within IT frameworks. This function creates a strategy and plan that outlines the overal
IT capability direction and initiatives as well as the common IT processes, organizations, applications, and technology architecture required to support desired business capabilities with optimal efficiency. This is accomplished by working with the enterprise and business unit strategy teams as well as analyzing technology industry trends. The Strategy, Architecture, & Planning function category provides technology guidance to the broader enterprise through the definition of common application and technology architecture blueprints.
The Strategy, Architecture, & Planning function category seeks to ensure that the information technology framework is aligned with the business and that there is maximum value, interoperability, and re-use of information technology initiatives. This function additionally evaluates, prioritizes, and plans for the recovery of critical business systems through the Risk Management function set. Important functions within Strategy, Architecture & Planning include strategic planning, capability planning, and risk management.
Risk Management:
Risk Management is an important function within Strategy, Architecture & Planning. Risk management encompasses all functions aimed at identifying and securing enterprise assets against various forms of business interruption or loss. Risk management links with other function categories, such as strategic planning and capability planning, to provide a Strategy, Architecture, & Planning function to an IT system. Following the industry's best practices, IT organizations focus on identifying and meeting the needs of the users to provide better quality customer service.
Risk management should include a risk mitigation strategy to develop people, processes, and technology directions to reduce security risks to an acceptable level. This strategy may include identification of asset and loss impact, identification and characterization of threat(s), identification and analysis of vulnerabilities, determination of risks and priorities, and selection of countermeasure security options. Risk management should include a
security planning function to develop tactical plans to secure environments in order of business priority, based on the risk assessment and risk mitigation strategy. This function may contain a quick wins identification function to identify the most egregious security holes that also have the minimum cost impact to the client. This function may contain a solution product analysis function to assess security components primarily for longer term solutions. Risk management should include a high availability planning function to develop and maintain contingency plans in support of the risk mitigation strategy. Risk management should include a business recovery function to develop a business recovery plan in the event of a significant or prolonged system failure.
Business Recovery Planning:
Business Recovery Planning is a key component of Risk Management. A business recovery planning function develops a business recovery plan in the event of a significant or prolonged system failure. This function is concerned with how to rebuild required services quickly in the event of a long- term outage that renders sites (remote or central) unusable. Some key responsibilities include plan development and maintenance; training and testing; deployment; disaster recovery; maintain disaster recovery strategy and plan; manage hot/cold site coordination; manage hot/cold site testing; and implement disaster recovery procedures. This framework uses Business Recovery to refer to the development of the strategy for recovering functional capabilities in the case of a major disruption or emergency. Disaster Recovery develops the strategy for recovering a system or a portion of the system in the event of a significant system failure caused by a major disruption or emergency. The following ten functions may be considered part of business recovery planning:
1. Risk Mitigation Strategy Confirmation:
This function identifies outage scenarios to be addressed by the recovery plan. The function develops preliminary recovery strategies across time, quantifies critical resources by function, and develops a recovery
timeline. The function also quantifies and qualifies appropriate recovery options and presents all findings to stakeholders.
2. Business Contingency And Resumption Plan Development:
This function identifies and selects alternative recovery sites. It also develops and documents an untested recovery plan and develops and documents appropriate recovery team procedures.
3. Technical Disaster Recovery Design:
This function designs the physical implementation of the hardware and processes recommended in the Risk Mitigation Strategy.
4. Business Recovery Maintenance Procedure Development:
This function develops business recovery plan update procedures, training materials, and task responsibility matrix.
5. Disaster Recovery Maintenance Procedure Development:
This function develops disaster recovery plan update procedures, testing strategy, training materials, and task responsibility matrix.
6. Disaster Recovery Plan Validation:
This function plans and conducts tests to evaluate the validity of the initial disaster recovery plan. It evaluates the test results and revises the disaster recovery plan if necessary. The disaster recovery plan is preferably regularly tested to ensure that the plan remains current.
7. Business Recovery Plan Training:
This function conducts training sessions with the business recovery team and awareness training with employees. The function evaluates the training based on feedback from these sessions.
8. Disaster Recovery Plan Training:
This function conducts training sessions with the disaster recovery team and awareness training with employees. This function evaluates the training based on feedback from these sessions.
9. Business Recovery Plan Approval:
This function creates a presentation to present to management to gain approval of the business recovery plan. Once approval is obtained, this function publishes the business recovery plan.
10. Disaster Recovery Plan Approval:
This function creates a presentation to present to management to gain approval of the disaster recovery plan. Once approval is obtained, this function publishes the disaster recovery plan.
Network Centric Environment: For the purpose of the present invention, the term "network centric" (or, netcentric) should be construed to cover various means of reaching out to customers and partners with computing systems and knowledge over a communications backbone, such as an intranet, extranet, or internet connection. It is valuable to have an understanding of a netcentric environment since carrying out the method of providing business recovery planning within this environment may take special considerations. To define netcentric properly, it is helpful to have a general understanding of a framework that describes the types of applications required in a netcentric computing system. Application logic is preferably packaged into components and distributed from a server to a client over a network connection between the client and server. The client has standardized interfaces so that an application can execute with a client that can run on multiple operating systems and hardware platforms. Further, the application components of the preferred netcentric computing system enable the netcentric computing systems to be adaptable to a variety of distribution styles, from a "thin client" to a "fat client."
Netcentric frameworks preferably support a style of computing where processes on different machines communicate using messages. In this style of computing, "client" processes delegate business functions or other tasks (such as data manipulation logic) to one or more server processes. Server processes respond to messages from clients. Business logic can reside on
both the client and server. Clients are typically personal computers (PC's) or workstations with a graphical user interface running a web browser. Servers are preferentially implemented on UNIX, NT, or mainframe machines. In netcentric computing systems, there is a preferred tendency to move more business logic to the servers, although "fatter" clients result from new technologies such as Java and ActiveX. In a netcentric environment, technology, people, and processes may be distributed across global boundaries and business functions/systems may involve multiple organizations. This will generally add complexity to the required systems.
The present invention is directed to providing or building an Operations Management ("OM") business recovery planning function. These specific tasks are described in reference to the Operations Management Planning Chart ("OMPC") that is shown on Figure 1. This chart provides the business integration methodology for capability delivery, which includes tasks such as planning analysis, design, build & test, and deployment. Each OM function includes process, organization, and technology elements that are addressed throughout the following description of the corresponding OM function. The method for providing a Business Recovery Planning function comprises four stages, as described below in connection with Figure 1. The first stage, "capability analysis stage" 102, includes the step of Refining Business Recovery Model 21 10. The second stage, "capability release design stage" 104, includes the step of Developing Business Recovery Strategies 2410. The third stage, "capability release build and test stage" 106, includes the steps of Acquiring Recovery Technology Infrastructure
5510, Developing Policies & Procedures 6220, Developing Learning Products 6260, and Preparing & Executing Business Recovery Product Test 5590. The fourth stage, "deployment" 108, includes the step of Deploying Business Recovery Infrastructure 7170. In the following description, the details of the tasks within each step are discussed.
Step 2110 - Refining Business Recovery Model
In step 21 10, the business model requirements for business recovery planning are defined. Current capabilities are defined, the scope of the delivery and deployment effort needed to upgrade the capability is determined, and risk assessment is developed. Figure 2 shows a representation of the tasks for carrying out these functions according to the presently preferred embodiment of the invention. These tasks include Defining Project Vision 21 12, Assessing Risks 21 14, and Analyzing Business Impact 2116. This task package is essentially a planning/high-level analysis step, and may actually occur as part of the planning phase of a larger business capability or operations architecture development project. The products of this step include project vision, capabilities analysis, impact analysis, and business risk assessment.
Task 2112: Defining Project Vision Task 2112 includes defining stakeholder values, defining core competencies, and developing shared project vision. For the project analysis to represent the business situation correctly, the organization's stakeholders preferably define the values of the enterprise and the reasons the project is important to the realization or continuation of those values. In addition, the future strategic direction of the organization will be affected by the performance of this project. The events and rationale leading to the project engagement are delineated so that the objectives of the project can be set and realized.
The identification of what the stakeholders believe to be their key competencies and capabilities will influence the stakeholders' perceptions of the organization's competitive advantages. However, there may be a disconnect between what the stakeholders believe are the critical entities and what the project analysis provides. Therefore, the tactical or operational activities and visions are highlighted so that they may be addressed further into the project. For the project to be a success, all parties to the engagement
preferably share a common vision as to the justification, objectives, and findings of the project.
Task 2114: Assessing Risks
Task 2114 includes documenting resource inventories, documenting existing controls, evaluating potential interruption, reviewing insurance coverage, documenting third party capabilities, and documenting avoidance and mitigation steps. This task involves the raw data gathering of the client environment included in the scope of the project. This includes documenting facilities and locations, specialty equipment, telecommunications equipment, system applications, data, and support personnel. This information will be utilized during Business Impact Analysis 2116 and Recovery Strategy Development 2410.
The way in which an organization has documented its business process (manual or automated) could place it at risk. If the documentation describing the business processes is old or nonexistent, restoring the process after a disaster may be extremely difficult or entirely impossible. Therefore, a thorough review of the organization's policy statements, business process controls, technology/information backup procedures, off-site storage procedures, change control procedures, and security controls will provide valuable insight into the amount of risk precipitated through a weak control structure.
An organization also faces risk due to its geographical location and the type of building it occupies. This task allows the project team to determine what environmental factors or threats could affect the normal operations of the organization if they were to occur, through an act of nature or an accidental or intentional act of a human. An observation will be made as to how the threat could occur, how much damage could be caused because of the occurrence, what controls are in place to avoid or mitigate the threat and its consequences, and the overall likelihood that the threat could, indeed, occur. Having adequate business interruption or liability insurance is also part of the arsenal required to deal with a business outage. Therefore, the project
team will preferably review the levels of insurance to determine what is and is not covered during the recovery and restoration process. Also, any gaps in the insurance coverage may serve as an additional risk or threat to the organization until the gap can be rectified or nullified. Integral to any assessment of organizational risk is an analysis of the support received from external third-parties during an emergency. If the support provided is inadequate, then the organization may be taking on an additional risk factor due to slow or poor response. Therefore, this step will allow the project team to assess the gap between what is currently being provided by external third-parties and what would be required in an emergency situation.
After all of the risk gaps have been identified, the project team will preferably prioritize the activities needed to avoid or mitigate the consequences of the potential risks or threats. Each recommended activity is quantified in terms of dollars and time so that management and stakeholders can select those which they determine will provide them with the most cost- effective return for avoiding or mitigating a potential risk.
Task 21 16: Analyzing Business Impact
Task 21 16 includes documenting the business environment, analyzing business functions, determining potential work-arounds, evaluating business resources, performing impact analysis, and obtaining management approval. To understand fully the extent of the business operations under study, the business process drivers preferably are reviewed and incorporated into the other steps of the project. The more information the project team can assimilate, the better the opportunity to develop and implement a recovery plan that is in tune with the organization's business processes, aspirations, and culture.
When the business environment has been documented, completed surveys are returned for review and analysis. All of the data is compiled according to the way it was recorded on the survey. Initial analysis will typically show that some subjectivity and bias have filtered through.
Therefore, interviews should be scheduled with the appropriate personnel to validate, reinforce, or expand upon the answers provided. At the completion of this step, the project team receives its first glimpse of the truly critical business functions required to continue business operations during an emergency situation.
Critical business functions may have contingency or work-around procedures that are activated when the usual system resources are not available for use. Work-arounds can extend the time before technical recovery steps are needed. These work-arounds can take the form of hard copy forms, rerouting work to another location, etc. This task documents the work-arounds, how soon they are activated, and how long they can be effectively utilized.
Information gained from the survey will also show what the respondents believe to be the critical resource set. The quantity and type of critical resources should then be compared and contrasted with the total number on hand. These critical resources represent what management believes to be the bare minimums needed to continue business operations in an emergency situation.
Once this information has been analyzed, the tangible and intangible impacts are quantified and qualified to determine the increasing level of harm to the organization as an emergency extends through time without being eliminated or held in check through avoidance, mitigation, or recovery steps. A resulting graphical display highlights the organization's financial position over time if nothing is done to counteract the consequences. This then allows the project team, in conjunction with management and stakeholders, to determine whether a recovery strategy should be invoked due to time of outage or amount of dollars lost.
In a netcentric environment, additional complexity may contribute to the business recovery process in several ways:
Ownership- When information is shared between multiple organizations, the questions of information ownership can introduce
challenges into business recovery planning. Agreements with extranet partners that define ownership in business recovery scenarios become desirable.
Communication- The communication plan for business recovery should consider global users and extranet partners.
Loss of Data/Business Transactions- Recovery of netcentric-based transaction information for internet/extranet users should be considered.
Legal Impacts- organizations should consider international legal requirements regarding access to, reporting on, and loss of business information due to a business interruption.
Internet Service Providers (ISPs)- ISPs should build adequate recovery mechanisms. As ISPs offer desirable connections to the Internet, it is important that business recovery plans are incorporated into their own network infrastructure to protect ISP customers from service outages. Business recovery capabilities should be assessed during all ISP selection efforts.
To conclude this task 2116 and the "capability analysis stage" 102, the project team presents the total findings of risk to management and stakeholders for their concurrence 110. The approval of the risks and their anticipated impacts will formulate the basis for the development of recovery strategies in the next step of the project. In the "capability release design stage" 104, business recovery strategies are developed by function 2410.
Step 2410 - Developing Business Recovery Strategies
In step 2410, the strategies for critical functions and resources included in the scope of the project are designed, and any related projects that may be required to effectively implement the overall strategy are identified. Figure 3 shows a representation of the tasks for carrying out these functions, according to the presently preferred embodiment of the invention. These tasks include Validating Interruption and Service Levels 2411 , Developing Strategies for Critical Functions 2412, Developing Strategies to Recover Critical Resources
2413, Performing Cost/Benefit Analysis 2414, and Identifying Spin-Off
Projects 2415. The products of this step include business recovery strategies and spin-off projects.
Task 241 1 : Validating Interruption and Service Levels Task 2411 includes the revalidation of the outage scenarios to be addressed, confirmation of the outage windows which the critical functions can tolerate over time, and identification and documentation of the levels of service that need to be provided for each function during an emergency. These three items will drive the strategies, developed in tasks 2412 and 2413, to recover the critical functions and resources within the organization.
Task 2412: Developing Strategies for Critical Functions
Task 2412 includes identification of the recovery priorities and time frames of the critical functions as they correlate to the outage scenarios. Depending on the outage scenario in question, functional criticality may vary. The recovery strategies will typically center around the timing and makeup of functional activities in relation to their distance in time from the point of the disruption. Finally, a disaster declaration table will be produced which will approximate the factors leading to a recovery decision as time advances, before a point is reached where recovery is no longer viable or effective.
Task 2413: Developing Strategies for Critical Resources Task 2413 includes determining resource recovery strategies. As a critical business function is being restored, its corresponding resource set preferably is restored at the same time so as to ensure recoverability of the previously agreed-to functional service level. To guarantee timeliness and cost-effectiveness, the resource recovery strategies will be determined by selecting the method of recovery (pre-determination, pre-arrangement, or redundancy) which best fits the functional requirements in the given outage scenarios. Additionally, a resource recovery timeline will be developed.
Task 2414: Performing Cost/Benefit Analysis
Task 2414 includes selection of the most opportune recovery strategies and time frames based on the corresponding benefits. All of the previously identified recovery strategies, tables, and timelines are quantified in terms of dollars and time. The resulting matrix highlights the most cost-effective and appropriate recovery options (tactics) available to management. These options will then be selected and finalized for management approval.
Task 2415: Identifying Spin-Off Projects
Task 2415 includes identifying ancillary projects which would aid the organization's recovery effort in some fashion. This may occur throughout the risk assessment (21 14) and recovery strategy development phases (2412, 2413) of the project. At this point, the projects are quantified and prioritized for management approval.
To conclude the "capability release design stage" 104 and the recovery strategy development step 2410 of the project, the project team presents the recommended recovery strategies for management and stakeholder approval 112. This action will, in turn, determine the recovery tactics utilized in the development and implementation of the actual recovery plan. In the third stage, the "capability release build and test stage" 106, the recovery plan is developed by functions 6220, 6260, and 5510 and then tested by function
5590.
Step 6220 - Developing Polices & Procedures
In step 6220, policies and procedures are developed. The method of the present invention includes producing a finalized, detailed set of new policies, procedures, and reference materials; documenting user procedures in enough detail to enable smooth execution of new business recovery planning tasks; and developing the process for plan maintenance and support. Figure 4 shows a representation of the tasks for carrying out these functions, according to the presently preferred embodiment of the invention. The tasks include Developing Plan Components 6221 , Developing
Component Procedures 6222, Developing Task Responsibility Matrix 6223, Defining Maintenance Approach 6224, Developing Update Procedures 6225, and Developing Testing Strategy 6226. The products of this step include recovery policies and procedures, recovery plan components, and maintenance policies and procedures.
Task 6221 : Developing Plan Components
Task 6221 includes developing a Response Plan. The Response Plan is the first document that should be referred to during an emergency. This document is divided into smaller plans or sections that address emergency response procedures, escalation plans, notification plans, disaster declaration plans, and the Crisis Management Center plan. Each of these plans has a particular purpose and set of procedures that are to be followed. The purpose of this task is to identify and document those plans. The necessary procedures are developed by the tasks 6222-6225 and are then tested by task 6226.
Task 6222: Developing Component Procedures
Task 6222 includes developing assessment procedures, mitigation procedures, operational procedures, restoration procedures, and recovery team procedures. Assessment determines the extent of damage caused by the disaster.
This information is used by management to determine which, if any, recovery options are necessary. Once these options have been initiated, further assessment is required which will allow management to determine the requirements for restoring the damaged site and whether it is restorable. This task focuses on developing the procedures that are required to perform an accurate damage assessment to provide information for management decisions.
Also in this task, procedures for mitigating critical resources and establishing their operation at an alternate site will be developed. These procedures will aid in the effective recovery of the critical resources and may provide for a minimum of losses to the critical business functions.
Once all of the resources have been recovered at the alternate site, the organization will preferably have a schedule for operating the critical functions. This schedule will document how available resources should be used to support these critical functions. The documentation produced will include staffing and operating schedules.
While the critical functions are being performed at the alternate site, steps can be undertaken to restore the permanent site. Procedures will be developed to begin restoration of the damaged site or selection of a new site based on the assessment of the damage. Recovery team members preferably know what their responsibilities are, who they are to contact, and where they should report to perform their tasks. All of the procedures that were identified in the previous development steps will be grouped and organized for the particular team that will perform them.
Task 6223: Developing Task Responsibility Matrix
Task 6223 includes documenting the maintenance responsibility for each developed procedure. The matrix identifies who is responsible for the procedure, how often the procedure should be reviewed, and the last date the procedure was updated.
Task 6224: Defining Maintenance Approach
Task 6224 includes identifying and documenting the organizational triggers which could require changes to the recovery plan. These triggers could include functional reorganizations, new product development, hardware and software upgrades and conversions, telecommunication expansion, outsourcing, corporate mergers or downsizing, etc..
Task 6225: Developing Update Procedures
Task 6225 includes developing materials to provide the organization with the procedures to update the business recovery plan. Based on the triggers identified in task 6224, procedures are defined to guide the submission of changes and the application of those changes to the plan. The
procedures developed are then incorporated into formal change control procedures in existence at the organization.
Task 6226: Developing Testing Strategy Task 6226 includes developing specific strategies to guide the organization in testing the plan as constructed by tasks 6221-6225. The strategies outline what components of the plan to test, which components should be tested together, what levels of testing should occur, and how often the components should be tested. The product of this task will be used in the Recovery Plan Testing phase 5590 for guidance each time a test is conducted. Criteria are developed for use in evaluating the results of the test and assessing its validity. This is also a checkpoint to review the untested business recovery plan and associated maintenance procedures with the stakeholders.
Step 6260 - Developing Learning Products In step 6260, learning objectives and learning activities are defined, the learning approach and requirements are translated into finalized learning products, and training is conducted for all target audiences. This task package should be completed prior to the Product Test 5590, so that personnel with ongoing business recovery responsibilities may participate in the testing process. Figure 5 shows a representation of the tasks for carrying out these functions, according to the presently preferred embodiment of the invention. The tasks include Developing Learning Materials 6261 , Conducting Training 6265, and Evaluating Training 6269. The products of this step include learning products and learning test model.
Task 6261 : Developing Learning Materials
Task 6261 includes developing training material for three purposes. The first is to provide background information on business recovery planning for the various audiences. The second is to educate the team members on the structure of the Response Plan document and their role in the Plan. The
third is to provide scenario-based exercises that familiarize the team members with the recovery procedures while also teaching the team to work together.
Task 6265: Conducting Training
Task 6265 includes conducting recovery team training, conducting staff awareness training, and conducting management training.
Recovery team training is done to prepare the key business recovery planning team members to execute the business recovery plan. A combination of awareness training and hands-on exercises prepare the team members for their roles in executing the Response Plan. Staff awareness training is conducted to give staff at the organization knowledge of the business recovery plan. The training should teach staff how to recognize and report potential threats, as well as how to respond to these threats, and what their responsibilities will be during recovery. Finally, this training should teach participants to identify updates which may be required to the plan and the procedures for initiating those updates.
Management has a key role in ensuring that the business recovery plan is workable and is kept up to date. This includes staff compliance with change control processes and participation in plan exercises. Management training will highlight these points and also the importance of their decision making role during a crisis.
Task 6269: Evaluating Training
Task 6269 includes evaluating the training after each session. Any changes identified should be approved and then applied to the training program. Since the recovery plan is constantly being updated to reflect current conditions, it is important to be sure that the training materials are also up to date, and that the staff and team members are benefiting from their training.
The following netcentric considerations may influence processes and procedures in a business recovery planning function. With additional global partners involved to support the business function, the processes to
update/maintain business recovery documentation may become more complex, and procedures of a business recovery plan may have to be documented in multiple languages. Processes to communicate to a global organization and global anonymous users in the event of a disaster may become more complex. Processes to identify, test, and reestablish interfaces and jobs between dispersed systems may become more complex. Research may be needed to ensure all international netcentric legal requirements are met by the business recovery plan.
Step 5510 - Acquiring Recovery Technology Infrastructure In this step, the arrangements for any alternate site processing facilities which may be needed in the event of a business interruption are planned and executed. If choices are available, it will be decided who will supply the facilities and services and how they will be supplied. Figure 6 shows a representation of the tasks for carrying out these functions, according to the presently preferred embodiment of the invention. These tasks include
Developing Alternate Site Requirements 5511 , Developing Alternate Site RFP 5513, Selecting Alternate Site Vendor 5515, and Negotiating Contracts 5517. The products of this step include hot-site vendor selection and contract.
Task 5511 : Developing Alternate Site Requirements Task 5511 includes documenting and prioritizing the organization's requirements for alternate sites. This should be done before submitting a Request For Proposal (RFP) to third-party vendors. The documentation would include minimum requirements for hardware, software, telecommunications, workstations, and specialized services. These requirements would include additional capacity for the above as required at critical time periods (1 hour, 4 hours, 24 hours, etc.). The requirements should be reviewed with management.
Task 5513: Developing Alternate Site RFP
Task 5513 includes preparing a form letter to the selected vendors that documents the required resources and services, and the time periods within
which they are preferably provided as determined by 551 1. The form letter, the RFP, also requires vendor pricing for services. The RFP should state the deadline for vendor response. A list of reputable vendors should be compiled based on prior experience with the vendors or trade journal surveys.
Task 5515: Selecting Alternate Site Vendor
Task 5515 includes mapping the vendor responses to the requirements as given in the RFP and selecting the most cost-effective solution. This is preferably done with a grid showing the prioritized requirements on one axis and each vendor on the other axis. It should be expected that follow-up calls will be necessary to clarify responses that are not clear or do not address the stated requirements. The findings and recommendations should be reviewed with the sponsoring organization.
Task 5517: Negotiating Contracts
Task 5517 includes assisting in vendor negotiations. While the organization may have its own staff that specializes in vendor negotiations, assistance may be appropriate in some cases. In other cases, the project's scope may include negotiation and contract review.
Step 5590 - Prepare & Execute Business Recovery Product Test
In this step, all the elements of the business recovery plan (as developed in 6220, 6226, and 5510) are ensured to be accurate and to function as designed. Figure 7 shows a representation of the tasks for carrying out these functions, according to the presently preferred embodiment of the invention. These tasks include Preparing the Recovery Plan Test 5591 , Conducting the Test 5593, and Evaluating the Test 5595. The products of this step include recovery plan test method.
Task 5591 : Preparing the Recovery Plan Test Task 5591 includes planning carefully for the test. This includes evaluation of what should be tested, how it should be tested, who the test should involve, and what the expected results would be. If the recovery plan is properly tested, the risk of failure after an actual disaster is significantly
reduced. Deficiencies encountered in previous tests are prime candidates for follow-up testing. This task covers all of the preparatory tasks for a test. A Test Coordinator is responsible for the planning, monitoring, and post-test evaluation.
Task 5593: Conducting the Test
Task 5593 includes briefing the test team about the test (unless it is an unannounced test) and then executing the test. As the test is conducted, documentation should be produced for evaluation following the test. A key rule in any test is to stick to the script and utilize only the documentation and resources that are available from expected sources and locations.
Task 5595: Evaluating the Test
Task 5595 includes conducting a post-test evaluation. This evaluation should reveal any oversights that may exist in the plan and verify that the recovery plan can be executed effectively. This is accomplished by a review of the test logs, expected outputs, and deviations from plan procedures. A final evaluation is written by the Test Coordinator and includes action plans that address deficiencies in the plan procedures. The recovery plan should be updated to correct the deficiencies. Deficiencies encountered during the test may reflect a lack of understanding of roles in the recovery process or unfamiliahty with plan procedures. These can be addressed by changes to the recovery plan training program.
Some areas of consideration for a business recovery product test 5590 within a netcentric environment include ensuring any extranet scenarios are tested in a business recovery environment. There may be multiple systems shared by various partners dispersed throughout the globe supporting multiple business functions, and testing the point in time interfaces between these systems can be challenging.
To conclude the "capability release build and test stage" 106 of the project, the test findings and recommendations are presented to the stakeholders for their approval 1 14. This task is very desirable for maintaining
management awareness of and commitment to business recovery planning efforts. In the fourth stage, the "deployment stage" 108, the business recovery infrastructure is deployed by function 7170.
Step 7170 - Deploying Business Recovery Infrastructure In this step, the plan is finalized and distributed, and arrangements are made for ongoing support. Figure 8 shows a representation of the tasks for carrying out these functions, according to the presently preferred embodiment of the invention. These tasks include Publishing the Recovery Plan 7171 , Developing Summary Presentation 7173, and Providing Continuing Support 7175. The products of this step include the recovery plan.
Task 7171 : Publishing the Recovery Plan
Task 7171 includes developing the final version of the recovery plan following the product test. If the plan is fairly short, complete copies may be distributed to all appropriate personnel. Large plans can be broken down to a level where team members receive only the introductory material and procedures relevant to their tasks. This is especially true of emergency response plans. Distribution of plans should be controlled by a coordinator to ensure that copies go to the right people and locations.
Task 7173: Developing Summary Presentation Task 7173 includes preparing a formal presentation which should be given to senior management of the organization. The presentation should explain the work which was done as well as the content of the recovery plan. Visual aids and handout documentation should be used to support the presentation. Following the formal recovery plan presentation, senior management will be asked to approve the recovery plan. This sign-off should be documented in a cover letter inserted at the front of the master copy of the recovery plan.
Task 7175: Providing Continuing Support
Task 7175 includes the finalization of a schedule for future testing of plan components. This would include test time frames and objectives.
Following initial testing and deployment, a second test should typically be scheduled within twelve months. The schedule should be cleared with alternate site vendors to ensure they can accommodate the test requirements. The plan coordinator should ensure that the procedures for maintaining the recovery plan, which were developed in step 6220 (specifically tasks 6224 and 6225), are followed on a regular basis. Also, if components of the plan become obsolete due to environmental or other changes, the plan coordinator should recommend to management that a project be set up to formally revise the plan using the processes outlined in these task packages.
In addition to the method for providing the business recovery planning function the present invention also includes a method and apparatus for providing an estimate for building a business recovery planning function in an information technology enterprise. The method and apparatus generate a preliminary work estimate (time by task) and financial estimate (dollars by classification) based on input of a set of estimating factors that identify the scope and difficulty of key aspects to the function.
Previous estimators only gave bottom line cost figures and were directed to business rather than OM functions. It could take days or weeks before an IT consultant produced these figures for the client. If the project resulted in a total cost either above or below the projected estimate, there was no way of telling who or what was responsible for the discrepancy. Therefore, a need exists for an improved estimator.
Figure 9 is a flow chart of one embodiment of a method for providing an estimate of the time and cost to build a business recovery planning function in an information technology organization. In Figure 9, a provider of a business recovery planning function such as an IT consultant, for example Andersen Consulting, obtains estimating factors from the client 202. This is a combined effort, with the provider adding expertise and knowledge to help in determining the quantity and difficulty of each factor. Estimating factors represent key business drivers for a given OM function. Table 1 lists and
defines the factors to be considered along with examples of a quantity and difficulty rating for each factor.
As an illustration of a preferred embodiment of the invention, the provider determines an estimating factor for the number of business functions 202 with the help of the client. Next, the difficulty rating 204 is determined. Each of these determinations depends on the previous experience of the consultant. The provider or consultant with a high level of experience will have a greater likelihood of determining the correct number and difficulty ratings. The number and difficulty ratings are input into a computer program. In the preferred embodiment, the computer program is a spreadsheet, such as EXCEL, by Microsoft Corp. of Redmond, Washington, USA. The consultant and the client will continue to determine the number and difficulty ratings for each of the remaining estimating factors 206.
After the difficulty rating has been determined for all of the estimating factors, this information is transferred to an assumption sheet 208, and the assumptions for each factor are defined. The assumption sheet 208 allows the consultant to enter comments relating to each estimating factor and to document the underlying reasoning for a specific estimating factor.
Table 1
Next, an estimating worksheet is generated and reviewed 210 by the consultant, client, or both. An example of a worksheet is shown in Figures 10a-e. The default estimates of the time required for each task will populate the worksheet, with time estimates based on the number factors and difficulty rating previously assigned to the estimating factors that correspond to each task. The amount of time per task is based on a predetermined time per unit required for the estimating factor multiplied by a factor corresponding to the level of difficulty. Each task listed on the worksheet is described above in connection with details of the method for providing the business recovery planning function. The same numbers in the description of the method above correspond to the same steps, tasks, and task packages of activities shown on the worksheet of Figures 10a-e. The worksheet is reviewed 210 by the provider and the client for accuracy. Adjustments can be made to task level estimates by either returning to the factors sheet and adjusting the units 212 or by entering an override estimate in the 'Used' column 214 on the worksheet. This override may be used when the estimating factor produces a task estimate that is not appropriate for the task, for example, when a task is not required on a particular project.
Next, the provider and the client review and adjust, if necessary, the personnel time staffing factors for allocations 216 for the seniority levels of personnel needed for the project. Referring to Figures 10a-e, these columns are designated as Partner - "Ptnr", Manager - "Mgr", Consultant - "Cnslt", and Analyst - "Anlst", respectively. These allocations are adjusted to meet project requirements and are typically based on experience with delivering various stages of a project. It should be noted that the staffing factors should add up to 1.
The consultant or provider and the client then review the workplan 218, and may optionally include labor to be provided by the client. In one embodiment, the workplan contains the total time required in days per stage and in days per task required to complete the project. Tasks may be aggregated into a "task package" of subtasks or activities for convenience. A worksheet, as shown in Figures 10a-e, may also be used for convenience.
This worksheet may be used to adjust tasks or times as desired, from the experience of the provider, the customer, or both.
Finally, a financial estimate is generated in which the provider and client enter the agreed upon billing rates for Ptnr, Mgr, Cnslt, and Anlst 220. The total estimated payroll cost for the project will then be computed and displayed, generating final estimates. A determination of out-of-pocket expenses 222 may then be applied to the final estimates to determine a final project cost 224. Preferably, the provider will review the final estimates with an internal functional expert 226. Other costs may also be added to the project, such as hardware and software purchase costs, project management costs, and the like. Typically, project management costs for managing the provider's work are included in the estimator. These are task dependant and usually run between 10 and 15% of the tasks being managed, depending on the level of difficulty. These management allocations may appear on the worksheet and work plan. The time allocations for planning and managing a project are typically broken down for each of a plurality of task packages where the task packages are planning project execution 920, organizing project resources 940, controlling project work 960, and completing project 990, as shown in Figure 10a.
It will be appreciated that a wide range of changes and modifications to the methods as described are contemplated. Accordingly, while preferred embodiments have been shown and described in detail by way of examples, further modifications and embodiments are possible without departing from the scope of the invention as defined by the examples set forth. It is therefore intended that the invention be defined by the appended claims and all legal equivalents.