Virtualizing Disaster Recovery Using Cloud Computing

Transcription

IBM Global Technology ServicesThought Leadership White PaperVirtualizing disaster recoveryusing cloud computingProtect your applications quickly with a resilient cloudJanuary 2013

2Virtualizing disaster recovery using cloud computingContents2 Executive summary3 Traditional disaster recovery—a choice between costand speed5 The pressure for continuous availability4 Thinking in terms of interruptions and not disasters6 Cloud-based business resilience—a welcome, newapproach7 Facilitating improved control with portal access7 Building confidence and refine disaster recovery planswith more frequent testing8 Supporting optimized application recovery times withtiered service levels9 More efficiently supporting mixed environments withvirtualized disaster recovery9 Enabling bandwidth savings with a local presence10 Coexisting more effectively with traditional disasterrecovery10 ConclusionExecutive summaryAlmost from the beginning of widespread adoption of computers, organizations realized that disaster recovery was a necessarycomponent of their information technology (IT) plans. Businessdata had to be backed up, and key processes like order entry,billing, payroll and procurement needed to continue even if anorganization’s data center was disabled due to a disaster. Overtime, two distinct disaster recovery models emerged: dedicatedand shared. Although both of these approaches were effective,they often forced organizations to choose between costand speed.As we fast forward 50 years to today’s “always-on” world, it isapparent that the flow of information and commerce in ourglobal business environment never sleeps. With the demands ofan around-the-clock world, organizations need to think in termsof application continuity in the face of interruptions, not just as aresult of infrequent disasters. Likewise, disaster recovery serviceproviders need to enable more seamless, nearly instantaneousfailover and failback of critical business applications. Yet giventhe reality that most IT budgets are flat or even lower than theyonce were, organizations must be able to obtain these serviceswithout incurring significant up-front or ongoing expenditures.Cloud-based business resilience can provide an attractive alternative to traditional disaster recovery, offering both the shorterrecovery time associated with a dedicated infrastructure and thereduced costs that are consistent with a shared recovery model.With pay-as-you-go pricing and the ability to scale up as conditions change, cloud computing can help organizations meet theexpectations of today’s frenetic, fast paced environment whereIT demands continue to increase but budgets do not.This white paper discusses traditional approaches to disasterrecovery and describes how organizations can use cloud computing to help plan for both the mundane interruptions to service—such as cut power lines, server hardware failures and securitybreaches—as well as less frequent disasters. The paper examineskey factors you should consider when planning for the transitionto cloud-based business resilience and in selecting your cloudpartner.

IBM Global Technology ServicesTraditional disaster recovery—a choicebetween cost and speed3HighAs shown in Figure 1, when choosing a disaster recoveryapproach, organizations have traditionally based their decisionon the level of service required as measured by two recoveryobjectives:DedicatedCostShared Recovery time objective (RTO)—the amount of time betweenan outage and the restoration of operationsRecovery point objective (RPO)—the point in time when datais restored, which reflects the amount of data that ultimatelycan be lost during the recovery process.LowWeeksDaysHoursMinutesSpeed to RecoveryFigure 2. Traditional disaster-recovery approaches include shared anddedicated modelsRTORecovery Time ObjectiveHow much data is lostDaysHoursMinutesHow long to recoverMinutesHoursRecoveryData ImageRPORecovery Point ObjectiveDaysFigure 1. Measuring level of service required by RPO and RTOIn traditional disaster recovery models—dedicated and shared—organizations are forced to make the tradeoff between cost andspeed to recovery, as illustrated in Figure 2.In a dedicated model, the infrastructure is dedicated to a singleorganization. This type of disaster recovery can offer a fastertime to recovery compared to other traditional models becausethe IT infrastructure is duplicated at the disaster recovery siteand is ready to be called upon in the event of a disaster.Although this model can reduce RTO because the hardware andsoftware are preconfigured, it does not eliminate all delays. Theprocess is still dependent on receiving a current data image,which involves transporting physical tapes and a data restorationprocess. This approach is also costly because the hardware sitsidle when not being used for disaster recovery. Some organizations use the backup infrastructure for development and test tomitigate the cost, but that introduces additional risk into theequation. Finally, the data restoration process adds variabilityinto the process. As illustrated in Figure 3, data restoration cantake up to 72 hours including the tape retrieval, travel and loading process.

4Virtualizing disaster recovery using cloud computingThe pressure for continuous availabilityDedicatedData Restore6 hrs or less4 - 72 hrsRecoveryInterruptionDeclarationHW SetupSW SetupData RestoreFigure 3. Time to recovery using a dedicated infrastructureIn a shared disaster recovery model, the infrastructure is sharedamong multiple organizations. Shared disaster recovery isdesigned to be more cost effective because the off-site backupinfrastructure is shared among multiple organizations. After adisaster is declared, the hardware, operating system and application software at the disaster site must be configured from theground up to match the IT site that has declared a disaster, andthis process can take hours or even days. In addition, the datarestoration process must be completed as shown in Figure 4,resulting in an average of 48 to 72 hours to recovery.SharedDeclareHW SetupSW SetupMin 4 hrsMin 8-24 hrsMin 4 hrsData Restore4 - 72 hrsInterruptionRecoveryDeclarationHW SetupFigure 4. Time to recovery using a shared infrastructureSW SetupData RestoreAccording to the IBM 2011 chief information officer (CIO)study, organizations are being challenged to keep up with thegrowing demands on their IT departments while keeping theiroperations up and running and making them as efficient as possible. Furthermore, users and customers are becoming moretechnologically sophisticated. Research shows that usage ofInternet-connected devices is growing about 42 percent annually, giving clients and employees the ability to quickly accesshuge amounts of storage. However, in spite of the pressure to domore, organizations are spending a large percentage of theirfunds to maintain their existing infrastructures. At the sametime, their IT budgets remain essentially flat.1With dedicated and shared disaster recovery models, organizations have traditionally been forced to make tradeoffs betweencost and speed. As the pressure to achieve continuous availabilityand reduce costs continues to increase, organizations can no longer accept tradeoffs. Although disaster recovery was originallyintended for critical batch “back-office” processes, many organizations are now dependent on real-time applications and anonline presence as the primary interface to their customers. Anydowntime reflects directly on their brand image, and customersview any interruption of key applications such as e-commerce,online banking and customer self-service as being unacceptable.As a result, the cost of a minute of downtime may be thousandsof dollars.

IBM Global Technology Services5HighPower 19%DedicatedHW/SW 17%Weather 54%CostOther 10%SharedVirtualized Using CloudLowWeeksDaysHoursMinutesSpeed to RecoveryFigure 5. Types of business interruptionsFigure 7. A cloud-based approach to business resilienceThinking in terms of interruptions and notdisastersTraditional disaster recovery methods rely on “declaring a disaster” in order to use the backup infrastructure during events suchas hurricanes, tsunamis, floods or fires. However, most application availability interruptions are due to more mundane everydayoccurrences. Although organizations need to plan for the worst,they also must plan for the more likely—cut power lines, serverhardware failures and security breaches.Virtualized using CloudRecoveryInterruptionDeclarationHW SetupSW SetupData RestoreFigure 6. Speed to recovery using cloud computingFigure 5 shows the kinds of disruptions IBM has helped its customers respond to over the past few years. Although weather isthe root cause of just over half of the disasters declared, almost50 percent of the declarations are due to other causes.Cloud-based business resilience offers benefits overtraditional disaster recovery models: More predictable monthly operating expenses can help youreduce the unexpected and hidden costs of do-it-yourselfapproaches. Having the disaster recovery infrastructure in the cloud canhelp you reduce up-front capital expenditure requirements. You can more easily scale up cloud-based business resilience managed services based on changing conditions. Portal access reduces the need to travel to the recovery site,which can help you save time and money.

6Virtualizing disaster recovery using cloud computingThese statistics are from IBM clients who actually declared adisaster, but organizations also experience interruptions forwhich they do not declare a disaster. In an around-the-clockworld, organizations must move beyond disaster recovery andthink in terms of application continuity. It is crucial that theyplan for the recovery of critical business applications rather thaninfrequent, momentous disasters, and build resiliency plansaccordingly.Cloud-based business resilience—awelcome, new approachCloud computing offers an attractive alternative to traditionaldisaster recovery. The cloud is inherently a shared infrastructure:a pooled set of resources with the infrastructure cost distributedacross everyone who contracts for the cloud service. This sharednature makes cloud an ideal model for disaster recovery. Evenwith a broader definition of disaster recovery that includes moremundane service interruptions, the need for disaster recoveryresources is sporadic. Because all of the organizations relying onthe cloud for backup and recovery are very unlikely to need theinfrastructure at the same time, costs can be reduced and thecloud can speed recovery time.Cloud-based business resilience managed services likeIBM SmartCloud Virtualized Server Recovery are designedto balance economical, shared physical recovery with the speedof a dedicated infrastructure. Because the server images and dataare continuously replicated, recovery time can be reduced dramatically to less than an hour, and on a machine basis, to onlyminutes per server. However, the costs are moderated by theshared model.Although the cloud offers multiple benefits as a disaster recoveryplatform, there are several other advantages that a cloud-basedbusiness resilience solution should provide, including:Client IBM SmartCloud VirtualizedServer Recovery portalServers/Storage Cloud hosted at IBM Resiliency Centers Figure 8. IBM SmartCloud Virtualized Server Recovery portal Easier-to-use portal access with failover and failback capabilitySupport for disaster recovery testingTiered service levelsSupport for mixed and virtualized server environmentsGlobal reach and local presenceMigration from and coexistence with traditional disasterrecoveryThe next few sections describe these considerations in greaterdetail.

IBM Global Technology Services7Figure 9. An administrative view of the recovery portalFigure 10. DR Testing view with IBM SmartCloud Virtualized Server RecoveryFacilitating improved control withportal accessAlthough having an administrative view through a portal isuseful, it is critical that the portal also provides the opportunityto initiate a failover and failback. With SmartCloud VirtualizedServer Recovery, clients can use the portal to fail over in nearreal-time (for the “always available” service-level protected servers described later), reducing the need to contact the cloudservice provider (IBM in this case) to declare a disaster or toinitiate the failover. With the ability to fail over from the portaland not need a formal disaster declaration, IT can be muchmore responsive to the more mundane outages and interruptionspreviously described.Disaster recovery has traditionally been an insurance policy thatorganizations hope not to use. In contrast, cloud-based businessresilience can actually increase IT’s ability to provide servicecontinuity for key business applications. Because the cloud-basedbusiness resilience service is accessed through a web portal, ITmanagement and administrators gain a dashboard view to theirorganization’s infrastructure.For example, clients can access the SmartCloud VirtualizedServer Recovery portal via the Internet and identify which oftheir servers they want to protect and replicate. Through thisportal, customers can download the SmartCloud VirtualizedServer Recovery client software to install on their covered servers. Once the environment is defined through the portal, userscan view the protection status of their servers, generate reportsand conduct other administrative tasks.Building confidence and refining disasterrecovery plans with more frequent testingOne traditional challenge of disaster recovery is the lack ofcertainty that the planned solution will work when it is mostneeded. Typically, organizations only test their failover andrecovery an average of once or twice per year, which is hardlysufficient given the pace of change that most IT departmentsexperience. As a result of this lost sense of control, some organizations have brought disaster recovery in house, diverting criticalIT focus for mainline application development.

8Virtualizing disaster recovery using cloud computingSmartCloud VirtualizedServer RecoveryService LevelRTO (until system boot start)DescriptionGoldAlways-available virtualmachine”Minutes per server” is typically less than an hour;full RTO is dependent upon configurationsFor mission-critical applications that require near-zeroRTO/RPO and that need a recovery infrastructure withnear-continuous availability for use beyond recoveryservicesSilverDisaster and test virtualmachineSame as Gold service when servers are immediatelyavailableFor applications that need rapid recovery in minutesand that need a cloud recovery infrastructure that isremotely accessible at the time of disasterCloud-based business resilience provides the opportunity formore control and more frequent and granular testing of disasterrecovery plans, even at the server or application level.SmartCloud Virtualized Server recovery provides a disasterrecovery testing view in the portal so that IT can test the failoverand failback processes more frequently.Clients can generally tailor testing to their schedule. For example, a critical e-commerce application can be tested prior to apeak online shopping period such as Cyber Monday. Or anonline banking system can be tested after a version upgrade inorder to assess if the failover and failback processes still workseamlessly.Supporting optimized applicationrecovery times with tiered service levelsCloud-based business resilience offers the opportunity for tieredservice levels that help organizations to differentiate applicationsbased on their importance to the organization and the associatedtolerance for downtime. For example, SmartCloud VirtualizedServer Recovery provides two premium service-level options:gold and silver. These tiers enable organizations to optimizetheir spending, paying more for mission-critical applicationsthat require nearly continuous availability and paying less fornoncritical applications.

IBM Global Technology ServicesWith SmartCloud Virtualized Server Recovery, the frequency ofthe data replication and the resulting RPO and RTO are basedupon the service level assigned to the server. Multiple serverssupporting the same application and business process can becollectively assigned the same group and service level to helpprovide consistency and synchronization for failover and failbackoperations.More efficiently supporting mixedenvironments with virtualized disasterrecoveryThe notion of a “server image” is an important part of traditional disaster recovery. As the complexity of IT departments hasincreased, including multiple server farms with possibly differentoperating systems (OS) and OS levels, the ability to respond to adisaster or outage becomes more complex. Organizations areoften forced to recover on different hardware, which can takelonger and increase the possibility of errors and data loss.Organizations are implementing virtualization technologies intheir data centers to help remove some of the underlying complexity and optimize infrastructure utilization caused by thegrowing number of virtual machines installed over the pastseveral years. According to a recent IBM survey of CIOs,98 percent of respondents either had already implementedvirtualization or had plans to implement it within the next12 months.29Cloud-based business resilience solutions must offer bothphysical-to-virtual (P2V) and virtual-to-virtual (V2V) recoveryin order to support these types of environments. SmartCloudVirtualized Server Recovery supports virtualized, non-virtualizedand mixed environments, including those with multiple operating systems.Enabling bandwidth savings with a localpresenceCloud-based business resilience requires ongoing server replication, making network bandwidth an important considerationwhen adopting this approach. A global provider like IBM offersthe opportunity for a local presence, thereby reducing thedistance that data must travel across the network. WithSmartCloud Virtualized Server Recovery, the client’s server configuration, operating system, application software and associateddata are replicated to the IBM Resiliency Center across theInternet or designated network connection. Although data willbe replicated to the closest IBM Resiliency Center runningSmartCloud Virtualized Server Recovery, added resiliency andbackup can be provided within the IBM network of securecenters.

10 Virtualizing disaster recovery using cloud computingIBM offers a SmartCloud Virtualized Server RecoverySynchronization and Bandwidth Estimator to assist with theassessment of network bandwidth requirements. The estimatorcan confirm your capacity needs even though many of our clients may not need to increase their capacity.Clients should identify all servers that support a single businessapplication and include those servers in a single VirtualizedServer Recovery plan. The solution can provide cross-serverconsistency for failover and failback, helping to enhance securityand reduce risk.Coexisting more effectively withtraditional disaster recoveryAlthough cloud-based business resilience offers many advantagesfor mission-critical and customer-facing applications, an efficiententerprise-wide disaster recovery plan will likely include a blendof traditional and cloud-based approaches. SmartCloudVirtualized Server Recovery can help ease the transition fromtraditional methods allowing clients to use it in conjunction withexisting data back-up solutions like IBM SmartCloud ManagedBackup or other traditional tape-based recovery methods.In a recent study, respondents indicated that reducing data losswas the most important objective of a successful disaster recovery solution.3 With coordinated disaster recovery and data backup, data loss can be reduced and reliability of data integrityimproved.ConclusionCloud computing offers a compelling opportunity to realize therecovery time benefits of dedicated disaster recovery with thecost structure benefits of shared disaster recovery. However,disaster recovery planning is not something that should be takenlightly; cloud security and resiliency are critical considerations.SmartCloud Virtualized Server Recovery is hosted within theIBM network of Resiliency Centers, so clients can feel confidentthat IBM is helping to protect their sensitive data. In addition,there is no need to rush in. Clients can start to work withSmartCloud Virtualized Server Recovery with as few as fivevirtual machines under managed contract, so getting started iseasier and relatively risk free.With more than 1,800 dedicated business continuity professionals and more than 160 business resilience centers located aroundthe world, respected industry analysts recognize IBM as a leaderin business continuity and resilience. Our virtually unparalleledexperience is based on more than 50 years of business resilienceand disaster recovery experience and more than 9,000 disasterrecovery clients. Further, IBM has been in the systems businessfor 60 years, and just about no other company understandssystems and security like IBM does. Using our vast businessprocess and technology expertise, we can help you designand implement a business resilience solution that meets yourorganization’s needs.

NotesIBM Global Technology Services 11

For more informationTo learn more about virtualizing disaster recovery and managingbusiness resiliency, please contact your IBM marketing representative or IBM Business Partner, or visit the following website:ibm.com/services/continuity Copyright IBM Corporation 2013Additionally, IBM Global Financing can help you acquire the ITsolutions that your business needs in the most cost-effective andstrategic way possible. We’ll partner with credit-qualified clientsto customize an IT financing solution to suit your business goals,enable effective cash management, and improve your total costof ownership. IBM Global Financing is your smartest choice tofund critical IT investments and propel your business forward.For more information, visit: ibm.com/financingIBM CorporationIBM Global ServicesRoute 100Somers, NY 10589Produced in the United States of AmericaJanuary 2013IBM, the IBM logo, ibm.com, and SmartCloud are trademarks ofInternational Business Corp., registered in many jurisdictions worldwide.Other product and service names might be trademarks of IBM or othercompanies. A current list of IBM trademarks is available on the web at“Copyright and trademark information” at ibm.com/legal/copytrade.shtmlThis document is current as of the initial date of publication and may bechanged by IBM at any time. Not all offerings are available in every countryin which IBM operates.THE INFORMATION IN THIS DOCUMENT IS PROVIDED“AS IS” WITHOUT ANY WARRANTY, EXPRESS ORIMPLIED, INCLUDING WITHOUT ANY WARRANTIESOF MERCHANTABILITY, FITNESS FOR A PARTICULARPURPOSE AND ANY WARRANTY OR CONDITION OFNON-INFRINGEMENT. IBM products are warranted according to theterms and conditions of the agreements under which they are provided.1IBM 2011 CIO study2IBM 2011 CIO study3IBM 2011 CIO studyPlease RecycleBUW03013-USEN-04

Virtualizing disaster recovery using cloud computing. Contents. 2. Executive summary. 3. Traditional disaster recovery—a choice between cost and speed. 5. The pressure for continuous availability. 4. Thinking in terms of interruptions and not disasters. 6. Cloud-based business resilience—a welcome, new approach. 7. Facilitating improved .