Best Practices for Building an Enterprise Disaster Recovery Playbook

About a year ago I started to contemplate writing a book on the topic of ‘Backup Redesign for Enterprise Organizations’. I even went so far as to register the domain name www.backupredesign.com in anticipation of writing and releasing a book on that topic. Fast forward to today and I am still examining how to best tackle the specific subject of disaster recovery (DR) in such a manner that it meets the needs of enterprise organizations.

Having once been responsible for hundreds of TBs of storage (which was a lot not that long ago) and mission critical applications at an enterprise data center, I do not think it is possible to write a book that can authoritatively and comprehensively cover everything that every enterprise data centers can collectively use to successfully execute on a DR plan.

Enterprise organizations have far too much data with diverse DR requirements for me to believe that such a universal playbook can be written that addresses all of their concerns. In fact, many of them would be happy just to have an internal playbook that meets their specific DR needs.

However enterprises that seek to put together their own playbook struggle to do so. A wide variety of DR technologies are now available but these technologies often solve only a specific segment of their overall DR problems and do not provide them with a comprehensive DR solution.
 
This suspicion was recently confirmed again by SEPATON. It has been talking to end users from large organizations and the majority of them are still grappling with developing DR strategies and then deploying the right technologies that address their particular sophisticated requirements for DR.

However this does not mean that enterprise organizations have to assume their situation is hopeless. There are some best practices that they can follow in order to develop an appropriate strategy then will them select the right technologies for their environment. These best practices include:

  • Plan for enterprise class performance and capacity requirements. Enterprise organizations tend to habitually underestimate the number of applications in their environment that they need to protect and recover. While IT architects are understandably trying to be conservative and not waste precious IT dollars and resources, once a DR solution is proven successful, everyone who was standing on the sidelines comes forward and wants to take advantage of the solution.

This is why it is imperative that enterprises select solutions that can scale to meet the enterprise performance and capacity requirements that will surely be thrust upon it.  Look for architectures that can scale nodes for performance and storage for capacity.  These solutions like those from SEPATON can deliver a 5 year DR plan that actually makes sense.

  • Ensure you can meet recovery time objectives (RTOs). Successfully protecting and then recovering data is not necessarily good enough if it cannot be done in a time frame that meets defined RTOs. Organizations need to ensure that whatever DR solution or solutions that they select align with the RTOs of the many applications for which they are responsible for protecting. This can be particularly challenging with deduplication solutions which often negatively impact recovery performance.  You should consider solutions that use forward referencing or SEPATON’s DeltaCache Recovery™ which provide the fastest restore performance on the newest data.  
  • Monitor changing requirements. This may prove to be one of the most challenging aspects of implementing an appropriate DR solution or solutions. The status of neither DR strategies nor applications being protected is static. The criticality of applications changes over time such that some become more important to recover quickly while others decrease in importance. Whenever possible, organizations should look to use a single platform that meets the recovery requirements of as many of their applications as possible to minimize the need to change what approach they use for recovery.
  • Pay as you grow. This has become one of the emerging requirements for DR solutions in the last decade – the ability to start small, prove it can work and then scale it out so enterprises can pay as they grow. Look for products that offer grid storage architectures that can independently scale either capacity or performance.
  • Aim to reduce OPEX for IT administrators. IT administrators are already pulled in multiple directions and, with the uncertainty surrounding Cap and Trade and Healthcare Reform at the federal level, most organizations are hesitant to hire more employees. This makes it even more important for enterprise organizations to select solutions that are familiar to IT administrators and require little time to setup and manage.  Automation is no longer a luxury but now a necessity.  Your DR playbook should look to robustness in storage management with thin provisioning, dynamic capacity expansion and the kind of reporting tools that truly help to monitor and plan the storage environment.
  • Know your bandwidth requirements. An excerpt from a recent Q3 2009 Forrester Research study revealed that the majority of enterprises have at least one data center, are increasing the distance between their production and recovery data centers and are using their recovery data center for multiple purposes.

That’s great news but that only works if there is sufficient network bandwidth to handle passing all of the data from one site to the other. Technologies that measure the rate of data change can give enterprises insight into how much network bandwidth you need while other technologies like deduplication can reduce the amount of data that they need to send while more effectively using what bandwidth is available.

  • Test early and often. Enterprises need to have defined testing plans and then stick with them. This is often easier said than done but the best way to ensure this happens is to automate and simplify as much of the DR process as possible. Part of this includes making sure the data needed for testing is where it needs to be in advance of the start of the test using technologies such as replication.

There is still no secret DR playbook that enterprises can yet pull out of their back pocket and expect it to magically work for their environment. However best practices for developing a workable DR strategy do exist which organizations can leverage to help them build a DR playbook that works for them.  In so doing, they can remove the confusion that persists around DR and use that to build a strategy and select the technologies that will work for them. 

Click Here to Signup for the DCIG Newsletter!

Categories

DCIG Newsletter Signup

Thank you for your interest in DCIG research and analysis.

Please sign up for the free DCIG Newsletter to have new analysis delivered to your inbox each week.