Disaster Recovery Planning Requires More Than Scheduled Backups
There are big disasters, like hurricanes and blackouts, that affect multiple businesses. There are the small disasters, like a server crashing, that affect just your business. In either scenario, you need a strategy to get back in business. Restoring backups is only part of the challenge.
Lost Data and Manual Workarounds
When systems go down, even if you have reliable backups, you’ll lose data that wasn’t included in the backup. You may need a way to recreate the data that was lost. If there were any transactions that were in process when the system went down, you’ll need to figure out how to identify those transactions and how to get them to a consistent, completed state. If the systems can’t be restarted quickly, you need manual, alternate processes for conducting business, plus a way to get that data into the systems once they come back online.
You don’t want to have to figure that out in the middle of a crisis, so your business should create a comprehensive disaster recovery (DR) plan. The plan needs to spell out the detailed procedures to be followed when systems go down. This includes specifying the sequence of application restarts and any application-specific recovery processes.
Test Plans to Identify Omissions
Writing a plan down is a good start, but it’s easy for steps to be overlooked. Companies should test their DR plans before there’s a crisis. Testing will find any missing steps and configuration changes that weren’t added to the plan, as well as provide realistic timing for the necessary recovery tasks to complete. Updating the plan with this information will make handling a real disaster a simpler process.
Although the idea of testing DR plans is easy, actually testing DR plans is a real challenge. The only way to do a real test is to really shut down a production system and execute the steps in the plan. Because this necessarily impacts business operations, these tests need to be carefully scheduled and coordinated with both business and IT teams. Desk walkthroughs of a plan with everyone who would be affected is an alternative, but while this can identify some omissions, it’s not as thorough as really executing the plan.
There are other alternative test strategies that fall between desk walkthroughs and full outage simulations; they don’t have as big an impact on operations but also aren’t as effective in identifying issues in the recovery plan. One strategy is to incorporate DR testing into scheduled maintenance time, as the system will be down already. Another strategy is to focus on application-level tests that bring down and recover applications separately. That approach doesn’t stress the system to the same extent as a full DR test but does help application support teams understand their application’s recovery process.
Use Technology to Support Disaster Recovery
Companies can increase the odds their DR process will succeed by using technology and services appropriately. A reliable backup and restoration process is critical; rather than relying on ad-hoc scripts, companies should use backup tools and services that provide monitoring and alert when critical backups fail. Configuration management tools can help ensure that new servers, storage devices, and applications are not accidentally omitted from the backups.
Another way to ensure disaster recovery success is to work with an IT services firm like Prescient Solutions. Our team will take a strategic look at your business and design a comprehensive strategy covering hardware, software, and backup and recovery processes. We can implement the strategy for you, and monitor and oversee its operational status. We’ll help you create your DR test plan and execute it to make sure it succeeds. Contact us for a free disaster recovery assessment.