It is only at the time of a real incident, that the weaknesses of your DR plan are realised.
Your DR plan covers the transition to a disaster footing, but what happens when the floodwaters recede, you regain access to your premises after the fire, or the power comes back on? Something that is often missed in DR plans is the process of returning to normal. How do you transition from your disaster footing to your normal systems and infrastructure? Have you documented the steps you need to return as carefully as you designed your emergency actions? We have seen businesses with an otherwise terrific plan come unstuck when normality is within reach.
Unlike an emergency situation, you may feel that time is on your side. Your DR plan has worked, you’ve probably had a few days or weeks of thinking on your feet and you have trodden the tightrope between excitement and fear. In a disaster situation, IT professionals are astonishingly adept at keeping going on a mix of caffeine and adrenalin.
When the urgency is over, the adrenalin leaves your body, and you are likely to be left feeling drained. There is still a lot to do, though, and if you have planned and practiced your switch back to your usual systems, it will help enormously. It is not a matter of simply reversing the DR steps.
Think in terms of outcomes and user needs, and the priorities will become clear. It may be that the sales team is affected little by staying away from the office – after all, they’re on the road most of the time. If you’re in the business of manufacturing or retail, though, keeping the warehouse and production line working is likely urgent.
In testing, you will ensure that your people have access to data, and that the tools to access that data are within reach. You will undoubtedly check how your essential tasks hold up under pressure. But it makes sense to also see which applications and services work just fine in the mid-term without reversing to their original environment.
In reality though, at the time of a real downtime incident, it is only then that the weaknesses of your DR plan are realised… For example, ‘how does resolution continue now that your IT staff have worked for 30 hours getting the PROD server back up and running’. Who backs up your IT team? – Do you have a ‘DR Partner’ you can turn to for assistance, one that understands your particular environment?
Timing is also key. If your operations are working as they should in the DR environment, why disrupt them unnecessarily. It means more out-of-hours work in many cases, or impact minimisation in others, but scheduling transitions is something of an art form. This does imply that your DR server is of adequate configuration to run your business, and not simply a ‘token server’ to tick the auditor’s requirements. If your DR environment is not of adequate configuration you need to decide which components of your business you can do without for the duration of the downtime incident. That scenario might change your priority for getting back to the normal environment.
Our many years of experience tell us many organisations end up continuing operations hosted at their datacentres, and opt not to return to their original environment at all, while others assess and prioritise based on business outcome. The convenience of having infrastructure managed and secured with enterprise equipment – either temporarily or permanently – can make a lot of sense when you’re overstretched.
Just as your security controls and access protocols must be ready, and connectivity must be available to your employees and your trading partners when you transition to your DR environment, the same is the case on the return journey. It is rarely a matter of just slotting back into place. Our usual refrain of ‘test, test, and test again’ applies more than ever here. If you do a regular, independent security audit, include this part of your DR plan as a consideration – while you have a security specialist on-site, you may as well pick their brains.
In both security and more general IT audits, we work with our clients on documentation. It is, let’s face it, one of the duller parts of IT, usually seen as a necessary evil, but good documentation is a saviour when returning to normality after a disaster. Who has time to document everything? It definitely suffers in most businesses where resources are limited, but given that lack of documentation can be risky in disaster situations and their aftermath, it isn’t something that should stay in the too-hard basked for long. If your IT partner doesn’t discuss this with you, they should.
And, of course, the DR downtime incident will always strike at the time your senior IT Technician is out of the country on leave, the IT Manager is at 35,000 feet and your senior IT Administration is on a Rostered Day Off because they just worked all weekend upgrading your SQL Server – so, how good is that DR documentation and could the Operations Manager, for instance, just pick it up and run with it?
While there is no substitute for your involvement through the DR process, it is worth accepting the help that is on offer. Your partner will likely hold documentation about your equipment and projects, and should be happy to work through an inventory for insurance or practical purposes. If customers have bought time by using DRaaS options, they often use the breathing space to re-assess their original environment. Cloud providers vary, but our DRaaS choice was made in part because of the ready expertise on offer.
Whether you return to the same footing, or to a new normal, we’re here to help. To chat about how you extend your plan back to normal, give us a call. Get in early and get us involved in your DR planning so we can hit the ground running on the day.