The biggest reasons organizations have implemented HA/Clustering solutions in their environment is to provide immediate / real-time failover of an application stack that is critical to business operations. Along with this is the innate ability it provides to protect their data and servers from a location specific event (a disaster causing outage to a facility or components within that facility.)
It is for reasons like these that HA/Clustering software includes features that enable synchronous or asynchronous replication techniques of data-sets and/or file systems. The selection of one of both of these options is dependent on the specific use-case and the challenge that end-users are trying to solve.
A key requirement for HA/Clustering software to support is replication abilities. Many now use 3rd party storage array replication software that enables an end-user to use its favorite storage system to perform the replication.
It is also crucial that the HA/Clustering software integrate with application level replication (for example Oracle DataGuard or Exchange DAG’s.) These use cases are driven by circumstances where it make more sense to use the replication techniques provided by the application vendor.
Finally, to offer as much flexibility, it is crucial that the HA/Clustering software offer its own internal replication capability. There are times when application and/or storage replication techniques are not available, not required, or simply too costly to implement.
The HA/Clustering software should also ideally support the ability to simulate replication requirements. As an example, an end-user wants to deploy a cluster or has already deployed a cluster but isn’t quite sure how much latency and bandwidth will be required will be incurred when replicating from point source-to-destination. The availability of this feature enables an end-user to capture metrics over a period of time to determine how much bandwidth is required and what latency will look like.
A requirement for any successful business continuity plan is the ability to actually test (and test regularly) the failover and recovery an application to a different location. In the past this has been a very difficult task to prepare for in some cases it was an all or nothing proposition. You push the button and “pray to the heavens” that it all work as expected.
No longer, with many of the HA/Clustering software packages offering the ability to test or perform what’s commonly referred to as “a fire-drill” the days of worrying and angst go away. This functionality will enable the end-user BC plan to be functional and potentially even test or a more timely and regular basis.