I have been looking into clustering and high availability (HA) a lot lately, mostly as it pertains to my day job but also because I just like to keep myself up to speed on things. What has been changing? An abundance of things!
Thinking back to the late 90’s and early 00’s, it was difficult, it not impossible, to get a cluster up and running without significant amounts of pre-planning and coordination from many groups (Network, Server, Application, DBA’s etc.) This was back in the day before VMware and other hypervisors where if you wanted to protect an application from having a single-point of failure, your only option was to cluster that application.
Even once that is done, you had better hope that the application was one that the cluster software supported otherwise you would be hacking things together and praying that the failover groups and everything else functioned as expected. Assuming you got the cluster up at all, there was nothing to provide simulated failover testing and/or the ability to have the cluster software run a simple health check on functions to ensure everything was working as expected.
I can tell you with all the advancements in the last 15 + years, things have changed radically. System administrator now have a myriad of tool sets to assist in the deployment of clustering and high availability technologies. Here are a few of the feature sets I have come across that I wish I had “back in the day.“
- Pre-Install Checklist
The ability to ensure that all cluster components are configured properly prior to the installation of the cluster software. In essence, a real-time checklist of public, private, and heartbeat interfaces, as well as shared storage configurations, and most importantly patch levels on the server(s) that the cluster will be installed on.
- Failover Testing (No Impact to Production) – Fire Drill
The ability to perform non-invasive checks of the infrastructure to validate HA configurations, which can be automated and run on a scheduled basis. A functional tool that can pick up mistakes made by system administrators (Like adding storage and forgetting to add the mount point to the cluster, etc).
- Ongoing Health Checks to validate Cluster Health
Dashboards, reports, and notifications that provide an “at a glance” look at the health of the cluster across your landscape, notifying the administration team when functions breach a threshold.
- Robust Application Integration / Simple Frameworks to Integrate Non-Standard Applications
Almost every shrink-wrapped application out there is supported by one cluster or another. If not, most clustering software provide a very simple framework to get them into the cluster and protected.
- Physical to Virtual – Virtual to Physical Failover Capability
The ability to failover P2V and V2P as stated this function allows for a very robust configuration of clustering which can include the entire application stack (Web to App to DB etc)
The clustering of yesterday is certainly gone, and the clustering of today makes everything very simple. At the same time providing the robustness any company may be looking for to protect their applications, whether that be in a local data-center, connected to a metro-data-center or something a bit farther crossing the country or even the world.