Grid computing is starting to appear in some unlikely places. It is easy to assume that grid computing appears primarily in the world of academia or high tech corporate IT engineering labs. In these environments, computer scientists typically have the time and expertise to engineer complicated, high performance, low cost computing solutions that can perform tasks like mapping out the human DNA or identifying possible new sites to drill for oil. But applying grid computing to address a low-tech problem like backup and recovery? That almost seems like a misnomer.
However that is exactly what is happening. As companies are finding out, data protection possesses many of the same characteristics of the types of problems that grid computing was designed to address. Consider:
- Companies want to minimize their expenditures. The value of data protection is almost impossible to quantify on a day-to-day basis. Most of the time companies can only really document the true cost of not protecting their data only after some type of disaster occurs. This prompts companies to limit how much they spend on data protection since it is viewed as a cost to the business, not increasing revenue.
- It is complicated. Backing up one server is easy. Backing up and recovering hundreds of servers with different applications with varying amounts of data across an enterprise requires data protection software is much more complicated. The data protection software needs to automatically adjust and allocate appropriate resources (capacity, performance, etc.) to ensure that the backup can complete in the application server’s backup window. Of course, it needs to do this inexpensively.
- It is performance intensive. During the backup window, the data protection software may need to analyze TBs of data, move it across corporate networks and then store it on disk. To minimize the amount of storage capacity needed, data is deduplicated before it is stored. Every stage in this process – analysis, data movement or deduplication – requires high levels of performance. However, since this all occurs as part of the data protection process, the cost for these compute cycles must be kept to a minimum.
Asigra also recognized this trend towards the adoption of grid computing in data protection and, in its Televaulting 8.0 release, introduced its version of grid computing so it could help companies dynamically and economically adapt to the demands of enterprise backup. Here is an overview of how Asigra’s implementation of grid computing works:
- A master, or parent, DS-Client exists at each site with one or more child DS-Clients. (The DS-Clients may exist on physical or virtual machines.)
- Each of these DS-Clients (parent and children) is assigned specific application servers to protect.
- The parent DS-Client monitors the progress of the backup jobs on all of its children DS-Clients and functions something like a job scheduler. As children DS-Clients complete their backup jobs, the parent re-assign jobs from other children DS-Clients that are still busy to the idling child DS-Client. The idling child DS-Client then starts to backup the application.
- This process continues until the backup jobs on all application servers are complete.
The worlds of grid computing and data protection are beginning to merge. The big difference is that as grid computing finds its way into the world of data protection, companies like Asigra are finding ways to take the complexity out of grid computing while keeping its low cost and high performance benefits. This combination means that going forward, companies can expect simpler, easier and faster ways to deliver improved levels of data protection and recovery for their enterprise while maintaining, or even lowering, their data protection costs.