As I was going through the job interview process for a storage administrator position nearly seven years ago, my prospective employer took me on a tour of the data center in which I was to eventually work. Having always worked in smaller data centers, this tour took me aback. Entire rooms were filled with tape libraries while other rooms were filled with racks of tapes staged for offsite delivery or prepped for someone to put them back in the tape libraries.
Yet by the time I left that company about 18 months ago, most of that tape infrastructure was gone and replaced by disk as disk became the predominant method for backup and recovery. What is of note is that my former employer financially justified the changeover from tape to disk prior to the advent of deduplication.
This is an interesting point to ponder since if deduplication had been as widely available on disk subsystems three or four years ago as it is now, would my company have bought disk with deduplication or without? That may seem like a stupid question but companies should not assume that just because a disk library supports deduplication that it is a better choice than just buying raw disk and backing up all of their data on it.
Before adopting deduplication, one needs to consider all of the intangibles. The overhead associated with deduplicating the data, the additional time needed to recover the data and the risks associated with storing data in a deduplicated state contribute to the argument that deduplication is not a slam dunk in every circumstance.
So am I saying that one should stop reading this blog about HYDRAstor’s deduplication features and start shopping for disk drives at Wal-mart? Not exactly. But what I am saying is storage administrators, engineers and architects need to take a step back and take a look at the larger picture of what problems they are really trying to solve before implementing deduplication.
While reducing their data stores is one of the problems they are trying to solve, it is not the only one. The real problem that most companies are trying to solve with disk is improving their backup and recovery windows which, at the enterprise level, become a much more complex architectural challenge than just buying a storage system that supports deduplication.
This is what makes HYDRAstor an extremely compelling story. Yes, it does do deduplication and maybe even arguably better than other products. But more importantly its architecture addresses the reasons companies introduced disk-based backup in the first place – to improve the speed of their backups and recoveries. By delivering an architecture that can independently scale performance and capacity, companies can take advantage of the benefits of deduplication while at the same time providing fast backup and recovery times.
Deduplication will store data more efficiently in a smaller footprint – no one disputes that. Yet for companies who prioritize data reduction ratios above faster backup and recovery times, they will quickly be reminded that it was faster backup and recovery times, not data reduction, that was their primary motivation for switching to disk in the first place.