Recently Kelly Polanski (another DCIG analyst) and I had a rather lengthy discussion about the value of keeping archive and backup data on disk versus tape long term. We were both in agreement that using disk in some form as an initial backup target makes sense in most environments but as we started to debate the merits of keeping data on disk versus tape long term, the issue can get more cloudy. While DCIG has previously argued that eDiscovery is becoming a more compelling reason to keep archive and/or backup data on disk long term, the concerns we had centered on the fact that some disk-based archival and backup storage systems can become as problematic as tape.
Kelly is talking to some end-users and resellers where disk storage systems are starting to lose some of their shine when used in archiving and backup capacities. When I pressed her further, she indicated that these accounts started using disk for archive, backup or both which many times solved their initial problems around backup by shortening backup windows and creating more successful backups. But they are now encountering new challenges that are just as serious as the ones they used to have with tape. Key issues include:
- Inadequately size for capacity. This may force the organization to introduce another storage system or appliance to archive more data so now they need to manage multiple devices.
- No way to extend the deduplication benefits from the first appliance. Many archiving and backup storage systems now deduplicate data but if a second appliance is introduced, organizations lose whatever benefits of deduplication that the first appliance delivered since the new appliance needs to start deduplicating data from scratch.
- Inadequately sized for performance. To rehydrate large amounts of data for either recovery or eDiscovery, searches could put inordinate loads on the appliance that results in unacceptable slow recovery or search times as well as concurrent normal backup or archive access.
- Some archiving storage systems can only be accessed via proprietary network APIs. Only applications that support the proprietary network APIs of the storage system can access or retrieve data from the archiving system.
- No upgrade path. Once the data is on the storage system, how do you upgrade the system or migrate the data off the system to a new one? Kelly told me that some companies she is dealing with have simply stopped trying because it is too expensive to either migrate the data or upgrade the system. She is aware of at least one company that simply turned their archiving storage system off because it was full. Now they only power it on only when some business need dictates.
The last situation is almost comical if it was not true and you can be sure that if one company is doing this, others probably have already done so or are thinking about it. Granted, I do not know all of the circumstances surrounding this event but it goes to the point that deploying disk-based storage systems for either archive or backup requires more forethought than most companies are giving it
In the near-term, disk almost always expedites archives, backups, eDiscovery searches and recoveries but as the days turn into weeks, weeks into months and months into years, can the system really adapt its capacity and performance to meet the growing data volumes that are sure to come? Obviously many systems cannot.
Organizations who are already encountering problems with their disk-based storage systems – or who are looking to avoid these types of episodes – are advised to look at storage systems such as the NEC HYDRAstor that is based on a grid storage architecture. It can scale either capacity or performance independently since users can add new Accelerator nodes for performance or Storage nodes for capacity. It can also eliminate the upgrade and deduplication problems that other storage systems present since all HYDRAstor nodes are part of one logical configuration. All nodes integrate into a single shared pool of storage and deduplication is distributed and processed globally to guarantee a single copy across the entire grid. New nodes are dynamically added or even used to replace aging, out-of-date technology. Older nodes can be retired, and the system automatically transitions the data onto the remaining nodes.
I’m sure that the conversation that I had with Kelly is not an isolated incident and should illustrate that organizations need to think more strategically about the disk storage systems that they use for their archive and backup needs. Using almost any disk system in lieu of tape for archive and backup can clearly often solve some immediate archival or backup problems but they can create larger, more complicated issues down the road. It is for these reasons that organizations should consider new storage systems such as the NEC HYDRAstor that meet today’s needs but can also scale to meet tomorrow’s needs as well.