Over the last few months DCIG has spent fair amount of time researching and documenting specific reasons why tape will not die. Green IT is the one reason we most often hear cited for retaining tape, though new disk-based deduplication and replication technologies coupled with new disk storage system designs that are based on grid storage architectures can offset some of those concerns. So before organizations think that after 30, 90 or 180 days that they should immediately move their archival and backup data, deduplicated or otherwise, from disk to tape just to save money, there are certain intangible savings from an eDiscovery perspective that keeping data on disk provides that are not always feasible on tape.
Immutability, portability and remote recoveries along with capacity and power savings are reasons sometimes used to justify copying some archival and backup data to tape. But if organizations focus solely on these reasons, they are failing to account for the new needs that today’s more litigious environment is creating to access and search these data stores. Specifically, as organizations respond to Electronic Data Discovery (EDD) requests during periods of litigation, they may find that the costs of recovering and searching data on tape to do eDiscovery offsets whatever cost savings that tape initially provided.
According to a recent article by George Socha and Tom Gelbman, their 2008 Socha-Gelbmann survey that surveyed the state of eDiscovery in 2007 found that commercial expenditures on EDD topped $2.7 billion in 2007. Further, they forecast expenditures on eDiscovery will grow by 21% in 2008, 20% in 2009 and 15% in 2010 which equates to a $4.5 billion market by 2010.
However whether or not your organization participates in that trend will likely depend on how you store your data. Charles Skamser, VP of Sales and Marketing for Trial Solutions, makes a point in a post on his ediscoveryconsulting blog that 2006 changes to the FRCP (Federal Rules of Civil Procedure) create new requirements for organizations to utilize new eDiscovery technologies but how much you pay will likely vary according to the sort of media you store you data on. The cost to simply index 2 million files stored to tape can run as high as $75,000 while the cost to access, index and search archived and backup data stored to disk can be a fraction of that.
This is where keeping archived and backup data online on disk as opposed to on tape can result in dramatic cost savings for organizations. If data on tape needs to be restored so it can be accessed and searched to satisfy the requirements of the eDiscovery request, this could take hours, days or even weeks to complete. This does not take into account the extra storage capacity that might be required to recover all of this data and/or the time required to index it and search it.
So if organizations spend all of this time just retrieving the data, it leaves them little or no time to assess it. Since the FRCP only gives organizations 30 days to respond to an EDD request, they may find themselves unable to produce the data or, if they produce the data, unable to fully understand the nature of the information they just shared and the implications to their organization.
Compounding the problem, failure to provide the information could result in the presiding judge issuing the dreaded “death instructions”. When a judge issues these instructions to a jury, he or she is essentially communicating to them that your organization has something to hide and should be viewed in the most negative light. So when one starts to add up the costs of retrieving the data from tape, the shortened length of time or inability to assess the recovered data before sharing it and the possibility that you may lose the case simply because you can not access and produce the data quickly enough, suddenly using tapes for “Green IT” does not look so appealing any more.
These new eDiscovery demands that are being driven by updates to laws like the FRCP make it clear that aging backup and archival data stores are not as static as in the past. As more large organizations look to balance how they use disk and tape in their environment, moving some data off to tape certainly makes sense for some of the reasons I mentioned above. But for organizations to assume they can move all of their archived and backup data off to tape after a set period of time may expose them to increased eDiscovery costs and even lost lawsuits because they produce data too slowly or not at all.
Organizations are right to weigh the costs of keeping data on disk or tape but, as they do so, they also need to factor in today’s more litigious environment and the cost of not having instant access to these data stores. New disk-based deduplicating grid storage systems like the NEC HYDRAstor make these decisions between disk and tape a little easier to make. While the HYDRAstor can cost-effectively keep data on disk long term, it also provides an option for organizations to copy data off to tape without impacting day-to-day backups or recoveries. Further, since the HYDRAstor uses standard network file sharing protocols, organizations can potentially use new eDiscovery tools to access and search stored archival and backup data on the HYDRAstor without first having to recover the data.
Tape has its place in today’s data centers but organizations need to consider more than just cooling, heating and power when deciding how to best manage their data. We live in the information age and the inability to access and search data is no longer a luxury that most organizations have. Changes to laws like the FRCP and new ones being promised by the current Congress should make every organization realize that whatever savings tape may provide may be more than offset by an organization’s inability to respond to legal eDiscovery requests and the subsequent court losses that may result.