Right now on Yahoo finance it is counting down what it considers the top 10 tech trends for 2010. However some of the trends that it is including in its top 10 are so broad in their definition that when it lists ‘Data Centers‘ as its #2 trend and then identifies nearly every technology company in the space as being part of this trend, you have to question just how real this trend is? The list of what I consider the more subtle storage trends of 2010 will be a bit more specific in terms of what features, products, services and/or vendor alliances are taking place that support these theories.
Disclaimer: The trends listed here are in no particular order other than I see far more interest in these technologies among end-users than I do in them re-introducing tape as their primary backup target.
The Emergence of Disk-based Backup 2.0. I am not sure if I should classify this trend as disk-based backup 2.0 or 3.0. However since I do not recall ever seeing disk-based backup with deduplication ever classified as 2.0 (1.0 being just plain disk-based backup with a NAS or VTL target), let’s start with the assumption that disk-based backup solutions that offered deduplication defined 1.0 disk-based backup.
Backup 2.0 solutions go further. The inclusion of features like deduplication and replication are assumed but new products start to introduce features like high availability and better management of the backup data. This will be done for the simple reason that as organizations have now solved their day-to-day tactical backup problems, they can now turn their attention to better management and actually automating the entire backup process, including the replication, placement and recovery of the data. This cannot be done solely with disk, deduplication and replication but requires better data management policies and solutions that are highly available.
Continuous Data Protection (CDP) will start to compete with more traditional backup and recovery approaches. This was reinforced by Symantec’s 2010 State of the Data Center report that was released this past Monday. While CDP has been steadily maturing over the last few years, it seems all size enterprises are taking notice of its progress and recognizing the full breadth of features that it can bring to bear on expediting and simplifying not only backup but recovery, disaster recovery and even lowering test and development costs. While it is way too early to say CDP will displace once a day backups in 2010, 2010 appears to be the year that the clock begins counting down as to when that will occur.
Thin provisioning will continue to get the nod over deduplication on high end primary storage systems. While some storage providers, probably most NetApp, are making deduplication available on their primary storage systems, what will ultimately drive the adoption of either of these technologies on primary storage is cost and, right now, I have to argue that thin provisioning is winning and will continue to win the battle in one important area on high end storage systems: upfront cost.
Deduplication introduces processing overhead and that means organizations have to size the controller heads on their storage systems appropriately to deduplicate the data. This will almost certainly add to the cost of the system up front and possibly further down the road depending on the demands deduplication places upon the system.
Thin provisioning eliminates these upfront loads plus some initial surveys suggest that it provides the same reductions in storage consumpion as deduplication. Once recent survey conducted by TechValidate shows that Senior IT Architects have been able to over provision their storage by 2,000%. This equates to about the same 20:1 reduction in storage that deduplication delivers and while the organizations that saw these types of capacity savings were those with more sophisticated IT staff, these are also the ones more likely to be deploying and managing high end storage systems.
Archiving in all of its forms will gather momentum in 2010. It used to be that if users thought of archiving at all, they think of it in the context of email archiving. No more. Judges are no longer sympathetic for organizations that cannot produce requested documents in timely fashions and no organization is immune for having to produce their documents when requested.
So while email archiving tends to get the most attention, continuing data growth in both the unstructured (file servers) and structured (database) segments of the market is creating new user requirements to archive these data stores as well. On the file archiving side, it can often be justified in two key ways: the ability to purchase less expensive storage to keep data online and reductions in backup time.
The same principle holds true on the database side but one of the roadblocks to database archiving has been that managing the databases that house archived data has been as painful as just leaving the data in the primary database tables. However products from companies like Informatica alleviate much of this pain and, based upon the recent alliance that it struck up with CommVault, suggest that companies are actively looking for a broader range of archiving solutions than just the email and file archiving solutions that CommVault natively offers.
Cloud storage will be the most talked about non-adopted trend of 2010. Personally I see a very bright future for cloud storage, just not in 2010. Most of the products are still in their early stages, users are still kicking the tires and the ‘gotchas’ associated with cloud storage are still being discovered. Until the majority of those ‘gotchas’ associated with cloud storage are understood, documented and worked around, do not expect to find cloud storage anywhere but in the shops of a few early adopters.
Organizations are finally getting serious about data management. There is no one force driving this trend. Part of it is that management is getting sick and tired of asking their IT staff to report on the status of their corporate data and only getting partial or incomplete reports. Part of it is that organizations are worried about the increasing demands that the judicial system is placing on them to produce data when requested. Another part is that server virtualization is centralizing data so that organizations can quantify the scope of the problem. Maybe the largest driver is that the process of managing distributed data has finely gotten so broken that organizations intuitively know that it is costing them more money in terms of staff, hardware and software not to fix it than to fix it once and for all.
Have a good weekend and stop by again next Friday as I once again begin my weekly musings as to that week’s events in the storage industry and how they stand to impact end users.