Organizations have a proclivity to look at storage arrays primarily in the context of how much storage capacity do they offer. But as storage arrays add features such as deduplication and thin provisioning, storage efficiency is taking on new importance as an evaluation criteria when selecting a storage array. This is raising questions as to what role, if any, that a storage array’s storage efficiency features should play in the final buying decision.
Storage efficiency is gaining momentum as more storage arrays add features such as deduplication, MAID, solid state drives (SSDs), storage tiering, system wide disk striping and thin provisioning. The introduction of these features is already leading some to claim that traditional benchmarks such as the number of TBs to which a storage array can scale is becoming less relevant in the final storage buying decision. There is some merit to this claim.
Thin provisioning is a good example as it changes the dynamics of how efficiently capacity on a storage array is used. Almost every major storage vendor now offers thin provisioning in some form which may change how important it is for that storage array to physically scale into the hundreds of TBs or even PBs of data.
For instance, using thin provisioning, a storage array can now virtually scale into the PBs. Some vendors even claim that some of their users are achieving utilization levels of up to 1500% on their storage arrays using thin provisioning.
These claims make sense when I think back to my own experiences when I was a storage administrator and how storage requests grew.
If the application owner thought that the application would need 100 GB of storage, he may pad that request by 50%. The system administrator would then receive that request and remember the last time an application owner underestimated the amount of storage he needed and add another 50 GBs to that request.
Finally it reached me (the storage administrator) who just came in from a hellish weekend of on call duty because an application encountered an out-of-space condition at 2 am on Sunday morning. So I would add another 100 GBs to that request “just to be safe”.
Granted, this might be an extreme case and it surely did not happen every time or even most of the time. But the point is, it is easy to see how an applicaiton that needed at most 100 GBs of storage could easily turn into a 300 GB storage allocation just by everyone taking precautions.
Leveraging a feature like thin provisioning theoretically keeps everyone happy while avoiding this storage overallocation. The storage administrator can thinly provision the 300 GBs that he thinks the application might need to prevent the out-of-space condition that got everyone up in the middle of the night. In the meantime if the application only uses 50 GBs of data, the other 250 GBs of storage capacity that was requested is never allocated and can be re-used for other applications.
So the questions become, “How do you properly measure the storage efficiency of thin provisioning in general?” and “How do you measure how well it is implmented on a specific storage array?”
Merely saying that a storage system is using storage efficiently because it supports thin provisioning is misleading when one considers the following environments.
First, when a thinly provisioned volume is presented to a VMware server, VMware may go out and zero out all blocks in the newly allocated volume. So unless the storage array has some means to recognize that zeros were written to all of those blocks by VMware, whatever benefits that thin provisioning initially offered are immediately negated by VMware.
Therefore to say that storage array manages storage efficiently simply because it supports thin provisioning is inaccurate, at least when comparing it to those storage arrays that support thin provisioning and have a means of reclaiming all of the blocks to which VMware writes zeros.
A second example of how thin provisioning may not manage storage capacity efficiently came to my attention just a few weeks ago. A storage administrator was provisioning storage to users for the first time on a new storage system that supported thin provisioning and was really looking forward to avoiding the out-of-space conditions that thin provisioning helped prevent. So what he did was allocate 1 TB (or whatever the amount was) to each application that previously only had access to about 50 GBs of storage capacity.
What he did not anticipate was the reaction of his application owners to this additonal storage capacity. His users discovered that they were no longer at or near capacity for their applications and now, thanks to their newly caring storage administrator, they suddenly had hundreds of GBs at their disposal for use by their application. So data that they had kept tucked away on local disk drives, thumb drives or whatever began to find its way onto these thinly provisioned volumes.
This had exactly the opposite effect of what the storage administrator expected. While it solved his out-of-space problems on the application side, he was unexpectedly running into out of space conditions on his storage arrays since his users were viewing this virtual capacity on the thinly provisioned volumes as real, physical storage capacity available for their immediate use.
Here again, simply making thin provisioning available on a storage array is not always a good indicator that it will lead to more efficient storage management as in this case it made an already bad situation worse.
The more I study storage arrays in preparation for the release of next year’s DCIG Midrange Array Buyer’s Guide, the more I realize that simply adding more features to a storage array is a poor way to measure how effectively a storage array delivers on attributes such as storage efficiency.
While the addition of features like deduplication, storage tiering, thin provisioning and the others mentioned above are important first steps to making storage systems more efficient, these alone are insufficient. Rather, it is the addition of features with the right supporting cast of management policies that enable organizations to achieve the storage efficiencies that they expect when they purchased the storage array.
As more storage vendors add features that enable them to utilize storage more efficiently, organizations need to recognize that the storage industry is still in the early stages of this transformation of moving from measuring storage arrays by their total storage capacity to how efficiently they measure storage. Therefore as they look to buy storage arrays that promise storage efficiency, they need to verify exactly what storage efficiencies that these vendors are promising, if their storage arrays are actually capable of delivering on them and if they will realize them in their environment.