Close this search box.

1 in 10 Midrange Arrays Scale to Over 1 Petabyte; Does a Cache Crisis in Midrange Arrays Loom?

Last month I announced that DCIG is putting together its first annual Midrange Array Buyers Guide. Since then a lot has happened and over the last two weeks responses to the questionnaires that I sent out to over 20 storage providers representing over 100 midrange array models have been pouring in. So while it is still too early to announce any winners and results are still being tabulated, I am prepared to share some preliminary findings in the areas of total storage capacity and cache sizes on midrange arrays.

Over the last few years the industry has witnessed an explosion in the storage capacity of hard disk drives (HDDs). It was just three years ago in 2007 that the industry saw the shipment of the first one (1) terabyte (TB) HDD. Now 2 TB HDDs are readily available with 31% of the midrange arrays that will be included in the final DCIG Buyers Guide containing 2 TB drives.

These large capacity drives (600 GB FC and SAS drives coupled with 1 and 2 TB SATA drives) are putting large amounts of storage capacity at the fingertips of any size organization that wants it. In fact, just to be included in the survey, a midrange array had to scale to over 30 TBs of storage capacity which is more capacity than what many large enterprises possessed even as recently as a decade ago. Now unless the midrange array scales to more than 30 TBs, it does not even make the list and, as is, over 70 midrange arrays survived the cut.

This combination of 600 GB, 1 TB and 2 TB drives are contributing to skyrocketing storage capacities on midrange arrays. Nearly 13% of the midrange arrays to be included in the Buyers Guide scale to over 1 petabyte (PB) of storage capacity with 1 in 4 scaling to storage capacities that can exceed 400 TBs. One that particularly caught my eye was the Nexsan Databeast. It can scale up to 4 PBs of storage capacity which dwarves its nearest competitor in terms of total storage capacity by more than double.

Granted, many organizations may never hit these limits and will likely only use a fraction of the storage capacity that these midrange arrays make available to them which is a probably good thing. Because if users did scale these midrange arrays out and take advantage of their high end storage capacities, they may fall victim to a potential shortcoming in these midrange arrays: the relatively small size of their cache.

This was the other statistic that caught my eye as I was tabulating these results: the growing disparity between the amount of storage capacity that these systems could support and the size of the cache on their controllers. Fully 42% of the midrange supported 4 GB or less of cache in their controllers.

While the good news is that about an equal number scaled to over 8 GBs of cache (39%), one needs to keep these numbers in perspective. The midrange arrays that supported these higher cache sizes were also the same ones claiming they could support hundreds of TBs if not PBs of storage capacity.

The problem that starts to creep in here is that even if a midrange array scales up to 64 GB such as the Xiotech Emprise 7000 does, it also supports over 1 PB of storage. So if an organization should fully populate it with disk drives, the ratio of cache to storage capacity is a measly .000064%. In this circumstance, if a performance is a concern, you may actually be better off buying an entry level InforTrend ESVA F20 system that only scales to 32 TBs of storage capacity and 4 GB of cache but sports a .000125 ratio of cache to storage capacity. That is still low but the ratio is twice as good as the Emprise 7000 when it is fully populated.

Granted, this is a hypothetical situation but as organizations of all sizes use more midrange arrays as primary storage and do so in virtual server environments, cache size matters. More cache potentially means virtual server faster boot times, better application performance and maybe even faster backups.

So does a midrange array such as the HP EVA6400 that scales to over 200 TBs but only supports 8 GBs of cache in its controllers become a performance bottleneck in these virtual environments? Fully configured, the ratio of cache to storage capacity on the EVA6400 is .000037, once again worse than the ratio found on the entry level InforTrend ESVA F20. So you have to wonder, is this really the right midrange array for my virtual environment?

Solid state drives (SSDs) are being heralded by some as a potential solution to some of these problems (and to the EVA6400’s credit, it does support SSD) but that approach only works when all of an application’s data is kept on SSD. That becomes an expensive solution that only solves the problems for a subset of applications.

So I suspect that this abysmal cache to storage capacity ratio is a big part of the reason that in the last couple of weeks a number of announcements have come out where SSDs are now being used as in a cache-like configuration.

Last week FalconStor announced a new alliance with Violin Memory that leverages SSDs this way. Those users that have virtualized their storage capacity with the FalconStor NSS can now add the Violin 1010 appliance to it. The NSS then copies active data from the virtualized disk to SSD on the Violin 1010 for accelerated application performance.

3PAR did something similar this week. It announced new support for SSDs on its InServ Storage Servers but equally and arguably more importantly announced its new Adaptive Optimization feature. This feature enables InServ storage servers to identify hot blocks of storage within the array and automatically move those blocks – regardless of what volume or application that they are associated with – to the SSD disk.

So does a cache crisis loom in midrange arrays? Arguably yes based upon these recent announcements from 3PAR, FalconStor and others as the application latency problems that result from insufficient cache are already showing up but are just not being widely reported or discussed. The good news is that it looks like SSDs – whether they are placed in the storage network or in the midrange array and used in the form of cache – are shaping up as a viable workaround to the cache limitations that almost all midrange arrays face as their storage capacities skyrocket upwards.

Those are my thoughts for this week. Have a good weekend and thanks for stopping by!


Click Here to Signup for the DCIG Newsletter!


DCIG Newsletter Signup

Thank you for your interest in DCIG research and analysis.

Please sign up for the free DCIG Newsletter to have new analysis delivered to your inbox each week.