Today many enterprises and cloud storage providers ask, “What scale-out storage solution will enable us to economically and easily house our burgeoning Big Data stores?” However small and midsized enterprises (SMEs) put a slightly different spin on that same question by asking, “What scale-out storage system will enable us to affordably address our ‘bigger data’ problems?” SMEs are finding an answer to their question in the form of the Gridstore Scale-out Storage platform.
Enterprises and cloud storage providers are rightfully concerned about implementing economical storage solutions in response to the challenges that Big Data presents. Some IT executives attending the 2011 Gartner Data Center conference anecdotally shared that they were seeing unstructured data growth in the range of 400 – 800% annually. This rate of data growth has led to one analyst firm to forecast a 61% compound annual growth rate for storage over the next four (4) years.
To respond to this growth many enterprises and cloud storage providers turn to scale-out network-attached storage (NAS) solutions. Scale-out NAS starts in relatively small configurations, scales to hundreds or thousands of terabytes of storage capacity, facilitates the easy add-on of more capacity and manages all of that capacity centrally and as one logical repository.
These features of scale-out NAS are attracting mid-tier enterprises. Even though they do not necessarily have the same ‘Big Data’ challenges as large enterprises, they have their own set of ‘bigger data’ challenges and need affordable storage solutions that can scale to hold tens or even hundreds of TBs of data.
However scale-out NAS solutions targeted for enterprises and cloud storage providers break down in SMEs. To scale to meet the needs of enterprises and cloud storage providers, scale-out NAS solutions are architected to deliver functionality well beyond what mid-tier organizations need. As such, the typical initial investment for an enterprise scale-out NAS solution is quite large – certainly in the tens if not hundreds of thousands of dollars – putting it well beyond most SMEs’ budgets.
To understand why that is, it is worth examining why scale-out NAS is so costly. Storage capacity certainly contributes to the cost of enterprise scale-out NAS systems. However other features actually play a much larger role in these costs. These include:
- Redundant disk controllers. Storage capacity is fronted by controllers that provide front-end network connectivity as well as CPU and memory. To swiftly handle storage traffic, these disk controllers are typically both powerful and robust making them expensive.
- Complex clustering software. Complex clustering software is needed to make all of the controllers function as one logical storage pool. While its complexity is hidden from the users of these systems, this software does require a lot of processing power. This precludes the introduction of more economical hardware for use as disk controllers.
- High performance backplane networks. Clustered scale-out storage solutions rely on a high speed backplane network for inter-node traffic. To avoid potential bottlenecks, most costly 10gE, 40gE or even Infiniband networks are utilized.
- 3-Way Replicas. Like the google file system, clustered scale-out storage most often creates two replicas in addition to the original data. This allows two nodes to fail concurrently without data loss or disruption. However this technique adds the capital cost of 3X the required capacity and, on an operational basis, 3X the data to power, cool and manage.
- Up to 50% of IOPS consumed with replicas. Every time a write occurs to one node in the cluster, two more writes occur on two more nodes resulting in 50% of IOPS being utilized to write replicas. To make up for the loss in IOPS, expensive flash disk is often added to boost performance.
So what gets overlooked is that the cost of storage is really NOT the primary cost in these scale-out NAS solutions. Rather it is this cluster model and this architecture that drives up its cost. So if a scale-out NAS provider could successfully decouple the controllers and software from the storage without sacrificing feature functionality or performance, the cost of scale-out NAS should plummet.
That is exactly what Gridstore has done with its Scale-out Storage platform. The Gridstore architecture provides the convenience and desirability of scale-out NAS by creating a virtual storage grid that eliminates the most costly components of cluster based scale-out NAS solutions – their redundant controllers, complex clustering software, backplane networks and 3-way replicas.
Gridstore’s virtual storage grid consists of vControllers. This zero-cost vController software resides on clients and in essence provides the same functionality as the clustering software found on enterprise scale-out NAS systems – only without the cost as there are no controllers hosting the clustering software.
By placing the vController software on the clients accessing the Gridstore Storage Nodes, Gridstore distributes the storage processing out to them. This technique eliminates any potential for the scale-out NAS disk controller to become a bottleneck since there is no controller.
Gridstore reads and writes data in parallel to multiple nodes without replicating the data. Data is encoded before leaving the vController and sent in parallel stripes directly to the storage nodes. There are no background replicas so no IOPS are consumed performing this task. Further, the encoding used by Gridstore offers the same level of protection from failed nodes without the overhead and cost of 3-way replicas.
The vController technology has some other benefits as well. The only storage traffic going across the network is when the vController writes or reads data from a Gridstore Storage Node which may eliminate some network traffic and expedite response times.
For example, if an application needs to access data and that data resides in the vController client cache, there is no need to make a trip across the network at all. By moving the storage processing out to the client ensures that processing is done where it is needed without network latency. This enables the vController to make intelligent choices about what data to send and receive before any network traffic occurs.
SMEs do not have the Big Data challenges of enterprises and cloud storage providers but they still have their own set of ‘bigger data’ challenges that they are trying to solve. As such, theys are looking for a scale-out NAS solution that matches their particular storage requirements and fits within their budget.
The Gridstore Scale-out Storage platform accomplishes that by providing the convenience and flexibility of enterprise scale-out NAS solutions. However since it has been engineered to meet the specific workloads and budgets of mid-tier organizations, it can deliver this functionality at a fraction of the cost of enterprise scale-out NAS solutions. As such, Gridstore Scale-out Storage provides an answer to the ‘bigger data’ question that many SMEs have without incurring the ‘bigger costs’ of scale-out NAS that they are looking to avoid.