Bad news is only bad until you hear it, and then it’s just information followed by opportunity. Information may arrive in political, personal, technological and economic forms. It creates opportunity which brings people, vision, ideas and investment together. When thinking about a future history of 2013 opportunities, three (3) come to mind:
- Solid state storage
- 64bit ARM servers
- Common slot architecture for servers
While two of these are not new by themselves, an amalgamated version of them is a recipe for necessity. The most novel of the three is common slot architecture for servers. Common slot architecture allows an Intel or AMD x86, and Samsung or Calxeda CPU to be plugged into the same board and system. But, let’s start by looking at solid state storage impact on storage architecture. It can eliminate or mitigate at least four (4) storage architecture constraints:
- IOPS – Inputs/Outputs per Second
- Latency – The time between when the workload generator makes an IO request and when it receives notification of the request’s completion.
- Cooling – The design and implementation of racks and HVAC for data centers
- Ampre – The design and implementation of electrical systems for data centers and cities
While some may disagree with the assertion, a majority will agree that solid state storage modules or disks (SSD) are fast, much faster than their hard disk drive (HDD) brethren. In less than two years the measurement for IOPS has increased from a few hundred thousand to over one (1) million as expressed in “Performance of Off-the-Shelf Storage Systems is Leaving Their Enterprise Counterparts in the Dust.” Thus it can be assumed the median IOPS requirement is some where between a few hundred thousand to one (1) million.
In that regard, it’s fair to say that most applications and systems would perform quite well with the median output of a solid state storage system. Thus, when implementing an all solid state storage system the median IOPS requirement can be met – CHECK.
Secondary to IOPS is Latency. Latency is a commonly overlooked requirement when gauging the adaptability of a storage system to an application. While defined above, Latency is referred to as “overall response time (ORT)” as commented by Don Capps Chair at SpecSFS. In 2012 Mr. Capps wrote to DCIG suggesting this format when sharing SpecSFS results “”XXX SPECsfs2008_cifs ops per second with an overall response time of YYY ms.”
ORT and IOPS are not on par with each other. In that regard, a high IOPS number doesn’t result in a lower latency. For example, Alacritech, Inc. ANX 1500-20 has 120,954 SpecSFS 2008_nfs ops per second with an overall response time of 0.92ms, whereas the Avere Systems, Inc. FXT 3500 (44 Node Cluster) has 1,564,404 SPECsfs2008_nfs ops per second with an overall response time of 0.99 ms. In both cases the ORT is under 1 ms and meets the latency requirements for the broadest application cases, but the IOPS are nearly 10x different.
The examples above are designed to illustrate a point architecting a system to meet a balance of IOPS and Latency can go on for hours discussing controllers, memory, disk and networking (as do all performance baseline and bottleneck detection). Conversely, SSD has the ability to meet performance requirements of IOPS, while delivering low with little modification or discussion. Consequently latency and IOPS are easily balanced when using SSD – CHECK.
The final two constraints mitigated when using SSD compound each other – cooling and power. Let’s take cooling first. For a system to be properly cooled it must be properly powered or geographically located. For simplicity, let’s assume you can’t build your data center in Prineville, OR. In that regard, it must be properly powered.
Since power must be adequate, the first thing a storage architect must consider is whether or not they can cool and power storage devices. Larger capacity systems offering higher IOPS and balanced latency require more power to cool and run them, thus compounding requirements. An architect must work with data center operations to balance cooling power with storage device power.
Here is where borrowing from Jerome Wendt, Lead Analyst and President of DCIG, is prudent:
Quantifying performance on storage systems has always been a bit like trying to understand Russia. Winston Churchill once famously said in October 1939, “I cannot forecast to you the action of Russia. It is a riddle, wrapped in a mystery, inside an enigma; but perhaps there is a key. That key is Russian national interest.“
Power is limited to the amperage available from an public utility. Limitations on available amperage creates a fixed constraint. Choosing storage with reduced power and cooling needs would mitigate the consequences of the fixed constraint. In that regard, SSD reduces complexities introduced by conflicting power and cooling architectures. While some may disagree, we know SSD requires less power and less cooling, and with less cooling, power needs are further reduced. SSD can or will eliminate the complexity related to power and cooling requirements – CHECK (Reference: The real costs to power and cool, IDC 06/2008).
Articulating storage needs isn’t based solely on capacity. Storage architects must consider IOPS, latency (ORT), capacity required to meet IOPS & latency needs, data center rack space in the form of “U space“, square footage for cooling, and physical (cooling/power) operations. While some will disagree that these are required with SSD, there complexity is significantly reduced if not eliminated in a broad deployment of SSD.
While SSD can meet capacity, IOPS and ORT while reducing power and cooling costs, many of today’s all flash memory storage arrays are based on x86 software and hardware. It is x86 that creates a barrier to entry for data center deployment of SSD. It is true that some may argue for x86 processing despite a high power-to-heat requirement, we know that ARM can deliver processing with substantially lower power-to-heat requirements.
To the point of power-to-heat, Calxeda published benchmarks indicating a 10x difference comparing x86-to-ARM power-consumption-versus-data-production with a +5 ms storage response time. From a marketing standpoint 10x is a great number, but a 5x difference is “good enough.” 5x enables one to start thinking about replacing individual network attached storage (NAS) systems with private cloud scale-out storage systems using ARM processors and solid state storage modules or disk.
In that regard, it is my opinion that the market will desire ARM servers based on common slot architectures. Common slot architecture allows an Intel or AMD x86, and Samsung or Calxeda CPU to be plugged into the same board and system. Slot homogenization will reduce dependency on specific manufacturer motherboard designs (e.g. Intel) and allow for better elasticity in data center deployments.
As a result of homogenization, market pressure will pressure ARM processor vendors to enter scale-out NAS space in 2013. To that end, Calxeda silently noted their desire in late 2012 to enter the enterprise storage market in this piece by Stacy Higginbotham of GigaOM. Ms Higginbotham writes:
Its t
ests show roughly a 4X improvement in IOPs for a rack of Calxeda SoCs versus x86-based systems. Adding Calexeda’s SoCs also cuts complexity because the entire system of processing and networking components are integrated on the SoC, and the terabit-plus fabric between cores also offers more network capacity between cores in a system -the so-called east-west networking traffic.
Calxeda’s commentary muted the value of SSD, because Calxeda believes that the power hungry storage systems aren’t concerned about power consumption. Instead they believe storage systems are looking for more IOPS by adding process and memory capability to a backlog of disk operations. While that assertion has some flaws, real value for ARM is power consumption and reduced heat signature. ARM combined with SSD delivers an investment annuity in operational and capital expenditure savings.
Complementing Calxeda’s commitment to ARM is Apache‘s port of popular software that meets big data processing and storage requirements. Some will argue that SSD doesn’t make sense in big data. But, common sense indicates storing 10 PB of data on spinning disk (HDD) over a period of a few years requires you to start migrating when the last TB is added. Controller aging alone require data to find a new home in immediately, or have a common slot server upgrade.
Factoring in an all SSD and ARM based scale-out storage system using open compute common slot architecture reduces or eliminates the top four (4) storage architect requirements AND delivers a storage ecosystem with the flexibility to exist in excess of 10 years.
Further complementing the marriage, ARM and SSD should have similar data center architecture requirement as tape. For example, let’s track a company like Quantum with StorNext. They may port StorNext to ARM and take advantage of the $1/GB SSD prices as a way to transition tape customers from tape to new storage systems. Using ARM and SSD, very little would need to change with the data center power and cooling.
Finally, look for companies like Samsung to be a powerful force in componentry as they continue to produce SSD and start the development of their ARM server processors. DCIG believes that as 2013 progresses, we’ll experience a pull from the market for these storage systems long before the manufacturers are geared up to push them.