One of the major problems facing enterprises interested in adopting server virtualization is the heavy throughput created by their virtual machines (VMs). Beyond that, if companies look to also virtualize their desktops and create a virtual desktop infrastructure (VDI), they also face provisioning problems when attempting to provision and manage new workstations. These I/O intensive applications of virtualization technologies present a major stumbling block to VM technology adoption.
Today we continue our blog series today with Virsto Software CEO Mark Davis where we discuss how Virsto handles the management of the available physical server’s resources through its position in the virtual server’s hypervisor.
Ben: Resource management is certainly a big issue in virtualization. One of the biggest problems DCIG historically sees in this area is throughput and disk I/O. Virsto solves some of those problems by combining I/O streams and using gold images. In order to deal with the burdens associated with these disk throughput issues, does Virsto set aside some memory for caching to prepare the streams for sending to the SAN or NAS or whatever is behind your vDisk?
Mark: It is pretty small, on the order of 100 megabytes per physical server. It is quite negligible, actually.
Ben: If a CPU on the box is completely pegged out at 100 percent, you have to take some performance hit there, correct?
Mark: Yes, in theory. Although again, what we find is if CPU utilization was pegged out, if a company adds our software, it will actually get CPU cycles back. Again, our software is doing extra work. But that causes all the other utilization of server resources to be much more efficient so the net effect is the physical server always gets CPU cycles back.
Ben: Where does Virsto fit? In vSphere it would sit within the vSphere stack itself, right?
Mark: Yes. Virsto installs into ESX, just as in Microsoft Hyper-V it installs into the Hyper-V partition itself. In the case of ESX, Virsto installs as a virtual machine. So, Virsto is a virtual appliance.
We install one virtual appliance per physical server. Each of these virtual appliances is aware of each other in a VMware data center or VMware cluster. This virtual appliance is what does all the intelligent things to the I/O.
How we do that is when we do writes, for example, instead of sending the writes from the 50 VMs running on that server to 50 different I/Os, we coalesce all those into a single sequential write stream that we write to a log. Instead of just trying to write directly to the end disk, we write to a log.
This log is on a non-volatile piece of storage – sometimes a rotating disk, sometimes a solid state disk, according to the customer’s requirements and configuration. When we write into this log, it allows us to much more efficiently use the I/O buffers and use the hardware. As soon as the write hits the log, we can acknowledge the write back to the host. From that moment forward, that data is secure.
At a later time, we take the data out of that log and place it in a highly optimized way on the back end storage. In the process of doing all this, we handle things like snapshots or provisioning of, I do not know, 10,000 desktops made out of a clone from a single golden master snapshot. We do this process out of the log.
This quite unique architecture is what allows us to deliver not only super high performance, because we have optimized the writes through the log, we also optimize the reads because we can do really smart data placement. We also can provide super rapid provisioning because the process of provisioning a new virtual machine as a clone from a previous VMDK is literally as simple as us making a tiny little mark in the log. Everything before the log was pre-snapshot, and everything after is post.
We can also do these huge numbers of clones. We have been able to demonstrate making thousands, even tens of thousands of these fully-readable, fully-writable, and absolutely full performance clones in literally a handful of seconds, and for tens of thousands might be on the order of a few minutes.
We recently did some performance benchmarking with a very major storage, a major server vendor and with Microsoft. Without our software it took 32 hours to get 1,000 virtual machines provisioned. With our software, the process of provisioning the storage for that was a matter of a couple of minutes.
Ben: Well there is nothing really to read and write in that situation because it is employing from a master and through the deduplication process there is really nothing to move around, right?
Mark: Yes. I would not say “nothing,” there is just not much and there is a lot less to move around.
Ben: You get a bunch of marks in log files and some minimal copying – a block here, a block there, change the host name, and that sort of thing. But it is pretty small to the initial VM, right?
Mark: Exactly. It is very little, and of course there is the provisioning problem. There is also, in virtual desktops in particular, the well-known problem of the boot storm, when everybody is booting their virtual machine at the same time.
That causes a whole bunch of I/O which is heavily write-dominated. Then there is the log on storm, which is even worse – really, really write-oriented. This is where most snapshot and cloning technology falls down, because almost any cloning technology other vendors have really degrades performance as you do writes to it.
In fact, it dramatically degrades performance to the point that many vendors who offer these features have to tell you in their application notes “do not use this in production.” It is fine for backup, it is fine for non-production work. But the performance characteristics are simply too slow to use in a production.
This we think is a problem. We think that there is a great reason to do space efficiency, whether you call it de-dupe or “no dupe.” There is a great reason to thin provision storage because we all know that most storage that gets allocated in the world, often never gets written to.
But at the same time, we need to be able to have high performance so that the end user experience in a VDI environment is super sharp and crisp. You cannot do that with any other writable clone technology that we are aware of.
In Part I of this interview series, we look at how Virsto creates a VMware storage hypervisor in VMware
vSphere to give incredible boosts in performance using even traditional
In Part II of this interview series, we looked at what the virtual machine I/O problem is and how Virsto fixes it.
In Part IV of my blog series with Virsto Software CEO Mark Davis, we will look at how Virsto fits into the private cloud infrastructure storage space and what it does to optimize the performance of SSDs.