Data Protection, I/O Bottlenecks and iSCSI SANs Top User Concerns at Quarterly Omaha VMUG Meeting

To prepare this week’s Friday recap blog, I went on the road. Well, sort of. I actually went down the road about 5 miles to attend the quarterly Omaha VMware User Group (VMUG) meeting that was held at the Bellevue, NE, CoSentry data center facility.

Going into the VMUG meeting, I was expecting to find maybe 40 – 60 users in attendance. However upon my arrival I found a steady stream of cars pulling into the parking lot, over 200 users registered to attend and I counted more than 150 people physically present at the event. So anyone who still doubts the impact virtualization is having on organizations need question no more.

While some might be surprised to learn that there were 200 people registered to attend an Omaha VMUG meeting (some are probably surprised just to learn that there are more than 200 people living in Omaha area), I did not preclude the possibility that there might be a good turnout. Companies like Google and Yahoo have been building data centers in the Omaha area for the past few years that complement the already sizable presence of data centers that previously called Omaha their home.

The presence of all of these data centers also mean they need large IT staffs to architect, design and support them so I knew there was the potential for a lot of people to attend. Still, 200 people exceeded my expectations and it was clear that those in attendance were hungry to learn and collaborate.

The event was broken down into four general tracks – a lab, a main floor presentation, a tour of the CoSentry data center and an interactive architectural white board session. Of the four tracks, it was the interactive architectural white board session that drew the most interest. At least half of those at the event huddled around the white board and the four systems engineers hosting the session straining to hear what they had to say about how to best implement and support VMware.

The questions and comments that the users had to share was rather insightful as to the challenges they were having but they seemed to break down into three main areas of concern.

First, protecting virtual machines is a major concern which comes as no major surprise. However what I did find surprising was how few were using native VMware backup utilities like VMware Consolidated Backup (VCB) or the vStorage API found in vSphere and how many were using snapshot utilities found on disk storage systems.

While one should by no means consider my results below a scientific sample, when the users were asked by the presenters how many were using VCB or the vStorage API, I saw maybe 5 or 6 hands go up. But when they were asked how many were using the snapshot feature on the disk storage system attached to their virtual servers, there was a noticeable ripple in the audience with maybe 1/3 nodding their heads yes or raising their hands.

I find this significant since I earlier this week blogged about how CommVault® Simpana® had added new support for Dell EqualLogic systems into its SnapProtectTM feature. While I mused on a number of reasons in my first blog as to why users would find this new functionality beneficial, it did not occur to me until I was with this VMware user group that users are still unfamiliar with the native backup utilities found in VMware.

What might also be slowing the adoption of these native VMware backup technologies is that it was evident the systems engineers on hand were not big fans of VMware’s backup utilities. While they did not make any outright remarks disparaging them, they did comment that the vStorage API still has a lot of maturing to do which effectively serves to raise doubts in the minds of users.

The second item of note that came up during this white board session was the number of people reporting I/O bandwidth constraints in their virtual environments. The systems engineers again shared that they regularly see I/O bottlenecks, especially in environments with blade servers.

The specific problem that users can encounter is that as they load multiple virtual machines (VMs) on blade servers, all of these VMs start contending for storage resources across the same shared back plane. This can cause a significant hit in performance for all of the VMs hosted across all of the blade servers in a specific rack.

Aggravating this performance problem, the engineers warned that there is no really good way to measure performance on individual VMs. On physical machines, you turn on performance monitor in Windows and monitor the queue length and if the queue length is consistently greater than 1, you know you have a performance problem.

Not so in virtual environments. When these application servers are virtualized, their performance monitors become largely irrelevant. Since so many virtual machines share the same physical resources (back planes, network cards, and paths to disk), there is no way to definitively determine if the reading displayed in the performance monitor accurately reflects the performance of the application on that virtual machine.

For example, the application on the virtual machine could now be negatively impacted by the activity of another virtual machine on that same physical machine. In the case of blade servers that share the same backbone to physical storage as other physical servers, it is even possible the performance impact could come from other virtual machines on other physical blade servers accessing the same storage.

The third and final problem that still shows up is implementing virtual server environments with iSCSI SANs on existing corporate LANs. While keeping iSCSI traffic off of corporate LANs should be considered best practice and has been written about many times before, it was clear from the  comments that the engineers had to share that they still regularly encounter it.

One shared the story where they had just gone into a customer account to install a new IP phone system and noticed its performance was abysmal. He did some investigation and discovered that the Ethernet switch that this IP phone traffic was to traverse was already running at 85% utilization. Further investigation uncovered that someone had implemented an iSCSI SAN across the corporate LAN.

So he recommended to the customer that they purchase a new Ethernet switch and move those servers using the iSCSI SAN to a physically separate SAN. After moving that traffic off of the corporate LAN, application performance on all of the servers jumped dramatically.

After the engineer finished telling that story, one user in the crowd spoke out. He said, “If a vendor tells you that you can run iSCSI over your existing Ethernet network, that should be a clear signal that you need to look for another vendor to implement your iSCSI SAN.”

In any case, good stuff at the Omaha VMUG meeting and I recommend that anyone who wants to learn more about how to best implement virtualization make a point to attend one in your local community.

Also, in a note to any storage vendors that are reading this blog, the Omaha VMUG is setting up a test virtual lab and needs used but good server, networking and storage hardware and software. If you can help them out with any donations, please let me know or email me at jerome.wendt@dcig.com and I will put you in touch with the appropriate individual.

Have a good weekend everyone!

Click Here to Signup for the DCIG Newsletter!

Categories

DCIG Newsletter Signup

Thank you for your interest in DCIG research and analysis.

Please sign up for the free DCIG Newsletter to have new analysis delivered to your inbox each week.