Archives for: July 2010, 12
VMware And Its Storage - Faster, Smarter, Stronger... Free?
July 12th, 2010Sometimes, though rare, you do get something for free. VMware has had API’s built within the code since the beginning, and while some of the earlier features, such as VCB (VMware Consolidated Backup), were a little rough around the edges, a new set of APIs are due out in vSphere 4.1 that are really going to impact the performance and scalability of your virtual infrastructure.
I will be the first to admit that I tend to see things from the storage side of the equation, so this latest news is particularly interesting to me. But anyone into squeezing out the best bang for their virtualized hardware investment should be pretty jazzed about this. These latest APIs are targeted specifically to how VMware can leverage a “smart” storage array to make virtual guests go even faster on existing hardware. The new “family” of APIs are called the vStorage APIs for Array Integration (VAAI). This is to differentiate them from existing APIs such as the ones for data protection, multipathing, and Site Recovery Manager.
I have long said that we in IT don’t fix issues, we push them around, and this is exactly what these three new APIs do. More specifically, they take tasks that your server hardware is doing and move those tasks to the storage array hardware. This has two major benefits: The first is that the server resources such as CPU and RAM can now serve the tasks specific to the virtual machines rather than the “administrative” work of the underlying VMFS care and feeding. Secondly, these tasks take up considerable network traffic (IP or FC depending on your storage array networks flavor) between an array and the server infrastructure; so again, more of the network’s resources actually go to serving the needs of the business applications and less to the underpinnings of vSphere.
There are a few caveats (I know you saw it coming). It’s all about the block level access, raw device mappings and VMFS for now, so if you were one of the early NFS adopters you are going to have to sit on the bench for a while yet. Secondly, you will have to be using an array that supports the APIs (kinda obvious). The good news is that while EMC will be the first kid on the block with the new toys, it is based on standard SCSI commands, so other manufacturers should not be too far behind.
OK, now onto the goodies…
First up is the hardware accelerated locking. One of the features that makes VMware the data center tool it is, revolves around the ability for multiple physical systems to work together in a cluster. Since all the machines see all the guest files at the same time, file locking is a big deal. If any two servers try to write to the same file at the same time, well, bad things happen. This locking process takes commands before, during, and after an actual update. When you have many machines performing these updates, this amounts to millions of commands. The new API reduces a large number of lock commands to a single SCSI command. While this will have some performance impact, the reduced instruction set will allow VMware clusters to become much larger due to reduced effort to arbitrate all this locking.
The next new feature is called hardware accelerated zero. VMware zeros out blocks inside a virtual machine’s file when it expands. This means that as you add data to a virtual machine there are often two or more writes necessary to actually write your data to the file. This is a huge overhead. With the new API, the host only needs to tell the array how much space to zero out, and the array performs the task rather than the host. This can reduce the IO overhead from 2 to 10 fold. Data writes are already expensive in terms of parity calculation, so this will be a huge improvement in overall array write performance.
The final, and perhaps most obvious, tool introduced is hardware accelerated copy. Here, instead of the VMware hosts moving the files for storage vMotion and the creation of machines from VM templates, the copy request is given to the array which simply moves the data internally. While not an everyday occurrence in most shops, the savings in network, array, and server resources are huge.

Personally, the most amazing thing about these new tools is that (assuming your array supports them) you will be getting them for free when vSphere 4.1 comes out later this year. I guess we can chalk it up to our maintenance dollars at work deep in the confines of VMware. So keep those maintenance contracts up to date—the upgrade is going to be worth it!