Deduplication Wars: EMC Avamar vs CommVault Simpana

OK, so you have decided that deduplication is the best thing ever and a must have for your backup needs. The next big question on the horizon is what KIND of deduplication is right for you. Two of the big hitters in the market today are EMC’s Avamar and CommVault’s Simpana products. Both products seem to be doing very well in the wild and both approach deduplication in completely different manners.
In the case of Avamar, the product is deduplicating at the client using variable block deduplication. Once the scan is complete on the client server and the deduplication hash is created the client actually checks back with the Avamar Data Store appliance farm to see which blocks the farm has not seen and then only the truly unique blocks across the environment (not just that server) are sent over the wire. This results is extremely high levels of deduplication AND remarkably fast backups since very little data is normally left to send after deduplication and comparison to the rest of the environment. The data is stored on the EMC Data Store appliance which presents a pretty simple GUI for the recovery. The only major chink in the armor of Avamar is that it does not have the ability to natively create tapes for those data sets you may want to retain longer than you have space to keep on the appliance farm.
CommVault came at the deduplication process a completely different way and leveraged their existing tape archive construct to create a form of fixed block deduplication. In this case the clients do the same thing they always did: run their scans, package the data, and then shoot it out. Once the data gets to the Media Agent the deduplication occurs and the data is spit out onto any disk target supported by CommVault. Since the deduplication is fixed block, the deduplication ratios are not as good as with variable block, aka Avamar, but certainly much better than typical compression. Since the deduplication occurs on the Media Agent, there is no savings in backup window time. The good news is that this is CommVault and cutting tapes is it’s forte and completely automated with the ability to have different retentions on each type of media to fit all your compliance desires within a single tool. Also, since the format of the media archive did not change, restores are just as fast with deduplicated data as they were with plain Jane backup to disk, which is huge if you have a lot of data to restore. Avamar can be sluggish in terms of restore on the smaller deployments but still certainly functional for your every day restore needs.
On the grand scheme, Avamar is the holy grail of backup speed since it only every sends fractions of incremental data over the wire to the target which reduces not only backup times but the impact of backup on both the hosts and the network. I also give CommVault a major tip of the hat in how they leveraged their existing technology and morphed it into a deduplication technology that brings huge benefits to their current customer base while staying on the commodity hardware bandwagon.
Obviously there are many more features to both products worth investigating and comparing but now you know how the two differ technically in terms of the deduplication angle.
Photo Credit: JTony via Flickr


Thanks for the insight.
We’re currently a CommVault v9 shop that is looking at Avamar from two angles. One, as a replacement for our deployment of Barracuda for some remote sites and two, as a means of supporting existing Avamar sites that may want to coop with us as their DR site.
We’re testing version 5 currently and I must admit that the Avamar console reminds me of SyncSort – a product we left years ago. I’m reserving judgement on the performance until I’ve used it a bit longer. The sales team from EMC promises features and modules that we’ll need when v6 is released – we’ll see.
Any updates or opinions that you might have on these products would be greatly appreciated.
Cheers,
Norm
Norm,
Avamar certainly has the most “baked” source based dedupe on the market and the version 6 cures some of the challenges around VMware backup performance through proxy, improves scalability and a host of other worthy advancements. The key question in my mind is what your retention requirements are. While there are methods for Avamar to send its data to tape or a Data Domain box, these methods are not as robust as your CommVault incumbent and it may be a while if extended retention is a requirement.
Now that CommVault is offering source based dedupe I would think it would absolutely be worth a look. Depending on your recovery model it may make sense to have a Media Agent at the remote site to facilitate faster restores depending on your typical restore size. No one wants to restore 100G over the wire, ugh.
I also agree on the interface comment that Avamar has a way to go, but I am enough of a gear head to see past it if I need the best dedupe codec out there. No one likes to learn a new backup interface especially if you are coming from one of the more mature products out there but I assure you it would not take long to get used to Avamar, it’s just a matter of spending the time with it.
In terms of supporting other Avamar sites out there, this is more of a business question than a technology question. Many shops with affiliations with other “sister” companies do this today with great success. The question is whether it fits your model or not. If so, and you want to put it in to support existing Avamar sites, the Barracuda displacement may be a nice collateral win. Watch those costs models and change rates though, CommVault and Avamar have different pricing structures so it may be the almighty dollar that drives the decision if both technologies can get the job done.
So Norm, thanks for the response, and as usual I think I asked more questions than I answered but keep me in the loop on how things progress and as new questions arise and I will be happy to chime in.
David
It’s great to see your view on de-duplication, it’s a hotly contested topic right now in the industry with many taking sides with source or target based de-duping.
I sit in the CommVault camp right now as I have done for the past 3 or more years since investing alot of time and training skilling up in the product.
I do see value in both methodologies but would like to add that while the Simpana offering as it stands does not offer source based de-duping, there is at least the option to place a MediaAgent at the source location to stage de-duplicated data prior to sending to a central storage farm. I will add though that this does not come without some cost in software licensing though.
Hopefully we will soon see further evolution of both EMC’s Avamar and CommVault’s Simpana products. Until then, we should always evaluate the options and consider what fits our requirements and environment best.
David:
Thanks for writing up this great summary!
We currently use Commvault to back up our remote file and exchange servers to local tape based media agents. I was really excited to hear about Commvaults De-dupe functionality in 8.0 and the huge potential to save money on the tape costs but at the same time disappointed that they couldnt work some magic into the client to do the dedupe before sending the data over the wire to the media agent. Perhaps they will add this in one of their next service pack releases.
Anyways, Im kicking off a project next week to re-architect our remote office backups in hopes that we can easily transition to WAN based backups using a dedupe solution.
Since we are a big EMC shop, Ive heard lots about Avamar from our local EMC rep and have been very impressed with what ive seen so far, except for the lack of tape out support like you mentioned. They say they have it coming in their next release, but from the details ive heard thus far, it sounds rather clunky. Hopefully, the demo that we have set up for next week will shed some light on this. Either way, its going to be a tough decision to make. We either save money on tape w/ Commvault dedupe using our existing remote media agents to perform the dedupe before it sends the data over the WAN or we save money and time in the long run by eliminating the remote backup server and switching to Avamar’s source based dedupe backup solution.
Further insight on this topic would be great!
Thanks
Bryan