How much storage should we provision for your VM? Never worry about it again
Storage provisioning is one of the most crucial first steps in enterprise application deployment.
If you plan to share storage between hosts in a cluster, getting this right gets harder.
And if you need to set a RAID level, whoa nelly!
Setting the correct RAID level, managing the needs of different VM workloads, and balancing that against cost over time is a forbidding calculation. Add the different workload characteristics of the applications and the need to run consolidated workloads on a given storage array and you have an ugly decision to make. Although one could argue that delivering the highest level of performance to the application is the ultimate goal of storage provisioning, most storage admins don’t have the luxury of a limitless budget.
This post is Part 3 in our series exploring how flash impacts VM performance and deployment strategies. Our first post covered the impact of VM host / storage array block alignment, the second post covered the performance differences between different VM types, and this third post will cover how different LUN characteristics and datastore sizes impact performance.
Storage Provisioning for Performance Sucks!
Storage admins provisioning LUNs usually rely on historic perspectives (lessons learned working with the application admins), vendor white papers, and application best practices for storage. However, when that information isn’t available or relevant, you have to experiment with predictive and adaptive schemes for making a LUN provisioning decision. In a predictive scheme, many LUNs with different storage characteristics are provisioned and a VMFS volume is created. In the adaptive scheme, a large LUN with a certain RAID level is created and multiple VMDKs are created. Applications are run with the VM consuming the different VMDKs on the different datastores. If the application performance is not in the acceptable range (likely), then a new RAID layout or cache tuning is applied and the whole experiment is repeated until performance meets the requirement. When multiple VMs share a LUN, VMware vSphere provides disk shares and Storage I/O Control (SIOC) mechanisms to prioritize or give equal access for the VMs.
The majority of storage problems originate from calculating IOPS (I/O operations per second) without considering the underlying RAID geometry. There are a number of articles on this topic; check out Aaron Delp’s blog, Duncan’s blog, Scott Lowe’s blog or other Scott Lowe’s blog on the subject. Given there are only so many IOPS per drive and a finite number of drives, one has to balance LUN provisioning with the drives at disposal.
The choice of RAID level will dictate the IOPS penalty (for writes); for example, if you configure RAID-5 or RAID-6 you are looking at a penalty of 4 and 6 I/Os respectively (see table below). So if the VM is going to generate 1,000 IOPS and has a known R/W ratio, you should account for the I/O penalty to provide the necessary and sufficient physical disks backing the LUN you are about to create.
For the curious reader, the formula to calculate backend IOPS based on the RAID level is:
Storage IOPS = (TOTAL IOPS × % READ)+ ((TOTAL IOPS × % WRITE) × RAID Penalty)
- Total IOPS is the IOPS required by the application
- READ/WRITE are the percentage reads and writes of the application
While this calculation looks simple enough, in the real world this works more by trial and error; if the application starts to experience high latency and the choice of the RAID level becomes the culprit, a new layout has to be created and the steps repeated all over again. This method is both inefficient and unpredictable.
Efficiency and Cost tradeoff
The next issue the storage admin has to tackle is whether to create a few large LUNs or many small LUNs. This is a tricky question because if you create too few large LUNs then you will end up wasting storage and end up in over provisioning. Traditional spindle-based storage is not cheap, and you end up under-utilizing the backend storage to avoid storage bottlenecks, but obviously it’s not optimal design of storage. If there are many applications that want to share the same large LUN, given there are different workloads, this is certain to lead to storage performance problems. Thin provisioning of storage is one way of solving the over-capacity issue, but it still doesn’t address the IOPS and latency problems we described earlier.
If you create many small LUNs then you end up in a storage management nightmare with 100s of LUNs and no way to keep tabs of the every growing demand (storage sprawl anyone?).
All these provisioning shenanigans are directly related to cost, as storage admins want to maximize their investment but also want to keep the application running smoothly and give optimal performance to the end user.
Provisioning with the FlashArray is Easy
The FlashArray does away with these issues by simplifying provisioning inside the array’s architecture. The FlashArray eliminates performance and capacity provisioning tasks – you don’t need to care about any of the usual shenanigans including IOPs, RAID penalty, disks per LUN, and other things storage admins constantly worry about.
No IOPs Calculations
The FlashArray creates thin-provisioned LUNs that incorporate all the drives in the array, which ensures the fullest utilization of all the drives and also provides reliability and availability of the LUN. Storage admins do not have to worry about how many drives constitute a given LUN and all other complicated math and formulas associated with it. Storage admins just have to create the LUN and we handle the inner mechanics of the provisioning, which is arguably how all storage arrays should work, flash or otherwise. So that’s it:
- No more tweaking.
- No more trial-and-error experiments with different layouts.
- Just stop doing it. period.
We have obsoleted the ambiguous and unpredictable decision you need to make, while delivering the enterprise class resiliency, data integrity, performance and reliability you can’t live without. We did this so users don’t have to manage storage anymore. It is possible through the unique combination of our patented RAID-3D protection scheme, our end-to-end data integrity fabric, and the FlashCare technology that earned us the name “the flash whisperers” from industry bloggers.
A deeper dive into the architecture can be found on the Purity OE page, Nigel Poulton’s blog and enterprise reliability page. We are able to achieve 100s of thousand of IOPS with sub-millisecond latency. This leads to the next benefit of our FlashArray: There are no pesky performance problems to debug, this almost sounds like nirvana, that’s correct.
No RAID Penalty
Yes, that’s right. LUN provisioning doesn’t have to take into account the RAID penalty; there is absolutely no RAID penalty that you need to worry about during provisioning. All the data on a Pure Storage FlashArray is protected by RAID-3D technology that is purpose-built for flash. RAID-3D implements three orthogonal levels of protection, each designed to heal around a specific failure modes of flash.
The first level is cross-device RAID, delivering at least dual-parity protection for all SSDs in the system, and allowing the FlashArray to restore perfect parity in minutes even after multiple drive failures, all without user intervention.
At the second level, RAID-3D ensures that the array delivers sub-millisecond latency even if a particular SSD were intermittently unavailable by rebuilding the data from parity (i.e., we treat multi-millisecond latency as failure condition).
And finally, the third level of RAID-3D is end-to-end data integrity protection from bit errors. Raw MLC flash suffers from higher bit errors than rotating disk (described below), so the Purity operating environment (POE) implements a completely independent third level of parity for auto-correcting bit errors within a particular segment.
A Few Big LUNs or Many Small LUNs: Who Cares??
We have performed many tests in our lab and received commensurate feedback from customers about LUN performance: that it doesn’t matter whether it’s a large LUN or a small LUN, the performance tends to be same. All LUNs are created EQUAL in all aspects, performance included. It doesn’t matter if you are using few large 64 TB LUNs (new LUN maximum in vSphere 5.0) and datastore in the case of VMware or create a number of smaller thin-provisioned LUNs; there is no performance penalty. Indeed, using a thin-provisioned datastore on an already thin-provisioned LUN is perfectly valid. The VMware vStorage APIs for Array Integration (VAAI) provide a thin provisioning API that helps the FlashArray to act on the out-of-space condition and also provides a means to implement soft and hard thresholds. Thin-on-thin is a valid use case (See second post for performance smack down between different VMFS formats). We do support the ‘volume grow’ operation in case you want to resize a small LUN later date. You can have it anyway you want it, now or later!
Pure Storage makes LUN provisioning as easy as 1 – 2 – 3.
With Pure Storage, storage provisioning is extremely simple, just three steps:
- Create a cluster and host group
- Create a LUN of any size
- Connect the LUN to the cluster and host group
By adding inline deduplication and compression, Pure Storage is able to provide customers with 5-10X more space than the actual raw capacity of the FlashArray. Combined with the use of consumer-grade MLC drives, the cost per usable GB of storage is in the range of $5-$10.
In short, there are no RAID geometries to select, no flash caches or tiers to provision, no caching/tiering policies to set, and no operator intervention required for fail-overs and device rebuilds. If the cost is equivalent to a disk array, why not go all flash?
This concludes the three part blog on how Pure Storage all-flash strorage arrays drastically change how one thinks about storage, starting from storage provisioning, management and day-to-day operations especially in virtualized data centers. Please feel free to comment and share your thoughts.