Are Flash/Disk Hybrids Just HSM 2.0?
Halleluiah! Tiering within a storage appliance (intra-array or sub-LUN tiering as opposed to the universally practiced inter-array tiering) is very hard to get right. The challenges are reminiscent of those faced with Hierarchical Storage Management (HSM) in the 90s. HSM was an attempt to marry an array of disks and an automated tape library within a single appliance. HSM sounds great in principle: faster, more expensive storage for hot data; and cheaper, slower storage for cold data. So why didn’t HSM take off?
The problems with HSM are (1) complexity of management (another tier to manage, complex policies to decide what data goes where) and (2) the lack of predictable performance. While better automation can address #1, #2 is inherently a problem with HSM: whenever you miss disk and hit tape, you’re looking at a multiple order of magnitude spike in latency. It is very hard to design applications in the face of such widely disparate response times. And if you were going to configure enough disk to ensure your users never hit tape, you were arguably better off with a performance disk tier and a separate automated tape library for backup and archiving (i.e., inter-array tiering).
Our argument is that tiering across flash and hard drives within a single array is HSM 2.0. End users face similarly disparate latencies as they fall through solid-state flash to mechanical disk, particularly as their vendors employ most cost effective (and slower) multi-gigabyte SATA drives. From the perspective of a modern CPU doing the random I/O required for virtualization and database workloads, these drives really do look like tape.
Ours and Forrester’s thesis, then, is that dedupe and compression will do the same for flash in performance storage that they did for hard drives in backup and archiving—enable a faster but more expensive media to be cost competitive with a slower, cheaper one. With the 5-10X deduplication and compression ratios that Pure Storage has seen for our customers’ virtualization and database workloads, you really can get all flash storage at below the price you have been paying for enterprise disk arrays of 15K hard drives (and that’s without any flash cache)! These savings from data reduction cannot easily be extended to mechanical disk, because deduplication is random I/O intensive, for which disk is >95% inefficient.
With all-flash storage, faster, more power & space efficient, and easier than disk or HSM 2.0 hybrids of flash and disk, why buy disk?