Top of Page

Why Flash Changes Everything, Part I

You may have noticed the phrase “Flash changes everything” on the Pure Storage home page. Well, not quite everything—in fact, part of the raison d’etre for companies like Pure Storage will be to deliver flash in form factors that make it very easy for end users to adopt it. But flash really does necessitate dramatic changes inside the storage systems that purport to offer its benefits.

Flash is simply a profoundly different media than mechanical disk. Before the industry shortened its name, we talked about Flash memory, with good reason: flash behaves far more like a persistent version of DRAM–that is, as a true random access device, with no position dependence or mechanical components.


Our claim is that no matter how flash is packaged for incorporation in the data center – whether as a PCIe server add-on, as a shared appliance, or within an enterprise array – the internal systems software that manages the reading and writing of that flash should be radically different than the software that looks after rotating disk. Here, for your consideration, are the first five of our top ten reasons why flash really does change everything within the storage software managing that flash:

10. Parallelism rules

Disks are serial devices in that you read and write one bit at a time, and parallelization is achieved by pooling disks together. Algorithms for optimizing disk performance, then, strive to balance contiguity (within disks) and parallelization (across disks). Flash memory affords a much higher degree of parallelization within a device – the flash storage equivalent to a sequential disk contains more than 100 dies that can work in parallel. For flash, then, the trick is to maximize parallelization at all levels.

9. Don’t optimize for locality

For disk, the goal is to read or write a significant amount of data when the head reaches the right spot on the platter, typically 16KB to 128KB. The reason for this is to optimize the productive transfer time (time spent reading or writing) versus the wasted seek time (moving the head) and rotational latency (spinning the platter). Since seek times are typically measured in milliseconds and transfer times in microseconds, performance is optimized by contiguous reads/writes (think thicker stripe sizes), and sophisticated queuing and scheduling to minimize seek and rotational latency. For flash, all these optimizations add unnecessary overhead and complexity, since sequential and random access perform the same.

8. No need to manage volume layout and workload isolation

For traditional disk storage, the systems software and the storage administrator generally need to carefully keep track of how logical volumes are mapped to physical disks. The issue, of course, is that a physical disk can only do one thing at a time, and so the way to minimize contention (spikes in latency, drops in throughput) is to isolate workloads as much as possible across different disks. With flash memory, all volumes are fast. After all, do you care to which addresses your application is mapped in DRAM?

7. No costly RAID geometry trade-offs

Of course, flash still requires that additional parity be maintained across devices to ensure data is not lost in the face of hardware failures. In traditional disk RAID, there is an essential trade-off between performance, cost, and reliability. Wide RAID geometries leveraging several spindles and dual-parity are the “safest.” They reduce the chance of data loss with low space overhead, but performance suffers because of the extra parity calculations. Higher-performance schemes involve narrower stripes and more mirroring, but achieve this performance at the cost of more wasted space. Flash’s performance enables the best of both worlds: ultra-wide striping for protection with multiple parity levels at very low overhead, while maintaining the highest levels of performance. And of course, dramatically faster re-build times when there is a failure.

6. No update in place

Traditional storage solutions often strive to keep data in the same place; then there’s no need to update the map defining the location of that data, and it makes for easier management of concurrent reads and writes. For flash, such a strategy is the worst possible, because a dataset’s natural hot spots would burn out the associated flash as the data is repeatedly rewritten. Since any flash solution will need to move data around for such wear leveling, why not take advantage of this within the storage software stack to more broadly amortize the write workload across the available flash?

OK, we’ll stop there for now. With any luck, we’re starting to shed some light on the fact that simply dropping SSDs into a storage solution designed around the idiosyncrasies of mechanical disk cannot deliver the full capabilities of flash. Instead, we believe flash is deserving of the same consideration that disk has enjoyed for decades – that the systems software managing the reading and writing of flash be optimized for flash’s own idiosyncrasies. Check out Part II of this list in a future blog post when we explore the rest of the top ten reasons that flash really does change everything.

About the Author

Scott Dietzen is the CEO of Pure Storage and a three-time successful entrepreneur with WebLogic, Zimbra, and Transarc.

  • Chris Saari

    Dietz, RAM is still an order of magnitude faster than flash, and unless you’ve gone off and changed the major OSes there is a cache of your data in RAM in between the CPU and the mass storage. Several caches if we’re counting L1/L2/L3 and the RAM disk cache. Point? Locality still makes a huge difference, so I think point 9 is misleading.

  • http://mnot.net/ Mark Nottingham

    I read point nine as referring to physical locality, not access locality…

  • Scott Dietzen

    Yes, my point was only that intra-tier locality for random-access media like DRAM or flash is generally irrelevant. This stands in sharp contrast to locality within a sequential-access media like hard disk, where it is far more important. No disputing that inter-tier locality matters a great deal. One subtlety I did miss is that intra-tier locality does matter if it impacts the efficiency of caching in the faster tiers above—that is, managing to squeeze more “hot” data into a particular page in one tier could improve cache efficiency up the stack.

  • http://deranfangvomende.wordpress.com darkfader

    Hi,

    also about #9:
    This remembers me a lot of one “inner workings” whitepaper on Oracle TimesTen – an in-memory database:
    They came to the same conclusion – that once you put disk seek times out of the game, you’ll find
    a) all your “most important” optimizations become irrelevant or even overhead and
    b) that you’ll have to optimize in totally different regions because suddenly a code path that never looked like an issue compared to disk seeks has become your biggest “new” issue, to optimize.

    Few people have figured to use “noop” scheduling with SSDs on Linux because the all-important command queueing etc. introduce you’d never have otherwise. Even less try having an external journal to remove that IO hotspot.

    I don’t know what PureStorage will do differently, but I know we’ll all enjoy a bit better computing when we can stop looking at disk IO latencies :)

  • http://www.purestorage.com/blog/why-flash-changes-everything-part-3/ Why Flash Changes Everything, Part 3 | Pure Storage Blog

    [...] the top ten list format in the past, but I’ve reached new lows by starting one in March (Why Flash Changes Everything, Part 1), continuing it in May (Part 2) only to now finally finish it in August. In my defense, Part 3 of [...]

  • Steve Leeper

    He’s talking about locality of access on the disk itself.  Cacheing algorithms look at a hit on the disk and then make certain assumptions that data in and around the block being accessed are more likely to be the next candidates to be read by the application.  This feeds into pre-fetching algorithms as well.  Net-net is that the locality of access discussion is focused more on spinning disk rather than the various levels of cache within a server, or for that matter, the file system buffers maintained by the OS.

    Also, when referring to L1/L2/L3 cache those are tremendously small layers of cache.  L1 is contained directly on the CPU and thus quite small as the transistor budget dedicated to the L1 cache must not rob the chip of processing power for cores.  L2 cache gets bigger but alas, is slower than on-chip L1.  L3 gets even slower out in DRAM and also suffers from the challenge of diminishing returns on additional layers of cacheing and the management/interaction between the layers of that cache.

    So with all due respect, point 9 isn’t really misleading when one is reading it in the proper context which again, is locality of access in terms of the disk itself. 

  • http://www.purestorage.com/blog/is_flash_the_new_disk/ Is flash the new disk? | Pure Storage Blog

    [...] between storage and the rest of the data center.  As we have already pointed out in this blog, flash affords dramatically more random I/O per GB vs. disk, and due to virtualization and cloud computing an ever-greater share of data center I/O is [...]

  • http://www.purestorage.com/blog/why-flash-changes-everything-part-2/ Why Flash Changes Everything, Part 2 | Pure Storage Blog

    [...] We remain convinced that flash warrants a holistic redesign of the storage array (software and hardware) that manages it. To understand why flash requires such a rethink, this seems an apropos time to continue the top ten list we started a couple of months ago (see Why Flash Changes Everything, Part 1): [...]