- Flash Array
Purity is a storage operating environment built from scratch around the unique benefits and idiosyncrasies of flash. Purity’s core is FlashCare™, which virtualizes the underlying SSDs into a unified pool. On top of FlashCare run the Purity services that provide resiliency, deduplication and compression, and consistent performance of the FlashArray.
Flash is very different than disk: writes are far more expensive than reads, there is no random IO penalty, and it must erase rather than overwrite data. Purity is optimized for the underlying geometry of the flash, employing an append-only data layout, read/ write segregation, real-time QoS monitoring and load balancing to ensure consistent sub-millisecond performance.
Purity employs a unique RAID scheme, RAID-3D, specifically designed for solid-state. RAID-3D provides dual parity or better protection with minimal space overhead, and actually helps reduce average latency. Purity also delivers active/active high- availability in order to rapidly self- heal around any failures in underlying hardware subcomponents.
MLC flash has limited write cycles, but Purity writes to the flash in a manner that ensures long life even when running at maximum write IOPS 24x7. The FlashArray stays within our flash manufacturer's specifications, and we make available a 5-year warranty on all hardware components in the array, including the SSDs.
In a world of evolving connectivity choices, an architecture designed for multi-protocol storage is essential. The FlashArray was designed with complete separation between its protocol support and data storage subsystems, giving it the flexibility to support today's popular protocols, and to be ready for tomorrow's.
One of the reasons that traditional arrays have had such a hard time offering advanced data services (deduplication, compression, fine-grained allocation units or chunks) is that they are limited by their metadata. These architectures require that 100% of the metadata fit in the controller's DRAM for performance. The FlashArray virtualizes its storage down to a 512-byte geometry, meaning that scaling to PBs of user data entails managing trillions of metadata objects, far more than can feasibly be stored in DRAM. Purity employs a scalable, distributed, multi-tiered metadata scheme, where metadata is written broadly across the array and protected in the same manner as user data. Frequently-accessed portions of that metadata are kept in higher-performance data structures, and the most frequently accessed is cached in the controller's DRAM. This rich metadata architecture is arguably Purity's chief asset, and what will ultimately demand the rewrite of traditional storage controller software to get to the solid-state future.
Purity writes data in a virtualized, append-only data layout. This ensures data is spread across the large pool of flash memory, and all data placement is aligned to the 512-byte data geometry. This results in extremely effective deduplication and compression on a large scale.
Meanwhile, the FlashArray is actively moving data in the background to ensure the longevity of both the data and the underlying flash memory. The append-only data layout reduces the time it takes to write because it eliminates the need to arrange related data close together in the physical address space.
Purity is informed by deep knowledge of the geometry of the flash memory and the SSD’s embedded controller chip, understanding page sizes, erase blocks, and how various write strategies impact intra-SSD data movement and write amplification. Purity employs this knowledge to only write data to the flash memory in the ideal manner for that particular SSD – optimizing its performance, minimizing internal data movement, and maximizing lifespan.
Flash memory architectures and the SSDs that contain them change with each major advancement in semiconductor process technology. Typically an annual event, the resulting diversity on solid state storage devices over time must be an architectural consideration of any flash-based storage array. Purity’s flash personality layer fingerprints each SSD (vendor, model, firmware, controller), so it can optimize data placement and performance for each SSD’s unique characteristics, as well as mix- and –match several SSDs in the same system over a typical 3-5 year deployment.
The FlashArray actively moves data from one physical location in the solid state storage to another, where the purpose is to balance flash memory wear, manage over-writes and deletion, refresh data, and heal around any underlying SSD failures. Purity budgets for this continuous background optimization, reserving sufficient bandwidth for it even in the highest IOPS situations.
The goal is consistent, sub-millisecond latency across all users and applications. To this end, Purity times every single IO, allowing it to make real-time decisions about how to shape IO traffic to accordingly. For any reason, if an IO request is not returned within an allotted period, the FlashArray schedules reads from other locations and races them to get the fastest and most consistent IO. Purity deprioritizes internal processes such as data refreshes or garbage collection against user IO. These techniques contribute to the consistent sub-millisecond IO that sets the FlashArray apart.