- Flash Array
Flash is very different than disk: writes are far more expensive than reads, there is no random IO penalty, and it must erase rather than overwrite data. Purity is optimized for the underlying geometry of the flash, employing an append-only data layout, read/ write segregation, real-time QoS monitoring and load balancing to ensure consistent sub-millisecond performance.
Purity employs a unique RAID scheme, RAID-3D, specifically designed for solid-state. RAID-3D provides dual parity or better protection with minimal space overhead, and actually helps reduce average latency. Purity also delivers active/active high- availability in order to rapidly self- heal around any failures in underlying hardware subcomponents.
MLC flash has limited write cycles, but Purity writes to the flash in a manner that ensures long life even when running at maximum write IOPS 24x7. The FlashArray stays within our flash manufacturer's specifications, and we make available a 5-year warranty on all hardware components in the array, including the SSDs.
Traditional arrays have a hard time offering advanced data services (deduplication, compression, fine-grained allocation units or chunks) as they are limited by their metadata. These architectures require that 100% of the metadata fit in the controller's DRAM for performance. The FlashArray virtualizes its storage down to a 512-byte geometry, meaning that scaling to PBs of user data entails managing trillions of metadata objects, far more than can feasibly be stored in DRAM. Purity employs a scalable, adaptive, multi-tiered metadata scheme, where metadata is written broadly across the array and protected in the same manner as user data. Frequently-accessed portions of that metadata are kept in higher-performance data structures, and the most frequently accessed is cached in the controller's DRAM. The flexible metadata structure has numerous benefits – virtually uncapped scale, variable block size (512B – 32KB) for comprehensive data reduction (inline deduplication and compression) and real world performance. The variable block size prevents misalignment of blocks in virtualized environments. This rich metadata architecture is arguably Purity's chief asset.
Purity writes data in a virtualized, append-only data layout. This ensures data is spread across the large pool of flash memory, and all data placement is aligned to the 512-byte data geometry. This results in extremely effective deduplication and compression on a large scale.
The FlashArray actively moves data from one physical location in the solid state storage to another, where the purpose is to balance flash memory wear, manage over-writes and deletion, refresh data, and heal around any underlying SSD failures. Purity budgets for this continuous background optimization, reserving sufficient bandwidth for it even in the highest IOPS situations.
Purity is informed by deep knowledge of the geometry of the flash memory and the SSD’s embedded controller chip, understanding page sizes, erase blocks, and how various write strategies impact intra-SSD data movement and write amplification. Purity employs this knowledge to only write data to the flash memory in the ideal manner for that particular SSD – optimizing its performance, minimizing internal data movement, and maximizing lifespan.
Flash memory architectures and the SSDs that contain them change with each major advancement in semiconductor process technology. Typically an annual event, the resulting diversity on solid state storage devices over time must be an architectural consideration of any flash-based storage array. Purity’s flash personality layer fingerprints each SSD (vendor, model, firmware, controller), so it can optimize data placement and performance for each SSD’s unique characteristics, as well as mix- and –match several SSDs in the same system over a typical 3-5 year deployment.
The goal is consistent, sub-millisecond latency across all users and applications. To this end, Purity times every single IO, allowing it to make real-time decisions about how to shape IO traffic to accordingly. For any reason, if an IO request is not returned within an allotted period, the FlashArray schedules reads from other locations and races them to get the fastest and most consistent IO. Purity deprioritizes internal processes such as data refreshes or garbage collection against user IO. These techniques contribute to the consistent sub-millisecond IO that sets the FlashArray apart.
Flash is very different than disk: writes are far more expensive than reads, there is no random IO penalty, and it must erase rather than overwrite data. Purity is optimized for the underlying geometry of the flash, employing an append-only data layout, read/ write segregation, real-time QoS monitoring and load balancing to ensure consistent sub-millisecond performance. In addition, active/active high- availability enables rapid self- healing from any failures in underlying hardware subcomponents. This allows the FlashArray to support non-disruptive maintenance, expansion and upgrades of everything – all without performance impact. Flash is poor at simultaneous reads and writes. Purity addresses this issue by micro-scheduling each SSD to segregate I/O to a given SDD at a given time to maximize performance. Reads from SSDs locked for write operations are served dynamically by rebuilding data from parity. The FlashArray is actively moving data in the background to ensure the longevity of both the data and the underlying flash memory. The append-only data layout reduces the time it takes to write because it eliminates the need to arrange related data close together in the physical address space.