Video
Webinars
DirectFlash®: Better Science from the Hyperscaler to Your Data Center

44:17 Webinar

DirectFlash®: Better Science from the Hyperscaler to Your Data Center

Discover how Pure Storage QLC DFMs and Purity software deliver top performance, density, and TCO—adopted by a top-four hyperscaler.

This webinar first aired on June 18, 2025

The first 5 minute(s) of our recorded Webinars are open; however, if you are enjoying them, we’ll ask for a little information to finish watching.

00:00

Um, hi everyone, thank you so much for coming this afternoon. I'm Justin Heindell and I lead product management. And business development for our hyperscale line of business. I'm gonna talk a little bit more about our collaboration with Meta that Charlie referenced this morning in his keynote.

00:20

My colleague Pete is gonna go into more detail on really the core technology surrounding that engagement, um, which is our direct flash technology and then Justin uh who will be here in a minute, uh, is going to talk more about what that means for the enterprise. OK. So between Charlie's keynote and this slide we're not exactly burying the lead, right?

00:48

Um, in December we announced a win with the top 4 hyperscalar. In our earnings call and then in February Mata published an engineering blog see there on the right or I guess your left um. What's detailed, uh, some details about our engagement. And then in May we jointly presented at Meta's at scale conference about sharing more details

01:17

of our collaboration. So through these venues, Met has shared some information publicly. Really about the challenges they're seeking to solve, right? and namely those are performance and power. So on the performance front, hard drives are getting more dense,

01:36

albeit slowly. But they're not delivering more performance. This is forcing hyper scalers to choose really two not so great options. Either they deploy deploy more TLC SSDs which are more costly and less space efficient. Or the open provisions significantly the hard disc drive resources to meet those performance needs.

02:04

Now neither of those options provide significant power efficiency. In fact, it drastically increases the amount of power they need to deliver on their performance needs for their warmer data. This is in contrast to flash-based media and especially QLC based media that delivers on all of those dimensions, right? Performance, power,

02:24

and density. Also, I'd be remiss in not noting. That green colour there on the screen. So that in particularly is OCP green or open compute project green uh and while we would prefer there were a lot more orange, um, I can attest to the fact that those are actually direct flash modules.

02:50

So let's talk a little bit more about power. So industry analysts project that hyperscale HCD demand will grow to a whopping almost 2 zettabytes by 2028. That's zettabytes with a Z. Actually I haven't said that word uh very frequently. It's pretty, pretty amazing and in contrast, data centre power is really failing to match

03:13

that really steep growth curve. So this is a picture of Three Mile Island. Some of you might recognise it. I don't because I was only 2 in 1978 when there was a nuclear disaster, uh, nuclear accident at this facility.

03:34

And this picture was really a strong indication of just how challenging the power availability. Problem is for hyper scalers. There have been numerous recent announcements of hyper scale nuclear power projects, right? And um they're not just after cheap power, right? They're after power at any cost, so it's really

03:56

an indictment of just how challenging the power scarcity is. And so safety concerns aside, a big problem is that these power projects won't be online until the latter half of this decade. In contrast, of course, our efficient QLC solution is available today and can help impact that power consumption curve.

04:20

So not only does our efficiency drive really significant power gains or power efficiency gains, but we can also impact space. As well as e-waste at similar scales. So let's talk a little bit about how this all fits in to a hyperscale storage system. So if we were to double click on the software defined cloud storage layer that Charlie

04:47

described this morning, we'd find that hyper scales are disaggregated. They don't buy enterprise appliances for the most part. And they have separate layers for protocol access for metadata access for the control plane as well as for the data layer and then within that data layer typically there's a large chunk store that is backed by many, many exabytes of hard disc drive and traditionally,

05:15

of course, as I referred to earlier, those hard disc drives had been the most efficient way to deploy warm and cold storage. So really a big initiative within our hyperscale group has been in trying to adapt our technology in a way that doesn't require these hyperscalers to change this topology. But rather we've refactored elements of purity as well as our directFAS modules to

05:46

slip or fit right in to their existing infrastructure. So remember those zettabytes I talked about with the capital Z. The services markets can require many, many bits of flash storage. In response to that, over the past several months, Pure has announced 3 strategic partnerships specifically for the hyperscale space with key NanFlash partners.

06:14

Kosha Micron and most recently SK Heinix. This enables us to not only deliver a common platform or a platform with a common experience to our hyper scales, but one backed with a very robust supply chain. So I'll hand it to Pete to talk a little bit more about the solution we built. Thank you, Justin. I'm gonna use this, so if we go one layer

06:44

deeper, what are we doing to, you know, integrate with the hyper scalers? So Justin mentioned this is, this is a refactoring of some of the same elements that we've developed for our enterprise fleet, namely the direct flash modules which are now 99% of of the bits that we ship, and, um, portions of purity uh they work together in this architecture. I'll go into a little bit later,

07:07

you know how that worked together. But we've, we've bundled these together in a way that we can easily integrate with hyperscale, um, one key attribute of this architecture is that it's gonna enable the whole stack so you can't necessarily solve all of the storage problems with, with one exact, uh, say SKU or or module, but the architecture with different modules of

07:33

different capacities, different, different types of man. Can cover everything from the highest performance tiers we're currently chasing the AI Gen AI application that a lot of the world is chasing and a lot of that demand is coming from, but especially over time we'll see that moving down through the traditional warm hard drive tier and in the future even an archived tier because this architecture is very

07:59

flexible that way, very efficient, and it enables the the coverage of all those that space. Now when I say coverage really it's it's the high density that covers the the let's call it the deepest tier but highest performance covering the the very top, and they kind of go together we we use a term we call performance density. I'm sure you're all familiar with even in any

08:23

array sometimes you want to use all the performance that array but you don't need a huge capacity. Sometimes you don't need a lot of performance, but you do need a lot of capacity so that ratio really is what I mean when I say we're covering all those different applications. Now reliability is something we'll also go into later how we get it and why it matters but.

08:43

Let's say at this point that it's critical if you want to scale into this zettabyte world you really need to be reliable um it's a key factor for how hyper scalers work and then efficiency I'm gonna use that term over and over it takes many forms, but that's one of the central themes of of this architecture and how we're really bringing a lot of value into these environments.

09:05

Now when we compare to SSDs, uh, we can just beat SSDs on, on cost um we can provide more bits uh at the same cost or even lower cost and that's a big advantage when we compete, when folks are already aware that that Flash is the way to go. With hard drives it's still true that the the acquisition cost of a hard drive is still uh lower on a per terabit basis, but when you talk about the environment and the inefficiencies

09:33

that come with this before you even talk about performance, you talk about just density, power, space, and those factors you can definitely come out ahead for flash. The minute you talk about performance at all, then there's no doubt you don't you don't really even talk about winning on a on a per terabyte basis because. You, if you want to perform in a hard drive world you have generally have to short stroke

09:58

or over provision that capacity, and now you're buying multiples of the capacity flash just becomes cheaper. Um, so that's, that's new, right? We've, we've been doing this for over a decade, right, and always competing against hard drives, and I think of that's us competing with the most performing tier of hard drives, the 15K hard drives that we used to use are gone,

10:21

right? We've, we've, uh, driven them out of existence. Now we're into the last tier, but we still compete with the most performing use cases. And we're marching our way down and and we're into that last year and the the end is in sight. I'm sure you've heard a lot of our um you know material about how we'll replace all the world's hard drives and that's definitely happening.

10:43

It's key when we want to integrate that we can integrate easily right? so we don't wanna go and change somebody else's interface in order to work with us um this is a unique technology but we we package it in a way that uses standard interfaces and and is easy to integrate and that includes the observability and manageability platforms. Um, at that level, they look like, uh, NVME SSDs.

11:10

They can use all the same telemetry, all the same fields monitoring and that, that sort of thing. And Justin talked about our supply chain and it's very critical, you know, if this is the product different types of man can go into this system and to the customer it all looks basically the same thing, maybe a slight variation here or there in the in the character,

11:30

but there's no change in the architecture or the integration and that makes it um pretty easy to integrate as well. OK, so how did we get here, right? This is, uh, over a decade ago that we started this project, like I said, it has now taken over uh all of the almost all of the enterprise bits that we

11:49

ship. When we tried when we set out to solve this problem, we went back to first principles and we said, you know, what are the core attributes of basically any storage that we want that we wanna develop. So these are the, the basics, you know, you could, we could probably think of more and we can go into other details,

12:08

but these are really the core principal attributes of the technology that we want to apply. So I'm gonna talk about how it works, uh, next, um, but at this point I want to just go through these points and you know kind of clarify how how important they are. So density really drives a lot of everything else we're talking about and you know you can

12:32

probably think back to how dense your storage was 10 years ago and we're using different uh exponents on the on the capacity zettabytes is the new exciting one, but density clearly just marches forward and the further we can push density. It creates all sorts of other efficiencies, but you know it's obvious when you look at how much storage you can fit in a small amount of space in a single drive in a single node in a rack or

12:56

even in one of these giant data centres and clearly the need for that total capacity is blowing up. Um, performance is, is the next thing you think of, and you know, think about how fast your storage was 10 years ago. You need to continually improve the speed in order to keep up with that density or you're gonna start to strand capacity, and that's really the,

13:19

the Achilles heel in the modern hard drive is they can incrementally get larger but they don't go any faster. So for those performance again if performance matters at all in your application. You're just gonna have to start over provisioning, you can't take advantage of the largest drives, you can't stay on the cutting edge for cost.

13:38

That's kind of a losing battle, uh, for the hard drive world. Uh, reliability, I'm gonna, I'm gonna hit this one several times again, but if you wanna scale, reliability is key and if you're in the enterprise, you think of reliability being key because you think of that more at a node level this node holds some of my most precious data. It can't go down.

14:00

Maybe we apply replication and, and we, we can enhance that a bit, but that has to be bulletproof. So that means every component in the system needs to be bulletproof. In hyperscale it's a little different because you can let some of the components fail. We're still dealing with real world electronic systems, some of it's gonna fail so you

14:19

provision for that. But if you want to get to very, very large scale, it can't happen very often, so the notion of let it fail is is true and that the design around failures will happen, but certainly they don't want it to happen all the time because that starts to drag in other inefficiencies, rebuilds, and so forth. And then energy efficiency is the other one

14:40

that will hit over and over, um, I do think nuclear is probably a good solution to some of the problems that's that's um probably controversial, but certainly the, the energy demand is there and that's one solution. But fundamentally the less power you can use, the better off you are. That's always true saving a watt here or there is always gonna matter,

15:01

but often it means that you can't deploy what you want to deploy because you don't have enough power. And so a lot of cases we find that somebody's got a fixed power budget. they're either retrofitting some existing infrastructure and they want to fit as much as possible into it. Or they have a new uh deployment and they still

15:20

want to fit as much as possible so that energy efficiency is always going to be a big deal. And of course we want to make it easy to to integrate and again that supply chain is gonna be critical for us as we scale. OK, so just a little bit about how it works and how these relate back to these basic characteristics. So the the two core architectural principles

15:41

that we set out to to use to solve these for these attributes are, uh, number one, the direct flash architecture, the direct part comes from physical addressing and so what that means is that software has a physical view of where data is placed within the modules. And physical meaning which module down to which die, which block, which page all the way down through the flash geometry we call it and so

16:10

that we know precisely where every bit of data is placed. So that's direct direct access or physical addressing. And then the other core principle is what's commonly known as a host based FTL so that's a mouthful, but host base simply means that work that used to be run run on an individual SSD in the traditional architecture is now run on the host or on the server side done in software

16:36

that gives us an extraordinary amount of flexibility, uh, it helps us improve quality in a few different ways that I'll touch on. And um it helps us move quickly as well. We can adapt very quickly, we can change interfaces and it's much easier to update software than it is to do for more upgrades. Um, FTL was the other part of that mouthful,

16:57

and that means flash translation layer. That's one of the core jobs that a traditional SSD does to appear and behave like uh at least logically like a hard drive, and that means that you get a linear address space that's contiguous and you know you can address every block in there. But that's not how Flash works underneath, right?

17:18

Flash has dye and blocks and pages and you can't use them all simultaneously and they're they're not symmetric in terms of performance attributes and you can't even use them randomly. You can't, for example, you can't write to a flash block randomly you have to write through it sequentially, so even though the drive a traditional SSD is attempting to behave like a hard drive. It's not underneath and trying to bridge that

17:44

gap that's what the flash translation layer does is map from this apparently linear space into this this complicated messy uh physical space um it creates inefficiency trying to overcome that it creates a lot of extra work. It's hard to manage and it's actually one of the sources of the most amount of failures and drives. And I'll get back to that.

18:04

That's why that helps on reliability. OK, so how do the how does this architecture and this, this, these attributes that are unique about this architecture drive these, these, um, physical and and logical attributes that we want to achieve in the drives? So to get density by removing that host the FTL from the drive and moving it into software you

18:28

remove a lot of other stuff with it physical and logical. So to maintain the what they call a logical to physical map is the map that bridges that gap between the linear space and the flash space so you have an address and logical space it maps to a uh location and physical space you have to hold that map somewhere. Um, that maps generally held in DRAM because

18:53

you wanna be able to access it quickly, right? So the traditional drive has a ratio of about 1000 to 1 between flash and the DRAM required to hold that map. So when drives were maybe even up to 1 terabyte, that means you need a gigabyte of DRAM, that's not a big deal. When drives are 300 terabytes to hold a map

19:18

that's 300 gigabytes, that's bigger than the biggest SIMs you're probably using anywhere in your environment, and they're not cheap. They burn a lot of power and they're similar size to the drives we're building, so that doesn't work right. We use more of a ratio of about a million to 1, so we still use DRAM in the system,

19:40

but we don't use it for that map and so instead of having this big power sink and expensive and large uh amount of DRAM in the system, we can use that space to put more flash that drives density, efficiency, and a lot of these other attributes. So that's one of the main ways we get to higher density. Um, performance actually follows a similar pattern, so the work that's required to manage

20:06

that map and then to do garbage collection, which is the, the process of, of really maintaining the map when something gets overwritten, you have to go and move things around and and rewrite that sucks up a huge amount of resources within an SSD. So I'm not gonna hide anything. We still need to do garbage collection,

20:26

but when it's done in software and when it's done across a pool of drives, it becomes much more efficient. And so that work still has to get done in our case we're doing that in software and it's running on the host. We do have some of that map that we still have to maintain, but we can maintain that map at a much coarser granularity.

20:46

It's actually adjustable so that you can choose, uh, even on the fly how much granularity you want in managing the lower level flash. If you want to maintain a large map and have fine granularity you can do that with host resources but if you wanna say the the the term for this is called indirection unit. If you wanna, um, have a larger indirection unit, you can actually reduce the size of that

21:11

map to a pretty trivial amount, uh, and that's what we're doing in a lot of these hyper scale environments. So in that case that map still exists, but it's the, the amount of resources used has gone down by a factor of 100 or more. So the result of removing all of that uh out of the drive means the drive itself is very, very predictable and it's actually perfectly predictable so that software knows across a

21:38

whole pool of drives it's all in one node exactly what operations are outstanding, including user IOs and including garbage collection IOs that do need to still get done, but now we can coordinate those across the system. Um, for example, if we want to read some data that's that's demanded by the user and that's in front of, uh, uh, or that could fall behind a right that needs to get done for garbage

22:03

collection, why don't we put that off and serve the user read first, give good low latency and get the garbage collection right done, you know, whenever, right, because it's not performance critical. It needs to get done eventually, but we can schedule around that. And with a traditional SSD you're at the whims of the drive and you have,

22:22

you know, maybe 24 drives in a system all doing their own thing. A lot of operations are gonna be coordinated across these drives. You're gonna get the worst, uh, the long tail problem. You're gonna get the worst result of everything in the system. We can get the opposite. We can get the best result of everything in the

22:40

system by precisely coordinating what happens and at what time. OK, reliability, so we talked about how important this is, but how does this architecture provide reliability? And it's easy to think oh well you know we just take care in our factories and we have good quality and we we fix our bugs and that's that's one attribute of it but that's,

23:03

you know, a lot of people can get good at that um it's not the main reason why this is such a big big difference. The main reason again goes back to the simplicity that we've developed in the drive by removing a lot of the complexity that's running the drive and remember it's really a small embedded system that's running on an SSD. It's got a bunch of cords.

23:23

It's got a lot of work to do, but it has limited resources. And when an SSD, you know, they're designed to work around man failures and, and do its garbage collection, do all these things, but when it's doing all of those things and there's a heavy demand from the system and then you get a maybe a compound failure, that's when they fall over.

23:44

So again we've removed a lot of that work from the drive period and we've removed that failure mode from the drives and we see a huge increase in reliability and this manifests as easier operations because you're not out there always replacing drives it manifests as efficiency because we don't have to add as much redundancy or parity in the system to manage those failures and it really creates a flywheel of a lot of uh benefits that keeps feeding back into

24:12

these other attributes. And then efficiency, you know, the way we draw a picture here looks again like the energy efficiency we talked about, but it's really performance comes from efficiency so being efficient in like that scheduling and the IOS that we're doing is a form of efficiency that leads to high performance.

24:33

When we're efficient on the use of space in the module, it leads to density. When we eliminate some of this work and do that more efficiently, it leads to power savings. So that's why efficiency is really the core theme here that gives us everything else that we're that we're chasing after. OK, density is a fun one because we're the biggest in the world and we like to brag about

24:55

it. um, it's actually been true for a very long time and I was gonna draw this line back to when we crossed over to be larger than hard drives because there was a time when SSDs used to be really small, right? And I went back to when we introduced this technology and production and we've always been larger than the hard drives that were available

25:15

at the time, um, certainly larger than than competing SSDs and we've maintained that but then a few years ago we were the first to introduce QLC technology into the into the enterprise and really. Hyperscalers are later adopters on QLC in general than than a uh enterprise so I'd say we're probably the first in the world to really introduce that at a significant scale.

25:40

And you can see a little inflexion in the curve there, so we took that and and so that was 5 years ago we were already making 49 terabyte drives. At that time, probably double the size of the leading edge hard drives in the world if you could manage the, you know, performance density problem and even use those. But you can see how flat that hard drive curve is.

26:02

It's, it's pretty linear. Uh, my mental model is they increase about 2 terabytes per year. And it's fascinating if you if you've looked at this technology, it's one of the most amazing achievements of humankind. But that doesn't mean it's easier or it doesn't mean that it's that it's really scaling like a

26:19

semiconductor technology does. It takes significant leaps of physics and engineering and manufacturing to make these incremental gains. And so it's, it's a losing battle for the hard drive world um even as amazing as the technology is it's not producing the, the benefits that that uh semiconductors and flash

26:40

are. But traditional SSD vendors, even though you can see there's an inflexion in that curve, um, they've seen what we're doing they're realising that there's a lot of value. I think AI has accelerated this this demand for very high capacity drives and high density, and they're chasing the other attributes that we're providing that we've been going over here,

27:02

um, but they don't have the same fundamental architecture, so they're limited in what they can do. So we're, we're uh putting the pedal to the metal here we've really had our own inflexion point. Um, we've had 75 terabyte drives uh in production for a couple of years. We currently have 150 terabyte drives in production and I think uh we talked about on

27:25

the main stage today that that we have 300 terabyte modules coming this year, and we're not gonna stop. Uh, I look at the road map out another 5-6 years past that and. It keeps looking uh pretty aggressive. And so then we can keep doubling down on all these other attributes that were that are so

27:43

important to scaling, it's fun. OK. This one's an obvious eye chart, um, but I, I love this. I've been using this for a few years now and it's, it's just this amazing result, um, that, that we get that highlights the performance advantage. So if you go all the way back, uh,

28:06

to the early days of Flash array before we had direct flash. We were always good at managing SSDs. So even though all the SSDs in the system could go off and do garbage collection when they want, we would really, really thoroughly characterise what they do under every workload and every type of situation to find out.

28:26

Is there a limit of how much you can write to this drive at a given time before it starts to go into some other mode? Um, can you control when it does garbage collection by managing the fullness? When can you get good read performance out of it? And that every drive we use, we had to do the super extensive calibration and then programme

28:45

into the system how to manage it basically to avoid it, you know, getting into these bad behaviour zones, but we got pretty good at it. And so that that green shows that that from what I just call naive use you pop a bunch of SSDs in the system and hope for the best. We were 3 to 4 times better than that naive use of SSDs.

29:07

And what the chart shows is proxy for performance is right amplification. So what that means is right amp is really defined as the amount of rights that the media is doing. Uh, over the amount of rights that the user did, right, and so because of all the managing the system, garbage collection parity, all these other attributes of the system, you generally have to do extra rights just to

29:32

maintain the, the data and and maintain the array, but. So a high write amp means that the system is doing all this extra work and the user doesn't get the work. You don't get the benefit of the array or the the system that you bought and so we want write amp to be as low as possible. That means that, uh, performance peak performance is gonna be as high as possible.

29:55

uh, latency distributions are as good as possible, so it's a good proxy for the overall performance of the system. Now when we introduced direct flash we took that ratio down another 3 X and it's kind of hard to see here it's a log scale, but you can see that the majority of the red dots, which are the direct flash uh systems are actually below 1, and so that means that the system is generally writing less data than the

30:21

user wrote. A large component of that that's outside the direct flash architecture more attributable to the purity architecture is that we do data reduction and that really cuts down on on the overall amount of rights and the result is great. The other cool thing here is that these are slanting down to the right so as right throughput increases we get more and more

30:45

benefit and that's a little counterintuitive that's generally because. When there's a lot of rights in flight, we have inline data reduction and technologies like that that actually see uh increased benefit of of data that got written to the system and then overwritten again before it could ever be written down to flash at all. OK, last little bit on reliability.

31:09

Again, it's super critical in terms of scaling. What I put here is, you know, you can see some, uh, good, good public references on the reliability of these components um I'd say an SSD is typically around 1%. You'll find some models from some vendors that are better. You'll find some models from some vendors,

31:29

often the same vendors, they vary quite a bit and, and they, they can be better or worse, but they average around 1%, a little higher. Again, we're good at uh managing SSDs within Purity and so we were able to along with the performance benefits we were able to to um increase the durability or the reliability a bit and then with our telemetry because we have

31:50

a lot of systems that are on phone home about 80% of our fleet is phoning home every 30 seconds we get this super rich data set and we're able to improve that a little bit more. But with Direct flash we were able to make a big step change and I put about 0.2% here because that's what we measure just based on returns, but a lot of those returns are for whatever reason and when we get the material back we study that and we find out the real

32:17

rate of failure is probably about half that so our true rate of failure is probably more like 0.1% on an annual basis. And again we can continuously improve that from our real time statistics and and analytics that we get from the fleet continuously we we monitor and process and do our analytics continuously and then we feed back the insights we get from that either real time to fix

32:41

situations or into the product so that we can improve the the future generations of the product. OK, so now I'm gonna hand it to Justin to help us understand what's the difference between all these needs in the in the hyper scale and how that affects the enterprise. Thanks, thanks, Pete. Everyone hear me OK?

33:00

Um, well, I, I could listen to Pete talk for, for hours and have, um, but. Justin did a good job of talking about sort of the the hyper scale problem. Pete talked about how did we go approach solving the problem so that we would then be in a situation where it would actually solve hyperscalar problems,

33:22

but how many people in this room have data estates that are zettabytes? No, uh, no, no hands should go up, otherwise you don't know metric. Uh, how many people here have tens of exabytes of data? All right, I know there's we're, there are probably some people in here with an exabyte, but so you may think to yourself, what do these scale things matter for me?

33:44

Um, well, it turns out that the things that enterprises care about are really just different manifestations of the same characteristics. That Pete talked about. So we were trying to understand how do we build systems that work better for us that are easier for us and then provide a better solution to our customers. But what, what does a customer care about?

34:05

In the enterprise customers care about reliability. You don't want to have any downtime in your mission critical systems, and you end up building all sorts of complexity around that in order to make sure that that happens. So a more reliable system means that the other parts of your infrastructure can be more simple.

34:29

If you have a super reliable system, maybe you don't need two of them backing each other up, because the math says that this, this one system by itself is reliable enough. And that's important because that again, drives efficiencies, especially with things like cost because it means that you, you know, everyone would rather have the fastest thing in their environment. The only reason that we don't is because that

34:51

tends to cost more. And so. Enterprises care about not just the efficiency, but also the cost and the, you know, even the second order cost things like total cost of ownership. Consistent performance is super important for enterprises because you need to design often around a fixed set of infrastructure.

35:14

I may have to go buy this storage array. This storage array needs to last me for this many years, and this storage array is going to perform this well, and I need it to perform this well because that's how I'm going to size the kinds of workloads that I'm gonna put on it. So consistent performance is really important and that's where things like you know the tail

35:32

latencies that Pete talk about Pete talked about are really the kinds of things that make it difficult to adopt the most efficient technologies like QLC. It's why we were so so early to market with QLC and so many of the of the other vendors are, are still laggards in that respect. And enterprises don't have unlimited resources. They don't have the resources of a hyper scalar,

35:56

so they need to do as much as they can with as little as they can, and they need that to be flexible so that it could meet meet all sorts of different kinds of workloads. And again, limited resources means I don't have an army of people to go and manage this, so I need it to be simple, because I need to be able to be efficient with my people.

36:18

So these are all the same problems that a hyper scaler has. These are actually just different manifestations in different domains of the same underlying business needs. Um, they are at a different scale, right? That's why it's called hyper scale. But all of the things that Pete talked about, all the things that that Justin talked about

36:39

are applicable in any one of your organisations. And the really cool thing is is that you all are in an actually a better, more advantageous position than a hyper scalar. Because a hyper scalar has all of this complex bespoke specialised infrastructure. They have to integrate our technology into that

37:05

stack in a very specific way, and we're working to make that as easy as possible and as seamless as possible, but because of their scale they don't have the luxury of being able to essentially go out and buy something in a box that provides them all that benefits. And, and you all do. And so even though most folks in here don't have, you know,

37:28

even a fraction of the capacity that a hyperscalar has. All of the advantages that they are chasing. That direct flash and Purity bring them. For any customer in here who's already a pure customer, you already have all of these advantages in your data centre.

37:45

So you actually have better technology than some of the hyper scalers do, and it's all put together in a completely simple to use appliance. And The great thing about this for everyone in here who may have a pure system already is that all of the engineering work being done by Pete and the amazing team behind our hyperscale line of business that is going to provide benefits across all of Pierre's customers.

38:15

Because, uh, famously, I think Google said scale breaks everything. Um, we're finding. Out more and more about NA as we deploy at these even larger scales and so that's gonna mean that efficiency, reliability, performance improvements that we derive in these exotic hyper scale environments, all of those improvements are still part of purity.

38:42

It's just a very different implementation of purity in a different context, but all of that technology is going to bring itself into the enterprise as well. And it's going to drive significantly more reliability because at a hyper scale, the difference between 0.2% and 0.1%. Uh, failure rates is huge, so all of this additional telemetry like Pete talked about is

39:07

gonna let us drive even better reliability, even more predictability. Um, there was a statistic, um, that there's a 50,000 direct flash modules right now in one data centre, and the annual failure rate of that specific population is less than 0.1%. I think you said 0.08%. So in a well, you know, well managed, well conditioned environment.

39:36

We know that Direct flash is more reliable than anything else, and that results in benefits not just at the hyper scale but also at the enterprise scale because it means you can get more usable capacity, more effective capacity out of the same infrastructure. And lastly, the relationships, you know, we've always had excellent relationships with our Nan manufacturing

39:59

partners, but this level of engagement has taken that to an an entirely different level and now you know. We are becoming some of the most important partners, one of the most important partners to all of these companies going forward, and so that's giving us even more insight, more capability, more partnership, more leverage in some cases to build better systems.

40:25

And so like the slide says here, I think this is a rising tide that really is going to lift all boats. So What does this impact mean for folks in in this room? Even at a smaller scale. Where maybe I'm not talking about exabytes, but I'm talking about petabytes,

40:50

10s of petabytes, or even just, you know, hundreds of terabytes. Using 80% less space and power matters. Um, using 90% less and, and, and, uh, 5 times less rack space than each the hard disk-based systems matter. Um, and.

41:11

Reducing carbon emissions by 85% matters. But even if you didn't care about those other things, right? The total cost of ownership when you look out over these longer time horizons that Flash makes possible. You really do start to see those savings at even at the smaller scales because the reason

41:35

that people have bought storage arrays every 5 years or, or bought for 3 years and then extended for years 4 and 5 is because that's when hard drives start failing en masse. The reason that Evergreen is even possible for us as an architecture is because Flash is fundamentally different than disc.

41:53

And so that means when you're looking at a total cost of ownership, don't look at that total cost of ownership on the horizon of a disk-based system or even an SSD-based system because the way that we build systems fundamentally changes that calculus to be over an even longer time horizon. And so that once you take that into account can completely flip the um the cost equation.

42:17

So This really cool, amazing hyper scale stuff that I'm sure all of you came here to learn about, um, is super cool. I love it. But what's funny is, is that it's actually stuff that you all already have. Anyone in here who's a pure customer already has direct flash technology,

42:43

and that is I think really amazing that you can go back to your organisations you can say, yeah, the stuff in my data centre is the stuff that I wish they had, right? The stuff that they're working on integrating now their problems are very are are are at a totally different scale and so they have to solve for those problems but you have the luxury of,

43:03

you know, we've been able to solve a lot of those problems for you. And while I love talking about this stuff, I love explaining to customers about why Direct Flash is so amazing and revolutionary, you also don't need to know any of this because the system doesn't force you to be experts in any of these things. It allows it to be simple, it allows it to do its job.

43:22

But hopefully now all of you have hopefully a greater appreciation and a greater understanding of all the I think absolutely amazing engineering work that's done um that that we do and have done for more than a decade now to make those systems as simple as elegant as long life as possible so that you all get a better experience. So thank you very much.

43:47

um, you know, make sure to, uh, play, play the games. Please, uh, check out the Pure Community. Um, also, you know, uh, we have the Gartner Peer Insights link here, um, and, uh, I think we're almost right at time, but, uh, we're happy to take a few questions and then maybe take questions outside after,

44:03

after we invariably have to vacate. But thank you all very much for coming.

Artificial Intelligence
Purity
Pure//Accelerate

Your Browser Is No Longer Supported!

Older browsers often represent security risks. In order to deliver the best possible experience when using our site, please update to any of these latest browsers.

Continue Watching

We hope you found this preview valuable. To continue watching this video please provide your information below.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.