1:16:49 Webinar

Tired of Energy Savings & Efficiency “Gas”-lighting? Better Science is the Answer

Who knew that the best coffee break conversations would end up happening online? Each month, Pure’s Coffee Break series invites experts in technology and business to chat about the themes driving today’s IT agenda - much more ‘podcast’ than ‘webinar’. Pure technology, our goal is to educate and entertain rather than sell.

This webinar first aired on 11 January 2023

The first 5 minute(s) of our recorded Webinars are open; however, if you are enjoying them, we’ll ask for a little information to finish watching.

00:01

Ok. I think with that, we are roughly on time, still see people streaming in, which is great as always, that's why we record these, but we will go ahead and get started just in case we want to ever cut these later for video purposes. I'm gonna do a quick pause and they'll dive in. Welcome to the first Coffee break 2023 on January 11th.

00:21

My name is Andrew Miller, principal technologist at pure storage or Coffee break host. Join today for the second time. Thank you, Brian for joining, joining again. So it's always great to have a, a literal rocket science back on a little bit of housekeeping as you know, hopefully, this is a series as indicated by the cool logo.

00:40

We are now literally into the second year of coffee break. So all of these are um online, you can find the recordings later and they've actually paid better than average, I think due to the solution focus, you know, we talk about pure stuff, but it's in a larger context. You can also find them on pure stores dot com slash event center that covers both the coffee

00:59

break and our flash through our customer webinars. Our tech talks are more kind of specific webinars on previous ones there. That's when I'll do some housekeeping in case part of your motivation. There may be all of your motivations coming for the copy, a little bit of revision on this that we changed.

01:15

Actually, it may or may not be Starbucks depending on where you're joining us from. And we actually, I was remiss of not changing the slide before a couple months back, we started to shift this over to business emails only just because of how many people are joining us. And in some cases, there are, there might be clever folks out there that are doing multiple registrations.

01:31

I applaud the effort. But I think you understand how the game works here too. So you don't see your copy card, make sure you double check that into introductions. As always, I'm your host, Andrew Miller, appreciate you doing today in the past. And while I, while I usually try to say as much of my background when it relates to energy,

01:50

I, I actually remember the first time in being responsible for a data center and actually understanding how the H VAC worked and the power that came in and the, and the actually the Upss that were huge and even the, the UPS tech that one time might have knocked the Ups offline while he was servicing it. It got eerily quiet outside my office when that happened, it was like,

02:07

I've never heard it sound not like that before. I'm joined today though, by Brian Gold. Brian. Do you mind? Uh Do you mind introducing yourself again since you did a year ago? But please? Yeah, it's great to be here. Andrew. Thanks for hosting. As always, I'm Brian Gold.

02:22

I'm a VP of technology here at Pure, I've been with pure storage for, it's coming up on 10 years. So, uh quite a while at this point, never thought I'd actually work anywhere for 10 years at one company. It's kind of weird in the modern, certainly Silicon Valley era, but uh it's been awesome and a lot of fun. Um I have a pretty varied background from

02:43

academia back in uh Pittsburgh and a few other places uh in and around the Bay area. And uh yeah, it's great to talk with folks about some of the really deeper technology stuff that we're working on uh here today. Congratulations as well. I think since you last joined us, you're, it was we were talking, it sound like your job responsibility has changed

03:02

dramatically. Uh But, but you're now listed as a VP. So, you know, congratulations for whatever that's worth as well. Please make sure to join us next month. I'm gonna be focusing back in on VM Ware like it says VM Ware and Pure. We're not done yet. It's a little bit of illusion to, we've been doing so much with VM Ware for so long as

03:24

they've been doing stuff with us too, there's new, good stuff that we've been doing and there might even be some fun stuff you haven't heard about that we're gonna touch base on. So last time of this topic, Cody Homan joined me this, this month is gonna be joined by David ST and he and I actually the pleasure presenting a tech field day extra uh, back at VM Ware Explorer last year.

03:41

So we'd be diving into all things VM Ware and pure. Oh That into the topic. Now, it, it was kind of fun to uh do a little bit of brainstorming. I wanna give kudos both, both to yourself. Brian and Jeff Pickett who's joining us in Manning, the uh the chat in the Q and I A line there, you know,

03:58

kind of thing about, you know, how do we approach this topic because we do sometimes see in the industry just a lot out there around energy savings and efficiency and everything is simple and everything's better. It's like can this actually really be be true? And is it a focus now what we wanted to do to this month is not do the same thing that we did last month. We as pure that is so we had a great webinar

04:22

last month with Kevin Rickon Don Kerouac. We went into kind of all things energy savings focused, even thinking about, you know, you think about stuff like you look at northern Ireland, Ireland, like there's the government has said we can't build more data centers there. This is getting discussed especially outside the US, but especially with the larger customers that I see in the US too.

04:42

But you might see sometimes claims like this and this was kind of the centerpiece of that webinar about, you know, 84.7% lower energy usage and it can almost maybe sound too good to be true, right? That's a pretty dramatic number. So we wanted to, we were trying to think about how do we take a kind of coffee break? I approach to this and really focusing on the answer.

05:07

So what we wanted to do, we're into the agenda. You're like, where's the normal agenda slide? Here it is. So please, like it says, let the energy claims be more than marketing. So we're right after Christmas. Um You know, so like III, I watched various Christmas series over the years.

05:22

A recent favorite is Polar Express, right? Polar Express, you know, like I want to believe almost that ends up being a child believing in something more and even something transcendent. But in this case, like, can it be more than just claims? So that's what we wanted to explore four parts as always first,

05:39

wandering into kind of the history of flash a little bit. And what we've seen in the industry is flash, just a flash in the pan, maybe second really actually go into the physics. This is especially where I'm glad to have Brian here because this is where we um I've said this before but not too awkward. The professor that you enjoyed listening to

05:57

kind of thing. Let's just walk through some of this third, two plus two equals five. Is it a hardware discussion? Is a software discussion? Why? Yes, it is. And last, but not least, you know, how do we actually then think about if we think about things at a very micro level,

06:11

what about the macro level? What about petabytes exabytes? What are, are there principles that can apply here? So with that, I think Brian, if I'm not mistaken, we were going to actually kick off the first pole here if you don't mind Emily. So just in reference to um the topic of energy consumption, because that's what we're gonna

06:31

talk about today where we're gonna kind of orbit around it even go really deep there. So I'm just gonna leave this pole up here for a minute as we kind of kick it off. So as we think about this as a topic, Brian and especially 10 years here pure. All of this working heavily with flash and designing storage systems is both both as pure and maybe even as an industry is kind of this energy topic and energy stuff.

06:56

New, been there for a while, different ways help me out there. So it's certainly been around and if you go back actually 10 or 15 years ago, it, it was peripheral to energy overall, but there were big calls in, especially in the build out of kind of hyper scale web uh architectures, things like Google, Facebook, Amazon and so forth about uh what was called energy

07:21

proportionality and power proportionality that if we put, you know, a kilowatt into some bit of it equipment, how much utility do we get out of that? And that's something that uh as we built hyper scale data centers, it mattered a lot. And you had kind of a single platform uh that, that, you know, would, would derive benefit from that efficiency.

07:45

And so it, it got talked about there, but what's happened over the last, you know, 10 or really five or even two years is that's come more and more to the forefront. Uh And you, you've seen it, whether it's the rise of, you know, virtualization and moving from uh you know, physical service to virtual service to get higher utilization and higher efficiency or the

08:07

changes in networking to have more efficiency. There's a common theme here of energy efficiency is actually a story of just efficiency and that can be great because it serves the dual uh requirement of lower my costs. If I'm more efficient with the it equipment, I'm gonna uh consume less energy hopefully, and I'm also gonna spend less money to get to an outcome.

08:33

Uh and that, you know, kind of win, win scenario is something we've thought about, you know, very basically every day for the last, you know, 10, 12 years here at pure, I even think about the density and we didn't even talk about this before. Like the first time that I started to do heavy stuff with U CS and U CS chassis, like server chassis, uh or blade chassis.

08:51

And you couldn't actually fill up the rack anymore because you couldn't get enough power into the rack because you made it so dense already kind of thing for thanks to VM Ware and other stuff. So let let maybe even focus it on storage. So with storage, we've kind of spent maybe maybe a decade throwing S sds or N at this, but not necessarily for energy reasons though.

09:13

Right. Yeah, you know, you look at the cost of flash 10 years ago and it was clearly a premium relative to spinning discs. And so for especially really large scale data sets, you'd say, well, ok, I'm physically rotating something and it's not the most energy efficient thing to keep something spinning at, you know, 7200 R PM or whatever,

09:37

but it's really cheap and I need a lot of it. So for certain use cases you're stuck with it, but, you know, 5 10 years ago and certainly still today, anything that needs performance, I'm looking at flash and we've seen the rise of, you know, all flash storage systems. We've helped be a part of that with, you know, what we've done at pure, but there's always been a performance element to that

09:59

because flash, you know, has that associated, you know, cost premium associated with it. And, and so, you know, today's especially really large scale platforms still have a fair bit of disc in them. And that's, you know, in some ways where the next area of our attention can focus on energy efficiency because we got to get rid of these spinning things.

10:22

If we're really gonna drive down power consumption for, for storage systems, and if we can do it in a cost effective way, then we get that win win of energy proportionality and, and and cost efficiency. It is this perfect storm of I think we've all heard data is exploding.

10:39

What whatever the numbers are three X per year, 10 X per year, say zeta bytes are big numbers at the same time that the energy set pieces around that and even the physical aspects of the data center. So if we look at that then through a, a little bit of a cost lens. So there's a piece of kind of why all flash and all now.

10:57

And, and I think you said you have this chart actually printed on your and it's like stuck up on your office wall if I'm not saying like help explain why that is. But we think a lot about trends and it's helpful to look at long term trends, both historically and projections going forward to understand the decisions and choices we make in building a product or designing a large scale system.

11:18

What are the big macro forces that are, you know, we're working against or working with? And, and so this data is actually publicly available data. It's really fantastic data set uh that looks at consumer prices going back for some components, actually goes back decades. Uh There's a professor who's been keeping track of consumer prices for uh electronic components and uh publishes this

11:43

regularly. And, and so, uh in particular, we've normalized everything here at $2022 factoring in, you know, various uh you know, inflation over the years and, and it's looking at dollar per gigabyte of just hard drives and S sds and you see, you know, kind of we've, we've put in some trend lines, we see a slow convergence there and it's, it's, we're getting closer

12:09

and, and you can see the path we've gotten with flash components, you know, using S sds as a proxy for that. It's getting us closer, but we still got a little bit of a gap here today and, and part of our mission is to help close that gap. I wanna make sure to put it in because there's some stuff when you and I were telling us like,

12:28

we don't want to put in too much preface stuff. We wanna just like dive into the stuff. But what? Fundamentally, the energy usage just by using flash, it is a good bit lower than hard drives. Although there's, although there are definitely challenges there in trying to close the gap. So back to you, back to you. Yeah. So you know,

12:45

we, we want to build cost effective systems. They've got to be resilient, reliable. I saw a question come through quickly in the chat there about longevity. We'll talk about about how we handle that. That's a huge focus for us and for anybody. Um But there are things just on a cost front that we as system builders can do to

13:06

close this gap at the that exists at the media level and get all flash designs that are cost competitive with hard drive based designs. And it's things like well continue to adopt the latest greatest cheapest man, you know, today, that means using QLC media, it'll be PLC. Uh You know, here in a couple of years, we've got to have really efficient parity. How do we encode resiliency into a system?

13:32

Uh something we've put a lot of time and effort into it pure is having best in class data reductions so that if there's any redundancy in the information you're sending us, we can squeeze that out and need fewer physical bits on the back end so on and so forth to get really, really cost effective systems, but, but not so simple.

13:58

And in fact, you know, that first thing I mentioned there of keeping pace with where flash is going at a technology level. It means that the cost of N goes down. But we've got to address the endurance problems. And as, as folks who are familiar with any of this part of the technology landscape may know as we store more information per cell and as we make those cells smaller,

14:26

we tend to have worse endurance characteristics and we'll talk in depth about why and, and how we overcome that. But that's actually just the surface level problem. There's actually many problems in the land and, and this is the full chart I have printed on the wall to think about. The problem is actually worse than you might think because one of the solutions you can,

14:51

you know, apply to try to overcome some of these issues, you can put a bunch of D ram in a system and do cashing. But D ram stopped scaling in in cost curves actually a few years ago. And there are dips and valleys based on supply chain and some uh uh technology trends, but it's basically flat and that has huge implications on how we'll build systems.

15:16

And so all of this we say is just the context for we wake up every day as you know, architects here at pure at least. And we think about how are we gonna build future storage systems for a huge variety of use cases, some of which may be very performance sensitive, some of which might be more archival and capacity optimized.

15:36

And we've got to take all these raw components and build really, really efficient systems out of them. So doing that requires fundamental understanding of what the building blocks are. D Ram Nan hard drives, et cetera. I'm gonna do a quick pause.

15:56

Let me close the last poll here and we'll share it back with everybody, the share results there. So it's interesting to see uh 50% of folks saying uh energy consumption is much more important. 40%. It's about the same and 10%. Yeah. You know, so I think you'll find the rest of

16:15

this interesting, right? For other reasons. Uh But at the same time, it's not. So actually that trends and even in some cases, we see different uh distributions from co company size because that affects total energy bill, even geographies, you know, USA ma PJ, etcetera, you know, kind of thing. So as you mentioned, like this is not,

16:33

I mean, the problem is worse than you might think, but to even kind of understand that. And I, I got out of order. I'm like, could you launch the second poll here as you can tell we planned this out but not precisely so second poll coming up right now, we'll let this sit out there. Um And actually, so yes. OK. So this is a little bit similar uh

16:54

sustainability higher than it was before. So we'll let that one. Sit out there for a minute. I won't read the poet and we'll come back to it. So, we think about then what are the building blocks that we truly have? Because in a way we've had S SDS, we've had flash for a while. And so what I enjoy really enjoyed in this Brian is just how you go all the way down into

17:14

how it really works. So, do you mind walking through? Oh, almost just like at a physics level. Yeah. Well, uh so first a little bit of why, right, if we really want to solve these macro problems, we got to understand at some point, we need to understand how everything works. Even if you know, we at pure storage or other system builders and system architects,

17:32

we might not be fabricating the individual man cells or chips, but I need to know the properties that they have so that I can factor in where they're going and what we're gonna do about the some of the challenges at a system level. And so I've enjoyed learning and, and talking to, you know, real deep experts on the physics of flash and then thinking about how are we gonna work with

17:56

those? So, so here's a quick uh you know, primer on how flash works way down at the, you know, uh electron level. And, and so the picture on the left here is a transistor. And if you've never seen something like this. Don't worry, it's really not as complicated as, as you might think, it's think of it like a light switch.

18:16

And what we do with a transistor of any kind is we apply some uh voltage, some effort across the top and bottom. That's like flipping the switch on. And if I apply a certain amount of effort or a, you know, a voltage I'll switch this switch on and current will flow. What folks back in the fifties figured out was if we apply a lot of voltage,

18:42

like kind of a crazy amount of uh uh potential across that top and bottom gap, it will actually cause a physical effect called quantum tunneling where these electrons that are flowing from left to right in our transistor, they will shoot across this insulator gap through quantum tunneling and they'll get trapped. And that's kind of what we have shown in this cartoon is these little,

19:05

these little circles, you know, think of these as trapped charges and they'll stay there. Even when you take all the power away, they'll stay there. And which is kind of an amazing physical property while they're there though. And they're, you know, sort of there in, in mass, there's a bunch of charge, they change the electromagnetic field and therefore they change the amount of effort it

19:28

takes to turn the switch on. And that allows us to now understand whether we've trapped those charges or not. And that's what the graph on the right shows if we have thousands of these transistors, some with trapped charges and some without, we can actually observe which group is a given transistor is in based on whether this electromagnetic magnetic field is disrupt

19:53

disrupted. And if we can tell that condition, well, we can call one of those states, uh one and the other one is state, you know, zero. And we've congratulations, we've just stored a single bit of information binary all the way back to binary. And if we can tell a lot of charge or if we could tell a lot of charge from no charge,

20:15

well, maybe we could tell some charge from a little less charge. And so if we have the ability to differentiate four different levels, we can actually store two bits of information and then we can tell, you know, eight different levels of charge, we can store three bits and so on to get all the way to what you can see in production today where we can store four bits of information.

20:39

So that's 16 different levels, which if you step back and think about it, it's kind of amazing. Uh you know that we're harnessing all this sort of underlying physics and we stack these things up in these incredibly sophisticated, you know, 3d stacked guy, you know, uh a petabyte of information is quadrillions of bits all stored. And you know,

21:06

using this sort of quantum of uh physics effects. And we don't think about that every day necessarily as we use storage devices, but it's pretty amazing stuff. So, but, but pretty much there is no free lunch. And uh as, as, as we mentioned, upfront, um endurance has long been a concern with, with flash and

21:31

it gets worse. And so, yeah, the way to think about this now that we kind of introduced this idea that hey, we have these different uh threshold voltage values. We want to tell the difference between to be able to store and retrieve information. Well, we kind of wish they were these perfectly divided. And I could tell, you know, this many electrons

21:50

is this value and this other mini electrons is other value. But in fact, the different distributions have some overlap and it's actually even worse than that. Uh they change over time. So as we write values to this, they, they change the, the distributions drift and degrade. And, and,

22:14

and this is, you know, a, a visual of what we know to be true of the realities of N is that it has lower endurance and, and every time we pack more information in or, or make it more dense gets worse. It's there's all kinds of quirks and caveats that we as system architects have to deal with because they are tied back to these physical properties of like quantum tunneling is literally smashing electrons through tiny,

22:40

tiny, tiny microscopic, you know, films of glass. And so you end up with uh you know, a trend over the last 10 or 15 years as we've gone from storing a single level, you know, single level cell, single bit of information to the modern quad level cell flash where we have, you know, roughly 100 x fewer right cycles.

23:02

And when we do a right, it takes a long time, it takes like spinning disc latencies to erase or write new values and those have huge impacts on how storage systems should be built. So, from a general, so man, and, and whenever I've looked at this and I'll talk about this, I mean, this is dramatic. We're talking an order of magnitude shifts, two orders of magnitude shifts as far as the

23:27

endurance. That was actually a great question already from someone about endurance, right? That is one of the key things we have to solve for. So at a general industry level, we're we're kind of still leaving pure out of this in a way. Although we think about this topic a lot, this is a more educational agnostic type content.

23:42

How has, how have SSD vendors solved this in, in general? Well, so the the first thing we want to do is run away and hide. We wanna as, as consumers of flash, we want to pretend none of this stuff exists. And in fact, even the basic programming challenges of the fact that flash, you can't do random overwrites, you have to do what are called out of place updates where I

24:05

can only erase this block of, of of cells and then append new information into it. And so what we do is we create an illusion layer in an SSD. We create an illusion layer that says, don't worry, pretend this effect isn't there. We'll have a what a traditional hard drive would present to you,

24:26

which is this linear block address where I can do random four kilobyte sector level overwrites. And it's the job of the SSD through the flash translation layer and all of its machinery inside of the drive to hide the physical effects and to remap that logical address space into the really complex underlying physical structure.

24:52

The problem with that is that illusion layer knows nothing about the file system, your application, your database, your VM environment, it knows absolutely nothing. And so it's stuck basically with a set of heuristics and designs that are more or less designed for arbitrary random, you know, workloads you might find on a laptop or a phone or in a data center. And it's really hard to be good at all of those

25:22

things and yet know nothing. And it gets really expensive and to tie this back to even the energy and cost efficiency discussions, all that illusion layer that we're keeping in these drives and these, you know off the shelf S sds. Well, it takes about, you know, typical rule of thumb is you have about 0.1% of the flash capacity you'll have in uh D ram to keep these mapping tables of mapping logical addresses to

25:52

physical. And that might not sound like a lot like 0.1%. But you know, we're seeing storage systems that are now pushing 1000 petabytes in scale. Well, a 10 petabyte storage system has 10 terabytes of D ram inside the S SDS. That's not server memory, that's embedded D ram. And remember that earlier chart that showed the D ram is not getting cheaper.

26:18

So it's really expensive and it draws power because you got to keep it refreshed all the time. And so it's quite inefficient to do it that way. And so the problem here is that the SSD being kind of a black box to outside software, it throws away any knowledge that might have existed in the file system or in the database or in whatever application you're running about metadata updates different types of,

26:45

you know, right streams different background tasks that might be happening and it has no knowledge. And so we got to burn all this D ram and power budget to support what might be a laptop oriented workload. Even though you're running some VM ware or SQL server or oracle or whatever in your data center. It was fun talking about this.

27:06

I mean, is that it's, it's amazing. This works at all. You talk, you talk, you talk about quantum tunneling, you talk about like applied physics, like the modern S SDS didn't just make it on the, on the slide. I mean, and, and you wrote this slide actually, I mean, they're engineering marvels but there's a lot to go further.

27:23

Yeah, they really are. And, and, you know, it's a testament to the industry and how much opportunity solid state presents that, you know, there's enormous R and D investment has been made and if you don't know anything about the workload, it's hard to do better than an SSD. But if we're building, you know, petabyte scale storage systems for data

27:44

center type workloads, we do know a lot more and we got to go further. So we're now into how do we think about this as pure a little bit and, and we're still orbiting around, by the way, everything this is energy. Yeah, we kind of were going all the way down and we're kind of kind of pulling it back up again. So like this is like how it truly all works all

28:03

the way down because I mean, fundamentally more efficient than spinning disc. But then there's challenges with it and flashes term interchangeably. So how do we develop solutions about this? And this even goes to the idea of uh I first saw this quote actually on a Steve Jobs presentation with that one. I know I'm giving free advertising to Apple but you know,

28:22

but well, so kind of thing, the, the quote from Alan Kay of people who are really serious about software should make their own hardware. That is, we're kind of alluding to here, you know, two plus two equals five. Or is this a software question? Is it a hardware question? Is it, it's both, actually, that's what we've done from a peer standpoint.

28:43

I think about how do we cosign these two systems that deep breath? Exactly. So if, if we are the designers and architects of the software that is implementing a file system, an object store, a large scale storage system, you know, even if it's presenting, you know, block abstractions, we know so much more than you would if you were building an

29:08

SSD. So we wanna take advantage of that knowledge that we know all these different right streams. And we want to design the structures in the hardware and the structures in the software. And so this, you know, picture on the left uh which, you know, we needed like another hour to talk about the data structure fundamental. So we'll skip some of that,

29:29

but rest assured there's, there's a lot of work that's been done to design really efficient data structures to represent all the complex uh metadata. And we're gonna co-sign those structures with the hardware that we're also building on top of the underlying media such that we get all the context and knowledge of what's going on in that higher level software. But we also have direct control and visibility

29:57

into that hardware and into that media. And that's the key enabling insight if you will, that we can use to overcome some of the really big technical problems that exist in modern flash and in building efficient systems at scale, especially when it comes to energy safe. So there's three, I think there are three areas that we wanted to highlight there kind of both benefits but also

30:19

problems we solve first around lifetime, right? Yeah, so we hopefully made it clear that, you know, we've got cheaper and denser components that have come with modern flash fabrication technology, but that presents an enormous endurance problem. So how do we overcome that? Well, the key is get rid of that illusion layer and have that context have all of the knowledge

30:43

of the different streams of right activity. Our software which is called purity now understands uh what those different streams of metadata and data and we can place and align that data onto flash in such a way that we minimize what's called right amplification so that we're not having to collect as much garbage and rewrite things on the back end when you as an end user didn't actually do a new

31:12

right? And so if we can minimize that right amplification, we can extend endurance no matter what the flash technology is. Uh you know, under the covers um minimizing right amplification is a good thing. And this graph shows, you know, really the ultimate measure of that, which is annual return rates, think of it like an annual failure rate effectively.

31:33

Uh The blue bar uh so lower is better rate. We don't want to get, we want to get fewer things back that fail, you wanna send us fewer things back. Um The blue bar is, is kind of your SSD typical number. Uh And as we, you know, we built a software system in the early, early, early generations of flash ray. So this is the FA 300 fa 400 series where

31:58

it was you, we were using off the shelf S SDS and we were better than kind of the industry average because we wrote the software knowing it was flash, but we were still working through that illusion layer. And when we moved to our direct flash modules, that's that cosigned hardware and software, we saw roughly a three X improvement and we've been able to make it better by collecting

32:21

some basic telemetry from call home systems through pier one where we can see, you know, hey, how do we improve some of our heuristics and algorithms to do that? Scheduling and placement in a way that reduces right amplification and further extend the lifetime of these devices. And this is held up phenomenally. Well, for us, as we move from NLC flash to TLC flash to further generations of TLC,

32:48

and we've now got multiple years of production experience with QLC. And we see really no end in sight for our ability to overcome these problems that are plaguing the rest of the industry efficiency lifetime. That's what underpins actually pure having, you know, flat and fair maintenance because we actually know that they'll last longer, but flash is still about performance,

33:13

right? It is, and we can't give that up as we move to the, you know, latest generations of flash. And so a big way in which we overcome these longer program and erase times is by doing fewer rights. So it's actually a side effect of having lower right amplification as we get some performance improvements. But we also have by engineering,

33:35

these systems end to end, we have the ability to use NV ram and to use the right amounts of D ram caches such that we can uh mitigate, we can't eliminate, we can mitigate some of these tail latencies. And at the same time, avoid just throwing hardware at the problem, which is gonna increase energy costs and is gonna increase just actual capital costs. And so a big part of what we've been able to do

34:00

is take a lot of the D ram out of the drive, consolidate it pool, it, use it more efficiently in the whole system. And that's really allowed us to unlock the scale of what you're seeing already in the market with uh the direct flash modules where uh today we've got a 48 terabyte DFM that's based on QLC media and while I can't unfortunately give it you uh precise,

34:27

my, my legal team won't allow me to give you precise dates. There's some exciting stuff happening uh in our future over the next couple of years where we're really gonna see massive increases in scale and we'll use this to serve different parts of the market. You know, we will continue to serve the really high performance parts, but we're likely to be able to serve an increasing number of uh more

34:52

capacity centric. We're already doing this with the flash or AC and the flash blade s platforms and we're just gonna keep pushing further and further. Uh because that intersection, you know, is, is across all of our products now. And that intersection of software and hardware, it's what I came to pure to do. I have a background in hardware but always have worked kind of right at this boundary level and

35:16

it's a fun place to be, you can really affect uh big outcomes when you control both sides of that. So it's been uh it's been pretty awesome to, to build this out here at pure to kind of highlight a little bit. So for those who are joining us part way through key part here is peer doesn't use S sds, they actually build the equivalent.

35:35

And there was a, there was actually a great video with um oh I'm free in the youtube channel, you know, called actually the, the whole array is like one big SSD almost if you will because it sees and understands everything. I think Brian you were saying like it's, it's 500 engineers that are basically behind this one slide that are working to make the things on this true and not just marketing.

35:53

Yeah, 505 100 engineers, you know, 78 years, no big deal. Um Yeah, one slide gets reduced to one slide but, but there is like I said, this is a short version of what we walk through and even the density, you know, 49 terabytes today with QLC. This is where we're going all the way back to the title. The answer is truly better science.

36:14

Like we're not making this up. We've actually had, you know, 10 years of engineering. That's part of why we say a 10 year head start on competitors that have just been using S sds off the shelf if you will. But by the way, we've actually been thinking about this will actually be a final poll question. There's other topics that we could do that are

36:30

true like better science and engineering. Beyond this direct flash topic, we may come back later and do some other coffee breaks on this. That that'll be a later poll. I think we're at the end of section three, we're roughly on time. I was kind of cheat back and forth a little bit.

36:44

I'm gonna close the poll here and then share the results back with you. Thank you, everybody for answering. We had uh about 1200 people answer, which is pretty amazing. So, if you or your company specifically designated energy savings or increased sustainability is one of your top priorities for 2023 63%. Yes. So that's, that's interesting to kind of map

37:05

that even to the previous poll a little bit. I won't spend too much time playing with those thoughts right now because we got more, we got one more good section to go to no section four. But even before we say what it is, what if we've been all the way down here? I really appreciate how you walk through it because like to me,

37:23

it's, it's very understandable, but we've been all the way down this layer. What if though we solve the micro, we do all these amazing things, but we miss the macro. We're in data centers. We've got stranded capacities and arrays that aren't running because we need to avoid stranded capacity, performance of capacity wise or stranded resources,

37:40

either performance or capacity, but we can't get too close to the red line or else things fall over. So we need that magical, you know, it's almost a goldilocks thing. Not too big, not too small, just right, you know, kind of thing. So this is where we go into thinking about if we're thinking about this through an and so

37:54

through an energy efficiency lens, this is part of what we do at the micro level. But is there's stuff here that we can do and think about at a macro level, at an exabyte level. And this is where Brian, I love how, I mean, you have conversations that wander all the way to that scale as well. Right. Yeah. And so whether you're building an exabyte scale,

38:16

private cloud or even a much smaller, so if it's petabyte, you know, type deployments at a, a smaller shop, what we've seen over the last, you know, really few years is growing complexity is becoming as much the the the challenge and it affects cost efficiency, it affects energy efficiency. And so, you know, this cartoon picture is there to represent like what's happened when we have

38:43

lots of storage frames that are individually named and carefully curated and they have, you know, complex run books and people attached to the care and feeding. What what happens is you end up with a lot of operational costs that can, can dominate what it costs to procure and uh power the actual equipment. And, and not to mention,

39:09

just be a drag on developer efficiency and deploying new applications. So this is this is a problem that we've really been thinking about a lot over the last few years, in particular is that macro what happens at scam? It's a little bit of pets versus cattle, often that gets applied at a server layer. We talk about like rene or containers, but it applies at a storage level too.

39:32

Just, just absolutely. And so, you know, one of the things that we've seen from customers that are deploying really large scale private clouds and they've been doing that for, you know, five or 10 years. Is that what uh we often call the, the pod based architecture where say, OK, we might still have 2030% of uh traditional estate that are special cases. But for 70 80% of the

39:59

environment can we standardize on some kind of cloud management system and have these pods that are maybe roughly rack scale or tied to racks of compute storage and networking and have just simple, consistent API S very much like what you get in a public cloud consumption experience for the, you know, consumer of the infrastructure, but also to try to streamline the operations of what does it mean?

40:27

You know, how do I get out of the world where I've got 1000 different special snowflakes of things. We are thinking about this though. And yeah, so to, to tie this into all the, you know, energy efficiency pieces we've been talking about here, right? We, we, this is exactly like you mentioned, Andrew. This is we, we wanna avoid having optimized the

40:51

micro and miss out on the macro. We've got, we think the most efficient storage systems, you can think of that as the micro, that's the individual building blocks that go into these pods. We've got to make it easy for customers to deploy larger scale collections. The full pod, if you will, uh without getting locked into something that's unique from pure,

41:14

we gotta make it, you know, possible for uh a customer to build and a customized, you know, management system that's unique to their business, unique to their workflows and requirements. And that's really the role that uh pure fusion, a relatively new product here at pure, it's been generally available for a couple of quarters now and we've got great reception uh early here in the market.

41:37

What it does is it allows us to extend the, you know, cloud management API S to the pod based, you know, the 70 or 80%. But also to places that might be more in the traditional estate where you do still need a storage array with best in class, you know, data management features and services or where you can't deploy a full rack scale pod. Maybe you've got kind of an it closet for an edge site or a uh an SMB location.

42:06

We can unify that management plan for most things and again, simplify the operations. And what that allows you to do sort of full picture is get the energy efficiency and the cost efficiency of all the way down to the, you know, flash and how we treat the hardware. But the operational efficiency of I need to start thinking about my global footprint and that's that's what you know,

42:30

this picture shows is an example. Global topology where in a cloud architecture, I've got regions and availability zones and I have different storage classes. I can't as an operator of this, I can't be thinking about every single one of these storage arrays as individual, you know, pets. I need to think about them in aggregate and have API S that allow me to manage storage volumes and the placement of them in aggregate.

43:00

And that's exactly what we're doing with, with fusion. It needs to be policy driven, needs to be infrastructure or storage as code. It's a buzzword. I know, but it means it needs to have some level of self service and the idea of maybe consumers and providers so people can consume the storage as people provide it. You know,

43:18

that's actually maybe some cloud concepts for at scale. So what's interesting is when we first started talking about some of this, I think you and I were working this out and I was like, let's think about how we, how we actually handle both the micro and the macro and pure fusion. I don't know, inherently let's just be real,

43:32

we designed for energy efficiency. But if you don't have efficiency at a macro level, you might give away all the work that you did at the micro level that would be really be able to think about. Exactly. And, and you know, the great thing is just focusing on efficiency brings you energy efficiency and it brings you cost efficiency and it can bring you people efficiency, which is another way of looking at

43:53

cost efficiency. And, you know, so even if you're in the 37% that maybe energy efficiency wasn't yet a business focus, I'm guessing cost efficiency probably was. So it's a, it's an opportunity to have a real win, win. So hopefully, I kind of wandered. We, we started with,

44:12

with, with the topic, right? You know, energy savings, energy efficiency, gas lighting wandered all the way in through some context, you know, better science that fashion electrons through glass. Sometimes it's fun to write the titles here. I try to make sure they're still real, but I'm hoping that as we went through this now when you see something like this from pure

44:32

84.7% lower energy usage, that's a dramatic number. And you hear us talk about, you know, how we can deliver I ops and density in one piece of equipment and giving some comments about how uh challenging to impossible for a competition to do that. These numbers incidentally are based on real comparisons that we did. You know, we can share specific individual

44:51

specific details individually. Um But it's actually specific comparisons against competitors. We didn't, we didn't just make these up. But I'm hoping that frankly, even without sharing some of the specific numbers and the comparison stuff, what we went through helps you understand, how this is actually really possible at a, at a science level and we didn't even hit on some of

45:10

all the data reduction stuff that we do in inside flash. You alluded to a little bit, Brian. Uh Don Carra went deep into that in the previous webinar that we alluded to. But I'm hoping it, this makes a little bit more sense. I think with that, we have come down the home stretch, we have a pole and we still have the

45:26

drawing, Brian. Thank you as always for being such a great guest. Any final thoughts to bring us home before we kind of let our hair down and go into Q and A I, I, I'll be sure to let all my hair down here. Thanks. No, it's uh always great to be on with you Andrew and great to see so many folks so

45:41

engaged. So, thank you. No, thank you. So the last poll, if you don't mind going ahead and launching that, Emily, this is almost a little bit of us just figuring out that we're like, we're two years into this and the response to this is a series has been been phenomenal frankly and sometimes we, this is always for a technical audience,

46:01

but sometimes it's like, do we go deeper, do we hit on topics that are a little bit broader even though they apply to a technical audience or sometimes even thinking about recovering engineers, you may be a cio and you still find this stuff interesting because you want to understand how it really works, kind of thing. Right. So we'll leave this open for a minute of uh, would you like another session that goes deeper

46:17

on these topics? Uh So far the absolutely is far, far away leading. And I we'll leave the pull up here for a second for the drawing. The uh this is for an ember mug value, retail value of 100 and $30 the kind you can control with your phone because hey, that's cool. And it's actually kind of useful too.

46:34

Uh My wife uses actually ours much more to keep her tea warm for extended extended sessions. You know, we're just kind of kind of sipping your teeth. Our winner today is Al Z from North Dakota. Thank you so much for joining us. Now, please remember, uh we do have another next month VM Ware and Pure.

46:52

We're not done yet. I'll be joined by David Stamen. And with that, I think we're done the formal part part, Brian. We, we made it through and actually, I think we covered everything about as roughly as we planned. Um I think I'll pull the music back up here a little bit.

47:07

If it's too loud for people, let me know. And uh it's really a question maybe even first I'll just toss it to you. There's always things you're thinking of putting in or not. Is there anything you wanted to add in. You're like, oh, I gotta keep moving and I wish I could have spent a little more time on that little good

47:22

questions. Uh Well, so there's so much work that's gone into the software side of, of purity for the last, it's actually 13 years that folks have been working on, you know, how to engineer these flash friendly data structures and it's in some ways harder to talk about because there aren't physical pieces of, you know, equipment to, to show off the way we can with the hardware.

47:48

Um But there's, there's a lot there that uh we've, we've really put huge investments in and are very proud of it and uh amount to a lot of um you know, benefits for a customer, whether it's energy efficiency, operational simplicity, all the evergreen capabilities that if you've seen uh you know, what we could do with a flash array or a flash blade that actually ties back to technical

48:12

underpinnings that go back 13 years. And yeah, several, many hundreds of people working on what it wasn't many hundreds in the beginning. But um you know, a lot of work, even some of that, I mean, some of it just that you're building a platform that can refresh. We've done multiple break sessions where some we focused on evergreen.

48:32

So, you know, but we just keep oring around it because so much of what we do and it takes all this engineering work to make it through. Actually, it's, it just, just claims. OK. So, um I'm looking through the questions here, I'll pick one up. Uh Nick, Nick asked about future tech that may be taking the place of flash devices in the future.

48:56

And you know, the short answer is we don't see anything in the next decade, which is about as far as anybody can. I think credibly look forward in the media landscape that's gonna take the place of flash and in many ways, uh you know, our bet on the future is that there's flash and there's tape in the data center hard drives, you know, is more, my personal belief is not a official pure storage

49:24

line per se, but uh I believe hard drives are gonna go away and I think they'll go away, you know, slow at first and then all at once as the saying goes uh over the next really few years as, and it's a lot of the reasons we talked about of you can cut the gap by building really smart systems at scale. And what's gonna be left is you have flash for performance and capacity and then tape,

49:50

nobody's touching tape for a long time. In terms of the deep, deep, deep, deep coldest archive, I, I was on the partner landscape actually. Um When I first heard the term LA, it's actually a term, it's a flash plus tape. So you can go Google it, it, it still feels a little weird to say it. I, I think it's more appropriate to say but,

50:08

uh it's actually a term there for the idea of flash plus tape is what's gonna be around long term in an industry though. Um, let's see. So, uh also, you know, and I think it's, it's probably worth commenting a bit on uh, some of the poll results that we saw and I would say, you know, not terribly surprising from my discussions with customers at all

50:28

different, you know, size organizations. There is an increasing focus on energy and sustainability. But let's also be honest, there are lots of folks, especially in the current economic climate that are trying to figure out how do they get cost effectiveness and if energy efficiency comes along with that, it's great, but they might not make decisions solely based

50:49

on energy. So I didn't, I didn't see anything that was terribly surprising, but that's also where you see us really looking for ways to, you know, get, get both as, as positive outcomes. Can we just by improving efficiency overall, can we gain a lot of the energy efficiency along with it?

51:11

And as we, you know, look at how we design systems and focus on uh energy efficiency. We also look at things and, and this ties back to endurance. Well, one of the ways you can look at cost efficiency, especially in a TCO model is how often do I have to replace the device right, in a spinning, mechanical hard drive, I've maybe got 3 to 5 years expected lifetime.

51:37

We can extend the life, even with QLC drives, we can extend that life out quite substantially. And that has TCO advantages. It also has a huge impact on sustainability beyond energy, but sustainability, meaning we're digging up fewer rare earth, you know, materials out of the ground. We've, we've got fewer manufacturing processes. We've got fewer shipping containers that are moving stuff back and forth.

52:03

But fundamentally, we got less cost going out the door of uh you know, equipment to be installed. So it's another way to get that win, win while solving for some really important problems on the sustainability front. And even this one, I think he kind of hit on this but Tim Tim was kind enough to a question in um he called it, you know, striking result when you asked audience about sustainability

52:26

and he asked specifically about the difference between carbon footprint of enterprise disc and SSD or maybe flash. I mean, there has to be a huge difference there just from the energy consumption standpoint, more thoughts there. Uh Yeah, so there's, you know, there are lots of ways you, you my understanding is we want to look at carbon footprint all the way through the life

52:45

cycle from manufacturing to deployment to e waste. And so again, it's I got fewer things that need to get replaced. I'm not burning, you know, eight watts of power per spindle to do nothing because that's what you got to do with a hard drive. And so we're able to be more energy proportional during the life,

53:03

we're able to have, you know, fewer uh uh change cycles, you know, as, as we have to replace fewer failed components. Um And, but a lot of the true carbon footprint also comes from being more efficient in all the things around the media. So if I have the right amount of CPU and D ram and networking,

53:26

and I say, right, meaning it's proportional to the utility you get out. If you're building a tier two system, you're not gonna stuff, you don't want stuff tons of CPU and D ram that are gonna be burning power and chewing up, you know, carbon spitting out carbon uh when you're not getting utility out. And so efficiency is about proportionality to utility and that affects everything

53:53

in sustainability, but also cost. So I'm gonna take one here and actually go a little bit of a different direction from KG. Uh I was trying to leave last names out just in case pure vision for 2023 for sustainability in reducing energy, I think you can read between the lines or even explicitly, we've covered a lot of that. However, I also want to make sure to highlight

54:15

uh we actually have AES G report that came out in 20 actually, it says 2021 but it came out in 2022. It's highlighted there that actually walks through literal our, our vision. It starts with a letter from our ce I don't think I could point to anyone better to state what our vision is officially for this stuff. But for Charlie G, I'll put that link in the chat for folks.

54:34

So I would, I would recommend looking at that because this is a, a focus. And as you mentioned, Brian, it's not just about the flash that actually that but like with a lot of other areas, there was uh from Michael, interesting one about the energy tradeoff of throwing more drives or more capacity versus duplication and crunching that of the data.

54:59

Um because it's kind of a tradeoff of the compute the capacity you mind elaborating there. Yeah. Um So yeah, we thought about this actually from the very beginning because uh we don't want to do a ton of CPU intensive work if there's no benefit. And so uh we will in whether it's for compression or ded duplication uh live in the system in both flash array and,

55:24

and in the compression we do in flash blade, uh we will sample data and figure out is it uh reducible? And generally speaking, we've looked at this analysis, you know, repeatedly as we've improved our reduction algorithms over and over and over again in releases. And in general, if it's possible to reduce the

55:44

data, it's generally beneficial uh but we have to be efficient in how we do it. And so we do that dynamically by sampling data and only applying the expensive, you know, compute and memory to reduce if it's beneficial, you know, we don't want to spend a whole bunch of uh lots and, and CPU cycles on doing that and then get a 2% return. So we don't, there's three more,

56:10

one's a little bit opening, but I think we have time one, I'll, I'll take a little bit of this in my mind, Brian. So from JD, how can one make the case for sustainability initiatives to senior manager? I even think actually of Jeff who's helped it, help out the two, he's very passionate about this topic, the, the thought and I'm not gonna try and boil,

56:27

boil the ocean. The one thought I want to go back to is what you mentioned, Brian about how it can be efficiency period. And so if you can drive solutions that are efficient from a performance perspective and a cost perspective that also have sustainability and energy linked into them, that's how you can drive it, right? So depending on your organization,

56:44

you may not be able to drive it solely based on sustainability. But if you can align that to other areas, that can be often be a path forward, that's, that's frankly the one that I often see a lot, especially especially within the Americans to be very candid. Uh where there's that alignment it be and, and the, and the other uh way to,

57:02

to think about this is uh risk tolerance. You know, we didn't necessarily, a year ago, if we'd had this session a year ago, we wouldn't necessarily have anticipated some of the changes that came in Europe around the cost of energy and, and, and what's happened and we just, it's hard to predict some of these things. And so being a more efficient,

57:23

you know, having a more efficient architecture is also a way for uh organizations to have mitigated some of the inherent risk that may exist and, and simply unknowable risk and uncontrollable risk outside. I think the last one I may just do is, and this is more of maybe just a fun one to end with if there's a store that you wanna throw in. But if not,

57:45

this is from um the bank, it taste apologies if I'm getting your name wrong. What's the biggest challenge I realize this? I don't know if this is going to be a direct question about, but sometimes like the like if you look back over 10 years, you've been doing this, like what is something that you didn't expect that turned out a certain way unless I'm just putting on the spot without enough caffeine.

58:00

So, oh no, no, there's um there's, there's uh you know, any anybody who's built large scale systems has usually a collection of fun war stories of specific, you know, times when, when things were really difficult. But across the board, uh the thing that sticks with me is how we can draw those trend lines. We had those graphs of cost and you know, there's lots of these kind of trend lines and

58:25

you can do the math, but sometimes you step back and it's hard to realize just what exponential growth looks like. When I started at pure, the largest flash array that the team was shipping at the time. This was 2013. They had a huge party because we had a 30 terabyte raw storage array.

58:47

But the entire storage array could support a maximum of 30 terabytes. There are still t-shirts uh that, that we have that say, you know, 30 terabytes, yay. Uh because it was a big accomplishment. The largest flash module that we ship right now is 48 terabytes,

59:04

right? Think about that. We, you know, and we put hundreds of those into systems, right? So, uh the biggest challenge across the board is just keeping up with that kind of scale and all the things that you have to do uh to make systems successful when uh what you thought was big yesterday is all of a sudden kind of medium size and it's gonna be small in

59:28

a couple of years old stuff. It's amazing. And then we build more, then we build more. It's like, ah thank you. I'm gonna share the last poll results here for those who stayed with us. Uh, there is a pretty strong bent toward, um, another session that goes deeper on this. So Brian,

59:44

may I figure out the timing and have you back for some of the kind of better, better signs of all 23 and four that you've been three and four you've been thinking about awesome as well. A final reminder for those who said it's right to the end, which actually almost 1600 of you. Please join us again next month. VM Ware and Pure. We are not done yet, Brian,

01:00:04

we're right at the closing and final thoughts to bring us. All right, thanks everybody and uh thanks for hosting, but thanks everybody for great attention and awesome questions. I appreciate it. Thanks all. We will look forward to seeing you next month on behalf of pure Storage,

01:00:17

Ryan Gold. This is Andrew Miller. They must.

Hybrid Cloud
Evergreen//One
Portworx
FlashStack
Enterprise Applications
Enterprise Data Protection
FlashArray//C
FlashArray//X
Business Continuity
Data Analytics
Coffee Break
Pure1
Data Warehouse
FlashBlade
Pure as-a-Service

Andrew Miller

Lead Principal Technologist, Pure Storage

Brian Gold

VP Technology, Pure Storage

Who knew that the best coffee break conversations would end up happening online? Each month, Pure’s Coffee Break series invites experts in technology and business to chat about the themes driving today’s IT agenda - much more ‘podcast’ than ‘webinar’. Pure technology, our goal is to educate and entertain rather than sell.

For January, let’s embrace the Merriam-Webster 2022 Word of the Year - gaslighting - but with a positive spin. This month host Andrew Miller invites back a return guest - Brian Gold (both a literal rocket scientist and also the professor you enjoyed listening to) - to explore how Pure’s Better Science enables up to 1) 80% energy savings, 2) 96% less space required, and 3) dramatically higher storage density - all with 4) much longer equipment lifetime (3x lower DirectFlash module failure rates than SSD’s). Yes, we realize it’s almost hard to believe.

As always, we’ll keep it educational and start by exploring the industry landscape, how flash / NAND works at a low level, and then move into Pure’s engineering advantage and application at scale.

Topics will include:

The History of Flash in the Datacenter - eventually the solution became the problem. Insane data growth didn’t help either.
The Physics of Flash - Quantum Mechanics, Device Physics, Materials Science, oh my!
DirectFlash - Pure’s Hardware & Software Co-Design = how we deliver on the dramatic numbers above.
Start Small & Grow Forever - Exploring Exascale Systems and we to use the same building blocks for a single system as for exascale and pod based architectures. Yes, this involves Pure Fusion.
The team will stay on after the webinar answering any questions for those that want to stay longer!

Continue Watching

We hope you found this preview valuable. To continue watching this video please provide your information below.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Your Browser Is No Longer Supported!

Older browsers often represent security risks. In order to deliver the best possible experience when using our site, please update to any of these latest browsers.