Skip to Content
36:53 Webinar

Architecting Data Infrastructure for AI at Scale

Explore how to optimise and scale AI with FlashArray™, FlashBlade®, and Portworx®—from vector DBs to model inferencing and data tuning.
This webinar first aired on 18 June 2025
The first 5 minute(s) of our recorded Webinars are open; however, if you are enjoying them, we’ll ask for a little information to finish watching.
Architecting Data Infrastructure for AI at Scale
Explore how to optimize and scale AI with FlashArray™, FlashBlade®, and Portworx®—from vector DBs to model inferencing and data tuning.
Video Player is loading.
Current Time 0:00
Duration 0:00
Loaded: 0%
Stream Type LIVE
Remaining Time 0:00
 
1x
  • Chapters
  • descriptions off, selected
  • subtitles off, selected
      Click to View Transcript
      00:01
      Thank you all for joining. I know that, uh, there's a lot of really good sessions all at the same time. There was like 3 of them that I wanted to go to, um, so I appreciate you all spending the time with me and, um, I'm John Owings. I'm the director of Cloud Native strategy. Uh, I also, uh, run our cloud native architect team globally,
      00:21
      so all of the Port works experts that are out, you know, that work with Pure. Um, I worked with those guys and today we're gonna talk a little bit about, you know, what how important Kubernetti's is to an AI environment. So it's more of an infrastructure look at it. I'm not, you know, but there's, we'll have a little demo at the end.
      00:37
      Hopefully the video works awesome. But, uh, before I dive into too much because we have, you know, a little bit of statistics that I was doing yesterday at breakfast, there was 50% of our table, didn't realize this one thing. So who is everyone here, they're all staying in this resort,
      00:53
      right? Um, who here realized in the shower that there's a rain shower in the ceiling? Who here, or at least before he got in, right? Who here, you know, waited till he didn't realize it till later. Anyone, so half of our table at breakfast, uh.
      01:14
      Didn't see it and they actually, it was 2 days in, they're like, why is there only a handle in the shower? So, um, I just, I was like, guys, I'm gonna share this because this was hilarious to me. So I saw it right away, but I've stayed here, this is my 3rd year in this hotel, so like I, I, I kind of, I think I figured it out.
      01:31
      And the other guy at the table was like. Um, yeah, I, I saw it, the one, the not in the 50% that didn't realize it was there. I saw it, but what's really a bummer is when you switch it, it's freezing cold when it comes out. I was like, yeah, you're gonna turn it to that first before you get in the shower.
      01:48
      And so, uh, little pro tip there from me on, um, on the, on that. But yeah, I was, I was wondering, yeah. Everyone, so now you can leave, there's one nugget, you can, you can leave, you know, you can get that good shower and it'll be great. So we'll go past that.
      02:07
      We have uh an agenda, but you know, AI is powering things, so who else, you know, this is not a huge surprise. Um, we started with things like predictive art artificial intelligence and like trying to retrain LOMs and those types of things, fine tuning them into these, you know, small language models, still all a thing.
      02:27
      Uh, one of the things that's that I found most exciting in the last year, 18 months is the emergence of the retrieval augmented generation, obviously being at a a data storage company that, that we store all kinds of different kinds of data like on structured data, databases, all those fun things, um, that, that caught my eye.
      02:48
      And so at CoubCon in Chicago, was that the end of 23? Yeah, I was trying to remember which which one it was a Detroit or Chicago. I, um, we saw some demos of Olama. Has anyone ever tried to use OLama before? It's a great little open source tool to pull down open source models,
      03:10
      play with them on your laptop. And so I saw some demos of that. I was like, you know what, we're gonna do some fun stuff. So that's part of the demo later. So we'll, we'll talk a little bit more about that, you know, because it's about avoiding hallucinations. So I,
      03:21
      I try to explain this to my kids. Because my, my oldest thinks that like chat, you know, she just calls it chat, he chat, um, knows, knows everything, right? And I'm like, you know, if it doesn't know the answer, it'll just make it up. And she's like,
      03:37
      it will, yes, it will, yeah, so obviously, they train on a lot of data, so it doesn't make up too much stuff, but you don't want to be the one who finds out that. It just fabricated an answer for you. Um, but the, the challenges that you get, obviously the data accuracy is, is a huge one. The access to it, like how do I get access to
      03:56
      it? How do I give my team access to it and also do that in a sim simplistic manner. Like we don't, I think all of us are real smart, we could figure out how to get access to the data, like, you know, we can create mounts, we, we can do all kinds of manual work. And you know, write shell scripts to do it. Like there's lots of things there that,
      04:13
      you know, we can work our ways around it, but how do we do it in a way that's gonna scale in the enterprise, um, is, is a huge question. And then obviously cost optimization, like, you know, first time to token, token-based pricing, um, you know, wasting the GPUs, you know, who here, you know, knows that GPUs cost a lot of money.
      04:32
      Right? Yes. Um, I was doing some list price math from some stuff I looked up online, a, a fully populated Super pod, if you paid the list price, this is before you add, you know, the nice uh XA, you know, the, the flash blade Xa or something to it.
      04:49
      And the number of GPUs, I think it tops out at like 16,384 minus 1 or something like that. And it's like $660 million so I, we probably put you everyone in here down for 2, right? Because you have to be redundant, right? So those are, those are, those are the things like, and so if you don't have the data or if you don't get to the data fast enough,
      05:11
      those things are just burning cycles for no reason, right? They, they still take electricity obviously, but as they get faster, and they start working, they consume more, but that's what you want them to do. And if they're sitting idle, they're wasted. And all that comes down to really what in uh I guess the slide worked out so they changed it
      05:31
      from like a white background to black because that's like the theme for the conference and I was like, OK, it does seem still readable, but is that Kuberneti's is at the center of that. And I talked to a lot of customers, you know, every AI project is a, is a Kuberneti's project in some way, shape or form. Uh, there may be some, you know, people out there that have built or use some.
      05:54
      Bespoke schedulers because they've been doing HPC since uh I was in college in the, in the early 90s, right? But most enterprises, most, most of our customers that we interact with, they're going to be looking at something like because they're not gonna rebuild that. And we're also, um they're already doing it for their container development environment,
      06:16
      their uh you know, maybe even the Red Hat OpenShift virtualization environment. So they're, they're going, OK, we already have, we're building a best practice around using Kubernetes as a, as a workload scheduler. Let's, let's use the same thing for AI. And so at the middle of that, you have, you have Kubernetti's,
      06:33
      whatever distribution that you like. We, you know, we obviously partner tightly with several of them. And, and then you have things like Nvidia, who also saw that Kuberneti's was going to be. Um, at the center of all these things, they purchased a company called Run AI that does scheduling for the GPUs, right? um, if you guys have seen the demo,
      06:54
      it's pretty cool stuff. And it is not necessarily like a, a promotion of that one solution, but what it does is it will actually, you know, allow you to request, so, you know, John logs in. I have access to be able to run 2 GPUs and so whatever workload I try to do,
      07:12
      I, I get my two GPUs and, and then I, but Nathan is more important than me and he gets access, you know, priority. So he could come on and if there's no extra resources, it kicks me off. So there's some cool, some cool things that it does there. It extends all the point is, is it extends the scheduling of Kubernetes,
      07:31
      right? Kuberne's is made to go, hey, you need 2 CPUs and this much RAM, you run here. Very simplistic, but that's what it does. Well, when you add GPUs to that, it gets a little more complicated and then there's, you know, multi-access and our back and all the fun things that they add on top,
      07:47
      but it extends that scheduler in the same way Port works with, with Pure extends the scheduler of Kuber and I just didn't know about data. Right? Through, through some of the things that we've built with like Stork and that type of stuff is like saying, hey, this workload needs access to this data whether it's a database,
      08:08
      unstructured data, you know, even object to those types of things, this is where you're gonna get it. So, um, together we have this cycle of, you know, GPUs need data. Portworks has access to the data together, you have a solution, right, that, that rotates around. Kubernetis.
      08:30
      And that allows you to kind of unify those silos, takes, takes what we do when it comes to what, what I generally focus on is how do we, how do we get the Jupiter notebooks running? How do we run our inference Python code, those types of things. And, um, you know, and then still pull data out of, out of things like postgrads,
      08:54
      out of other SAS apps, those, you know, all those other places that you're gonna do it in a way that abstracts that way for the actual consumers of AI. When, when I say that in the enterprise, I'm thinking developers. And I'm thinking data science teams that in general, you don't want to have to run the infrastructure, right? that you want to provide them a platform where
      09:19
      they can go and log in and and the the more the, the faster and easier access they have to those in those resources, the quicker your, your business will get a result for AI. Right? If, if they are spending a year tweaking parameters in Linux and figuring out how to re-script where to, where to access the data, you, you know, you're already a year behind your competition who's,
      09:46
      who's done it, right? So, um, on top of that, right, obviously then, you know, Portworks comes in with, you know, we have, yeah, oh yeah, the SRE silhouette over there on the side. But, you know, that platform team is building not only just a developer,
      10:01
      DevOps, you know, platform, it is now a GPU as a service platform. And we and what we wanna do is provide that, that abstraction to access the data even if it's, if it's on a flash array, flash blade, you know, the things that we're all familiar with here at Pure, but also to other sources, um, like in all of those servers come with local NVME drives.
      10:23
      How do we take advantage of those in a distributed manner so that when um run AI when you're doing a training job. This is actually a pretty fun use case in the cloud is that you're doing a training job and if you do not have a layer like Port works, um, and someone comes in and says, hey, you know, they have a higher priority because it's Nathan.
      10:43
      um, sorry, I no one knows, but Nathan's here in the front of the room, so I'm just gonna use his name as an example, and, and they have a higher priority, it'll pause the training job. And start up your inference because that's gonna take, that's not gonna take a week to run, like we can finish that real fast.
      10:58
      So then it runs, but what it does when it does that, it actually writes a checkpoint down to disk. And if you don't have a, a layer like Portworks that's gonna distribute that data, it's, it's, it will reschedule on another node when it becomes available and it will restart from the beginning because the checkpoint is gone. So that's a huge bummer, um, and everyone's like, why does it,
      11:20
      why does it like it said it would be done in in 48 hours and. Now it says it'll be done in 48 hours. I don't understand why, you know, the, the time hasn't gone away. So, um, providing that holistic view of where the data goes and where, where it's at is going to enable your customers, you know,
      11:34
      your consumers of it, the GPU is a service, to get to there faster. Well. Obviously, rag is a I don't know. I've talked about it a little bit last year, um, who here is, is ragged out like that we, we, we, uh, we overuse that term, but what it really means is like getting the most accurate response you can using your own
      11:59
      data, right? I've, I've done this in a couple different ways, even internally and pure is like there's data that we're not going to just go upload to open AI. So I can take that data, run it here and just like vectorize it, run queries, get my answer back internally, like we can do that with an open source model and the same,
      12:19
      same exact result I can get, well, not the same exact result each model, but I've tested them like, hey, what does DeepSeek do versus uh meta versus Chat GPT? It's always fun to see how, you know, with the same amount of source data, what the answers you get. Maybe that'll be a paper for uh next year. Um, but yeah, it's about getting the most accurate answer you can.
      12:39
      Avoid the hallucinations because, and we'll see in the demo what that means, right? When you, when you let the model kind of choose on its own, you get some fun stuff. And yeah, benefits of that is the cost, the accuracy. I, I like accuracy is a very important one, relevance and that time to market.
      13:01
      So if you, if you have people sitting around and uh trying to redevelop that or retraining the whole model, hey, we added some new things to the database, could you retrain the model? That doesn't, that doesn't fly, right? If we got to wait a week, by then, that, that PDF or that whatever you put into the vector database isn't good anymore.
      13:21
      And where's all the data go? So there's a few examples here in a Kuberneti's environment. Oh, they highlighted it so that makes it easier to see. Um, this was actually with a, with a tool that I downloaded from Hugging Face was just like a pre-built photo booth thing you know it turns you into something funny, um, and but it downloads a lot of data every
      13:45
      single time it runs. And where it goes is is into an ephemeral directory within Kubernetes and it all looks like it's running fine until you say go. The app actually starts to write data and that ephemeral settings crash and you get this fun error, hey, you.
      14:06
      You, uh, overran your, or is it requested zero has larger consumption of ephemeral storage. That's, that's not good, that causes the process to crash. And the thing is, is like a lot of these solutions or maybe even examples that you'll find on the internet, they're written by people that they write code, they don't, they don't think about the data storage portion of that and so in a You know,
      14:28
      on their laptop, it worked totally cool. Put it into Kubernetti's, you get something. Not ideal. And so, you know, we gotta think about where's all that data go. This is just a cache, right? It's just some hashes that that allow this app to run and so we gotta think about like, OK, there's local storage,
      14:44
      there's also the rag data, there's lots of different places we need to look at and so we need to, we need to make that fast and quick to access and Portworks does that like so once I replaced that ephemeral storage with just the Portworks volume. It worked, so. And, um, You know, we want to be able to provide, you know,
      15:03
      so at scale, you look at things like concurrent rewrite access, um, via the flash blade, right, being able to have not just one model run but hundreds or thousands as you add more users, it's going to be extremely important, like, so you want to have the concurrency that comes from something like a flash blade that's gonna be key. That's why.
      15:24
      Whenever I talk to any of our pure team, when we say AII, they, they immediately go, is that flash, flash, like the, the flash blade, right? Um, yes, but also you use Portworks to um glue those things together. Um. Second, yeah, concurrent um rewrite IO for
      15:42
      distributed KV cash. Now, this is one that is slightly. I think I've, I've worked in sales a little too, you know, too long, is that the KV cash is as you get answers from the rag model, it creates a cache and if you're running a distributed environment of for an LLM you can
      16:04
      have a cash on one node and not on the other. You have to rebuild that cash every time. So this distributed cash allows you to save that onto a flash blade. And then get accurate answers as you scale across multiple GPUs. And you get and and the result is lots of llama icons.
      16:28
      Um, from, from pure, obviously operational efficiency, the same benefits you get from that, you know, from everything, every other workload you've run with Pure is, is what you're gonna get with, with AI, um, build anywhere and self-service. So that's what Portworks at is we have customers that go, you know what, we're not going to, as much as we love.
      16:50
      To, uh, help customers get pure hardware, some of them are gonna say, hey, to start out with this AI environment, we're just going to rent, rent some GPUs from Amazon or Azure, right? And with Port Works, we build that platform so now I can actually move data. oh OK, we built, we built some models or we built some,
      17:11
      we built some uh proof of concepts that we really liked. Let's move those on prem. Uh, the reverse of that, OK, let's, the reverse of that is, hey, we built some really like nice models, uh, on-prem, and now I need, I need instead of the 32 GPUs I, I paid for here, I now need 1000,
      17:33
      um, and then Portworks will move that data up into the cloud and allow you to scale that way. Oh, I love build slides, by the way. So that enterprise data cloud, has anyone heard that term yet this week? Is uh. And, and, and what we do, right, we see a lot when it comes to delivering those things on the top, right?
      18:00
      We have, oh, there's, I love the hugging face logo like that's so professional and, um, you know, very, very AI startup, but you know, hey, how do we get access to those things and through that data cloud is really where, you know, pure. Uh, will shine, whether it's, whether it's through fusion,
      18:21
      like, hey, we have a fleet now because we have to manage all those things, or we have port works because we need to integrate into into into Kubernetes. Think of those both as software layers that are gonna drive efficiency for you. Um, and then obviously, the stuff that we've spent so much fun and time on over, over the last 1012 years at Pure is, you know, hey, let's,
      18:44
      you know, our file replication and uh when I started at Pure, there was no replication, there's no snapshots. The biggest array was 5.5 terabytes. So everyone think about the latest uh pure array that you bought and. What that is in comparison. Um, but as we extend those things, right, there's, you know,
      19:04
      Porworks Enterprise is the flagship of, of Por works, right? It, it, it automates those connections on top of, you know, most of the customers are gonna be on Red Hat, but you can, you can see that we do support cloud on-prem in the edge. We see a lot of AI on the edge.
      19:21
      Uh, one of the use cases is a grocery store chain that we work with. They Um, use AI at the edge to watch this, you know, because everyone has to use this self checkout now, right? And they actually watch via video and do analysis, you know, if you scan, the way the guy told the story is, is hilarious is that people will take a rib eye,
      19:41
      put it in their cart, they'll put a Kool-Aid packet, and they'll hold it in their hand and scan the the UPC code for the Kool-Aid packet. And so they watch and they say, hey, he's holding a stake, that's $32 now, thanks to inflation, and uh that just came up as 79 cents. There's something not right there.
      20:02
      So inventory control, still a very important thing to those guys um in that environment. Um, but then you add things like backup, being able to protect the data because Just because you're running AI doesn't mean the requirement to have the enterprise policies is gonna go away. The business isn't gonna go, oh, it was the new, it's the Kubernetti's environment, it's OK that the cluster went down and the apps went offline.
      20:27
      Right, so having backup to be able to restore, having DR to be able to fail over in a timely manner is gonna be um correct, you know, the, the correct way to go to it. And a lot of customers like they're looking at it like, hey, how do I kind of build this in a way where I can avoid some of those things, but at the end of the day, you know, my soapbox is the decisions you make now are gonna
      20:49
      inform how well and how much tech debt you're gonna have as you get to scale. So you're like, oh, I'm just running 3 servers, it doesn't matter, you know, once the boss sees that and they like it, right, they're gonna be like, oh, well, let's put more things on it, and then all of a sudden you're in a spot where you have to go to them and say,
      21:06
      hey, we actually really need some solution to solve these and then it might not even be a uh a data storage problem. It could just be networking or load balancing and, you know, we need to have a professional enterprise solution for this. And if you're already Way past, you know, way past getting into the production, it's gonna be painful, whether either through cost or time.
      21:27
      So, the right decision now will help you, uh, help you get there. Um, this is something we announced at Supercompute last year, and this is, um, Pors Data Services is a database as a service platform that is built on top of Portworks, and it actually, um, one of the things we're looking at is adding things for ML ML opps into it. So getting models,
      21:51
      getting databases that are tuned for AI right built into it. So, um, that's not, it says coming soon. I just wanted to give everyone a preview of that. Hopefully, everyone understands like sometimes those things change, so you never know, but that's one of the things we're looking at because uh more and more of the team, our customers are coming to us saying,
      22:10
      hey, we need, we need faster ways to deploy these things and we don't need to go become an expert in how to turn on PG vector and post grads or, uh, you know, SQL Server has a new thing, you know, if we can curate that and bundle it for them, they'll take it. So, um, as you do that, you see, you create those development environments,
      22:29
      you want to be able to get value fast. One of the key things is, yeah, train the models in the cloud or on-prem, like I said I was talking about earlier. Avoid the wasting the GPUs by getting access to the data that you need when you need it. High performance storage for vector databases, um, running on Kernas,
      22:46
      obviously, the faster the vector database can respond. You get that more real-time response answer. You'll see in my demo, I have like one GPU and a database on somewhere else, so it's. It responds pretty quick, but you'll, you'll see a little, you'll see a little pause there just to be transparent to everybody.
      23:07
      I didn't, I didn't like speed up, speed it up and make it like, look how fast it is and take the seconds out. Um, uh, data platform for self-service, that's the other important part is don't, don't give everyone access to Kubernetes, give them a platform where they can do GPU as a service and they, they just have templates to deploy outs.
      23:25
      So I was talking to a customer. Uh, 2 or 3 weeks ago and I was explaining kind of all the things we do, you know, with storage classes and being able to point data into a flash blade or into a flash array or onto onto Port Works data store itself and they're like, well, do, does the customer have to choose all those things?
      23:42
      I was like, no, like you're gonna build them a database template that says, hey, you've already built that in, like this is a transactional type data store, it's going on the flash array. This is this is this is like a rag repository. It's over on the flash blade, right? So all those things are important to know we
      24:03
      build those platforms and then give it to the, give it to the um the consumers, the users above us, and they're able to do what they need to rather than go, hey, I put my database on this kind of storage and. You know, oh, well, that should have been on, you know, something else. Uh, node zone failures, that's always a bad
      24:22
      idea, and then run AI models under the same stack, right? So that's, that's the end of the, of the um enterprise data cloud. It's like, hey, we gotta have that. Same stack that runs across the entire enterprise. We do have customers that are, that are running these things,
      24:39
      um, so you're not the first one, so it's always key to, you know, we're learning lots of things as we go, but also we've proven it through, through several solutions, uh, GPU, you know, investment into like being able to bypass the GPUs, get a lower latency. Has anyone uh used this or experimented or even know what this,
      25:01
      what it is? This this is basically a way we've worked with, you know, and. With Nvidia to uh skip the CPU when you have this um NFS over RDMA, right? So it's basically you tell it the protocol instead of TCP is RDMA and it, it will actually use the, use the bus to talk to the storage rather than wait for the GPU or the CPU to uh give it,
      25:28
      give it the um instruction set. So that's a pretty cool thing. That's one, that's one that like, I don't get to test every day because they don't just hand out my lab with uh, you know, 100 gig and uh ethernet and all the fun things that go along with that. Yeah, our solutions, there's, there's from a one I was just in a panel,
      25:49
      I think a couple of you were in that, in that we were talking about the Gen A iPod and for finance, but they're that what that is is a turnkey, you know, start small and scale and it, you know, it comes with the things you need, uh, from compute, from GPUs and, and networking to be able to get you started quickly. Um, so those are, those are key, I think that's, that is officially GA today,
      26:15
      so everyone can order 5 of those. That'd be awesome. Um, we have, you know, certifications with Nvidia. I think that that's sometimes something that a lot of people are gonna, that's where they're gonna start, like, hey, who, who is,
      26:29
      who is partnered with Nvidia to kind of get all the software stack working together? We've done those things. And this is the big one. This week, um, I think they have a mockup of it, and I was like, I was, I went over there and I was like, you guys have the XL and I saw it was kind of it
      26:47
      was some stickers, so, uh, which is fine. I won't criticize those guys, they can't really ship this stuff around the country, uh, super easy, but, um, when I look at this and I go, the, the places we've come from to get to 10 terabytes per second kind of performance is, um. Pretty amazing, right?
      27:06
      Um, and of course, because it's flash play, the same purity APIs, Port Works integrates with it right away. So, um, think about that. And now, as we get closer. I need pictures to kind of see how it all works. Like, obviously all the marketing slides are
      27:24
      cool and you got a little bit of stats there and try to sprinkle in enough stories to tell you guys how people are using it. But the way that the way that I picture this is, you know, in a Kubernetes environment, you have a control plane, don't run workloads there, you're gonna have something like AI nodes and compute nodes.
      27:40
      So depending on where you start, you're not gonna run every workload on, on your. Very expensive GPU servers, right, so you're gonna have ways to segregate those, schedule them accordingly, and. Um, then you see like, OK, like, yeah, the front end chat code that you'll see me run,
      28:01
      that runs on. Non-GPU powered servers. And then obviously, the AI model, the meta, the meta llama 3 that I'm gonna run in the back end, it runs over on the AI nodes, Portworks puts storage within each one of those nodes, but also attaches us to the flash blade. For those types of workloads and this is my favorite part is we're gonna talk a little bit
      28:25
      about barbecue. And so Uh, this, this environment, right? So it's, it's barbecue as a service, it's PX barbecue. If you see any of our technical marketing engineers who, who created the code for this, um, you know, make sure you,
      28:41
      you ask them for some. Uh, Frank paraphernalia, I don't know if there's any left, but, uh, Frank is the owner, fictitional owner of this fictitional barbecue company. No pigs are harmed in the, uh, making of this, but it is, it is a front end that just attaches to MongoDB and allows you to order barbecue.
      29:01
      But me being one of the pit masters of the barbecue company, I can't, I can't always just stop making barbecue to go answer questions about what's on the menu. So we gotta, we gotta look at how we can take something like Llama and some and a vector database. I use NO4J. There's 1000 other choices out there.
      29:22
      It's not a promotion of a specific one over the other, and that bot to be able to retrieve that data and give us an accurate answer. Any questions so far? Now we're gonna see if how the video looks on this. So like any Kubernetti's demo, it starts with the CLI because that's where the cool kids go
      29:43
      to run uh stuff in Linux. And so I'm gonna apply the actual front end to this. And that's the, you know, some services that deploys that database and a web front end. So, and as you saw in the picture, it's a replica set of 3, you know, to make the web front end scalable, but, you know,
      30:04
      just a small MongoDB in the back end and Real time, there we go, get the service. And we're gonna point our browser at that. Boom Oh, we probably should make sure it's running too. Like, the URL is nice, but if the, those pods aren't running,
      30:27
      it's, it's not gonna work very well. So everything's running, beautiful. So there's Frank And we're gonna order our. Our barbecue. So brisket, French fries, coleslaw.
      30:43
      Uh, let's switch to cornbread and pork special. Remember, pork special is what category it was in, right? It was in the, it was a drink. So, um, just remember that. And so we've placed our order. And um we can go back to home.
      31:00
      And you know, we're waiting now we're waiting for our order. Now you can see down at the bottom there I have a chat. It that that link isn't gonna work until I deploy this part of it. So this is the actual AI demo, so this deploys uh the vector database, it deploys a Python agent that allows me to grab PDFs and vectorize them,
      31:20
      and then also the, the code to talk to that chat button. So once that runs, one of the other big things about, about running this stuff is like that old llama container is pulling down that Llama 3 model, which is 8 gigabytes because I'm using the smaller one, but there's actually like a 45 gigabyte one.
      31:39
      If every developer in your organization was pulling that down over and over again, that would probably be bad. Um. And so what we want to do is put that somewhere where everyone can share it. So something like on a flash blade. Now all, you know, then I point all my code to use that one place or to download it internally
      31:58
      rather than pulling it over the internet over and over and over again, especially when I did this, I, uh, you know, it took me several 100 iterations to make the demo right, so. You think about your developers, they're doing, you know, lots of different iterations. So we're gonna ask Frank, there's Frank, how can I help you today?
      32:15
      What is the pork special? Everyone remember it was a drink. And of course it's the, you know, he creates a great answer. Frank is very creative. The crown jewel, it's got onion rings in it, barbecue sauce,
      32:27
      um. You know, Frank, that sounds like a whole lot of pork, and you know what, you'd be absolutely right. So yeah, it sounds like a super awesome drink. If anyone's paying attention, so, um, let's, let's, uh, give it some context. That's, that's the hallucination,
      32:43
      right? Like you don't give it context, it's gonna make up the answer. And based on his personality, I hey because we tell Frank, you work, you work at a barbecue place, you, you do this, here's your menu, and Frank's gonna. Frank's gonna rank and he's gonna create an answer.
      33:01
      Um, so what we're gonna do is we're gonna upload like a uh FAQ on Port Works barbecue. And once that's uploaded and vectorized, you can see it was running up there. I can ask, so through this interface, I can ask it directly. So it has none of the frank personality, so I should get like basically the straight answer on what the pork special is.
      33:23
      According to the FAQ pork special is a secret recipe known only to our pit masters and considered highly classified information. It is vegan. So it's good, right? I would think most of my drinks should be pretty vegan. I mean, there's, I mean, there's some that aren't, but like definitely not ones with pork
      33:39
      or onion rings or barbecue sauce in them. Um, so let's go back to Frank. So it's this easy because Frank's already talking to the model. So now we're gonna say, hey, Frank, what is the pork special? And he gives you. Let me tell you, oh, it's a secret,
      33:55
      um, but it's vegan, right? So we get the gist of it, but then of course, Frank adds his personality to it. But we can ask it more, and then, then we can see like, it'll eventually break down. Do you have barbecue tacos, this is one of my favorite things.
      34:17
      Um, I've shown this demo several times to the point where they've added barbecue tacos to the menu, um, on the newest version. So, um, yeah, we don't have them, but I say, hey, can you ask Eric, Eric Shanks is the guy who created the, the app. Um, can you add it? And of course, at some point along the way, he, he forgets what he's talking about.
      34:39
      He starts talking about barbecue tofu and um, you know, but yeah. You know, but if I had put the rest of that info into the FAQ, we would have got a, we would have got a better answer. So, obviously, as we want to fine tune these things, we just keep adding more and more data to them, um, to make them better.
      34:58
      Yeah, so yeah, he does promise we're gonna add barbecue tofu to the menu. But don't tell anyone else I'm willing to try new things. I guess that's, that's kind of funny, but um, yeah, and that is the end of the demo. How are we doing on time? I do, perfect. So we have a few minutes for Q&A. Does anyone have any questions or anything else
      35:19
      you want to know? You know, as I did this demo because it's kind of funny, but it's also. Really, really do. There we go. Oh, yeah, um, Rosen is a, um, is a Portworks and pure customer.
      35:37
      They inspect every pipeline, basically in the world. And they, they collect a lot of, it's actually really cool, they send this device down the pipeline and it measures all kinds of, of uh things around what's going on in those oil pipelines, or gas pipelines, and um they're able to tell, you know,
      35:56
      using AI. They synthesize that through. You can say, hey, you know, you repair these things, you're within certain specs, um, or AKA turn this off because you're gonna start shooting oil into the wilderness. So, um, all kinds of cool stuff that they do with that. So that's, that's a, a, a great use case.
      36:13
      This, um, if you want to read more about it, it's actually, if you search on our website, you will find the whole use case and kind of more details about what they do. Oh yeah, um. If you don't have those socks, you probably should go get them because they, they're probably gonna go fast.
      36:33
      Um, I think it says control alt deleted or something on there. So yeah, those, um, that's available. But do you, does anyone have any, any questions? I think that's really what I was trying to get to.
      • Artificial Intelligence
      • Portworx
      • Data Analytics
      • FlashBlade
      • Pure//Accelerate
      Pure Accelerate 2025 On-demand Sessions
      Pure Accelerate 2025 On-demand Sessions
      PURE//ACCELERATE® 2025

      Stay inspired with on-demand sessions.

      Get inspired, learn from innovators, and level up your skills for data success.
      07/2024
      Pure Storage FlashArray//X | Data Sheet
      FlashArray//X provides unified block and file storage with enterprise performance, reliability, and availability to power your critical business services.
      Data Sheet
      5 pages
      Your Browser Is No Longer Supported!

      Older browsers often represent security risks. In order to deliver the best possible experience when using our site, please update to any of these latest browsers.

      Continue Watching
      We hope you found this preview valuable. To continue watching this video please provide your information below.
      This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.