00:00
Hi everyone. My name is Casey Lai, Vice President of AI at Pure Storage. Really excited to have them in the studio Today, Jacob Lieberman from NVIDIA. Jacob, very, very nice to see you again. Thank you. Yes, would you mind telling us a little about yourself and what you do at Nvidia?
00:18
Yeah, so my name's Jacob Lieberman, and I'm a director of Enterprise Product at Nvidia and a New initiative launched. We have named it the AI Data Platform. Very cool, very exciting. Well, Jacob, the first thing I want to talk I think everyone, when they think of AI, they picture the coolest co-pilots, Right? The biggest and baddest models, uh, and they
00:36
Often forget about the data. And so I’d love to talk to you about that. Why do you think data matters for AI? What's your perspective on this? Well, this is one of my favorite subjects, Casey. So, yeah, there's been a massive rush of enthusiasm around Gen AI and all of its Capabilities. But somewhere in the midst of it all,
00:56
We lost sight of the fact that data is still king. So whether you're training a model, fine-tuning a model, Or retrieving additional context through RAG to inform your LLM generations. You need secure access to high-quality data. Basically, you don't want the garbage in, garbage-out problem and crazy hallucinations.
01:19
Right? That's true. All right, right. Well then, in addition to that, I think the performance also matters, right? Because you need to ensure the data gets to the GPUs fast enough so you don’t Have idle GPUs.
01:31
Nobody likes that. Right, so without GPUs, it's really not. Possible to prepare data for AI at scale, and to keep up with the velocity. Of the data, the rate at which data changes and the rate at which the data grows. So, that's number one. Number two, you know, Building these pipelines to make data AI-ready is complex,
01:55
And they have many stages. There are many handoffs between different Personas and users of the data. And at any moment during one of those handoffs, somebody could drop the ball. Totally. Yeah, I think I see this as probably One of the biggest challenges that's, uh, getting in the way of AI inference
02:12
Perspective. You know, up to now, most workload has Been training, so people have been really really focused on that. But now, it's going to shift where most of the time and effort, And money is actually all going to be focused on inference. And so, the reason why it’s interesting is because of what you just said.
02:28
The minute you get to inference, right, you can only get good inference and good consumption if The data is actually AI-ready. Well, there are many challenges. I mean, first of all, Data: enterprise data is unstructured. Ninety percent of the data an enterprise acquires is unstructured in nature.
02:46
And there are many modalities: video, audio, text. PDFs with graphics, images, presentations, spreadsheets—combine those things, It becomes quite challenging to extract insight from the data. Right? Well, if you can't even get the data to be EI. Ready? You're not getting any insights, right?
03:08
I think that's definitely key. And so, I think that's why we're very excited about what we're doing here at Pure Storage. Um, we announced at GTC last week the introduction of a new product, called Pure. Storage data stream. Where we are specifically focused. Challenge. So, the first part that data stream's going
03:27
Do is address that specific area, so that way you get one workflow. One product that's going to automate the whole process to actually generate data sets for AI In minutes. Second, we're going to do is we're going To make sure it's super easy to consume the output, put some governance around it. Right? What they should use, and what not to use,
03:51
Who can see, who cannot see — those types of things should be in there. And then third, we're entering an age where we have agents. Right? Agents are part of our digital workforce. So you have to now think about how agents are going to consume. I think these are very important capabilities in data streaming,
04:07
Right, to accelerate and simplify the process of making data AI-ready. Right? And so, you can think about Pure Storage as Really taking an active role to be there for the, For the customer, for every step of their AI journey. And what I love about it is that it's all centered around the data.
04:25
Which is really the core competency of Pure Storage: protecting that data, And then it builds on top of that, but data is always at the core. Jacob, it was a pleasure having you at the studio. Thank you so much for doing this. We really enjoyed it. Had a blast! Yeah, me too.