00:00
Why don't we get started? I am Bill Lynch and I have the pleasure of leading our life sciences and genomics subvertical. And this is a topic very near and dear to our collective hearts up here about how do you simplify and speed genomics pipelines using the appropriate storage when which in this case obviously is a pure storage platform so I'm joined by two
00:31
wickedly smart people. I'm gonna let Craig and Beash introduce themselves, um, Craig, you wanna go first? Just Here we go, can you hear me? Everyone hear me? Yeah, uh, I'm Craig Kelly. I am with, uh,
00:54
McMaster University. I am the uh infrastructure manager. Uh, for that, uh, faculty of Sciences, and so my job is to, um, help support faculty and research and, uh, in their day to day lives. All right. So my name is Vikash Roy Chowdhury.
01:16
I'm a director for product management, focused on high performance computing and AI where genomics is obviously a very, um, Uh, what we call highly, um, growing area in the space, uh, specifically HPC is already there and we already have genomics workloads on HPC, but AI is something which is new and is really ramping up big time in the genomics space.
01:44
Great and and you're gonna hear from both of them. You're not gonna hear from me all that much longer because as I said, these are the two really smart guys. Um, great story from McMaster. Um, one of the things we're really cognizant of is how do our solutions improve
02:02
outcomes? It's, it's great to have infrastructure stories, but at the end of the day, what did it do for that researcher? What did it do for that bioinformatics? What did it, what did it do for, um, any of the researchers, so, um. One of the things I wanna leave you with before
02:21
I've done my part is. This is the drum I bang all the time because typically researchers um have little things squirreled away you can call them shadow IT shadow AI shadow genomics, but you will typically see underperforming storage that's one of the reasons why. When we move from underperforming storage to flash storage to pure we
02:50
see increases of speeds up to 24 times which McMaster has seen so storage is not storage is not storage. I've had people who have told me, oh yeah, we went out and bought a hard a Seagate hard drive from Costco and it's like, no, no, you don't want that for your work you want. You want a solution, a platform that was specifically engineered for HPC and
03:18
AI workloads, and that's one of the reasons why our flash plate platform works so well with genomics workloads. Um One other thing is that when you talk to the researchers that Craig was just talking about this, um, one of the things that's really good about, um, McMaster is in the MacArthur lab the researchers and IT maintain
03:46
continuous communications so the researchers are relaying their needs and IT is delivering on those needs but typically what we hear from the researchers is. I just wanna go faster and I want my data to be easier to work with. I don't want to be a data janitor. I want to be a data scientist, so that's they they'll tell us I don't really
04:12
care about storage and we tell them you should care, but that's one of the stories that we're gonna hear today from uh from Greg about how do you bridge that divide between IT and the researchers, the folks actually doing the work. So, um, I'm gonna skip this slide with that I'm gonna turn it over to Greg and let him tell us about what's going on at McMaster.
04:39
A Yeah, thank you, Bill. So yeah, really McMaster University we started uh research cluster uh about 7 or 8 years ago now, um, so I had a lot of researchers coming to me saying, hey, how do I. Speed up my performance and so as Bill kind of alluded to they had you know a computer under their desk that they did their research on and
05:04
couldn't understand why they took forever to process their uh genome sequence or whatever they're working on and so when we uh. I was worked with one of our researchers, Andrew McArthur, and we said, hey, you know, there's gotta be a better way so he came to me and said like how can we make this better and uh you know what can we do here at McMaster to help so we,
05:28
we came up with an idea that we would build this cluster and so I'm actually one of the founding members we've finally made it official. We have uh a facility now called Advanced Computing Facility at McMaster University that was launched May 1st so I'm super excited about that. Um, and we really.
05:47
It evolved over the years we started with 5 PIs kind of bringing that hardware to us to say hey how can we do this better? And uh Really what we ended up at the end was now we're at 42. Uh, PIs in our cluster and, uh, and our clusters actually grown quite a bit, um, so we have uh 633 CPU cores we have about 11 terabytes
06:16
of uh. Memory uh for for that compute we have uh 8 GPU cards. I don't know how many cores, quite a number, quite a large number of cores and about 1.5 petabytes of storage that never seems to be slowing its growth, which is kind of crazy, but, um, you're uh here to help us with that.
06:38
So in that cluster we have some uh high performance compute through, uh, Hewlett Packard Enterprise through Cisco. And through uh pure storage is the uh the big kind of piece that we're talking about here today and so our uh our Apollo allows us to do some. Really cool AI stuff that uh we'll talk about later in our slides.
06:59
Our Superdome does some really crazy high performance uh analysis of genomes and then um obviously your flash blade is is a differentiator so. All right, so How did we get here and uh what does uh flash plate bring to the table for us? Sorry, I'm flipping through my notes here to keep myself on track because otherwise I could talk about this for a week um and I'm standing
07:30
between you and lunch so I don't wanna do that. So really, um, like I said we had. This huge research uh initiative at McMaster we're one of the top researchers, top research institutions in Canada, uh, for research dollars we're actually the leader in Canada which is pretty cool.
07:50
We brag about that all the time. There's a couple of universities that are bigger total dollar wise but dollars per researcher we're, we're number one, so they can suck it, um. But uh really at the end of the day we wanna make sure that we're changing the world with our research and how do we do that?
08:07
Um, so we had, uh, talk to you today about some of the large, uh, genome, uh, data sets that we were kind of brought to as they come up with a solution. We talk about how we used it to, um, really help our. Ontario government with our COVID variant uh response and tracking and uh recently how we've used it for AI drug discovery which is uh again super super cool so.
08:38
All right. So real world impact and that's kind of what everyone wants to know and here I'm sure how has Flash play really changed uh what we do and how we do it and really at the end of the day uh that's been kind of the biggest piece for for us um so we're fortunate to have a uh a large grant. And uh It included some storage and I was talking with Brent and saying OK how do we,
09:07
how can we leverage this grant to really revolutionize our research. And this is kind of the problem that we're having a genome sequence would take a week. Like realistically one week to process. On one of these computers that were, you know, shadow IT underneath the desk. And so we approached it and said OK let's throw a lot of these are single threaded.
09:33
High memory intensive so let's throw a really intensive. Heavy CPU box with lots of memory and that'll speed it up and it did a little bit. Maybe we got to 5 days. Said, OK, how can we. Really changed that and talking with Brent and he was talking flash blade was brand new at the
09:53
time just about to launch I think we were one of the first ones in Canada to actually get a flash blade which is uh super proud moment for us but uh. We put the flash plate in so Brent was like, look, Pures here about the business we wanna make sure this works for you. It's try before you buy. It's 100%.
10:13
If this thing doesn't work, we'll pull it out. You'll never hear from me again. You know, no problem, no, you know, no worries. I said, OK, great, let's try it out. We put it in our researchers ran their, um, genomic research model against it and literally was done in like half a day, like 4 hours.
10:33
We went from a week to 4 hours. Our researchers came back and were absolutely flabbergasted, no idea how this was possible and really it was through the power of, uh, what flash play brings so. Highlight here this change was not incremental it was transformative and for us uh there's no other way to state it I mean we really went from a week to hours and so really what that
10:57
allowed us to do is um. One of our dean's goals was sorry ahead of myself, um, one of our dean's goals was to, uh, really change the way health care and research work together and how we did that or how we approached that is said OK we have an antibacterial, uh, antimicrobial resistance AMR is kind of a huge topic right
11:25
now in research and health care, so we have all these pathogens out there that are becoming more and more. Uh, resistant to the medications that we have and in our hospitals in the city of Hamilton where we are located, uh, we have patients that are dying. Because the medications that we can provide to them just don't work anymore.
11:50
And so we've partnered with our hospital institutions to actually take the um pathogen that the patient is struggling from you know maybe it's uh a cancer maybe it's um. Uh, some sort of disease and say, OK, how can we treat it with medication. In the previous way that we would do that it was we give them a medication we try it for a couple of months and see how it would work well in critical patients that's they don't it's
12:18
time they don't have and so what we're able to do now with COVID or with uh with flash blade. Is actually take that scan of the patient. Introduce the medications through. Uh, slide scan and actually see what results might be in hours and so now at the end of the day we can actually run sequential research through to see what that looks like how a
12:45
patient might react to certain drugs maybe there's a a drug chemical that we can introduce that'll help save them from where they are today and and maybe give them a better outcome. And so it's um really something that's. Pretty awesome for us, uh, it's allowed us to really help and transform people in in. They come to our hospitals and really help them.
13:10
Um, and then COVID hit and the world stopped. And McMaster is a leading researcher in Canada said, hey, how can we help? And so we actually went to our, uh, partners so we went to Hewlett Packard, we went to Cisco and we went to Pure and we said.
13:27
Hey, this COVID thing is is pretty detrimental. We would like to do something about it as a research institution. But we need some more hardware and so Pure is actually was able to step up to the plate and provide us some additional storage for a flash blade. Um, at no cost to us, which was amazing, and you know,
13:48
definitely a thing that they do all the time. I mean it's going on in the, uh, expo right now where they're, uh, doing a good thing, uh, for, uh, a local organization. So happy that we were able to benefit from that, but what it allowed us to do is actually track the variant. Um, population and how variants were coming in and where they're being,
14:08
uh, brought in from and kind of the spread of those variants across the COVID, uh, thing so the Ontario government kind of came to us and said, hey, we don't have the hardware to do this can we use your cluster? And we of course, you know, good corporate citizens said absolutely with our partnership with Pure we can actually make this happen in real time.
14:31
And of course you know COVID was, was such a crazy thing to start with. No one really knew. Where these variants were coming from and how they're uh able to kind of transform and and manipulate through the society. Um, and so at McMaster we were able to actually do those scans several a day of all the
14:52
different, um, tests that were coming in to figure out, you know, all the different variants so the alpha and the delta and once Omicron hit, of course things were absolutely crazy, um. But really allowed us to and we actually still do monthly scans for the government to really um help assist the government in their COVID track tracking initially until they could get
15:15
their hardware up to date and uh actually do the scans themselves um so really something really cool that uh we're able to do with a flash plate. All right, so this is our. Shameless plug. So on the screen here is our uh. Are linked to um.
15:37
A researcher that uh um. Just recently came over McMaster Doctor Jonathan Stokes and uh he was able to use our research cluster and our flash blade um to study with AI and this is my first actual um introduction into using AI for research and it actually came out better than we could have ever anticipated he was actually able to.
16:04
Introduce a bunch of so through large language model introduce a bunch of chemicals and kind of how these chemicals work and then a bunch of uh microbes and kind of how the microbes work and then actually wrote some AI code to say this is how I wanted to interact and for the first time ever actually generated a new drug to use against uh microbe that had never uh been able to be attacked.
16:35
And use that to potentially eradicate this uh microbe which is absolutely amazing um so out of that he generated a company called uh Stoked Bio which uh is going to now test that chemical against that uh microbe and then hopefully do a lot further testing and and new drug discovery which um will really lead McMaster to the forefront of AI and.
17:03
Uh, research, so super excited about that. All right, and that in a nutshell is McMaster University and our uh approach to research in high performance computing. Craig, thanks. um, we can hold questions to the end or if you have anybody has questions now, um, for Craig.
17:29
Anything now? OK, well, we'll. I'm curious about the antibiotic. What is it targeting? Uh, it's a very specific pathogen. I don't remember the name. It's something I couldn't pronounce probably anyways. I'm an IT guy, not a researcher, um, but, uh, it's a very specific pathogen that,
17:49
uh, he's and the, the drug is, uh, basin. I think I'm sure I'm saying that wrong as well. I was, yeah, um, yes, just uh. Really cool. Great question for you when they are when people are starting the projects, are they coming to you more often?
18:10
Yeah, definitely. So that was kind of the, the biggest hurdle for us is getting researchers to come to us first. So kind of pre-grant instead of buying a computer and sticking under their desk, how do we get them to come to us and say, hey, I wanna be part of your cluster and like I said, it started off with 5 researchers kind of all saying, Hey,
18:35
we've got some money and we've got some funds. What can we do? How can we do it better? And if we pool, what does that look like, uh, we're up to 42 PIs right now in our cluster. Great, OK, we'll, we'll have some more at the end for you, OK, the cash.
18:55
Hello Prakash Choudhury um is gonna lead us through some updated feature function performance with the platform soash, take it away. Yeah, actually I was very fascinating what uh Craig just mentioned about how the researchers reached out to the cluster and they were doing all of this. The first thing comes to my mind when you do
19:19
research. And going through this phases of the research process, right? You do from primary, secondary, tertiary. When you go through this process, process doesn't change. You could be using any software for that. But the biggest thing that I really uh uh I connected with what Craig was mentioning is
19:35
when you have so many different pathogens and different projects working. How do you actually make them seamlessly work in different tenants on the storage so that it doesn't bother like running Coke and Pepsi together right? but they're completely independently running on the same storage on the same name space, right? So then you have a logical partitioning between
20:03
these projects where you actually scoop out the storage capacity that you need. And you can actually set the performance parameters. For example, if one project is more critical over the other and need more storage resources, how do I set my quality of service to perform better for a certain project over others and I can quickly switch over if that is another requirement for a different project.
20:34
So imagine these things happening natively in a storage layer will give so much amount of flexibility to add more projects and you don't have to worry about what the as you said rightful in the beginning, data is important. The access data is important. The data scientists don't want to be a data janitor just like Bill mentioned earlier.
20:57
Why would they care? Why doesn't the storage take care of all of that stuff for the for the end user? That end user experience is what we're trying to deliver in this entire workflow from the value that we do from a simplicity, as Craig pointed out, he's been using FlashPlate right from the beginning when we started to ship flashlate.
21:16
They're the early adopters and still with them. That means we have some merits on the storage with the storage and the changes that you have already heard, the new revisions that you heard in this event, the announcement that we had the launches we did at this event, the new generation is just going to add more stuff.
21:35
It's just making your experience much more better for as you start to scale the number of projects now here, anything that you see and this is the reason why I have this slide is we talked about speed, but then another key thing that is very important is these phases have got a very distinct workload characteristics. As Craig was mentioning, somebody can buy a hard drive and stick it on his desk and run it
22:01
and why it's taking such a long time. The reason for that is taking a long time because supposing you're doing this uh demarks where BL BCL to uh fastQ that needs a high concurrency and DAS or a local storage is not designed for doing that. So there is always a mystery behind why a storage like ours has been successfully running in the uh research center for McArthur Labs because there are distinct characteristic of
22:34
this feature of the workload. Now for example, you go to shorting and and deduplication, um, that one is a highly sequential workload. It's a high metadata and the sequential work, so you see the diversity of the workload, and you cannot be having one set of storage doing everything. It's not like a Swiss Army knife.
22:55
So if you need to have a story which is capable of handling your IOPs, which is small files, high metadata, and high throughput when you're doing a lot of read and write. That is where flash rate really shines because yes you can only see the performance but under the scene why we are successful is where we have actually got closer proximity to these kind of workloads and how we handle them and if
23:21
you see here on the bottom. The role that I have it is simple. It is scalable we have high performance that you already um heard the evidence from uh Craig, but what we're trying to do is data consolidation and future proofing for AI when you're doing a clinical research, drug discovery. Which is what is now picking up pace in the market right now.
23:42
How do we actually handle? Why do you have to have another silo of device or storage device? Why can't we just use the same data set from the core data because the analysis part and the clinical research could be merged together from a standard data platform. They you could be having different processes, but why would you move your data?
24:04
Keep in mind the volume of data that is generated through sequencers through the secondary analysis are in petabytes. Depending on the size of the organization and the number of samples we are actually analyzing, those are in petabytes. So in in that high volume scenario.
24:22
How do you actually handle and move data from different sam? So you are now saving time by running job a lot more faster, but at the same time you're spending time moving data between different silos. How does it work? So the solution is, yes, this storage is capable of handling different kind of uh diverse workloads.
24:44
Why didn't you just keep the data and we have already got certifications from Nvidia about the various AI integrations like the super power certification we have already got the NCP is on the same platform. If you're already using the same platform, why don't you throw a workload onto it? And I can make this much, I can admit, every hardware has its limit.
25:05
So instead of a single chassis may need a 3 chassis. It cannot be just one single chassis serving everything for you, but we always have means to grow the physical chassis so that the data does not leave anywhere. So what that means, that means you have got more control over your data, but apart from giving your performance,
25:25
we're also giving you more control. Control means auditing. Compliance, security. Data recovery from any kind of a disruption you're talking about COVID times. We are having a cyber attack. You're losing your data, you're locked in your data to some ransomware.
25:46
So how do you recover from those things? So those are the things which I think it makes a big and this is the slide I was talking about where you talk about this on the left hand side is all the sequencing that you're doing on the right is where you're doing the folding and drug discovery. This is all the research work.
26:02
So what I'm trying to say is you don't have to have different silos of storage. The same data platform with an integrated solution with the application upstream can really do the job for you. All you care or the data scientists care is about accessing data. Now this is a new product we have.
26:25
What happened? OK. All right. Um, so we have a hybrid model right now. It's called zero Move tier, and I think this is an excellent uh product for this genomics workloads. The reason because the high volume of data that is being generated, you don't have the data actively used every time.
26:52
You probably be using 20% of the data. The remaining 80% could be sitting idle. Why would that be sitting idle on a high performance flash when you don't need it? But at the same time, researchers would definitely have access to the data maybe a week later. But for that week, your TCO is going to be higher. Why don't you drive the TCO model down?
27:14
So that's where you have a high performance computing. Which is where the S 500 blade is there, the chassis, and underneath that we have got all our expansion chassis for cold data. So literally you have all your data in one name space if you run out of space, just add more blades to it that's all you care because the data doesn't have to move
27:39
until you decide, OK, I'm done and dusted with this project. I'm not going to touch it. I'm going to archive it for my compliance reasons, and that is where you can move this to our, hopefully it works. To our flash plate E for long term retention. So if you look at a data's life cycle process perspective, what you're trying to do is you're
28:02
the challenges that you're having not only you're speeding up your uh data that is coming out but also the time to move the data between different phases is also eliminated, not reduced eliminated. And that's the value that you're having, and these are some real results that we're getting from customers or even testing that we have done internally.
28:22
We see about a 50% faster, which I think Craig already mentioned that is obviously giving you a lot of speed, but the thing is those 25% is TCO I provide this is the ZMT it's called Zero Move tier, which is a new product which is a absolute good fit for genomics for the kind of volume of data that's been generated. So another one we have is when you're having a bigger sites.
28:46
You can actually start to collaborate. So if you see if you have a genome sequencing center and you've got a genome analysis center. a cluster Analysis cluster 2 different sites. You know what a fast you file. Which is a big one.
29:02
What do you normally do? FTP it copied over. Don't have to. You just have to move the file on demand. You just have the metadata copied over. The file is physically not there, just the metadata is mirrored between the two sites. Whenever the other site starts to access the data, data is not available.
29:22
It just puts, puts the data on demand. So then what happens is they're not only reducing the network bandwidth between the two sites. But at the same time getting the guys of the data scientists over there in the other region of the site up and running a lot more faster. You don't have to replicate it.
29:39
You have to just move the data on demand. So What that means is. You have got Different, uh, it is a global regional teams you've got HIPAA zones so you don't want to have the data leave to a different site. You don't have to move the entire data set, so you're processing a fastAQ file and then amount
30:01
of VCF and BAM files generated, they don't have to come back. They can still sit in that region and you don't have to move the data back. If you choose, you can, but normally for HIPAA HIPA zones and all of the restrictions you have for compliance reasons, you don't have to move the data. So we are providing all of that capability natively on the storage you set it up one time,
30:24
you're done unless you really want to go and make some changes and everything is working, just leave it there. So we're not only providing the secure multi-tenancy that we can run in multiple tenants for the different projects that you're spinning up at the same time we're providing quality of service that means if one project needs more resources over the other, you can actually set it up,
30:44
uh, automatically uh one time from a UI. Then you have got the ability to actually tear your data without moving it. That means you're providing dynamic SLAs. And 3rd 1 is replicating across sites on demand. So you can do so much with your data seamlessly between the different sites on site,
31:07
so that it becomes a lot more easier for end users experience. That's the important part because they don't have to worry about physical hardware. They just need to have access to the data and that's exactly what we are providing here. So this is one last slide I would like to conclude from all the various testing and customer experiences put together in a nutshell, this is what we have gathered so far and our
31:33
work is still work in progress because we believe that newer platforms with the new generation that got just launched, we need to do a lot of work on that. This is all from our existing generation platforms. So if you see here, we are not only 37.5x faster, we can actually have 35x more samples analyzed. That means you can run parallel sample samples
31:55
in parallel. That's the scalability we're providing. And then obviously all of the other pure value that we're offering as an OpEx versus CapEx overgreen, all of that stuff is also available. All right. Well, Prakash Craig, thank you. uh, questions from from the group I know we ran a little over but.
32:20
Um, I, I know I have one, but gosh, can you talk a tiny bit about the importance of being able to handle really large files and really small files at the same time when it comes to genomic's workloads? Great question. I was showing you the slide, I just want to quickly move through that just to get you a visual context here where Bill was is pointing
32:41
to, um, so what happens here is come on. Yeah, this one. So as I was giving you an example of various parts of your workflow, you have got small files and large files. Our blade architecture, the launch we did just yesterday for the SR2
33:05
flash plate S 500 R2 and 200R2. So the IOPs are already always tied to the blade. That means the CPU speed of the blade. We have announced that. So the newer testing or the newer blades we'll probably see more or further already I have
33:22
seen 2.7X improvement in a benchmark that I ran for genomics and that's purely on IOPs alone. The throughput when you're looking at it, we have got those 37 terabyte DFMs and the 75s depending on the capacity that you have the new generation, which gives you. Additional boost for large IO to multiple files, so a combination of the blades and the DFMs gives you the uh the performance that you need for small files versus large files.
33:51
Great, thanks. And Craig, one last one for you how do you build a culture? What what was unique about the way you all built the culture to bring IT and the researchers together to to put together the cluster and make it. Not not just me, not just check boxes on the IT side but meet the researchers' needs,
34:14
yeah, and really that I mean that's been the biggest challenge for sure um and we approached it in a way that. We want to be collaborative and at the end of the day we wanted to make sure that their research was important required. On demand when they needed it but that we were there to support them and help them through um
34:38
there's still some researchers out there that do their own thing don't get me wrong I'd love to say we have all of them in house and we have all the answers uh I really don't but. The fact that we've gone in the last we're going into our 8th year doing this we've gone from 5 to 42 definitely speaks to the ability that we have on the IT side. We also have a a really one of the other founding members Andrew McArthur,
35:02
um, he's a researcher. That kind of did this on his own outside in a um corporate environment before he came to McMaster and so he's out there speaking with researchers as well and pointing them to us to say hey this is this is how we do it this is why it's better this is what you get for your money as opposed to the little bit you're gonna get over here um but even like Doctor Stokes,
35:26
the reason he came, he's a Hamilton native studied at Stanford worked at Stanford, and the reason he didn't stay. is because the research cluster at McMaster helped compel him to come home, um, which is, which is massive, so. Great. OK. Well, if there's no other questions, we're gonna wrap it up.
35:46
Thank you everybody for attending and we appreciate Craig. Thank you for joining us, Baash, as always, thank you.