Skip to Content

Faster File Operations for AI, Analytics, and More

Accelerate file operations up to 20x faster with RapidFile Toolkit. Optimize AI, analytics & more.
Watch Now
Resume Watching
00:00
Large data sets are the norm in modern computing. So why are we still using single threaded file commands like it's the 80s? If you're managing millions of files with outdated tools, it's time to upgrade. Pure Storage's rapid file tool kit is built for speed up to 20 times faster than traditional
00:20
Linux commands. Stick around and we'll show you how it works. Managing large scale file systems with standard Linux commands is like trying to shovel snow with a teaspoon. You'll eventually get the job done, but it's going to be painfully slow.
00:41
Today's high performance environments, whether it's AIML, EDA, DevOps, or analytics, demand something faster, something built for modern workloads. Enter rapid file toolkit, a set of high performance tools designed to supercharge your file operations built from the ground up to leverage pure storage flash blade's massively parallel architecture.
01:07
This tool kit allows you to copy, move, list, and analyze files up to 20 times faster than the traditional Linux utilities. And the best part, it's a drop-in replacement for commands you already use. No learning curve, no complex scripting, just immediate performance gains. Let's take a look at what it can do.
01:32
Using the list command. The Unix can list files quickly. This is a basic command process. With the Flash Blades file system, you can enumerate the file structure and capacity. This file system contains hundreds of files.
01:54
You can see here the data science folder. Sub-directories, including thousands of files. Using the standard tool kit installation, we use RPM and install the tool kit. We can mount the data science directory. With a set of files, note we are using Nect equals 16 and NFS version
02:36
3. On the right hand side you can see what metric. is in the particular workload it creates, and there's a very small block size, not a whole lot of throughput. This process is very metadata specific. It took the platform 29 seconds to do a simple word count.
03:02
Now we unmount and mount in between each of these tests so that you can see that we're not caching any information. So now if we do this LS with a PLS, you can see the difference. You can see it's using a lot more resources now. And if you take a look at the difference here, you're seeing just a simple 3 2nd difference
03:29
with the PLS. A difference between 29 and 3 seconds is significant. Find in particular is a very heavy latency process because it does a tree walk. Tree walks usually take a lot of time. As you saw with P Find, we're able to reduce the time it actually took us to do a tree walk.
03:57
To find the files that were required. That's the power of Rapid File Toolkit, a simple yet game changing upgrade for managing large scale file systems. Whether you're processing millions of files in AI and analytics, working with massive data sets, or just tired of waiting on slow operations, this tool kit helps you move faster and get more work done.
04:26
If you're a flash blade customer, give it a try. You'll be amazed at how much faster routine tasks and automation will run. And if you want to see more ways Pure Storage is helping organizations work smarter, check out Pure 360, your hub for quick overviews, expert walkthroughs, and interactive demos, all designed to simplify your infrastructure and help you achieve more.
Watch more from this series
Completion
Unlock premium content.

* indicates a required field.

Gain exclusive access to all premium Pure360 demo content and explore even more in-depth insights and features of the Everpure platform.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Your Browser Is No Longer Supported!

Older browsers often represent security risks. In order to deliver the best possible experience when using our site, please update to any of these latest browsers.

Personalize for Me
Steps Complete!
1
2
3
Personalize your Everpure experience
Select a challenge, or skip and build your own use case.
Future-proof virtualization strategies

Storage options for all your needs

Enable AI projects at any scale

High-performance storage for data pipelines, training, and inferencing

Prevent against data loss

Cyber resilience solutions that reduce your risk

Reduce cost of cloud operations

Cost-efficient storage for Azure, AWS, and private clouds

Accelerate applications and database performance

Low-latency storage for application performance

Reduce data center power and space usage

Resource efficient storage to improve data center utilization

Confirm your outcome priorities
Your scenario prioritizes the selected outcomes. You can modify or choose next to confirm.
Primary
Reduce My Storage Costs
Lower hardware and operational spend.
Primary
Strengthen Cyber Resilience
Detect, protect against, and recover from ransomware.
Primary
Simplify Governance and Compliance
Easy-to-use policy rules, settings, and templates.
Primary
Deliver Workflow Automation
Eliminate error-prone manual tasks.
Primary
Use Less Power and Space
Smaller footprint, lower power consumption.
Primary
Boost Performance and Scale
Predictability and low latency at any size.
What’s your role and industry?
We've inferred your role based on your scenario. Modify or confirm and select your industry.
Select your industry
Financial services
Government
Healthcare
Education
Telecommunications
Automotive
Hyperscaler
Electronic design automation
Retail
Service provider
Transportation
Which team are you on?
Technical leadership team
Defines the strategy and the decision making process
Infrastructure and Ops team
Manages IT infrastructure operations and the technical evaluations
Business leadership team
Responsible for achieving business outcomes
Security team
Owns the policies for security, incident management, and recovery
Application team
Owns the business applications and application SLAs
Describe your ideal environment
Tell us about your infrastructure and workload needs. We chose a few based on your scenario.
Select your preferred deployment
Hosted
Dedicated off-prem
On-prem
Your data center + edge
Public cloud
Public cloud only
Hybrid
Mix of on-prem and cloud
Select the workloads you need
Databases
Oracle, SQL Server, SAP HANA, open-source

Key benefits:

  • Instant, space-efficient snapshots

  • Near-zero-RPO protection and rapid restore

  • Consistent, low-latency performance

 

AI/ML and analytics
Training, inference, data lakes, HPC

Key benefits:

  • Predictable throughput for faster training and ingest

  • One data layer for pipelines from ingest to serve

  • Optimized GPU utilization and scale
Data protection and recovery
Backups, disaster recovery, and ransomware-safe restore

Key benefits:

  • Immutable snapshots and isolated recovery points

  • Clean, rapid restore with SafeMode™

  • Detection and policy-driven response

 

Containers and Kubernetes
Kubernetes, containers, microservices

Key benefits:

  • Reliable, persistent volumes for stateful apps

  • Fast, space-efficient clones for CI/CD

  • Multi-cloud portability and consistent ops
Cloud
AWS, Azure

Key benefits:

  • Consistent data services across clouds

  • Simple mobility for apps and datasets

  • Flexible, pay-as-you-use economics

 

Virtualization
VMs, vSphere, VCF, vSAN replacement

Key benefits:

  • Higher VM density with predictable latency

  • Non-disruptive, always-on upgrades

  • Fast ransomware recovery with SafeMode™

 

Data storage
Block, file, and object

Key benefits:

  • Consolidate workloads on one platform

  • Unified services, policy, and governance

  • Eliminate silos and redundant copies

 

What other vendors are you considering or using?
Thinking...
Your personalized, guided path
Get started with resources based on your selections.