What is Big Data

What is Big Data

What Is Big Data?

Today’s businesses collect vast amounts of data from a variety of sources that must often be analyzed in real time. Big data refers to data that is too big, too fast, or too complex to process using traditional techniques.

The Three V's of Big Data

While the concept of big data has been around for a long time, industry analyst Doug Laney was the first to coin the three Vs of big data in 2001, which are:

  • Volume: The quantity of data that must be processed (usually a lot— gigabytes, exabytes, or more)
  • Variety: The wide-ranging types of data, both structured and unstructured, streaming from many different sources
  • Velocity: The speed at which new data is streaming into your system

Some data experts extend the definition to four, five, or more Vs. The fourth and fifth V are:

  • Veracity: The quality of the data with respect to its accuracy, precision, and reliability
  • Value: The value the data provides—what is it worth to your business? 

While the list can go all the way up to 42 Vs, these five are the most commonly used to define big data.

The Benefits of Hosting Big Data on All-Flash Arrays

The benefits of using all-flash storage for big data include:

  • Higher velocities (55-180 IOPS for HDDs vs. 3K-40K IOPS with SSDs)
  • Massive parallelism with over 64K queues for I/O operations
  • NVMe performance and reliability

Test Drive FlashBlade

Experience a self-service instance of Pure1® to manage Pure FlashBlade™, the industry's most advanced solution delivering native scale-out file and object storage.

Try Now

Why Choose Pure Storage for Your Big Data Needs?

The relative volume, variety, and velocity of big data is constantly changing. If you want your data to stay big and fast, you’ll want to make sure you’re consistently investing in the latest storage technologies. Advances in flash memory have made it possible to deliver custom all-flash storage solutions for all your data tiers. Here’s how Pure Storage® can help power your big data analytics pipeline: 

  • All the benefits of all-flash arrays 
  • Consolidation into a unified, performant data hub that can handle high-throughput data streaming from a variety of sources
  • Truly non-disruptive Evergreen™ upgrades with zero downtimes or data migrations
  • A simplified data management system that combines cloud economics with on-premises control and efficiency
  • Fast and efficient scale-out flash storage with FlashBlade®
Your Browser Is No Longer Supported!

Older browsers often represent security risks. In order to deliver the best possible experience when using our site, please update to any of these latest browsers.