Skip to Content
NL / NL

What Is a Time-series Database (TSDB)?

Vector illustration concept of time management isolated on blue background with long shadow.

When you need to show minute-by-minute, day-by-day, or any other date-range analytics, you use a time-series database. A time-series database stores data points with their associated timestamps so that trends within a range of time can be shown to users. It’s often used in visualization of time-based information and analytics.

What Is a Time-series Database?

In a time-series database, every record contains a timestamp. The timestamp can be used to display a single data point or in graphing and analytics. The time-series database is used specifically for information that requires a date range, such as tracking the weather or querying for specific events logged for monitoring purposes.

What Is a Time-series Collection?

A time-series database stores all data, but a time-series collection is a slice of data taken from the database and returned to the application. Time-series collections are retrieved from the database in the form of a data set, and the data set contains data points for the given date range. The user or application sends the time range as input to the database, and the database returns a collection for each data point that falls within the given range.

How a Time-series Database Works

Usually, a time-series database is created to capture large amounts of data for future analysis. Users set their date range in an application, and the database returns a set of data points. The database works by capturing data in intervals. For example, a stock ticker might display changes in a stock price every minute. The database stores the stock name, price, and timestamp to keep a record of the stock price every minute for analytics and historical information.

Data retrieved from a time-series database sorts records in chronological order so that developers can build visualizations without much overhead. Databases are powerful servers, so they can sort a data set much faster than a front-end web application. The time-series database takes input from the application to determine the way the data should be ordered so that developers can display analytics to users. For example, a user might request data about stock prices for a specific date range and for it to be sorted in ascending order.

Popular Time-series Databases: Comparisons

Every time-series database has a back-end engine, which is used to store and retrieve data. The engine must be fast and efficient to store large amounts of data while being able to retrieve it with very little latency. You could store time-based data in a traditional database, but several time-series databases on the market are built specifically for querying and storing this type of data.

Why Use InfluxDB: Open Source TSDB

Open source databases are preferred by some developers because they can fork the codebase and make their own changes to the base product. InfluxDB is an open source time-series database that can store thousands of data points every second. If you want to monitor infrastructure such as IoT devices for industrial applications, InfluxDB is a good choice.

Prometheus vs. InfluxDB

The main difference between InfluxDB and Prometheus is the way data is retrieved. With InfluxDB, an application continually sends data to the database where it is stored and retrieved. Prometheus works via an API where the application pushes data, and the database uses the API to then poll it for stored data. For large enterprises with systems located in numerous locations, the Prometheus cloud-based API lets developers upload data from multiple locations where it can be reviewed in a central dashboard.

TimescaleDB vs. InfluxDB

InfluxDB is a NoSQL database, while TimescaleDB is a relational database. Relational databases work very differently than NoSQL databases. A relational database works with tables and keys that can be used to join data stored in each table. It’s important to know the way a database stores its data because the way it’s retrieved uses different syntax. If you know the data that will be stored and can organize it into tables, then TimescaleDB is a viable option.

Elasticsearch vs. InfluxDB

For a boost in performance, Elasticsearch is a common engine used in enterprise applications. Its performance is slightly better than InfluxDB due to its ability to shard indices, which are maps to data stored as a “document” in Elasticsearch. Elasticsearch should be used for large data sets where applications and users will retrieve data sets that could span millions of data points from a wide range of timestamps. For example, Elasticsearch is beneficial for reviewing log files used to monitor a large enterprise network environment for any suspicious user activity.

When to Use Time-series Databases

Most time-series databases are used for monitoring hardware or software so that a large collection of data can be used to analyze specific events. To get a clear picture of events within an environment, you need a lot of data collected from numerous sources. For example, IoT sensors might collect temperature data from multiple machines. A time-series database stores temperature for every minute of the day so that engineers can identify any anomalies and remediate them before machinery fails.

Relational vs. Time-series Databases

Most time-series databases use NoSQL documents to store data, which is a common way to store unstructured data. Unstructured data means that developers can store data like a timestamp and a name without the restrictions of organizing the data into defined table rows. Relational databases require developers to store data using specific structures, so they aren’t viable options for time-series data with unknown values and data types. For example, a developer could not store a string value in a timestamp column using a relational database, but it can be done with a time-series NoSQL database.

NoSQL vs. Time-series Databases

For unknown values, a NoSQL database is the preferred method. You should choose a database that supports NoSQL such as InfluxDB or Elasticsearch. These time-series databases offer bulk data storage with fast performance during queries. They’re mostly reserved for large enterprise applications and are much more difficult to deploy. An incorrectly configured NoSQL database could inhibit performance during query processing.

Benefits of Time-series Databases

Every database stores information, but a time-series database is built specifically for time-based analytics. The benefit of a time-series database is in its ability to store large amounts of data with each data point that includes a timestamp. Because it’s built with large data sets in mind, a time-series database is often faster and much more efficient in inserting new records and retrieving large data sets than a traditional database.

Time-series databases are often more accurate for queries involving dates and times, and they store time-series data much more efficiently. Any organization that wants to store monitoring data will benefit from a time-series database. Applications benefit from a time-series database from its ability to retrieve large data sets for analytics, visualizations, financial trends, activity information, and changes in an environment that happen frequently during the day at different intervals.

Disadvantages of Time-series Databases

As with any advanced infrastructure, time-series databases are more difficult to deploy and configure properly. Because most of them are NoSQL, an improperly deployed NoSQL database will suffer from poor performance if it isn’t optimized. Configurations require someone within the organization to understand the proper ways to optimize the database. 

Businesses looking to store time-series data need the resources to store large amounts of data. Data can be stored in the cloud, but it will increase IT costs. The infrastructure to support data storage and time-series database processing can be costly.

Conclusion

If you need to find a better solution for time-based data, a time-series database is a good choice. Review the different types of database engines, consider costs, and find one that scales with the growth of business and the increase in data storage. Remember to review configurations and optimization options to ensure the database runs as efficiently as possible.

NEEM CONTACT MET ONS OP
Vragen, opmerkingen?

Hebt u een vraag of opmerking over Pure-producten of certificeringen?  Wij zijn er om te helpen.

Een demo inplannen

Plan een livedemo in en zie zelf hoe Pure kan helpen om jouw data in krachtige resultaten om te zetten. 

Bel ons: 31 (0) 20-201-49-65

Media: pr@purestorage.com

 

Pure Storage

Herikerbergweg 292

1101 CT . Amsterdam Zuidoost

The Netherlands

info@purestorage.com

Sluiten
Uw browser wordt niet langer ondersteund!

Oudere browsers vormen vaak een veiligheidsrisico. Om de best mogelijke ervaring te bieden bij het gebruik van onze site, dient u te updaten naar een van deze nieuwste browsers.