TimescaleDB: Leveraging PostgreSQL for Reliability
Michael J. Freedman is a Professor in the Computer Science Department at Princeton University, as well as the co-founder and CTO of Timescale, which provides an open-souce database that scales out SQL for time-series data. His research broadly focuses on distributed systems, networking, and security, and has led to commercial products and deployed systems reaching millions of users daily. Honors include a Presidential Early Career Award (PECASE), Sloan Fellowship, NSF CAREER Award, ONR Young Investigator Award, DARPA CSSG membership, and multiple award publications.
No video of the event yet, sorry!
Time-series databases are one of the fastest growing segments of the database market, spreading across industries and use cases. Common requirements include ingesting high volumes of structured data; answering complex, performant queries for both recent and historical time intervals; and performing specialized time-centric analysis and data management.
Today, many developers working with time series data turn to polyglot solutions: a NoSQL database to store their time series data (for scale) and a relational database for associated metadata and key business data. Yet this leads to engineering complexity, operational challenges, and even referential integrity concerns.
In this talk, I will explain how one can avoid these operational problems by re-engineering Postgres to serve as a general data platform, including high-volume time-series workloads. In particular, TimescaleDB is an open-source time-series databases, implemented as a Postgres plugin, that improves insert rates by 20x over vanilla Postgres and much faster queries, even while offering full SQL (including JOINs). TimescaleDB achieves this by storing data on an individual server in a manner more common to distributed systems: heavily partitioning (sharding) data into chunks to ensure that hot chunks corresponding to recent time records are maintained in memory.
I will focus on two newly-released features of TimescaleDB, and discuss how these capabilities ease time-series data management: (1) the automated adaptation of time-partitioning intervals, which the database learns by observing data volumes; (2) continuous aggregations in near-real-time, in a manner robust to late-arriving data and transparently supporting queries across different aggregation levels, and how these capabilities have been leveraged across several different use cases.
- 2019 March 21 13:00
- 50 min
- New York Ballroom West
- Postgres Conference