Performant Time-Series Storage and Continuously Up to Date Materialized Aggregates
Presented by:
Matvey Arye
Mat has been working on data infrastructure in both academia (Princeton, PhD) and industry. As one of TimescaleDB's core architects he works on performance, scalability, and query power. Previously, he attended Stuyvesant, The Cooper Union, and Princeton.
No video of the event yet, sorry!
Time-series workloads (i.e. data from sensors, IoT devices, finance, or even satellites) are one of the fasting growing segments of the database market, spreading across industries and use cases. Today, many developers working with time-series data turn to NoSQL databases for storage with scale, and relational databases for managing associated metadata and key business data, yet this leads to engineering complexity, operational challenges, and even referential integrity concerns.
In this talk, we will explain how Timescale re-engineered Postgres to efficiently handle time-series data alongside relational data. We’ll share how TimescaleDB, which is implemented as a Postgres extension, improves insert rates by 20x over vanilla Postgres and achieves much faster queries while offering full SQL.
Additionally, time-series data analysis often requires aggregates over significant periods of time. To support these types of queries more efficiently, we have implemented the infrastructure for creating continuously up-to-date aggregates without write amplification using features of Postgres (partial aggregates, invalidation triggers, proper locking/transaction safety, background workers, union views, query planner etc.). In this talk, we will also describe how and when to use these pre-calculated results to speed your queries.
- Date:
- 2019 July 19 09:30 EDT
- Duration:
- 40 min
- Room:
- Room
- Conference:
- Philly 2019
- Language:
- Track:
- Dev
- Difficulty: