Creating Continuously Up to Date Materialized Aggregates
Time-series workloads (i.e. data from sensors, IoT devices, finance, or even satellites) are generally insert-mostly, and data typically arrives in time order (at regular or irregular intervals). Given the high velocity and continuous workload of writing time-series, insert performance is paramount. But what is the use of inserting a significant amount of data if you can't analyze, visualize, and act on it effectively? Unlike many OLTP workloads, you often don't need the granularity of each data point, but rather reports on aggregates over significant periods of time and other analysis are the key to making good decisions with the data you store.
This talk describes how TimescaleDB (a time-series database packaged as an extension of PostgreSQL) has implemented the infrastructure for creating continuously up-to-date aggregates without write amplification using features of Postgres (partial aggregates, invalidation triggers, proper locking/transaction safety, background workers, union views, query planner etc.), as well as how/when to use these pre-calculated results to speed your queries.
- 2019 September 19 12:00
- 50 min
- Silicon Valley 2019