Advanced compression in TimescaleDB with hybrid row/columnar storage
Michael J. Freedman is the co-founder and CTO of TimescaleDB, an open-source database that scales SQL for time-series data, and a Professor of Computer Science at Princeton University. His research focuses on distributed systems, networking, and security.
Previously, Freedman developed CoralCDN (a decentralized CDN serving millions of daily users) and Ethane (the basis for OpenFlow / software-defined networking). He co-founded Illuminics Systems (acquired by Quova, now part of Neustar) and is a technical advisor to Blockstack.
Honors include: Presidential Early Career Award for Scientists and Engineers (PECASE, given by President Obama), SIGCOMM Test of Time Award, Caspar Bowden Award for Privacy Enhancing Technologies, Sloan Fellowship, NSF CAREER Award, Office of Naval Research Young Investigator Award, DARPA Computer Science Study Group membership, and multiple award publications. Prior to joining Princeton in 2007, he received his Ph.D. in computer science from NYU's Courant Institute, and his bachelors and masters degrees from MIT.
No video of the event yet, sorry!
Storage systems like databases and file systems have long used compression to reduce their storage footprint. Yet the most effective compression techniques were traditionally limited to column stores, where increased data-type locality provides greater options for advanced capabilities. It has often been assumed that fundamental differences between column-store and row-store architectures lead to these opportunities.
The TimescaleDB engineering team recently introduced a compression scheme which challenges this assumption. This compression technique uses regular Postgres values to store data from many rows in columnar form and supports all Postgres data types. Additionally, the recent release of TimescaleDB uses state-of-the-art compression techniques to achieve storage usage on par with dedicated column stores.
During this talk, Michael will discuss how TimescaleDB native compression combines the best of both worlds: (1) all of the benefits of PostgreSQL, including the insert performance and shallow-and-wide query performance for recent data from a row store, combined with (2) the compression and additional query performance -- to ensure we only read the compressed columns specified in a query -- for deep-and-narrow queries of a columnar store.
- 2020 March 23 16:10
- 50 min
- Postgres Conference 2020