Presented by:

Freedman headshot july2012

Michael Freedman

TimescaleDB

Michael J. Freedman is the co-founder and CTO of TimescaleDB, an open-source database that scales SQL for time-series data, and a Professor of Computer Science at Princeton University. His research focuses on distributed systems, networking, and security.

Previously, Freedman developed CoralCDN (a decentralized CDN serving millions of daily users) and Ethane (the basis for OpenFlow / software-defined networking). He co-founded Illuminics Systems (acquired by Quova, now part of Neustar) and is a technical advisor to Blockstack.

Honors include: Presidential Early Career Award for Scientists and Engineers (PECASE, given by President Obama), SIGCOMM Test of Time Award, Caspar Bowden Award for Privacy Enhancing Technologies, Sloan Fellowship, NSF CAREER Award, Office of Naval Research Young Investigator Award, DARPA Computer Science Study Group membership, and multiple award publications. Prior to joining Princeton in 2007, he received his Ph.D. in computer science from NYU's Courant Institute, and his bachelors and masters degrees from MIT.

No video of the event yet, sorry!

Time-series data tends to accumulate very quickly, across devops, IoT, industrial and energy, finance, and other domains. To drive real-time decisions and data science, software developers often seek to wrangle this large volume of data into a variety of database systems.

In this talk, Michael discusses the five objectives for scaling a database for time-series workloads -- total storage volume, insert rate, query concurrency, query latency, and fault-tolerant replication -- and how these objectives have different needs.

More concretely, this talk also contains a technical dive of how TimescaleDB leveraged its chunk-based architecture to go from a primary-replica system on PostgreSQL, to a scale-out distributed time-series database that can scale to tens of millions of metrics per second, store petabytes of data, and process queries even faster via better parallelization. Michael will describe how this architecture, compared to a traditional sharded system, enabled a much broader set of capabilities one wants for time-series workloads (e.g., both scale up and scale out, elasticity without data movement, partitioning flexibility, and age-based data retention, tiering, and reordering).

Date:
Duration:
50 min
Room:
Conference:
Postgres Conference 2020
Language:
Track:
Distributed SQL
Difficulty:
Hard