TOASTing an Elephant : Building a Custom Data Warehouse Using PostgreSQL
David is a R & D Engineer at TimescaleDB, before that he was a Data Engineer at Moat Inc where he focused mostly on Postgres, how he got there from battery engineering and electrochemistry startups is a story too long for a short bio, but you can ask him if you'd like (it even involves PGConf!)
No video of the event yet, sorry!
At Moat we add ~500 million rows each day to just a few tables in our data warehouse. Previously, we used Postgres for data for the last month (which takes ~15TB of disk), and we love its ability to handle concurrency and its cost compared to the commercial column-stores that we use for our historical data. By using a combination of composite-types, arrays and Postgres' TOAST mechanism we were able to build a custom data warehouse that combines the best aspects of Postgres and a commercial column-store database, tailored specifically for our needs without modifying any of the Postgres source. Our new store uses ~1TB of disk/month, has sped our ETL considerably (we're now compute rather than I/O bound) and will allow us to migrate our full historical data warehouse to a single Postgres instance. This talk will cover: - An overview of our data problems and why we chose to go this route - A deep dive into the different Postgres mechanisms (TOAST, arrays, composite types, SRF's) we use to make our solution work, how we put them together, and how others might apply similar techniques to their use-cases - Strengths and weaknesses of the approach, lessons learned - Upcoming features in Postgres that are going to make this approach even more attractive - Future work/things that aren't upcoming (yet!?) that would really help us out
- 50 min
- PGConf US 2017 [PgConf.US]
- Use Cases