Do you vacuum everyday ?
Hannu has been working with PostgreSQL since it was called
Postgres95 (and also played around with
Postgres 4.2 - without the "SQL" - a little before that). His oldest surviving post on
postgresql-hackers@ mailing list archives is from January 1998, proposing using index for fast ORDER BY queries with LIMIT.
Hannu was the first DBA at Skype, where he wrote patches for making VACUUM able to work on more than one table in parallel and invented the sharding and remote call language
pl/proxy to make it easy to use PostgreSQL in an infinitely scalable way. During that time he also participated in design of other enterprise features like CREATE INDEX CONCURRENTLY, PARTITIONED TABLES and anything else that was required to run PostgreSQL 24/7.
After Skype he did 10+ years of PostgreSQL consulting all over the world as part of 2ndQuadrant.
For last three years he has been a PostgreSQL Database Engineer at Google working mostly with Cloud SQL.
Postgresql has an amazing MVCC architecture. But it falls over in a small number of well-known cases (and a new one we recently found with temp tables). I have been a happy PostgreSQL user since 1995, and it was possible to run global-scale databases with the MVCC limitations - if you really knew how it works - at least since 2004/5 when we built the Skype database backend on PostgreSQL and scaled it to hundreds of millions of concurrent users. In the beginning we did tricks like killing the vacuum of a large table every few minutes to let small tables be vacuumed, and I also wrote a python script that was able to effect on-line table shrink/repack purely from the client side. then I also wrote a patch to let PostgreSQL vacuum more than one table in parallel making the first trick unnecessary. Fast forward to the present and while the automation autotuning of vacuuming has hugely advanced, there are still some rare cases, where un-tuned and un-monitored Vacuuming can cause problems. This talk will be about these, and how running PostgreSQL VMs at scale has made it finally unavoidable to deal with this as what is “rare” for a single user, or even a busy consultant becomes a daily occurrence “at scale”. And you really do not want to be told to “start PostgreSQL in single-user mode” to fix things.
- 2022 April 7 16:40 PDT
- 20 min
- Silicon Valley 2022