What is the optimal value of shared_buffers for your database? A use case of data-driven optimization based on database experiments
More than 17 years of experience with various DBMSes, more than 13 years with PostgreSQL. M.S., MIPT and ISP RAS, specialty "Database systems".
Founder of Postgres.ai.
Founder of #RuPostgres (PostgreSQL user group for Russian-speaking users, the second in size on Meetup.com, 2000+ members).
Committee Chair for Database/Architecture Sections of Highload++, RITFest, and PGDay Russia.
Russian press contact at PostgreSQL Global Development Group.
No video of the event yet, sorry!
What is the optimal value of shared_buffers for your database and workload? 8 GiB? Or 16? Is it worth using 70% of RAM if you run Postgres 11 on a server with 512 GiB of memory?
Of course, you might check dozens of various values of shared_buffers, right on production, but if your database is mission-critical for your business, it’s not what you want.
To find the optimal value, we will conduct database experiments. We’ll discuss:
- how to reproduce the production workload in the “lab” environment,
- how to reduce the amount of time of a single experimental run,
- what (latency? throughput? both? anything else?) and how we need to measure,
- how the optimal value depends on workload, and finally,
- how machine learning can help to predict the optimal value even without experiments!
During our journey to find the optimal shared_buffers value, we will use Nancy CLI — an open source framework we have built specifically to automate conducting numerous database experiments in a systematic way.
In the talk, we'll also dive into some technical details of shared buffers implementation in Postgres discussing why lower or higher shared_buffers values lead to much worse performance than one might expect.
- 2019 March 21 09:00 EDT
- 50 min
- Postgres Conference
- Use Cases