How do we monitor 100+ PostgreSQL machines of TomTom's mapmaking platform in the AWS cloud
10+ years of experience with Java and RMDS systems. Not really a DBA, but a software engineer with responsibility of delivering high quality software that solves problems.
Expert Software Developer in TomTom since 2012. Focusing on the area of database tuning and creating scalable software.
Co-speaker Rafał Hawrylak.
Everyone working with business critical PostgreSQL deployment, understands how important it is to set up proper monitoring. Basic checks that will say "my db works ok!" are essential and in DevOps projects everyone should know them.
However, it is also required to have insights into database performance based on values from PostgreSQL system catalog views.
Alerting requires real-time metrics, but it is necessary to store data for root cause analysis and implementing software improvements.
So there are some questions related to monitoring that we would like to answer based on our experience:
- Why anyone needs monitoring and alerting - it is not only about catching incidents.
- How to handle configuration and maintenance of scalable PostgreSQL environment in the cloud.
- What open source solutions are available on the market.
- Queries, logs, statistics - What and how to monitor metrics at PostgreSQL level.
- CPU, IO, memory - What and how to monitor metrics at system level.
- What happens when hundreds of databases needs to be monitored.
- 50 min
- PostgresConf US 2018
- Operations and Administration