Cross-site Replication for Massively Parallel Postgres
Shivram Mani is distributed systems enthusiast and is a committer to projects including Greenplum, PXF and Apache HAWQ. In the past he has worked on data munging in Google and on web search federation and search analytics in Yahoo. Currently in VMware he contributes in various areas ranging from greenplum external connectors (hadoop, spark, cloud and relation datastores) to cluster wide backup/restore and cross-site replication.
No video of the event yet, sorry!
Postgres provides various alternatives for replication including logical replication, synchronous and asynchronous WAL based replication, continuous archive based recovery all of which are perfect for standalone instances with the latter two more suited when your replica is geographically remote. Greenplum is a parallel distributed shared nothing database where multiple postgres instances are clustered together for analytical workloads. The challenge of having a cross site replica in such an MPP database is to maintain global consistency across each postgres instance.
In this talk we will discuss the practical challenges with the alternatives available for cross site cluster replication and how we have addressed this in our solution. We will demonstrate how greenplum makes use of postgres restore point (used for point in time recovery) as a building block to provide an incremental, consistent cluster wide replication to achieve a very low RTO (recovery time objective). Furthermore, we will discuss how these global restore points work alongside other distributed transactions and allows users to control the RPO (recovery point objective).
- 50 min
- Postgres Conference 2020
- Distributed SQL