Uncage the elephant: distributed postgresql
Started hacking multidimensional indices in postgres back in 2012. Since then worked also on full text search, two-phase commit, logical decoding, distributed transaction.
No video of the event yet, sorry!
Recently, we have seen a rise of databases that claim to provide high availability and scalability without sacrificing classical transactional guarantees. In most cases, such systems are developed from scratch and can choose internal architecture according to the deemed distributed algorithms. In turn, PostgreSQL is a 30-year-old database and some of its architectural decisions can be a good foundation for building a distributed database, but some others can be serious obstacles.
Turns out that the integration of distributed transactions in PostgreSQL is quite straightforward and most of the infrastructure is already present. However, highly-available replication, such as Raft, is hard to properly integrate due to the limitations and peculiarities of existing replications and logging mechanisms. So, the development of customized replication protocol becomes a viable option, despite the odds with verification and model checking of such protocol.
In this session, we'll talk about efforts of the PostgreSQL community to adopt ideas from Spanner-like databases, particularly isolated distributed transactions and highly-available replication. We'll discuss the tradeoff between changing the existing system to adopt new functionality and altering new functionality to be adopted by the existing system. In the replication part, we'll also share our experience with the TLA+ model checking of the discussed protocols.
- 20 min
- Postgres Conference 2020
- Distributed SQL