Scaling PostgreSQL Change Data Capture using Open Source Airbyte
Presented by:

Rodi Reich Zilberman
No video of the event yet, sorry!
Handling Change Data Capture (CDC) in PostgreSQL for large-scale synchronization presents unique challenges, particularly during the initial snapshot process. Traditional approaches can fail when the snapshot process exceeds the WAL retention period, leading to lost changes or costly resynchronizations. This issue becomes critical in high-transaction environments or with exceptionally large tables.
To address these challenges, Airbyte, an open-source data integration platform, introduced the WAL Acquisition Synchronization System (WASS). WASS implements an iterative synchronization algorithm explicitly designed for PostgreSQL’s WAL-based architecture. Instead of completing the entire snapshot before reading changes, WASS sets a timeout for the snapshot process. When the timeout is reached, the system temporarily halts the snapshot and reads the WAL to catch up on changes. Once the CDC reads are complete, the snapshot resumes. This alternating process continues until both the snapshot and WAL-based CDC synchronization are completed, ensuring no data is lost and enabling reliable replication for large-scale PostgreSQL databases.
In summary, WASS integrates iterative snapshotting and real-time WAL reads to provide a scalable and resilient solution for CDC in PostgreSQL. This approach minimizes data loss risks, supports high transaction volumes, and ensures seamless replication for even the most extensive databases, making it a significant advancement in PostgreSQL data integration techniques.
The speaker is Rodi Reich-Zilberman. senior software engineer - data warehouses at Airbyte, with more than 20 years in software engineering focused on data integration.
- Date:
- 2025 March 20 10:30 EDT
- Duration:
- 20 min
- Room:
- Space Coast 1&2
- Conference:
- Postgres Conference 2025
- Language:
- Track:
- Ops
- Difficulty:
- Medium