Schedule - PGConf.EU 2023
A journey into postgresql logical replication
Late last year Toggl Track engineering team embarked on a journey to develop a postgres-based close to real-time OLAP database system to better serve our reporting needs. Toggl Track has been using a monolithic Postgres database since its beginnings and the goal was to progressively shift away our report queries from the transactional database creating a CDC on top of it, leveraging logical replication by using data-changing events, and applying whatever transformations are required to reach the desired - report-friendly - data model, OLAP.
We hit a number of difficulties in the process, one of them was caused by incorrect assumptions when it comes to the usage of logical replication events coming out of Postgres, as - when tapped into the production databases - our log offsets (LSNs) would go haywire due to the high transactional load, not easily replicable in dev or staging environments.
Took months of experimentation, debugging, and workarounds till we finally concluded that we were indeed losing data in production, as the issue would only present itself when the client (golang application) was restarted and when that restart coincided with out-of-order log offsets. After that, took an attempt to build our own WAL parser and a few email exchanges with pg developers to figure out the correct way to keep track of logical replication events, and keep the correct offsets in check.
There are a number of open-source projects making the same bad assumptions when it comes to logical replication event order, and there is very little if any material on how to properly do this.