What Is CDC? Change Data Capture (CDC) is a mechanism that tracks row-level changes (inserts, updates, deletes) in a database and notifies downstream systems in the order they occur. In disaster recovery scenarios, CDC is often used for real-time synchronization from a primary database to a standby one. Change Data Capture (CDC) source ----------> CDC ----------> sink source ----------> CDC ----------> sink Apache SeaTunnel CDC SeaTunnel CDC supports two types of synchronization: Snapshot Reading: Reads historical data from tables Incremental Tracking: Captures and reads incremental change logs from tables Snapshot Reading: Reads historical data from tables Snapshot Reading Incremental Tracking: Captures and reads incremental change logs from tables Incremental Tracking Lock-Free Snapshot Sync Why emphasize “lock-free”? Because some CDC platforms (e.g., Debezium) may lock tables during historical data sync. SeaTunnel avoids this by reading snapshots without locking. Here’s the basic snapshot read process: storage------------->splitEnumerator----------split---------->reader ^ | | | \-----------------report------------/ storage------------->splitEnumerator----------split---------->reader ^ | | | \-----------------report------------/ Split Assignment: The splitEnumerator divides table data into multiple chunks (splits) based on a specified field (e.g., table ID or unique key) and step size. Split Assignment splitEnumerator Parallel Processing: Each split is routed to a different reader for parallel reading. Each reader occupies one connection. Parallel Processing Event Feedback: After completing a split, the reader reports progress back to the splitEnumerator. Event Feedback splitEnumerator Each split sent to a reader contains metadata: String splitId // Routing ID TableId tableId // Table ID SeatunnelRowType splitKeyType // Field type used for splitting Object splitStart // Split start point Object splitEnd // Split end point String splitId // Routing ID TableId tableId // Table ID SeatunnelRowType splitKeyType // Field type used for splitting Object splitStart // Split start point Object splitEnd // Split end point The reader generates SQL statements based on this info. Before reading, it records the log position (low watermark) for the split. Once complete, it reports: low watermark String splitId // Split ID Offset highWatermark // Log position after split is processed String splitId // Split ID Offset highWatermark // Log position after split is processed Incremental Synchronization After snapshot reading, any changes in the source DB are continuously captured and synced in real time. Unlike the snapshot phase, this stage reads from the DB’s log (e.g., MySQL binlog) and typically uses a single-threaded reader to reduce pressure on the DB. single-threaded reader data log------------->splitEnumerator----------split---------->reader ^ | | | \-----------------report------------/ data log------------->splitEnumerator----------split---------->reader ^ | | | \-----------------report------------/ During incremental sync, all snapshot splits and tables are merged into a single split. Metadata for the incremental split: String splitId Offset startingOffset // Earliest log start across all splits Offset endingOffset // Log end position; null if continuous List<TableId> tableIds Map<TableId, Offset> tableWatermarks // Watermarks per table from snapshot phase List<CompletedSnapshotSplitInfo> completedSnapshotSplitInfos // Snapshot split details String splitId Offset startingOffset // Earliest log start across all splits Offset endingOffset // Log end position; null if continuous List<TableId> tableIds Map<TableId, Offset> tableWatermarks // Watermarks per table from snapshot phase List<CompletedSnapshotSplitInfo> completedSnapshotSplitInfos // Snapshot split details CompletedSnapshotSplitInfo includes: CompletedSnapshotSplitInfo String splitId TableId tableId SeatunnelRowType splitKeyType Object splitStart Object splitEnd Offset watermark // High watermark from report String splitId TableId tableId SeatunnelRowType splitKeyType Object splitStart Object splitEnd Offset watermark // High watermark from report The incremental phase finds the smallest watermark from snapshot splits and starts reading from that log position. smallest watermark Exactly-Once Guarantee Whether during snapshot or incremental sync, the database may still be changing. How does SeaTunnel ensure exactly-once processing? exactly-once Snapshot Phase During snapshot reading, imagine a split is being read and data changes (e.g., insert k3, update k2, delete k1). Without special handling, updates could be missed. SeaTunnel solves this by: Reading the low watermark from the database log before the split Reading data for split {start, end} Recording the high watermark after the split If high = low, no change occurred. If high > low, a change occurred during the read. SeaTunnel: Caches snapshot data in-memory Replays log events between low and high watermark onto the in-memory table using primary key order Reports high watermark Reading the low watermark from the database log before the split low watermark Reading data for split {start, end} {start, end} Recording the high watermark after the split high watermark If high = low, no change occurred. If high > low, a change occurred during the read. SeaTunnel: Caches snapshot data in-memory Replays log events between low and high watermark onto the in-memory table using primary key order high = low high > low Caches snapshot data in-memory Replays log events between low and high watermark onto the in-memory table using primary key order Caches snapshot data in-memory Replays log events between low and high watermark onto the in-memory table using primary key order Reports high watermark high watermark insert k3 update k2 delete k1 | | | v v v bin log --|---------------------------------------------------|-- log offset low watermark high watermark CDC read data: k1 k3 k4 | replay v Final result: k2 k3' k4 insert k3 update k2 delete k1 | | | v v v bin log --|---------------------------------------------------|-- log offset low watermark high watermark CDC read data: k1 k3 k4 | replay v Final result: k2 k3' k4 Incremental Phase Before starting incremental sync, SeaTunnel validates the snapshot phase. It checks for inter-split changes—data changes that may have occurred between splits. SeaTunnel handles this by: inter-split changes Finding the minimum watermark from all snapshot splits Starting to read from that log position For each log event, checking if the data was already processed in a snapshot split If not, it's inter-split data and will be corrected After all tables are validated, true incremental sync begins Finding the minimum watermark from all snapshot splits Starting to read from that log position For each log event, checking if the data was already processed in a snapshot split If not, it's inter-split data and will be corrected After all tables are validated, true incremental sync begins |------------filter split2-----------------| |----filter split1------| data log -|-----------------------|------------------|----------------------------------|- log offset min watermark split1 watermark split2 watermark max watermark |------------filter split2-----------------| |----filter split1------| data log -|-----------------------|------------------|----------------------------------|- log offset min watermark split1 watermark split2 watermark max watermark Fault Tolerance and Checkpointing How to support pause and resume? SeaTunnel uses the Chandy-Lamport distributed snapshot algorithm. Chandy-Lamport distributed snapshot algorithm Assume two processes, p1 and p2: p1 has variables X1 Y1 Z1, p2 has X2 Y2 Z2: p1 p2 X1:0 X2:4 Y1:0 Y2:2 Z1:0 Z2:3 p1 p2 X1:0 X2:4 Y1:0 Y2:2 Z1:0 Z2:3 p1 initiates a global snapshot by recording its local state and sending a marker to p2. Before p2 receives it, it sends message M to p1. marker message M p1 p2 X1:0 -------marker-------> X2:4 Y1:0 <---------M---------- Y2:2 Z1:0 Z2:3 p1 p2 X1:0 -------marker-------> X2:4 Y1:0 <---------M---------- Y2:2 Z1:0 Z2:3 p2 receives the marker, records its own state. p1 then receives message M and logs it separately as it already took a snapshot. Final snapshot: p1 M p2 X1:0 X2:4 Y1:0 Y2:2 Z1:0 Z2:3 p1 M p2 X1:0 X2:4 Y1:0 Y2:2 Z1:0 Z2:3 In SeaTunnel CDC, markers are sent to all nodes — readers, splitEnumerators, writers — and each stores its current state for recovery.