What Are Conflict-free Replicated Data Types (CRDTs)?

Written by krishnansrinath | Published 2021/05/15
Tech Story Tags: distributed-systems | conflict-resolution | internet | scalability | data | data-replication | data-structures | database

TLDR The fundamental idea is this: You have data stored on multiple replicas. You need to deal with conflicts across multiple replicates of data. CRDTs solves a fundamental problem in data storage and makes collaboration seamless and conflict-free. There are two different categories of CRDT types: state-based or operation-based CRDT models. Operations-based models transmit state by transmitting only the update operation. Operation-based replicas send their full local state to other replicas. State-basedCRDTs send their state to others.via the TL;DR App

What is common between these things?
(1) Sync your calendar, notes, contacts from mobile data to other devices.
(2) Team members collaborating using software, such as Google Docs, Trello.
(3) Large-scale data storage and processing systems on a public cloud like Amazon AWS, Microsoft Azure, or Google Cloud.
The answer is all such systems need to deal with the fact that the data may be concurrently modified. The fundamental idea is this: You have data. This data is stored on multiple replicas. You need to deal with conflicts across multiple replicas of data.
Broadly speaking, there are two possible ways of dealing with such data modifications:
  • Strongly consistent replication: In this model, the replicas coordinate with each other to decide when and how to apply the modifications. This approach enables strong consistency models such as serializable transactions and linearizability. However, waiting for this coordination reduces the performance of these systems; moreover, it is impossible to make any data changes on a replica while it is disconnected from the rest of the system (e.g. mobile device with intermittent connectivity).
  • Optimistic replication: In this model, users may modify the
    data on any replica independently of any other replica, even if the replica is offline or disconnected from the others. This approach enables maximum performance and availability, but it can lead to conflicts when multiple clients or users concurrently modify the same piece of data. These conflicts then need to be resolved when the replicas communicate with each other.

Enter Conflict-free Replicated Data Types

There are two different categories of CRDTs: 
  1. State-based 
  2. Operation-based
Operation-based CRDTs replicas propagate state by transmitting only the update operation. For example, a CmRDT of a single integer might broadcast the operations (+10) or (−20). Replicas receive the updates and apply them locally.

State-based CRDTs send their full local state to other replicas.

Real-world applications

Collaborative note-taking applications like Nimbus Notes uses CRDT  for collaborative editing. Apple implements CRDTs in the Notes app for syncing offline edits between multiple devices.
Navigation application like TomTom uses CRDTs to synchronize navigation data between the devices of a user.
Distributed databases like Riak, Redis uses CRDTs for implementing globally distributed database.
In a world where most of the applications that we use on the internet are distributed in nature, CRDTs solves a fundamental problem in data storage and makes collaboration seamless and conflict-free.

Written by krishnansrinath | Product builder | People Developer | Engineering Management
Published by HackerNoon on 2021/05/15