Some of the popular implementations of a Kanban board are the following:
- Trello
- JIRA board
- Microsoft Planner
- Asana board
Requirements
- The user can create lists and assign tasks on the Kanban board
- The user can see changes on the Kanban board in near real-time
- The user can make offline changes on the Kanban board
- The Kanban board is distributed
Data storage
Database schema
- The primary entities are the boards, the lists, and the tasks tables
- The relationship between the boards and lists tables is 1-to-many
- The relationship between the lists and tasks tables is 1-to-many
- An additional table is created to track the versions of the board
Type of data store
- Document store such as MongoDB persists the metadata of the tasks for improved denormalization support (ability to query and index subfields of the document)
- The transient data (board activity level) is stored on the cache server such as Redis
- The download queue for board activity is implemented using a message queue such as Apache Kafka
High-level design
- The changes on the Kanban board are synchronized in real-time through a WebSocket connection
- The offline client synchronizes the delta of changes when connectivity is reacquired
Workflow
- The client-A creates a WebSocket connection on the load balancer to push changes in real-time
- The load balancer (HAProxy) uses the round-robin algorithm to delegate the client connection to a web socket server with free capacity
- The changes on the Kanban board are persisted on the document store
- Consistent hashing (key = board ID) is used to delegate the WebSocket connection to the relevant publish-subscribe (pub-sub) server
- The pub-sub server (Apache Kafka) delegates the WebSocket connection to the subscribed web socket servers
- The web socket server fetches the changes on the document store
- The WebSocket connection can make a duplex communication to the other listening clients through the load balancer
- The client-B receives the changes on the Kanban board
- The cache server stores the transient metadata such as the activity level of a session or the temporary authentication key
- The download queue to replay the changes in the sequential order is implemented using a message queue
- The changes are asynchronously propagated to the followers (replicas) of the document store to achieve eventual consistency
- The CDN serves the single-page dynamic application to the client to improve latency
- The single-page application is cached on the browser to improve the latency of the subsequent requests
- An event-driven architecture might be a good choice for the instant propagation of updates
- The client invokes the server logic through a thin wrapper over a WebSocket connection
- The LRU policy is used to evict the cache servers
- The document store makes it relatively trivial to run different versions of the Kanban board against the same database without major DB schema migrations
- The document store is replicated using the leader-follower topology
- The online clients fetch the changes from the leader document store
- The offline clients fetch the changes from the followers of the document store
- The offline clients store the time-stamped changeset locally and send the delta of the changeset when the connectivity is reacquired
- The changeset is replayed in sequential order to prevent data hierarchy failures
- The client synchronizes only the recently viewed and the starred boards to improve the performance
- A new TCP WebSocket must be established on the server failure
- The last-write-win policy is used to resolve conflicts on the board
- Exponential backoff must be implemented on the synchronization service to improve the fault tolerance
References
Also published here.
Featured image source.