Shyam Arjarapu

@shyam.arjarapu

Mastering MongoDB - Introducing multi-document transactions in v4.0

Photo by Madison Grooms on Unsplash
Support for transactions in MongoDB has been something long desired by many. With MongoDB v4.0, the wait is now over. Welcome to multi-document transactions for replica sets.

In a MongoDB database application when you have parent-child relationship between two entities, such as orders & order details, you typically embed the child documents (order details) in a parent document (order). Such schema design way will not only help with faster reads, but also help meet the atomicity. A write operations on a single document is atomic. However, ‘How do you maintain the atomicity while working with multiple documents?’

This is one of the many articles in multi-part series, Mastering MongoDB — One tip a day, solely created for you to master MongoDB by learning ‘one tip a day’. In a few series of articles, I would like to give various tips to help you answer the above question. This article discusses multi-document transactions, a new feature in MongoDB v4.0 — its applications, use case scenarios and finally some hands-on lab exercises.

Mastering — Multi-document transactions

What are Transactions

In MongoDB, a write operations on a single document is atomic, even if the operation modifies multiple embedded documents within a single document. When a single write operation, like updateMany, modifies multiple documents, the modification of each document is atomic, but the operation as a whole is not atomic. Some of the use cases may also require you to modify multiple write operations as part of single operation. In such scenarios you do need transactions to enforce atomicity across multiple write operations.

Why use Transactions

Until MongoDB v4.0 the only way you could emulate transactions is by implementing a two-phase commit in your application. Emmanuel Olaojo wrote an article, “Fawn: transactions in MongoDB”, on two-phase commits using Fawn module for Node.js applications. Please be aware that the two-phase commits can only offer transaction-like semantics. While using two-phase commit, it is possible for applications to return intermediate data during the two-phase commit or rollback.

The MongoDB v4.0 introduces multi-document transactions for replica sets and can be used across multiple operations, collections, and documents. The multi-document transactions provide a globally consistent view of data, and enforce all-or-nothing execution to maintain data integrity. The transactions will have following properties

  • When a transaction is committed, all data changes made in the transaction are saved.
  • If any operation in the transaction fails, the transaction aborts
  • When a transaction aborts/aborted, all data changes made in the transaction are discarded.
  • Until a transaction is committed, no write operations in the transaction are visible outside the transaction.

Hands-On lab exercises

This lab exercise helps you understand how to make use of transactions within a MongoDB shell. Since the Multi-document transactions are available only for replica sets, please ensure that you do have at least 1 member replicaset rather than standalone mongod.

Before we get started, I want you to be aware of a few points

  • You can only specify read/write (CRUD) operations on existing collections.
  • A multi-document transaction cannot include an insert operation that would result in the creation of a new collection.
  • Transactions are associated with a session.
  • At any given time, you can have at most one open transaction for a session.
  • To associate read and write operations with an open transaction, you pass the session to the operations.

Setup environment

First, you would need an environment to play around. If you already have a MongoDB v4.0 replicaset environment, you may skip this step.

A bash script with download MongoDB v4.0 release candidate and create a 1 member replica set

The next set of exercises will illustrate how the commitTransaction / abortTransaction work and most importantly also covers how multiple write operations with write conflicts lead to abortTransaction. To help you understand better I have included the output of these commands right below them as comments.

Saving data changes with commitTransaction

The below MongoDB shell commands shows a person collection with some sample data in it. Notice that the newly added document in the transaction is not visible on the db.person collection object until the session1 is committed.

MongoDB commands illustrating the data changes in transaction are save when the transaction is committed.

Discarding data changes with abortTransaction

The below MongoDB shell commands shows a person collection with some sample data in it. Notice that the newly added document in the transaction is not visible on the db.person collection object until the session1 is aborted.

MongoDB commands illustrating the data changes in transaction are discarded when the transaction is aborted on the session.

Transactions with no write conflicts can be committed

In the below code, you may notice that there are multiple write operations (insert, update and delete) are invoked from multiple scopes, inside/outside of transaction. As long as there is no WriteConflict, you can successfully commit these transaction.

MongoDB commands illustrating transactions with no write conflicts can successfully commit

Transactions with write conflicts are aborted

If two or more write operations modify the same document from different scopes, then the data changes in one transaction impact the data changes from the other transactions. When such WriteConflict exists, the operations would result in TransientTransactionError and abortTransaction. These writes could be any combination of insert/update/delete operations. Below example shows two delete operations on same document invoked in two different transactions.

MongoDB commands illustrating multiple transactions with write conflicts are aborted

Summary

The support for the Multi-document transactions for replica sets is just the beginning. The future release may address the transactions across the sharded deployment and various isolation levels that you may have exposure from other relational databases. I want to remind an important point —

“Just because you now have support for transactions, you must not design the data model around 3rd normal form. You must always have an effective MongoDB schema design to ensure your application is highly performant.

Multi-document transaction incurs a greater performance cost when compared to single document writes. So, “What is the performance cost of using transactions?”. Great question! But that’s a topic for another day.

Hopefully, you learned something new today on you scale the path to “Mastering MongoDB — One tip a day”.

More by Shyam Arjarapu

Topics of interest

More Related Stories