Mastering MongoDB - Introducing multi-document transactions in v4.0

Author profile picture

@shyam.arjarapuShyam Arjarapu

Support for transactions in MongoDB has been something long desired by many. With MongoDB v4.0, the wait is now over. Welcome to multi-document transactions for replica sets.
In a MongoDB database application when you have parent-child relationship between two entities, such as orders & order details, you typically embed the child documents (order details) in a parent document (order). Such schema design way will not only help with faster reads, but also help meet the atomicity. A write operations on a single document is atomic. However, ‘How do you maintain the atomicity while working with multiple documents?’
This is one of the many articles in multi-part series, Mastering MongoDB — One tip a day, solely created for you to master MongoDB by learning ‘one tip a day’. In a few series of articles, I would like to give various tips to help you answer the above question. This article discusses multi-document transactions, a new feature in MongoDB v4.0 — its applications, use case scenarios and finally some hands-on lab exercises.

Mastering — Multi-document transactions

What are Transactions

In MongoDB, a write operations on a single document is atomic, even if the operation modifies multiple embedded documents within a single document. When a single write operation, like updateMany, modifies multiple documents, the modification of each document is atomic, but the operation as a whole is not atomic. Some of the use cases may also require you to modify multiple write operations as part of single operation. In such scenarios you do need transactions to enforce atomicity across multiple write operations.

Why use Transactions

Until MongoDB v4.0 the only way you could emulate transactions is by implementing a two-phase commit in your application. Emmanuel Olaojo wrote an article, “Fawn: transactions in MongoDB”, on two-phase commits using Fawn module for Node.js applications. Please be aware that the two-phase commits can only offer transaction-like semantics. While using two-phase commit, it is possible for applications to return intermediate data during the two-phase commit or rollback.
The MongoDB v4.0 introduces multi-document transactions for replica sets and can be used across multiple operations, collections, and documents. The multi-document transactions provide a globally consistent view of data, and enforce all-or-nothing execution to maintain data integrity. The transactions will have following properties
  1. When a transaction is committed, all data changes made in the transaction are saved.
  2. If any operation in the transaction fails, the transaction aborts
  3. When a transaction aborts/aborted, all data changes made in the transaction are discarded.
  4. Until a transaction is committed, no write operations in the transaction are visible outside the transaction.

Hands-On lab exercises

This lab exercise helps you understand how to make use of transactions within a MongoDB shell. Since the Multi-document transactions are available only for replica sets, please ensure that you do have at least 1 member replicaset rather than standalone mongod.
Before we get started, I want you to be aware of a few points
  1. You can only specify read/write (CRUD) operations on existing collections.
  2. A multi-document transaction cannot include an insert operation that would result in the creation of a new collection.
  3. Transactions are associated with a session.
  4. At any given time, you can have at most one open transaction for a session.
  5. To associate read and write operations with an open transaction, you pass the session to the operations.

Setup environment

First, you would need an environment to play around. If you already have a MongoDB v4.0 replicaset environment, you may skip this step.
curl -O https://downloads.mongodb.com/osx/mongodb-osx-x86_64-enterprise-4.0.0-rc0.tgz
tar -xvzf mongodb-osx-x86_64-enterprise-4.0.0-rc0.tgz
rm mongodb-osx-x86_64-enterprise-4.0.0-rc0.tgz
mv mongodb-osx-x86_64-enterprise-4.0.0-rc0  v4.0.0-rc0
mkdir data
v4.0.0-rc0/bin/mongod --dbpath data --logpath data/mongod.log --fork --replSet rs0 --port 38000
v4.0.0-rc0/bin/mongo --port 38000 --eval "rs.initiate()"
A bash script with download MongoDB v4.0 release candidate and create a 1 member replica set
The next set of exercises will illustrate how the commitTransaction / abortTransaction work and most importantly also covers how multiple write operations with write conflicts lead to abortTransaction. To help you understand better I have included the output of these commands right below them as comments.

Saving data changes with commitTransaction

The below MongoDB shell commands shows a person collection with some sample data in it. Notice that the newly added document in the transaction is not visible on the db.person collection object until the session1 is committed.
// v4.0.0-rc0/bin/mongo --port 38000

// ****************************************************
// On a sample person collection with two documents _id 1, 2
// Insert new document, _id 3, inside session1 scope
// Understand how the find operation on these scopes change
// from startTransaction to commitTransaction
// ****************************************************

// drop and recreate person collection with 2 documents _id 1, 2
use test;
db.person.drop();
db.person.insert({"_id": 1, "fname": "fname-1", "lname": "lname-1"});
db.person.insert({"_id": 2, "fname": "fname-2", "lname": "lname-2"});

// create session1 and a collection object using session1 and start a transaction on it
var session1 = db.getMongo().startSession();
var session1PersonColl = session1.getDatabase('test').getCollection('person');
session1.startTransaction({readConcern: {level: 'snapshot'}, writeConcern: {w: 'majority'}});

// insert a document _id 3, inside a transaction/session1
session1PersonColl.insert({"_id": 3, "fname": "fname-3", "lname": "lname-3"});
// WriteResult({ "nInserted" : 1 })

// find the documents from collection and session1
db.person.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }

// notice that the insert on session1 is only visible to it.
session1PersonColl.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }
// { "_id" : 3, "fname" : "fname-3", "lname" : "lname-3" }

// commit and end the session
session1.commitTransaction()
session1.endSession()

// show the documents after committing the transaction
db.person.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }
// { "_id" : 3, "fname" : "fname-3", "lname" : "lname-3" }
MongoDB commands illustrating the data changes in transaction are save when the transaction is committed.

Discarding data changes with abortTransaction

The below MongoDB shell commands shows a person collection with some sample data in it. Notice that the newly added document in the transaction is not visible on the db.person collection object until the session1 is aborted.
// v4.0.0-rc0/bin/mongo --port 38000

// ****************************************************
// On a sample person collection with two documents _id 1, 2
// Insert new document, _id 3, inside session1 scope
// Understand how the find operation on these scopes change
// from startTransaction to abortTransaction
// ****************************************************

// drop and recreate person collection with 2 documents _id 1, 2
use test;
db.person.drop();
db.person.insert({"_id": 1, "fname": "fname-1", "lname": "lname-1"});
db.person.insert({"_id": 2, "fname": "fname-2", "lname": "lname-2"});

// create session1 and a collection object using session1 and start a transaction on it
var session1 = db.getMongo().startSession();
var session1PersonColl = session1.getDatabase('test').getCollection('person');
session1.startTransaction({readConcern: {level: 'snapshot'}, writeConcern: {w: 'majority'}});

// insert a document _id 3, inside a transaction/session1
session1PersonColl.insert({"_id": 3, "fname": "fname-3", "lname": "lname-3"});
// WriteResult({ "nInserted" : 1 })

// find the documents from collection and session1
db.person.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }

session1PersonColl.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }
// { "_id" : 3, "fname" : "fname-3", "lname" : "lname-3" }

// commit and end the session
session1.abortTransaction()
session1.endSession()

// show the documents after aborting the transaction
db.person.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }
MongoDB commands illustrating the data changes in transaction are discarded when the transaction is aborted on the session.

Transactions with no write conflicts can be committed

In the below code, you may notice that there are multiple write operations (insert, update and delete) are invoked from multiple scopes, inside/outside of transaction. As long as there is no WriteConflict, you can successfully commit these transaction.
// v4.0.0-rc0/bin/mongo --port 38000

// ****************************************************
// On a sample person collection with two documents _id 1, 2
// Insert new document, _id 3, inside session1 scope
// Update a document, _id 1, in session2 scope
// Delete a document, _id 2, directly on collection
// Understand how the find operation on these scopes change
// from beginning of transaction till after commit
// ****************************************************

// drop and recreate person collection with 2 documents _id 1, 2
use test;
db.person.drop();
db.person.insert({"_id": 1, "fname": "fname-1", "lname": "lname-1"});
db.person.insert({"_id": 2, "fname": "fname-2", "lname": "lname-2"});

// create session1 and a collection object using session1 and start a transaction on it
var session1 = db.getMongo().startSession();
var session1PersonColl = session1.getDatabase('test').getCollection('person');
session1.startTransaction({readConcern: {level: 'snapshot'}, writeConcern: {w: 'majority'}});

// create session2 and a collection object using session2 and start a transaction on it
var session2 = db.getMongo().startSession();
var session2PersonColl = session2.getDatabase('test').getCollection('person');
session2.startTransaction({readConcern: {level: 'snapshot'}, writeConcern: {w: 'majority'}});

// The find operations on all collections inside/outside transactions show same data
db.person.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }

session1PersonColl.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }

session2PersonColl.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }


// insert a document _id 3, inside a transaction/session1
session1PersonColl.insert({"_id": 3, "fname": "fname-3", "lname": "lname-3"});
// WriteResult({ "nInserted" : 1 })

// update a document _id 1, inside transaction/session2
session2PersonColl.updateOne({"_id": 1}, {"$set": {"fname": "fname-1U", "lname": "lname-1U"}} );
// { "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }

// delete a new document directly on the collection
db.person.deleteOne({"_id": 2});
// { "acknowledged" : true, "deletedCount" : 1 }

// The find operations on all collections inside/outside transactions show different set of data
// notice that all the uncommitted data changes inserts/updates are not visible on db.person
db.person.find();
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }

// notice that the insert on session1 is only visible to it.
// the delete operation is not visible to session1
session1PersonColl.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }
// { "_id" : 3, "fname" : "fname-3", "lname" : "lname-3" }

// notice that the update on session2 is only visible to it.
// the delete operation is not visible to session2
session2PersonColl.find()
// { "_id" : 1, "fname" : "fname-1U", "lname" : "lname-1U" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }


// commit and end the session1 and session2
session1.commitTransaction()
session1.endSession()
session2.commitTransaction()
session2.endSession()

// The find operation on the collection now shows committed changes
db.person.find()
// { "_id" : 1, "fname" : "fname-1U", "lname" : "lname-1U" }
// { "_id" : 3, "fname" : "fname-3", "lname" : "lname-3" }
MongoDB commands illustrating transactions with no write conflicts can successfully commit

Transactions with write conflicts are aborted

If two or more write operations modify the same document from different scopes, then the data changes in one transaction impact the data changes from the other transactions. When such WriteConflict exists, the operations would result in TransientTransactionError and abortTransaction. These writes could be any combination of insert/update/delete operations. Below example shows two delete operations on same document invoked in two different transactions.
// v4.0.0-rc0/bin/mongo --port 38000

// ****************************************************
// On a sample person collection with two documents _id 1, 2
// Update a document, _id 1, in session1 scope
// Delete a document, _id 2, in session1 scope
// Delete a document, _id 2, in session2 scope
// Understand how the find operation on these scopes change
// ****************************************************

// drop and recreate person collection with 2 documents _id 1, 2
use test;
db.person.drop();
db.person.insert({"_id": 1, "fname": "fname-1", "lname": "lname-1"});
db.person.insert({"_id": 2, "fname": "fname-2", "lname": "lname-2"});

// create session1 and a collection object using session1 and start a transaction on it
var session1 = db.getMongo().startSession();
var session1PersonColl = session1.getDatabase('test').getCollection('person');
session1.startTransaction({readConcern: {level: 'snapshot'}, writeConcern: {w: 'majority'}});

// create session2 and a collection object using session2 and start a transaction on it
var session2 = db.getMongo().startSession();
var session2PersonColl = session2.getDatabase('test').getCollection('person');
session2.startTransaction({readConcern: {level: 'snapshot'}, writeConcern: {w: 'majority'}});

// The find operations on all collections inside/outside transactions show same data
db.person.find()
session1PersonColl.find()
session2PersonColl.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }


// update a document _id 1, inside a transaction/session1
session1PersonColl.updateOne({"_id": 1}, {"$set": {"fname": "fname-1U", "lname": "lname-1U"}} );
// { "acknowledged" : true, "matchedCount" : 1, "modifiedCount" : 1 }

// delete a document _id 2, inside a transaction/session1
session1PersonColl.deleteOne({"_id": 2});
// { "acknowledged" : true, "deletedCount" : 1 }

// The find operation on session1 shows modified _id 1 and no _id 2
session1PersonColl.find()
// { "_id" : 1, "fname" : "fname-1U", "lname" : "lname-1U" }

// The find operation on session2 shows unmodified _id 1 and _id 2
session2PersonColl.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }

// The delete _id 2 operation inside transaction/session2 would result
// in WriteConflict and session2 transaction being aborted
session2PersonColl.deleteOne({"_id": 2});
// 2018-05-28T08:59:04.545-0500 E QUERY    [js] WriteCommandError: WriteConflict :
// WriteCommandError({
// 	"errorLabels" : [
// 		"TransientTransactionError"
// 	],
// 	"operationTime" : Timestamp(1527515941, 1),
// 	"ok" : 0,
// 	"errmsg" : "WriteConflict",
// 	"code" : 112,
// 	"codeName" : "WriteConflict",
// 	"$clusterTime" : {
// 		"clusterTime" : Timestamp(1527515941, 1),
// 		"signature" : {
// 			"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
// 			"keyId" : NumberLong(0)
// 		}
// 	}
// })
// WriteCommandError@src/mongo/shell/bulk_api.js:420:48
// Bulk/executeBatch@src/mongo/shell/bulk_api.js:902:1
// Bulk/this.execute@src/mongo/shell/bulk_api.js:1150:21
// DBCollection.prototype.deleteOne@src/mongo/shell/crud_api.js:363:17
// @(shell):1:1

// The find operation on collection shows unmodified _id 1 and no _id 2
db.person.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }

// The find operation on session1's collection shows modified _id 1 and no _id 2
session1PersonColl.find()
// { "_id" : 1, "fname" : "fname-1U", "lname" : "lname-1U" }

// abort and end the session1 and session2
session1.abortTransaction()
session1.endSession()
session2.abortTransaction()
session2.endSession()

// The find operation on the collection now shows committed changes
db.person.find()
// { "_id" : 1, "fname" : "fname-1", "lname" : "lname-1" }
// { "_id" : 2, "fname" : "fname-2", "lname" : "lname-2" }
MongoDB commands illustrating multiple transactions with write conflicts are aborted

Summary

The support for the Multi-document transactions for replica sets is just the beginning. The future release may address the transactions across the sharded deployment and various isolation levels that you may have exposure from other relational databases. I want to remind an important point —
“Just because you now have support for transactions, you must not design the data model around 3rd normal form. You must always have an effective MongoDB schema design to ensure your application is highly performant.
Multi-document transaction incurs a greater performance cost when compared to single document writes. So, “What is the performance cost of using transactions?”. Great question! But that’s a topic for another day.
Hopefully, you learned something new today on you scale the path to “Mastering MongoDB — One tip a day”.

Comments

Tags

The Noonification banner

Subscribe to get your daily round-up of top tech stories!