It’s been over 10 years since MongoDB was released, but the SQL vs. NoSQL debate rages on.
On one hand, NoSQL databases (of which Mongo is one specific type, a document store) are flexible, fast to write to, and created with the modern web in mind.
On the other hand, SQL databases (of which Postgres is one of the most popular) are still the backbone of applications today. Everyone knows how to use them and many incorporate NoSQL-like optimizations, making the line a bit blurry.
Either way, whenever you start a new application, you have to choose a database. This can be a scary prospect as it’s nearly impossible to change your core data model once you’ve built something significant. That’s why the debate is important, even now.
In this article, I’ll compare PostgreSQL and MongoDB. You’ll see how they stack up in speed, usability, deployment options, and scalability.
While it’s a complex topic (and I’m sure I’ll miss a few things that you can deride me for in the comments), I hope this gives you a great starting point for picking your next database.
PostgreSQL is a traditional, open-source RDBMS (relational database management system). It is not owned by a private corporation or entity and the source code is available free of charge. It has earned a strong reputation for reliability, extensibility, feature robustness, and performance.
A relational database like PostgreSQL is a collection of data items organized in tables. A table consists of rows, and each row contains the same set of columns. PostgreSQL uses primary keys to uniquely identify each row (a.k.a. record) in a table, and foreign keys to assure the referential integrity between two related tables.
PostgreSQL is ACID-compliant, transactional, has updatable and materialized views. It also supports triggers, functions, stored procedures, and foreign keys.
In PostgreSQL, database schemas and models need to be defined ahead of time, and data must match this schema to be stored in the database.
MongoDB is a document-oriented database that stores data in JSON-like documents with dynamic schema. You can store your records without worrying about the data structure such as the number of fields or types of fields to store values. MongoDB documents are essentially JSON objects.
In MongoDB, collections of documents do not require a predefined structure and columns can vary for different documents. MongoDB offers an expressive query language and strong consistency.
PostgreSQL stores data in the form of tables. Each table will have a predefined set of columns and a table is a collection of rows. Each row has a set of values corresponding to each column.
MongoDB stores data in the form of collections. Each collection will have a set of keys and corresponding values. A collection will have a set of documents.
These naming differences might seem petty, but they’re important to grasp as using the wrong names can confuse other engineers on your team.
PostgreSQL |
MongoDB |
---|---|
Table |
Collection |
Column |
Key |
Value |
Value |
Records |
Documents |
Without an index, any database will have to to a full-table-scan, essentially searching through all records one by one, in the order they appear to find the matching query. This operation becomes extremely time-consuming as your table size increases.
An index sets up a column in a specified order to help optimize query performance by fetching results quickly. But indexes also add overhead to the database system as a whole, so they should be used sensibly.
This video gives a great overview of indexes from a more technical perspective.
PostgreSQL includes built-in support for regular B-tree, Hash, GiST, and GIN. Each index type uses a different algorithm that is best suited to different types of queries. By default, the CREATE INDEX
command creates B-tree indexes, which are usually a good option. Indexes in PostgreSQL also support the following:
Indexes are preferred in MongoDB. If an index is missing, every document within the collection must be searched to select the documents that were requested in the query. This can greatly hamper performance and read times.
Indexes in MongoDB also support:
One of the biggest differences between using PostgreSQL and MongoDB in practice is the query syntax.
To demonstrate this, let’s look at how to create a customers
table with id, name, email
columns and compare basic CRUD operations using PostgreSQL and MongoDB.
customers
table/collectionPostgreSQL : In PostgreSQL, we need to define table schema before inserting records.
CREATE TABLE customers (
id int,
name varchar(255),
email varchar(255),
PRIMARY KEY (id)
);
MongoDB: In MongoDB, there is no need to pre-define collection schema as it allows dynamic schema.
customers
table/collectionPostgreSQL :
INSERT INTO `customers` (`id`, `name`, `email`)
VALUES (1, 'Mr. XYZ', '[email protected]');
MongoDB :
db.customers.insert(
{id: 1, name: "Mr. XYZ", email: "[email protected]"}
)
customers
table/collectionPostgreSQL :
UPDATE customers set name='Mr. ABC' where name='Mr.XYZ';
MongoDB :
db.customers.update({"name": "Mr. XYZ"}, {$set: {"name": 'Mr. ABC'}}, {multi: true})
customers
table/collectionPostgreSQL :
DELETE from customers where id=1;
MongoDB :
db.customers.remove({ id: 1 })
customers
table/collectionPostgreSQL :
select * from customers;
MongoDB :
db.customers.find()
If you are familiar with javascript and JSON, you will find MongoDB Query Language to be simpler and easy to learn whereas if you’re familiar with traditional SQL(Structured Query Language), you’ll likely find PostgreSQL more natural.
PostgreSQL is available on Windows, macOS, and Linux, and all the major cloud providers offer support for it. Many also provide zero-downtime migration if you ever need to move your data and offer a pay-as-you-go pricing model.
MongoDB similarly runs on all major operating systems and most of the big cloud providers. MongoDB also offers its own proprietary service called Atlas to deploy and scale your instance in the cloud.
In either case, managed hosting is probably a good idea for small teams. Having your database go down can be really scary, so unless you have the expertise in-house, you’re probably better off paying for managed hosting.
Replication lets you create multiple copies of your data to improve:
There are two methods for replication in PostgreSQL:
Synchronous replication or 2-safe replication. In synchronous replication, transactions on the primary database are declared complete only when those changes have been replicated to all the replicas. Unless both databases crash simultaneously, data won't be lost.
Asynchronous replication means that transactions on the primary server can be declared complete when the changes have been done on at least the primary server. These changes are then replicated to the replicas when they’re available. The replica servers can remain out-of-sync for a certain duration, which is called a replication lag.
Both synchronous and asynchronous modes both have their pros and cons. In synchronous mode, write latency will increase but the data replication is always in sync. In asynchronous mode, writes will be faster, but there is a slight chance of data inconsistency, especially if the primary database goes down.
A replica set in MongoDB offers redundancy and high availability. Using a similar model (primary and secondary nodes), you can replicate data across multiple servers to ensure no downtime or lost data in the case of an outage.
Typically, the primary node will receive all write operations and records changes data sets in its operation log.
Meanwhile, the secondaries replicate the primary's oplog and apply the operations to their data sets. If the primary is unavailable, an eligible secondary will hold an election to elect itself the new primary.
MongoDB uses role-based access control and flexible permissions that admins can set. All data is encrypted with TLS in transit, and it’s possible to encrypt documents in a collection at rest using a master key.
PostgreSQL supports the same encryption features as MongoDB and its authentication model is similar. Users can be granted roles and privileges, giving them permissions over particular database operations and against particular data sets.
Interestingly, both PostgreSQL and MongoDB are open-source, but their supporting organizations are financed much differently.
PostgreSQL is maintained by PostgreSQL Global Development Group and is financially supported by sponsors (mostly large tech companies). The PostgreSQL community has been active for over 20 years so there are innumerable tutorials, guides, and extensions available.
MongoDB is run by a for-profit corporation, and while also open-source, the company offers commercial support and hosting services to fund the project. Most of their paid services are geared towards enterprise development, so most small-time projects will never need to pay for MongoDB. The MongoDB community is also very active, so finding resources to solve specific problems is usually pretty easy, although slightly less so than PostgreSQL in my experience.
PostgreSQL has been around longer, so there’s probably more large-scale enterprise applications using it, but that doesn’t mean it’s always better.
MongoDB can be a great choice if you need scalability and caching for real-time analytics; however, it is not built for transactional data (accounting systems, etc.). MongoDB is frequently used for mobile apps, content management, real-time analytics, and applications involving the Internet of Things.
Ultimately, there’s no “right” option for every project, but here’s how I’d think about picking between PostgreSQL and MongoDB:
I hope this has given you some insights into choosing between these two popular databases. If you have questions, leave a comment, and let’s keep the conversation going.