CEO of Dashbird. 13y experience as a software developer & 5y of building Serverless applications.
NoSQL is a database management system that exists as an antithesis to SQL, in that it doesn’t store data in a relational model. As such, data can essentially be stored as anything, in any way a developer chooses, within reason of course. This flexibility comes from the fact that NoSQL doesn’t require a schema in the same way that SQL does.
Originating back in the early 1970s, NoSQL didn’t gain recognition until the early 2000s when both Google and Amazon started heavily investing into R&D. It makes sense that these two tech giants placed resources into it, especially since NoSQL is built for high performance and availability, something both these companies require.
That being said, NoSQL doesn’t necessarily stand for “No SQL” as much as it stands for “Not Only SQL”. This is because while it’s both possible and standard that NoSQL databases don’t use a schema, ultimately you absolutely can use a schema if you’d like to. In fact, some NoSQL database types might benefit from it since they don’t approach the issue of data duplication the way that SQL does.
Therefore, a NoSQL schema still has value depending on the context.
Probably the biggest positive about NoSQL is the fact that it’s horizontally scalable rather than vertically scalable like SQL. With SQL, getting better performance means buying better hardware, such as more powerful CPUs or even GPUs depending on what you’re doing.
Because of how NoSQL is made, it’s much more modular, and expansion for NoSQL simply means adding more of the same. So you can just add another shard to the server and you’re pretty much set in terms of getting better performance. This style of scalability means that NoSQL is several magnitudes cheaper than SQL.
On a similar note, NoSQL is truly made for performance, with some data models being able to handle millions of transactions per second. This is due to how NoSQL works as a philosophy; instead of spreading data out over multiple tables, it’s all contained in one. Ultimately that means that all queries can be done within one single table, which frees up a lot of processing power to run more queries.
Finally, the one thing that tends to often be brought up is the lack of a schema, which allows for a ton of flexibility. That means not only can you change the type of data while the NoSQL database is running, but you can also change the data model or add an additional one as well. In other words, NoSQL is simply great for flexibility and scalability.
Of course, NoSQL isn’t perfect and it does have some disadvantages.
One rather large problem that comes with NoSQL is that it’s not as mature as SQL. Keep in mind that even though NoSQL has been around since the 1970s, it only really picked up in the 2000s. Compare that with SQL which had constant development and support since even before NoSQL existed and we can understand the “why.”
This ultimately means that finding an expert in NoSQL can be rather difficult. Similarly, support online is mostly corralled into data model specific forums, or even just the developers of these data models themselves.
Another smaller issue is the fact that NoSQL isn’t made to remove data duplication as SQL does, and therefore database sizes can quickly become massive. Thankfully this isn’t as big an issue nowadays with the cheap cost of storage, compared to even the nineties where several hundred GB could cost you an arm and a leg.
When it comes to NoSQL data models, there are actually quite a few to pick from, but generally, they might get broken down into these four categories.
Key-Value stores use a hash table that stores a unique pointer (the key) which points to a bit of information or data (the value). The data can be pretty much anything; key-value stores are incredibly versatile, and there are several data models that fall under this one, each with its own specialization.
Also, this type of data model is made for high-performance and high-volume applications, which makes sense when you know that it was pioneered by Amazon and is the basis of DynamoDB. As you can imagine, they might have millions of queries a second that need to be processed, so this data model is perfect for them.
What’s interesting with the column-oriented data model is that it sort of flips the standard SQL design concept on its head. Instead of storing data in rows, they are stored in columns, which can be grouped into families and then hold even more columns themselves. In essence, columns are nested into columns.
This makes the system really efficient when it comes to data access and data aggregation. Sadly though, not that great for complex querying.
Traditionally speaking, when it comes to SQL, you tend to have to rely on XML and JSON. Since NoSQL doesn’t use a schema or relational storage, that means you don’t have to unnecessarily tie both XML and JSON together and essentially hamstring both. Aside from that, there’s also the general flexibility of having data in any shape you want since they’re stored inside documents.
Also, document stores can also be considered a form of key-value, or at least some forms of it. That just goes to show how versatile key-value stores really are.
In a graph data model, information is stored as nodes and edges. The nodes themselves store information, such as names, addresses, etc., while edges describe the relationship between the nodes. Graph data models are meant to easily represent the relationship between different sets of data, and as the name suggests, work perfectly for graphs.
The idea being that having data represented visually like this gives much quicker and more straightforward insight into data that can be used for data analysis.
Since NoSQL isn’t a singular DBM, each individual database model works somewhat differently. It would be correct to say that as a whole, NoSQL works on the basis of not using a relational model for data.
Considering that NoSQL stands for “Not Only SQL” it might be a good idea to describe it in relation to that. When it comes to SQL, data is organized into schemas in the form of tables, columns, and rows. NoSQL, on the other hand, doesn’t bother with a schema or these types of structures, and therefore any NoSQL database is flexible with how it handles data.
The great thing about NoSQL is that there are dozens of databases to pick from depending on what you need. While you can check out a list of NoSQL databases, we’ll mention some of the more popular ones:
MongoDB: MongoDB is a document-oriented data model that is made for high volume data storage. It uses key-value pairs to organize data into documents and collections, with the latter being functionally similar to relational databases in SQL. Interestingly, MongoDB gets compared to DynamoDB, the next database on our list.
Dynamo DB: Similar to MongoDB, Amazon’s DynamoDB is a key-value store database that uses documents to store data. What might be interesting to know is that DynamoDB can handle 10 million queries a second, so it’s a very powerful NoSQL database to work with.
Redis: Another popular key-value store database, uses in-memory data storage in the form of strings, maps, sets, streams, bitmaps, and spatial indexes. Since Redis is in-memory, it gets a performance boost compared to databases that store information on a disc with every check. On that note, an honorable mention goes out to AWS’ ElastiCache which is quite similar to what Redis offers.
Neo4j: A graph-based data model developed by the company of the same name, Neo4j is almost completely built around the importance of how data is connected. This gives it a massive boost in performance compared to a lot of other graph-based data models.
FlockDB: Another graph-based database, it actually compares to Neo4j in that it’s built for rapid operations, rather than multi-hop traversal. This is actually the database that Twitter uses, and since it’s still in the process of being made easily available to the public, the code can be a bit rough around the edges.
Cassandra: Made by Apache, Cassandra is a wide-column DBMS that’s specifically made to handle commodity servers with in-built redundancy. It is also essentially a mix between DynamoDB and Big Data, and it is open source, which is always great.
NoSQL is an interesting database management philosophy that moves away from the restrictions imposed by hardware back when SQL came about, and instead approaches databases from a modern lens. Overall, the NoSQL data models offer specialized tools for pretty much any application, although it’s important not to forget that NoSQL doesn’t necessarily stand for “No SQL”.
As such, SQL still has a place in the modern world with its own use cases that aren’t readily dealt with by using NoSQL.