At the very beginning of most development endeavors lies an important question: What database do I choose? There is such an abundance of database technologies at this moment, it’s no wonder many developers don’t have the time or energy to research new ones. If you are one of those developers and you aren’t very familiar with graph databases in general, you’ve come to the right place!
In this article, you will learn about the main differences between a graph database and a relational database, what kind of use-cases are best suited for each database type, and what are their strengths and weaknesses.
The main difference is the way relationships between entities are stored. In a graph database, relationships are stored at the individual record level, while a relational database uses predefined structures, a.k.a. table definitions.
Relational databases are faster when handling huge numbers of records because the structure of the data is known ahead of time. This also leads to a smaller memory footprint. Graph databases don’t have a predefined structure for the data which is why each record has to be examined individually during a query to determine the structure of the data.
First things first! To decide if you need a graph database, you need to be familiar with the basic terminology. The fundamental components of a graph database are:
In a typical social network graph, the nodes represent people in different social groups and their connections with one another. Every person is represented with a node that’s labeled as Person. These nodes contain the properties
name
, gender
, location
and email
. The relationships between people in this network are of the type FRIENDS_WITH
and contain a yearsOfFriendship
property to specify the duration of the friendship connection. Each person is assigned a location through LIVES_IN
relationships with nodes labeled Location
.While this is a very simple example, it concisely demonstrates the power and benefits of using a graph database. For example, if you wanted to add different properties to some of the nodes, you would be able to. Unlike a table, where you need to add a column for each additional attribute, here you can be much more flexible with the data structure and types. A property that was meant to be a string can be used as an integer without any constraints. To be fair, this can cause problems for you in the long run, but you can do it if need be.
A relational database requires a predefined and carefully modeled set of tables. We create one for each entity and add the needed attributes as columns. While this is also pretty straightforward, it’s much more rigid than the graph schema and not as extendible.
For example, each person is connected to other people through friendships, and to model this relationship, we have to add another table. If there were different kinds of connections (related to, no longer friends…) we would have to change the schema accordingly. A relational database isn’t suited for this specific use case because the focus isn’t on the data itself but rather on the relationships within it.
There are always two sides to every story and graph databases aren’t a perfect solution for every problem. Far from it. There are a lot of use cases for which you should stick with relational databases or maybe search for other alternatives aside from graph databases.
Here are three simple questions you can ask yourself to decide if there are any benefits to using a graph database.
Graph solutions are focused on highly-connected data that comes with an intrinsic need for relationship analysis. If the connections within the data are not the primary focus and the data is of a transactional nature, then a graph database is probably not the best fit. Sometimes it’s just important to store the data and complex analysis isn’t needed.
In our example, if we were to store only people without their relationships, then we would end up with a sparsely connected graph. Yes, a number of simpler graphs would remain because of the connections between nodes
Person
and Location
, but this degree of connectedness and the consistency of the data structure is well suited for a relational database.Graph databases are optimized for data retrieval and if you choose one, then you should probably use this functionality often. If your focus is on writing to the database and you’re not concerned with analyzing the data, then a graph database wouldn’t be an appropriate solution. A good rule of thumb is, if you don’t intend to use JOIN operations in your queries, then a graph is not a must-have.
In our example, if you only store data for the sake of logging interactions and you don’t intend to analyze it later on, then a graph database isn’t particularly helpful. However, if there are numerous connections within the data being stored, then a graph might be worth considering.
If your data model is inconsistent and demands frequent changes, then using a graph database might be the way to go. Because graph databases are more about the data itself than the schema structure, they allow a degree of flexibility.
On the other hand, there are often benefits in having a predefined and consistent table that’s easy to understand. Developers are comfortable and used to relational databases and that fact cannot be downplayed.
For example, if you are storing personal information such as names, dates of birth, locations… and don’t expect many new fields or a change in data types, relational databases are the go-to solution. On the other hand, a graph database could be useful if:
In our example, the attributes and relationships of a person could be set in stone due to a specific use case and no further changes may be needed.
If you need to run frequent table scans and searches for data that fits defined categories, a graph database wouldn’t be very helpful. Graph databases are well equipped to traverse relationships when you have a specific starting point or at least a set of points to start with (nodes with the same label). They are not suited for traversing the whole graph often. While it’s possible to run such queries, other storage solutions may be more optimized for such bulk scans.
If the majority of the queries in our example include searches by property values over the entire network, then a graph database wouldn’t be the right fit.
Very often, databases are used to lookup information stored in key/value pairs. When you have a known key and need to retrieve the data associated with it, a graph database is not particularly useful.
For example, if the sole purpose of your database is storing a user’s personal information and retrieving it by name or ID, then refrain from using a graph. But if there were other entities involved (visited locations for example), and a large number of connections is required to map them to users, then a graph database could bring performance benefits. A good rule of thumb is, if most of your queries return a single node via a simple identifier (key), then just skip graph databases.
If the entities in your model have very large attributes like BLOBs, CLOBs, long texts… then graph databases aren’t the best solution. While you can store those objects as nodes and link them to other nodes to utilize the power of traversing relationships, sometimes it just makes more sense to store them directly with the entities they are connected to.
In our example, if each person had a long biography that needed to be included in the same database, a graph wouldn’t be the answer. However, if you needed to connect these biographies to other entities in the database (for example people that are mentioned in them), then the strengths of a graph database could outway the limitations.
It very much depends on your specific use case. Graph databases are a very powerful tool when it comes to handling interconnected data. If you have a hard time deciding, then go through the aforementioned requirements and check if any of them apply to your scenario.
In this article, you have gained some insights into the fundamental differences between relational and graph databases. If you are still unsure if a graph database is the right choice for your project, then simply drop us a line on our community forum, and we’ll be happy to help!
Also published at https://memgraph.com/blog/graph-database-vs-relational-database