Today there are hundreds of SQL and NoSQL databases. Some of them are popular, some are ignored. Some are user-friendly and well documented and some are hard to use. Some are open-sourced and some are proprietary. And, perhaps, the most important - some are scalable, optimized, highly available and some are difficult to scale or maintain.
There comes the natural question: how to choose a database? To answer it, we should decide, what we want to achieve with a database. To create a view, we should answer questions like these:
When the decision is done, we need to keep in mind, what one or another database able to offer. Particular features of each database may vary, but in general, there are only a few types of databases. Within these types, we can achieve mostly the same goals. Let’s look at them closely.
If you have ever worked with databases, most likely you have begun with this type of database. This type is the most popular and widespread. These databases allow storing data in relational tables with defined columns of a particular type. Relational tables are good normalization and joins.
Advantages
Disadvantages
Examples: Oracle DB, MySQL, PostgreSQL.
If we don’t want to join several tables to retrieve desired data, we can look at the document-oriented databases. These databases allow storing records in JSON-like format. With this format, we can create complex value for any key and include all the data structure in one record at once.
Advantages
Disadvantages
Examples: MongoDB
Databases of this type can provide real-time response for selecting and inserting particular records. Most of them mainly store data into RAM but also offer persistent storage on HDD or SSD for some cases. Most of these databases operate with key/value records, so the values may recall document-oriented format. But some databases also operate with columns and allow secondary indexing in the same table. Using RAM allows to process data rapidly but makes it more unstable and expensive.
Advantages
Disadvantages
Examples: Redis, Tarantool, Apache Ignite
These databases store data as key/value records on HDD or SSD. These solutions are designed to scale well enough to manage petabytes of data across thousands of commodity servers in a distributed system. They represent the SSTable architecture. This architecture was designed for two use cases: fast access by key and fast, highly available writing.
Advantages
Disadvantages
Examples: Cassandra, HBase
Sometimes we need to access data fast not with particular keys, but with particular columns. In this case, we better get rid of inserting row by row and move to batch writing. Batch inserts allow columnar databases to prepare the data for rapid read by columns.
Advantages
Disadvantages
Examples: Vertica, Clickhouse
If we want to access the data with filter by any value and even with any word in column, we should remember search engines. These databases perform indexing of every word in columns and allow full-text search. They are perfect for storing and analyzing logs or large text values.
Advantages
Disadvantages
Examples: ElasticSearch, Apache Solr
For some use cases exist graph data structures. We can find their realization in graph databases. If your tasks require working with graphs, there are special databases designed to satisfy your needs.
Advantages
Disadvantages
Examples: Neo4j
Almost every task can be done with almost any type of database. The question is how expensive and optimized it would be. Choosing the tool you are used to can reduce your time to market, but it also can cost you an enormous amount of money to maintain and expand your hardware, which may be used inefficiently. Always try to use a database in the way it was meant to use. Perhaps, a solution that suits your needs already exists.