With over 300 databases on the market, how do you determine which is right for your specific use case or skill set? We continue to see the common debate of SQL vs. NoSQL and other database comparisons all over social media and platforms like Hackernoon. In most cases, it’s not that one database is than the other, it’s that one is a due to numerous factors. better better fit for a specific use case Last year, our CTO Kyle Bernhardy, led an awesome talk titled . You can watch this talk at the link, but since this is such a prominent discussion topic we thought it might be helpful to summarize. . A Deep Dive Into Database Architectures This article will provide an overview on database architectures, including use cases and pros & cons for each of them . It’s important to understand things such as data type / structure, data volume, consistency, write & read frequency, hosting, cost, security, and integration constraints. The more you know about these factors, the easier it will be to pick the right database for your project. Let’s start with general considerations when selecting a database You may already know that there are generally : 3 database hosting options On-Premises Database fully maintained by organization on servers running within their data center(s) More control, but usually more expensive and time consuming Cloud Hosted Servers are maintained by cloud providers, organizations maintain database software and operating system running on the machine Flexible scaling and no server upkeep, but no control over physical server and potential network limitations Database-as-a-Service (DBaaS) Database maintained by service provider, organizations only charged for usage of service Cost effective and zero upkeep, but data stewardship and potential network limitations Now for the part you’ve been waiting for - . database architectures Relational Databases We’ll start with the most commonly used. Relational (SQL) Databases such as Oracle, MySQL, PostgreSQL, Microsoft SQL Server, and SQLite, organize data into tables with columns, each with a specified name and datatype. Additionally: Rows are identified with a unique attribute, or grouping of attributes, called a primary key (typically a single column) Relationships between tables are defined through foreign keys which reference primary keys Strict schema (data model) enforcement Data accessed via Structured Query Language (SQL) ACID compliance (Atomicity, Consistency, Isolation, and Durability) Extra features like Triggers & Stored Procedures , relational databases use mature technology that is widely understood and well-documented, SQL standards are well-defined, defined constraints enforce data integrity, they avoid data duplication and are highly secure and ACID-compliant. , SQL databases cannot handle unstructured or semi-structured data, their tables don’t necessarily map to objects, they require complicated ETL (Extract, Transform, Load) and maintenance, have row locking, and pricing for some products (Oracle, SAP) are out of reach for developers and some organizations. On the plus side However on the negative side Note: While some RDBMS systems can now handle JSON, they are not purpose built to do so. There are several awesome ; situations where data integrity is absolutely paramount (financial applications, defense and security, private health information), highly structured data, and automation of internal processes. use cases for relational databases Then comes everything else… Relational databases are the most common database in production today, but they were not designed for the scale and agility of modern applications. About 10 years ago to address these concerns and changed the database landscape forever. Note: the NoSQL movement caught on NoSQL doesn’t mean anything (Non-SQL, Non-relational SQL, Not Only SQL) Designed to address structure, performance, data volume, and scalability Relational databases have since addressed many of these concerns While databases are typically categorized as SQL or NoSQL, there are many intricacies to NoSQL databases. . Let’s get into it Key-Value Stores Key-value stores are often used as the underlying storage for higher level databases. For example, MongoDB uses a key value store called WiredTiger as their default storage engine. Key-Value Stores, such as Redis, DynamoDB, and Cosmos DB, are: Simple, basic, & schemaless Provide basic functionality for retrieving arbitrary data via a specific key Values can be anything: single values, arrays, objects, files, etc. Database does not evaluate the data it is storing Data structure can be referred to as a dictionary or hash table , key value stores provide fast, low-complexity access to data, are flexible, and can scale quickly and cheaply. , they have extremely limited functionality, cannot handle complex structures or query or search by anything other than key, do not scale well as data models grow, and they require more programming overhead for complex implementations. For pros However Example would be embedded systems, URL shorteners, configuration data, application variables and flags for web applications, state information, and data represented by a dictionary or hash. use cases for key value stores Document Stores Document stores, such as MongoDB, DynamoDB, Couchbase, and Firebase, are similar to key-value stores, but the value is a document. Typically document formats are JSON, BSON, or XML documents Schemaless, no data structure enforcement (documents can be different) Data accessed and modified via NoSQL (or proprietary language) Well-suited for unstructured and semi-structured data Seen as easier for development The include flexibility and scalability, schemaless, fast writes, ideal for semi-structured and unstructured data, and developers do not need to know data structure ahead of time / it can change overtime without downtime. are that they are not ACID compliant, limited to querying within a document, relationships/cross references are not enforced, slow searching, cannot join documents/collections in a single query, lack of database enforcement requires developer discipline and vigilance for application level enforcement, and they typically result in data duplication. pros of document stores The cons Great are unstructured or semi-structured data, content management, rapid prototyping, and collecting of high traffic data. use cases for document stores Graph Databases Graph databases, such as Neo4j, OrientDB, and TitanDB, are ideal for when relationships or connections are top priority. Based on mathematical graph theory Represent data as a network of related nodes, edges, and properties Database stores data items within nodes and relationships in edges that connect nodes Nodes are connected by relationships and grouped according to labels Facilitate data visualizations and graph analytics Each node contains free-form data have advanced features for relationship querying, traversing, and tracking, are optimized for querying related data, and they avoid row locking. , graph databases have a large ramp up time for developers, high overhead for simple use cases, lack of standardization, poor performance of aggregate queries, and devs typically need to learn a custom query language. On the plus side, graph databases As for the negatives analysis of heterogeneous data points, fraud prevention, advanced enterprise operations, social networking, payment systems, and GeoSpatial routing/visualization. Graph databases are great for Time Series Databases ! Next up is time series databases, such as InfluxDB, Kdb+, and Prometheus, which are: We’re almost there Focused on datasets that change over time Heavily write oriented Designed to handle constant streams of data Typically append-only (no modification after ingestion) Rollup/aggregation/down sampling features to lower archive data footprint , time series databases are designed for dealing with linear data over time, can handle high ingestion rates, have built-in features specifically for dealing with time-based data, a schema optimized for time-series arrays, and batch delete features. , time series databases only deal with time-series data, do not support full SQL, their read speed suffers compared to writes, they have no transaction capability and are append-only (not optimized for updates). On the positive side As far as negatives are managing infrastructure, IoT sensor collection, and log monitoring and alerting. Great use cases for time series databases Search Engines Last but not least is search engines, such as Elasticsearch, Splunk and Apache Solr, which are: Built for non-relational, document-based data Arranged and optimized for storage and rapid retrieval of data Indexes data across a variety of sources including: file systems, intranets, document management systems, e-mail, and databases include their focus on optimized searching, highly scalable and schemaless, and they have advanced search options like full text search, suggestions, and complex search operations. are that they are expensive, have low durability and , have no transaction support, are not efficient for writing and retrieving data outside of searching, and are difficult to manage. Search engine pros The cons poor security search results are top priority, logging, product catalogs, and blogs. Search engines are great when Multi Model Considerations It should be noted that many real life implementations choose “polyglot persistence,” which is the concept of using different data storage technologies to handle different data storage needs within a given software application. Many database technologies implement this within a single software, referred to as or technology. This can be ideal for specific use cases, but poses inherent risk such as instability, data inconsistency and corruption, expensive / resource intensive, and data replication on disk and memory. multi-model data lake : A few examples of multi-model technologies are Hadoop: Software tools running on top of multiple databases MongoDB: Multiple data storage options in a single database software platform PostgreSQL: Row-oriented, column-oriented, key-value, and document-oriented data storage options Finally, as a database company... . If for no other reason than to highlight that these database “categories” are not all black and white, and some databases take more of a “hybrid” approach by subscribing to several methodologies. Instead of falling into a single bucket, . It features: It probably makes sense to discuss where HarperDB fits into the mix HarperDB can be considered as a structured object store with SQL capabilities Built-in REST API NoSQL & SQL operations including joins Dynamic schema Advanced publish-subscribe data replication Self-service management studio Traditional drivers/interfaces We built from the ground up to expand and blend the best capabilities of SQL, , and NoSQL products because we felt there were certain use cases that could be better served with another solution. We believe that providing developers with the ability to choose the right tool for the job empowers developers and spurs innovation. include cases where you need both NoSQL & SQL, rapid application development, hybrid cloud, integration, edge computing, distributed computing, and real-time operational analytics. Essentially, the SQL vs. NoSQL debate becomes irrelevant with HarperDB because you no longer have to choose! HarperDB NewSQL Some examples where we feel that HarperDB is a better fit Hopefully this database architecture overview is helpful in finding the right database for your use case. Please reply below with questions or comments - we would love to discuss! Also published at https://dev.to/harperdb/database-architectures-use-cases-explained-5711

Cosmos

Apache

HarperDB

Microsoft

Oracle

#NoBrainers: You Need A High Performing Low Latency Distributed Database

Along Came HERN Stack

Check out HarperDB

2021 - HackerNoon Contributor of the Year - PERFORMANCE

Nominated for 2022 - HackerNoon Contributor of the Year - Database

Too Long; Didn't Read

A Comprehensive Guide To Database Architectures And Use Cases

A Comprehensive Guide To Database Architectures And Use Cases

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Along Came HERN Stack

105 Stories To Learn About K8s

10 Ways to Future-Proof Your Business With Cloud

10 Ways to Reduce Data Loss and Potential Downtime Of Your Database

10 Upcoming DevOps Conferences for 2018

101 Stories To Learn About Cloud Infrastructure

Along Came HERN Stack

105 Stories To Learn About K8s

10 Ways to Future-Proof Your Business With Cloud

10 Ways to Reduce Data Loss and Potential Downtime Of Your Database

10 Upcoming DevOps Conferences for 2018

101 Stories To Learn About Cloud Infrastructure

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps