paint-brush
Hybrid Data Models: How To Have Your JSON Cake and Eat MariaDB Tooby@grobbert
2,501 reads
2,501 reads

Hybrid Data Models: How To Have Your JSON Cake and Eat MariaDB Too

by Rob HedgpethAugust 5th, 2021
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

MariaDB is a free (yea, free) open-source database solution that you can use for anything from some lightweight tinkering to supporting fully-fledged, production-grade applications. At its core, it's a relational database management system, but diving deeper into its capabilities you’ll quickly discover that it's more than that. One of those capabilities is its ability to handle JavaScript Object Notation (JSON) formatted data, completely free and out-of-the-box. The ability to store. JSON documents within a. relational database allows you to create hybrid data models, containing both structured and semi-structured data.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail

Coin Mentioned

Mention Thumbnail
featured image - Hybrid Data Models: How To Have Your JSON Cake and Eat MariaDB Too
Rob Hedgpeth HackerNoon profile picture

I’m willing to bet that if you’re reading this article you’ve at least heard of MariaDB. Heck, maybe you’ve even used it at some point. But, if not, I highly recommend that you give it a quick glance as, at the highest level, it’s a pretty solid, free (yea, free) open-source database solution that you can use for anything from some lightweight tinkering to supporting fully-fledged, production-grade applications.


To get started with MariaDB, just in case you want to follow along with the samples in this article, you can download it directly from the website, start a service using their database-as-a-service offering SkySQL or spin up a container from the official MariaDB Docker image.


In the beginning, and at its core, MariaDB is a relational database management system, or RDBMS for short, but diving deeper into its capabilities you’ll quickly discover that it's more than that.


Much, much more.


One of those capabilities is its ability to handle JavaScript Object Notation (JSON) formatted data, completely free and out-of-the-box. OK, but why is that important? Well, in the context of databases, JSON has often been thought of as something you’d use with NoSQL solutions. And that makes sense as one of the problems the “NoSQL Revolution” set out to solve was flexibility, or having the ability to create, update and remove data and the structures they're housed in without having to modify things like those pesky relational database schemas.



Structured vs. Semi-structured data


Keying off of the success that NoSQL solutions have been able to achieve by using semi-structured data in that time, over the past few years JSON integrations have made their way into the relational world. And for good reason. The ability to store JSON documents within a relational database allows you to create hybrid data models, containing both structured and semi-structured data, and enjoy all of the benefits of JSON without having to sacrifice the advantages of relational databases (e.g. SQL and all things data integrity).


Keying off of the success that NoSQL solutions have been able to achieve by using semi-structured data in that time, over the past few years JSON integrations have made their way into the relational world*.*


OK, enough of this “setting the stage” b’niss. Let’s check out some of the JSON functionality that’s available in MariaDB and how you can use it. I say some because MariaDB contains a large amount of JSON functionality. In fact, too much to cover in just a single blog post. However, we can certainly hit the high points, which should give you the foundation you need to dive even deeper.

Structured Data + Semi-Structured Data

There are a multitude of use cases where it may make sense to combine structured and semi-structured data. That’s just the world of software development for ya. However, I’ve always found it easiest to consume new technologies by focusing on a simple, (hopefully) relatable use case that you can then use to get your own creative juices flowing.


To help walk-through the JSON capabilities that are available within MariaDB I’m going to be using a hypothetical application. This application will only contain one table, called locations, that will store, yep, you guessed it, locations. Simple enough, right?


We won’t be using any kind of front-end management, but imagine that the locations could be represented on some kind of map like the following.


Using location data to place points of interest on a map

From the simplest standpoint, geographic locations, no matter the type, contain foundational information such as the name, type, longitude and latitude. But, depending on the type, each location could have different details.


Structured and semi-structured data in the same table

Table Creation

Using JSON within MariaDB is as easy as enabling the ability to store JSON data within a table. In fact, the SQL used to create a new location table should look very familiar.


CREATE TABLE locations (
    id INT NOT NULL AUTO_INCREMENT,
    name VARCHAR(100) NOT NULL,  
    type CHAR(1) NOT NULL,
    latitude DECIMAL(9,6) NOT NULL,
    longitude DECIMAL(9,6) NOT NULL,
    attr JSON, 
    PRIMARY KEY (id)
);


Note that the attr column included within the locations table is defined with a JSON data type. More specifically the column is using a JSON alias data type. That means there’s no actual JSON data type, but, instead, the JSON specified data type is converted into an existing data type within MariaDB.


…there’s no actual JSON data type, but, instead, the JSON specified data type is converted into an existing data type within MariaDB.


Taking a closer look, we can use the SHOW CREATE query to inspect the details of what’s actually been created.


SHOW CREATE TABLE locations;


Executing the previous statement will yield the following result.


CREATE TABLE locations (
    id INT NOT NULL AUTO_INCREMENT,
    name VARCHAR(100) NOT NULL,  
    type CHAR(1) NOT NULL,
    latitude DECIMAL(9,6) NOT NULL,
    longitude DECIMAL(9,6) NOT NULL,
    attr LONGTEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin DEFAULT NULL 
    CHECK (JSON_VALID(`attr`)
 	PRIMARY KEY (id)
);


Notice that the data type for the attr column is LONGTEXT. Beyond that you’ll see that there are a few constraints added to the field for the character set and collation. Most importantly, however, is the CHECK constraint, which indicates a function that will be executed when the data in attr has been modified, either through insertion or updating.


The JSON_VALID() function is a predefined function that serves to receive JSON data (in the form of a string) and validate whether or not it’s valid. Valid meaning properly formed JSON data.


Using the JSON_VALID to validate JSON data before storing it

Inserting Data

The JSON data that you insert is contained within quotes, just like any other string-based information you’d insert. The only difference is that the string must be valid JSON.


INSERT INTO locations (type, name, latitude, longitude, attr) VALUES 
     ('R', 'Lou Malnatis', 42.0021628, -87.7255662,
  '{"details": {"foodType": "Pizza", "menu": "https://www.loumalnatis.com/our-menu"}, 
     	"favorites": [{"description": "Pepperoni deep dish", "price": 18.75}, 
{"description": "The Lou", "price": 24.75}]}');


Note that you can specify different JSON data, with completely different structures, within the insert to the same table. Which makes sense, of course, because that’s the entire point!


INSERT INTO locations (type, name, latitude, longitude, attr) VALUES 
     ('A', 'Cloud Gate', 41.8826572, -87.6233039, 
'{"category": "Landmark", "lastVisitDate": "11/10/2019"}');


…you can specify different JSON data, with completely different structures, within the insert to the same table.

Querying JSON Data

You’ve probably picked up on it by now but managing JSON data within MariaDB really boils down to using predefined functions. For the rest of this article we’ll be taking a look at several of the functions that are available to you.

Reading Scalar Data

The JSON_VALUE() function returns a JSON scalar value from the specified path in the specified data. In the following example I’ve used the attr column as the “specified data”, but note that the JSON supplied to the function could just as well be a raw string of JSON data.


SELECT name, latitude, longitude, 
    JSON_VALUE(attr, '$.details.foodType') AS food_type
FROM locations
WHERE type = 'R';


Which, depending on the data previously inserted into the locations table, could yield a result similar to the following.


And if you’re wondering “what about handling null/non-existent values?”, because, due to the nature and, really, the purpose of semi-structured, that’s kind of the point. Yep, the JSON_VALUE() function handles that.


What about handling null/non-existent values? …Yep


This whole query then displaying the tabular results type flow is something I’ll keep going with throughout this article. Just a heads up!


You’re also not limited to using the JSON functions strictly as part of the SELECT clause. You can just as easily use them within the filtering portions.


SELECT id, name, latitude, longitude
FROM locations
WHERE type = 'S' AND
    JSON_VALUE(attr, '$.details.yearOpened') = 1924;


Reading Object Data

The JSON_QUERY() function accepts JSON data and a JSON path and returns JSON data. The difference between JSON_VALUE() and JSON_QUERY() is that JSON_QUERY() returns entire JSON object data.


SELECT name, description, 
    JSON_QUERY(attr, '$.details') AS details
FROM locations
WHERE type = 'R'


The JSON_QUERY() function can also return arrays.


SELECT name, description, 
    JSON_QUERY(attr, '$.teams') AS home_teams
FROM locations
WHERE type = 'S';


Creating Indexes

At this point you may be wondering with all this querying going on, what about being able to create (performance enhancing) indexes? Is that even possible? You’re damn right it is!


It all starts with the creation of a virtual column.


ALTER TABLE locations ADD COLUMN 
    food_type VARCHAR(25) AS (JSON_VALUE(attr, '$.details.foodType')) VIRTUAL;


Then you can use the virtual column, in combination with other virtual or persistent columns, to create a new index.


CREATE INDEX foodtypes ON locations(food_type);


Modifying JSON Data

As you know, reading data is really only half the battle. To really get the value out of being able to store JSON data within a relational database you also need to be able to modify, or write, it. Luckily, MariaDB provides a bunch of functionality for this as well.


Inserting Fields

The JSON_INSERT() function returns JSON data created by inserting one or more path/value pairs to JSON data.


UPDATE locations
SET attr = JSON_INSERT(attr,'$.nickname','The Bean')
WHERE id = 8;


Inserting Arrays

You can also create new arrays using the JSON_ARRAY() function. Then, within the JSON_INSERT() function, the new array can be inserted into the specified JSON data (in this case the attr field).


UPDATE locations
    SET attr = JSON_INSERT(attr, 
 '$.foodTypes', 
        JSON_ARRAY('Asian', 'Mexican'))
WHERE id = 1;


Adding Array Elements

Using the JSON_ARRAY_APPEND() function you can modify an existing array by adding one or more elements.


UPDATE locations
    SET attr = JSON_ARRAY_APPEND(attr, 
                '$.foodTypes', 'German’)
WHERE id = 1;


Removing Array Elements

The JSON_REMOVE() can be used to remove an array element, specified by index.


UPDATE locations
    SET attr = JSON_REMOVE(attr,       
                           '$.foodTypes[2]')
WHERE id = 1;


The JSON_REMOVE() function is so powerful that it can be used to return a resulting JSON document after removing any JSON data (i.e. array element, object, etc.) at the specified path(s) from JSON data.


Hybrid Data Querying

It may be the case that you want to create JSON data from structured data. For that you can use the JSON_OBJECT() function.


SELECT 
JSON_OBJECT('name', name, 'latitude', latitude, 'longitude', longitude) AS data
FROM locations
WHERE type = 'S';


Merging Data

You can merge the data returned from the JSON_OBJECT() function and merge it with existing JSON data by using the JSON_MERGE() function. Notice below that you can create an entirely new JSON object, using the JSON_OBJECT() function, and then use the JSON_MERGE() function to combine, or merge, it with the value of the attr field.


SELECT 
  JSON_MERGE(
    JSON_OBJECT(
        'name', name, 
        'latitude', latitude,
        'Longitude', longitude), 
    attr) AS data
FROM locations
WHERE type = 'R';


JSON to Tabular Data

In MariaDB 10.6, currently the latest version, the JSON_TABLE() function was added. This new function enables you to transform JSON data directly into tabular format, which can even be used directly within a FROM clause to join to other tables (or tabular data).


SELECT l.name, d.food_type, d.menu
FROM
	locations AS l,
JSON_TABLE(l.attr,
		 ‘$’ COLUMNS(
		        food_type VARCHAR(25) PATH ‘$.foodType’,
		        menu VARCHAR(200) PATH ‘$.menu’)
           ) AS d
WHERE id = 2;


You can find a more detailed article on the new JSON_TABLE() function here!


Enforcing Data Integrity

Lastly, I’d like to touch on the ability you have to be able to enforce data integrity within JSON data that exists within MariaDB. In more plain English that means that you have the ability to create constraints, or requirements, for the types of JSON that is allowed to exist within your tables.


Below is an example of how you can create a new CONSTRAINT, in this case named check_attr, that specifies that for each location of type ‘S’ the JSON data within it has to fit particular criteria. Namely you can control things like the data types for properties or values, whether a property must exist, and even the length of the values within a specified property. This is all accomplished using the JSON functions within MariaDB. As you can see it’s extremely flexible and powerful.


ALTER TABLE locations ADD CONSTRAINT check_attr
    CHECK(
        type != 'S' OR (type = 'S' AND 
            JSON_TYPE(JSON_QUERY(attr, '$.details')) = 'OBJECT' AND
            JSON_TYPE(JSON_QUERY(attr, '$.details.events')) = 'ARRAY' AND
            JSON_TYPE(JSON_VALUE(attr, '$.details.yearOpened')) = 'INTEGER' AND
       JSON_TYPE(JSON_VALUE(attr, '$.details.capacity')) = 'INTEGER' AND
       JSON_EXISTS(attr, '$.details.yearOpened') = 1 AND
            JSON_EXISTS(attr, '$.details.capacity') = 1 AND
            JSON_LENGTH(JSON_QUERY(attr, '$.details.events')) > 0));


Next Steps

Thanks so much for reading this article on how you can combine the power of MariaDB with the flexibility of JSON. We were really only able to scratch the surface of what’s available within MariaDB.


The full list of JSON functions currently available within MariaDB.

Ultimately, we all learn in different ways. If you’d like to learn even more about the JSON functionality that’s available, and how you can start creating hybrid data models using MariaDB, please check out the following resources I’ve also put together.



And if you have any questions, comments or suggestions please feel free to reach out to me at [email protected] or @probablyrealrob on Twitter.


Reminder: To get started with MariaDB (completely for free) you can download it directly, start a service using their database-as-a-service offering SkySQL or spin up a container from their Docker image.


Happy coding!