Lately, one of my favorite database features is the JSONB data type in PostgreSQL. To recap the SQL vs NoSQL debate: Structure is good! SQL crowd Structure is bad! NoSQL crowd With JSONB data in PostgreSQL, you can get the best of both worlds. We know that a rigid schema is not a good fit for every type of data, because not all real-world data adheres to a strict format. For example, log messages, errors, code traces, and JSON web services all return diverse families of data. And, there are plenty of good use cases for storing this kind of data in the database. But it's also true that relational technology is extremely powerful, and it it would be a shame to have to give it all up to just store a bit of unstructured data. In this post, I’m not recommending that you check out JSONB; you’ve probably already done that. I’m going to suggest two techniques that we use at to make JSONB data even more effective. Let's dig in. just AppLand Efficient querying - Indexes on JSONB You know that when you have SQL data stored in strongly typed columns, every row has a uniform structure, and data is both efficient to retrieve and easy to to index. What you may not know is that PostgreSQL can also create columns. So, if you have a common field such as within a JSONB document, you can add an index on that field. Then you can efficiently search by within the JSONB, and PostgreSQL won’t have to slog through all that JSON to find the matching names. Not only that, but you can also SELECT the from within the JSON and PostgreSQL can retrieve that data without parsing the JSON, using a technique called . indexes on JSONB name name name index-only scan Efficient retrieval - Using generated columns with JSONB Now, what if you want to be able to efficiently fetch a subset of JSONB data, without having to load and parse every object? At AppLand, we store in the database. Each AppMap has a mandatory section called , and we want to be able to efficiently pull out the metadata of selected rows without having to parse the entire JSONB column (the metadata is usually less than 1% of the total JSON object). Awesomely, since PostgreSQL 12, PostgreSQL has a feature called . AppMap data metadata generated columns If you’ve ever worked with triggers, you know that when a row is inserted, updated, or deleted, you can instruct the database to run an in-database function which can process the row in a pretty much unlimited way. A generated column is like a trigger; it runs automatically whenever a row is inserted or updated. The generated column specifically applies a function to the new row, and then the result of that function on the row. stores So, you can use generated functions to pull frequently used data out of a JSONB column, and store it on the row. Then you can write a query to access the copied/cached (“de-normalized”) data. It’s truly the best of both relational and non-relational worlds. An example Here's a concrete example of a JSONB index and generated column. For this example, I will use JSON, which represents a class dependency diagram and execution trace of a program. At the top level, the AppMap format includes , , and . AppMap metadata classMap events Let's store the AppMap data in a simple table called , and add a minimal record: appmaps CREATE TABLE appmaps ( id uuid DEFAULT gen_random_uuid() PRIMARY KEY, data jsonb ); -- data is schema-less so we can omit all fields except metadata INSERT INTO appmaps ( data ) VALUES ( '{"metadata":{"name":"login"}}'::jsonb ); Optimizing with an index When a user wants to retrieve an AppMap by name, the server needs to search . The double-arrow instructs PostgreSQL to convert the result to (from JSON), so that the query condition can be ordinary text. data->'metadata'->>'name' ->> text CREATE INDEX idx_appmaps_name ON appmaps USING BTREE ((data->'metadata'->>'name')); Note: There are you can use, with advantages and disadvantages. is good for text. different types of indexes BTREE This entire query can now be executed using just the index, using index-only scan: SELECT data->'metadata'->>'name' AS name FROM appmaps WHERE data->'metadata'->>'name' = 'login'; name ------- login (1 row) Optimizing with a generated column Let's say we want to be able to efficiently pull out the from the data column. Add the column: metadata GENERATED ALTER TABLE appmaps ADD COLUMN metadata JSONB GENERATED ALWAYS AS (data->'metadata') STORED; All rows now contain the column which holds stored, denormalized data: metadata SELECT id, metadata from appmaps; id | metadata --------------------------------------+------- c5e4cfa5-1945-4685-a50f-7e5b0edd62e5 | {"name": "login"} (1 row) Wrapping up These are a couple of techniques that we are using at AppLand to make JSON data act "relational" when we want it to. Not only does PostgreSQL provide the powerful JSONB data type, but it also provides the shortcuts and usability tricks that make JSONB truly useful. For those who are ramping up on some new-to-you backend code, and finding it hard to figure out how all the pieces fit together - s with . ee what the code really does AppMap - Google Maps for code Also published on . https://dev.to/kgilpin/effective-sql-indexing-and-denormalizing-jsonb-1bhm