The products and services which we are interacting with are evolving day by day with new sets of features and experiences. Food delivery services delivering food to every doorstep are now hosting dining services. Amazon selling books initially has now expanded to an e-commerce giant dealing with various categories of products.
Handling these business changes at a data store level can be a nightmare for software engineers if the design of the underlying model does not account for adaptability.
“Good Design is Good Business.”
Thomas Watson, IBM
As a database architect (role exists in the real world?) I have been asked to build a product model for an e-commerce application similar to Amazon helping end-users to buy different products on the same platform.
To understand the extensibility factor of the model, consider the e-com platform is initially selling books as a product and later extends to sell clothes and other product categories as well. All products may or may not have the same set of attributes. For example, the binding type (hardcover, spiral, e.t.c.) is an attribute specific to books and color variation (red, black, e.t.c.) is specific to clothing. In the future, I might encounter a product having different attributes specific to itself.
Let’s explore the different ways of designing such a model which can be extensible enough to store different products with a perpetual combination of attributes, strengths, and weaknesses.
Moving forward, I will be using product and attributes keywords a lot while explaining the approaches. To add some more context, the following are the definitions of those keywords.
attributes: Specific elements of a product such as size and color of a T-shirt are attributes.
product: Combination of all attributes will decide the product. T-shirt is a product.
Creating a single table called products
to store all the products with their attributes (size
, color
) values as columns. And having some extra reserved columns (extra_1
, …) for attributes which can arise with a new product launch and were not accounted for while designing the products
. So in the future, if a new product has material_type as an attribute, I can dedicate extra_1
to store the value of it.
Strengths
Weakness
ALTER table
).null
values. For example, a row with a book as a product will have a null
in both size
and color
column.Creating a base table called products
to store all the products and their common attributes (size
- assuming it is common for book products too) values as its columns and having different tables, book_products
and clothing_products
to only store attributes binding_type
and color, material_type
specific to respective products. A join of the base table (products
) on the inherited product table (book_products
/clothing_products
) will result in entire product detail (a book or a cloth).
Strengths
null
values can be brought down as all the specific attributes are present on different tables.
Weakness
ALTER table
, leading to a table lock on product
.Creating standalone tables for each product type (book_products
and clothing_products
) and having all the attributes (size
, binding_type
, color
) required as columns of each respective table. A new table will be created for every new product type.
The strengths and weaknesses of this approach are quite similar to Class Table Inheritance with the exception of the base class table not being present.
Creating a single table called products
to store all the products with common attributes (size
) as columns. And having a JSON column (extra
) for storing all the attributes specific to that product entry.
For example: a clothes product will have size (S/M/L) stored in the size
column and the rest of attributes as JSON data ({color: red, material_type: cotton}
) in extra
column. If the RDBMS type doesn’t support JSON datatype, then datatype such as text can be used which allows it to store large serialized objects (stringified JSON objects).
Strengths
ALTER
on the table.null
values can be brought down as all the specific attributes are present as JSON data, but JSON data takes a lot more space than normal rows and columns.
Weakness
Following this approach, I will separate out the entity (products
) and attributes (product_attributes
) as two different tables. product
will not be storing any information related to attributes. Instead, each attribute will be stored as a key (attribute
)-value (value
) entry in product_attributes
. Join of products
on product_attributes
will result in entire product details (a book or a cloth).
For example: To store a cloth product, I will be creating a product entry in products
and each attribute such as size - S, color - red and material_type - cotton as different key-value entries (rows) in product_attributes
with the entity referring to the products.id
.
Strengths
product_attributes
.product_attributes
which are required by the entity (product).
Weakness
product_attributes
, RDBMS data types and constraints rules cannot be exercised and instead have to be handled at the application level.product_attributes
leading to complex SQL operations, i.e. each attribute will lead to a join.Relational tables have a fixed set of columns and have to be decided while designing the model, whereas NoSQL databases are free of these rules and constraints. Any kind of data can be stored in the non-relational model.
Strengths
Weakness
The approaches explained in this article are some of the generic standards of handling database model flexibility. There can be several other ways as well to design models specific to every use case. This article can be thought of outlooking different ways to build basic extensible models and can be extended with tweaks specific to business requirements.
Each approach has its own strengths and weaknesses, so the best approach depends on the use case of a design problem being solved and the result of some brainstorming meetings with your fellow mates 😄.