Intro to Databases: Using Different Data Models and Representing Databases Visually

Written by lazar.gugleta | Published 2020/01/20
Tech Story Tags: data-science | sql | databases | data-visualization | chen-notation | how-to-use-data-models | how-to-write-sql-query | database

TLDR As you get into the Databases and Data Science, the first thing that you have to master is the relations between entities in your database. This is important because the data that you use has to be absolutely efficient for its further implementations. For this Database, I am going to use Modified Chen Notation. Chen’s notation for the entity-relationship modeling uses rectangles to represent entity sets, and diamonds to represent relationships appropriate for first-class objects. If an entity set participates in a relationship set, they are connected with a line.via the TL;DR App

As you get into the Databases and Data Science, the first thing that you have to master is the relations between entities in your database. That is important because the data that you use has to be absolutely efficient for its further implementations.
Let’s see right away how to best overview data and prepare it.
As an example, I will use a database based on FIFA World Cups from 1954–2014.
In this Database, we have multiple Entities, which represent the most important part of the database and which are supposed to be connected with each other. This is the visual representation of how that relationship looks like:
When connecting the entities, you also have to add the notation so its easier to recognize the relationship between entities. For this Database, I am going to use Modified Chen Notation. Chen’s notation for the entity-relationship modeling uses rectangles to represent entity sets, and diamonds to represent relationships appropriate for first-class objects: they can have attributes and relationships of their own. If an entity set participates in a relationship set, they are connected with a line.
Here is how that looks like:
1:1 — [1] to [1]
1:C — [1] to [0 or 1]
1:M — [1] to [at least 1]
1:MC — [1] to [arbitrary many]
C:C — [0 or 1] to [0 or 1]  see 1:1 in Chen
C:M — [0 or 1] to [at least 1]
C:MC — [0 or 1] to [arbitrary many]  see 1:N in Chen
M:M — [at least 1] to [at least 1]
M:MC — [at least 1] to [arbitrary many]
MC:MC — [arbitrary many] to [arbitrary many]  see M:N in Chen
C (“choice”/“can”) to model 0 or 1, while 1 means exactly 1 and
M means at least 1
There are multiple representations of modeling the relationships, meaning that the notation between entities can vary.
For example, you can use the (min, max) notation, which for the same database would look like this:
This is a SQL script, which represents creating the tables as Entities above:
Connecting the Entities (green), with relational tables (red) is represented with this next image:
That is all for the visual representation of data. There are many more ways you can do it in terms of different notations and relationships.
Last words of motivation
When starting out with databases, try to be very precise with your data and the relationships between entities in your data. Every step of the way to mastering Data Science has its own something special and it is important to start off right! Keep learning and programming!
Thanks for reading!
If you like this story, check my other stories on Medium!

Written by lazar.gugleta | Student. Tutor. Writer.
Published by HackerNoon on 2020/01/20