Joining tables is the first big learning curve after getting your head around SQL basics.
More often than not you won’t find everything you need in one table so will need to learn which join to use when.
If you’re familiar with the VLOOKUP function in Excel, you’ll be just fine.
T-SQL supports cross, inner and outer JOIN clauses, and UNION operators to combine datasets. While there are plenty of guides to JOINs that use venn diagrams to get the point across, I prefer something more visual.
Use an INNER join, or just a JOIN, when you want to find the match between two tables. You need to have a key on both tables that you join ON, and that’s where the match happens. Any results where there is not a match are discarded.
select o.order_item, i.inventory_item
from orders o
inner join inventory i
on o.order_item = i.inventory_item
Use a LEFT OUTER join when you want to find a match between two tables, but also show a NULL where there is no match from the right table. Like the INNER JOIN you need a key to join ON.
A RIGHT OUTER JOIN does the same but in reverse.
select o.order_item, i.inventory_item
from orders o
left join inventory i
on o.order_item = i.inventory_item
A CROSS JOIN joins everything with everything. There is no need to provide a key to join on and it can result in a very big data set, (and a really big image so you’ll have to use your imagination when reviewing the image below).
Proceed with caution.
select o.order_item, i.inventory_item
from orders o
cross join inventory i
While a JOIN combines rows of columns horizontally a UNION combines the results vertically. Using a UNION combines the result of two queries into one column and removes duplicates. If your query has multiple columns, they need to be in the same order to complete the UNION.
select *
from orders
union
select *
from inventory
The UNION ALL combines the results of two queries the same as a UNION but keeps the duplicates in the result.
select *
from orders
union all
select *
from inventory
The best way to get your head around JOINs is to start using them. If you aren’t working with a SQL database already, check out SQLZoo or Hacker Rank to experiment with JOINs.
Let me know what other key concepts you think would be useful for complete beginners and junior data analysts.
Originally published at dev.to.