All about SQL JOINs

Written by hackernoon-archives | Published 2018/12/22
Tech Story Tags: sql-joins | sql-server | data-science | sql | join

TLDRvia the TL;DR App

Joining tables is the first big learning curve after getting your head around SQL basics.

More often than not you won’t find everything you need in one table so will need to learn which join to use when.

If you’re familiar with the VLOOKUP function in Excel, you’ll be just fine.

T-SQL supports cross, inner and outer JOIN clauses, and UNION operators to combine datasets. While there are plenty of guides to JOINs that use venn diagrams to get the point across, I prefer something more visual.

INNER JOIN

Use an INNER join, or just a JOIN, when you want to find the match between two tables. You need to have a key on both tables that you join ON, and that’s where the match happens. Any results where there is not a match are discarded.

select o.order_item, i.inventory_item 
from orders o 
inner join inventory i 
on o.order_item = i.inventory_item

LEFT OUTER JOIN

Use a LEFT OUTER join when you want to find a match between two tables, but also show a NULL where there is no match from the right table. Like the INNER JOIN you need a key to join ON.

A RIGHT OUTER JOIN does the same but in reverse.

select o.order_item, i.inventory_item 
from orders o 
left join inventory i 
on o.order_item = i.inventory_item

CROSS JOIN

A CROSS JOIN joins everything with everything. There is no need to provide a key to join on and it can result in a very big data set, (and a really big image so you’ll have to use your imagination when reviewing the image below).

Proceed with caution.

select o.order_item, i.inventory_item 
from orders o 
cross join inventory i

UNION

While a JOIN combines rows of columns horizontally a UNION combines the results vertically. Using a UNION combines the result of two queries into one column and removes duplicates. If your query has multiple columns, they need to be in the same order to complete the UNION.

select * 
from orders 
union 
select * 
from inventory

UNION ALL

The UNION ALL combines the results of two queries the same as a UNION but keeps the duplicates in the result.

select * 
from orders 
union all 
select * 
from inventory

The best way to get your head around JOINs is to start using them. If you aren’t working with a SQL database already, check out SQLZoo or Hacker Rank to experiment with JOINs.

Let me know what other key concepts you think would be useful for complete beginners and junior data analysts.

Originally published at dev.to.


Published by HackerNoon on 2018/12/22