Why you should use SQL CTEs

Written by hackernoon-archives | Published 2018/12/28
Tech Story Tags: sql | data-science | sql-server | cte | sql-cte

TLDRvia the TL;DR App

CTEs (common table expressions) are persisted temporary data sets, that allow you to store a single query to go back to later in your script. They’re underrated compared to the subquery, that seems to be what most analysts around me use. Here’s why I prefer to use CTEs when building SQL queries.

How do they work?

In the example below I’ve used a CTE to:

  • grab all the data from the Email Delivered table, told SQL to hold that in memory,
  • then grabbed what I need from the Unsubscribe table, told SQL to hold that in memory too,
  • then joined them together with flags from the Customers table

Where you see a WITH, is the CTE starting and then I’m naming them ‘delivered’ and ‘unsubs’ before starting to tell the CTE what I want to return:

When I do the final ‘join everything together’ part I’m joining fields from the ‘delivered’ dataset such as ‘delivered.email’.

Why not just use a Subquery?

Here is an example of a Subquery. I don’t use them often because my brain doesn’t work that way. I would rather get all my datasets separately then join them all together.

The way I get my head around reading it is thinking about it from the inside out. It’s nesting everything you need together, but in my opinion, it tends to get ugly really quickly.

  • The first step is to run the query in the centre starting ‘SELECT AccountID …’ to get all orders greater than 30 from the OrderHistory table.
  • Then JOIN on the Account table to look up which Accounts were from New Zealand.
  • Then the top SELECT runs to return all the fields from the ord dataset and the three columns I want to see from the Account table.

Why are they so great?

You can use them multiple times throughout your script and they are readable, you can return what you need then reference it later.

Why not just create a table?

If you don’t have write permissions this may not be possible and if it’s only used for this query your DBA might not be thrilled with you creating one-off tables.

Are there any negatives?

CTEs don’t last forever and can only be used in the query you’re currently in, unlike temp tables or views that can survive outside the current script.

What about performance?

SQL server will always decide for you, via the query planner, the best way to execute your query. If you ask your friendly DBA which strategy to use, they will tell you ‘it depends’ because it does. The CTE is all about readability, so if it works for you give it a try.

Which do you prefer? The CTE, subquery or just creating a table?

Originally published at dev.to.


Published by HackerNoon on 2018/12/28