There are so many exciting projects out there in the Data World. Artificial Intelligence, Machine Learning, Neural Nets, Blockchain, and more are sweeping the technology industry. In order to get to the cutting-edge stuff, first and foremost, data needs to be stored, evaluated, and tested. The best place to do that is SQL (or a library that operates with SQL-like commands, see my article on Python’s Pandas library).
This series Data Mastery: SQL, will teach you the essential subjects. These are not exhaustive tutorials. Instead they are focused preparation guides — with brevity and efficiency in mind. It is meant for:
See my resource list of the books I used to prepare for my big interview
Each article will contain a brief technical explanation of the subject, an example question, and an answer. Follow up articles will contain challenging questions so you can evaluate your learning.
This series does not come with accompanying data sets. The advantage to this is when you are on the drawing board, whether in an interview or project design, you do not have test data to play with. You have to think abstract.
So far we have only covered queries that return data from the entire table. When we only need part of the table we need to employ the WHERE clause. There we will write conditions to filter out or focus on certain rows in the table. This clause will go after the FROM statement.
Let’s recall our daily_user_score table from earlier in the series:
date | userid | sessionid | score
------------------------------------------
2018–09–01 | 983489272 | 125 | 112
2018–09–01 | 234342423 | 34 | 112
2018–09–01 | 567584329 | 207 | 618
2018–09–02 | 983489272 | 126 | 410
2018–09–02 | 983489272 | 127 | 339
If we want to modify our query from the last chapter to only include a count of unique users and their average score from September 1st, 2018, it will look like this:
SELECT COUNT(DISTINCT userid) AS number_of_users,
AVG(score) AS average_score,
FROM daily_user_score
WHERE date = ‘2018–09–01’;
This query returns:
number_of_users | average_score
-------------------------------
3 | 280.66
You can include multiple conditions in the WHERE clause by using the keywords AND and OR followed by a condition.
Find the average score of userid 983489272 on September 2nd, 2018.
SELECT AVG(score) as average_score
FROM daily_user_score
WHERE date = ‘2018–09–02’
AND userid = 983489272;
This query returns:
average_score
-------------
374.5
Thanks for reading! If you have questions feel free to comment & I will try to get back to you.
Connect with me on Instagram @lauren__glass
Connect with me on LinkedIn
Check out my essentials list on Amazon
Search for me using my nametag on Instagram!