Pandas cheatsheet for SQL people (part 1) by@adilaliyev_25812

Pandas cheatsheet for SQL people (part 1)

June 6th 2018 14,464 reads
Read on Terminal Reader
react to story with heart
react to story with light
react to story with boat
react to story with money
image

Pandas library is the de-facto standard tool for data scientists, nowadays. It is used widely by many data scientists around the globe. After being familiar with it I always use it for processing table-structured data whatever project I am working on. It works fast and reliable, supports CSV, Excel, JSON and so on.

However, as a person experienced in SQL, I had some difficulties and confusion with manipulating the tables (a.k.a. DataFrames) in the beginning. Eventually, I learned more APIs and ways of doing the things properly. I believe many people who do his/her first steps on Pandas may have the same experience. Some time ago I prepared the cheatsheet using SQL queries and their analogy in Pandas. And I am happy to share it with all of you.

In this post, I am sharing the queries using β€œSELECT”. I will post other queries and their analogies to Pandas soon later. I would be happy to receive your feedback and wish-list regarding the cheatsheet.

Here we go.

We have three simple tables:

  • users
  • courses
  • attendance

And the rest of the queries will be about data manipulations on them.

Lets start from basic SELECT command.

SELECT * FROM users

SELECT * FROM users LIMIT 0,10

SELECT * FROM users WHERE email IS NULL

SELECT first_name, last_name FROM users

SELECT DISTINCT birth_year FROM users

Basic math and arithmetic operations.


SELECT AVG(points) FROM users;SELECT SUM(points) FROM users;

Conditional operations and LIKE.

SELECT * FROM users WHERE birth_year BETWEEN 1998 AND 2018

SELECT * FROM users WHERE first_name LIKE 'Ch%'

SELECT * FROM users WHERE first_name LIKE '%es'

SELECT * FROM users WHERE first_name LIKE '%on%'

SELECT first_name, last_name FROM users WHERE first_name LIKE '%on%'

I finalize this post with JOIN and ORDER BY related SQL queries. To make the code more simpler and easy to read I will assign the results into new variables and re-use them in next ones.


SELECT * FROM attendance atnLEFT JOIN users usr ON atn.user_id = usr.id

Now lets join the result above with course titles. The result it will be same as the result of following SQL command.



SELECT * FROM attendance atnLEFT JOIN users usr ON atn.user_id = usr.idLEFT JOIN courses co ON co.id = atn.course_id

The query above will return the full table/dataframe with all columns included. To select necessary columns we can use the same method as we did before.

SELECT * FROM users ORDER BY first_name, last_name

SELECT * FROM users ORDER BY first_name, last_name DESC





SELECT first_name, last_name, birth_year,points, course_name, instructor FROM attendance atnLEFT JOIN users usr ON atn.user_id = usr.idLEFT JOIN courses co ON co.id = atn.course_idORDER BY first_name, last_name

react to story with heart
react to story with light
react to story with boat
react to story with money
L O A D I N G
. . . comments & more!