14 Best Tableau Datasets for Practicing Data Visualization

Written by datasets | Published 2023/03/13
Tech Story Tags: tableau | data | datasets | covid-19-datasets | data-visualization | data-visualization-tools | data-analysis | tableau-vs-powerbi

TLDRTableau is a data analysis and visualization tool that enables users to connect, visualize and share data in an easy-to-understand and meaningful way. This article focuses on the 14 Best Tableau Datasets for Practicing Data Visualization, essential for helping you gain valuable experience.via the TL;DR App

Data visualization has become an essential part of the modern business landscape and Tableau is a powerful tool for creating impactful visualizations.

What is "Tableau"?

Tableau is a data analysis and visualization tool that enables users to connect, visualize and share data in an easy-to-understand and meaningful way. Its user interface is generally regarded as more intuitive, with drag-and-drop functionality.

While we have alternative visualization tools and have created other dataset resources like our Power BI datasets list, this article focuses on the 14 Best Tableau Datasets for Practicing Data Visualization, essential for helping you gain valuable experience in data preparation, analysis and visualization, as well as familiarity with Tableau’s opulent set of features.

Ultimate List of the Best Tableau Datasets for Practicing Data Visualization

  1. Superstore
  2. World Bank Development Indicators
  3. Airbnb Listings
  4. Flight Delays and Cancellations
  5. Titanic - Machine Learning from Disaster
  6. COVID-19
  7. Spotify Tracks DB
  8. 120 Years of Olympic History: Athletes and Results
  9. NBA Players
  10. The 2014 Inc. 5000
  11. Pokemon Index
  12. Tour de France Statistics
  13. US Home Sales
  14. Global Superstore

1. Superstore

The Sample Superstore Sales dataset provides sales data for a fictional retail company, including information on products, orders and customers.

This dataset includes the following variables:

  • Order ID - A unique identifier for each order.
  • Customer ID - A unique identifier for each customer.
  • Order Date - The date of the order placement.
  • Ship Date - The date the order was shipped.
  • Ship Mode - The shipping mode for the order (e.g. standard, same-day).
  • Segment - The customer segment (e.g. Consumer, Corporate, Home Office).
  • Region - The region where the customer is located (e.g. West, Central, East).
  • Category - The category of the product purchased (e.g. Furniture, Technology, Office Supplies).
  • Sub-Category - The sub-category of the product purchased (e.g. Chairs, Desktops, Paper).
  • Product Name - The name of the product purchased.
  • Sales - The sales revenue for the product purchased.
  • Quantity - The number of units of the product purchased.
  • Discount - The discount applied to the product purchased.
  • Profit -The profit generated by the product purchased.

The dataset can be downloaded on Tableau or Kaggle.

2. World Bank Development Indicators

This dataset contains information on GDP, life expectancy, and literacy rates for various nations throughout the world. It also includes many economic and social variables.

Some of the variables included in this tableau dataset:

  • Gross Domestic Product (GDP)

  • Inflation

  • Unemployment rate

  • Government debt

  • Trade balance

  • Life expectancy

  • Infant mortality rate

  • Access to electricity

  • Literacy rate

  • Mobile cellular subscriptions

Note: The variables included in the dataset depend on the year and the country being analyzed.

You can download the dataset directly from the website or you can download it on Kaggle.

3. Airbnb Listings

This dataset is a collection of data on Airbnb listings, including price, amenities, type of property, number of bedrooms and location in New York City. It is commonly used for exploratory data analysis and visualization, with a focus on the distribution of listings and prices across different locations and neighbourhoods.

Some of the variables included in the dataset:

  • Id - Airbnb's unique identifier for the listing.
  • Host Id - Airbnb's unique identifier for the host.
  • Host name - The name of the listing.
  • Neighbourhood Group - The neighbourhood group e.g Manhattan, Brooklyn etc.
  • Host identity verification - This shows if the host identity is either verified or unconfirmed.

The dataset can be accessed directly from the Airbnb website or on Tableau by clicking here.

4. Flight Delays and Cancellations

This tableau dataset comprises data on flight numbers, departure, airlines, arrival times and the reason for any delays or cancellations. With this dataset, Tableau users perform data analysis and create interactive dashboards to identify the most common causes of flight disruptions by studying the frequency of cancellations by airline and flight delays.

It consists of the following variables:

  • Flight Duration - The length of time from departure to arrival for the flight.

  • Delay Reason - The reason for any delay in the flight. Examples may include weather, mechanical issues, or air traffic control.

  • Delay Time - The amount of time by which the flight was delayed.

  • Cancellation Reason - The reason for cancellation of the flight.  Examples may include weather, mechanical issues, or insufficient passenger demand.

  • Date of Flight - The date on which the flight took place.

  • Flight Number - A unique identifier assigned to each flight by the airline.

  • Airline Name - The name of the airline operating the flight.

  • Departure Airport - The airport from which the flight is scheduled to depart.

  • Arrival Airport - The airport at which the flight is scheduled to arrive.

  • Scheduled Departure Time - The time at which the flight was scheduled to depart, as originally planned by the airline.

  • Actual Departure Time - The actual time at which the flight departed, if different from the scheduled departure time.

  • Scheduled Arrival Time - The time at which the flight was scheduled to arrive, as initially planned by the airline.

  • Actual Arrival Time - The actual time at which the flight arrived, if different from the scheduled arrival time.

The dataset can be accessed directly on Kaggle by clicking here.

5. Titanic - Machine Learning from Disaster

This dataset is a popular open-source dataset that offers information on the passengers onboard the Titanic ship that sank on April 15, 1912.

Some of the variables included in the dataset:

  • PassengerId - A unique identifier for each passenger.

  • Survived: This shows whether the passenger survived or not (0 = No, 1 = Yes).

  • Pclass: A passenger's class (1 = 1st, 2 = 2nd, 3 = 3rd).

  • Name - A passenger's name.

  • Sex - A passenger's gender.

  • Age - A passenger's age.

  • SibSp - The number of siblings/spouses aboard.

  • Parch - The number of parents/children aboard.

  • Ticket - The ticket number.

  • Fare - The fare paid for the ticket.

  • Cabin - The cabin number.

  • Embarked - The port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton).

You can download the dataset on Kaggle or Tableau.

6. COVID-19

The COVID-19 dataset is a collection of data related to the COVID-19 pandemic, curated and made available for analysis using Tableau.

This tableau dataset includes a wide range of information, such as the number of confirmed cases and deaths, testing data, hospitalization and vaccinations, for countries and regions all over the world. It is also useful in creating visualizations and dashboards that help track the virus's spread and its impact on populations.

Some of the variables included in the tableau dataset:

  • Date: The date of the observation.
  • Country/Region: The name of the country or region currently observed.
  • Province/State: The name of the province or state within the country or region currently observed.
  • Latitude: The latitude of the location currently observed.
  • Longitude: The longitude of the location currently observed.
  • Confirmed cases: The total number of confirmed cases of COVID-19 in the location currently observed.
  • Deaths: The total number of deaths due to COVID-19 in the location currently observed.
  • Recovered: The total number of recovered cases of COVID-19 in the location currently observed.
  • Active cases: The total number of active cases of COVID-19 in the location currently observed.
  • Incidence rate: The number of confirmed cases per 100,000 population.

The dataset can be downloaded on Kaggle or the European Centre for Disease Prevention and Control (ECDC) website.

7. Spotify Tracks DB

This dataset contains information on songs, artists and playlists from the music streaming platform, Spotify. It can be used to explore patterns in popular artists, music consumption, genres and playlists.

The Spotify Tracks DB dataset can be used to create visualizations on Tableau, that can assist users in understanding how people consume and interact with music on the Spotify platform.

You can also download this tableau dataset on Kaggle or request a copy of your data from Spotify.

8. 120 Years of Olympic History: Athletes and Results

This historic dataset is a collection of data that provides information about the modern Olympic Games, which started in 1896.

It usually contains the following information:

  • Athletes: The names, nationalities, ages, heights, weights, and other personal details about the athletes who have participated in the Olympic Games.
  • Countries: The names of the countries that have participated in the Olympic Games, along with their national flags, codes, and other related information.
  • Events: The details about the various sports and events held in the Olympic Games, including the date, location, and number of participants.
  • Medals: The details about the medals awarded to the athletes who have won in the Olympic Games, including the type of medal (gold, silver, or bronze), the event they won in, and the country they represent.

The dataset also ranges from Athens 1896 to Rio 2016 and can be downloaded on Kaggle.

9. NBA Players

The NBA Players dataset is a collection of data related to the National Basketball Association (NBA), which is a professional basketball league in North America. It consists of various information and statistics on NBA teams, players, games and seasons, including:

  • Team and player performance metrics such as points, rebounds, assists, steals, and blocks.
  • Game-specific data such as scores, win-loss records, and shooting percentages.
  • Seasonal data such as team standings, playoff brackets, and awards.

You can download this tableau dataset on Kaggle.

10. The 2014 Inc. 5000

The 2014 Inc. 5000 dataset is a list of the 5,000 fastest-growing private companies in the United States. Inc. magazine publishes this list every year, and it includes companies from a wide range of industries and sectors. The rankings are based on the percentage revenue growth of the companies over three years.

Some of the variables included in the dataset:

  • rank - The rank of the company on the Inc. 5000 list.
  • url - The URL of the company's website.
  • company - The name of the company.
  • founded - The year the company was founded.
  • industry - The industry category of the company.
  • revenue - The company's revenue in millions of US dollars.
  • employees - The number of employees at the company.
  • state - The state where the company is headquartered.
  • city - The city where the company is headquartered.

11. Pokemon Index

The Pokemon Index dataset is a collection of information about the different species of Pokemon. It includes data such as the name, type, abilities, stats, and moves of each Pokemon. The dataset is often used by researchers, developers, and enthusiasts to study and analyze various aspects of the Pokemon franchise, such as game mechanics, strategy, and popularity.

Note: There are several versions of this tableau dataset available, including ones that cover different regions or generations of the Pokemon games, as well as ones that include additional data such as sprite images or evolutionary trees.

12. Tour de France Statistics

The Tour de France Statistics dataset is a collection of historical data related to the Tour de France, which is an annual multiple-stage bicycle race primarily held in France. The dataset includes information about the race's stages, routes, riders, teams, classifications and results for each year of the Tour de France from its inception in 1903 to the present day.

Some of the variables included in this tableau dataset:

  • Year: The year in which the Tour de France race took place.
  • Date: The date on which the stage was raced.
  • Start city: The city where the stage started.
  • Finish city: The city where the stage ended.
  • Total distance: The distance covered by the riders in the stage, usually measured in kilometres.
  • Winner: The name of the rider who won the stage.

13. US Home Sales

The US Home Sales, 1963-2016 dataset is a collection of data on the sales of new single-family homes in the United States, from 1963 to 2016. The data includes information such as the month and year of the sale, the number of homes sold, the median and average sales prices, and the seasonally adjusted annual sales rate.

14. Global Superstore

The Global Superstore dataset is a simulation of retail sales operations with stores in multiple countries. It includes information about customers, orders and products, which is particularly useful for exploring retail sales data, as it offers a large and diverse set of data that can be used to analyze customer behaviour, product performance and sales patterns.

It includes the following variables:

  • Order ID - A unique identifier for each order.
  • Order Date - The date and time the order was placed.
  • Ship Date - The date and time the order was shipped.
  • Ship Mode - The method used to ship the order (e.g. standard, express).
  • Customer ID - A unique identifier for each customer.
  • Customer Name - The full name of the customer.
  • Segment - The customer segment such as Home Office or Corporate.
  • Country - The country where the customer resides.
  • City - The city where the customer resides.
  • State - The state where the customer resides.
  • Postal Code - The postal code of the customer's residence.
  • Region - The geographic region where the customer resides.
  • Product ID - A unique identifier for each product.
  • Category - The broad product category, such as Furniture, Office Supplies, or Technology.
  • Sub-Category - The specific product sub-category, such as Chairs, Paper, or Phones.
  • Product Name - The name of the product.
  • Sales - The total sales revenue for the product.
  • Quantity - The number of units of the product sold.
  • Discount - The discount applied to the product.
  • Profit - The total profit earned from the product.

Common Use Cases for Tableau Datasets

Superstore - This dataset can be used for analyzing sales and inventory data of a retail store, identifying popular products, and forecasting demand for products in the future.

World Bank Development Indicators - This dataset can be used for analyzing the trends in economic growth, poverty reduction, health, education, and other development issues.

Airbnb Listings - This tableau dataset can be used for analyzing the popularity of different neighbourhoods, predicting prices, and understanding user preferences.

Flight Delays and Cancellations - This dataset can be used for identifying patterns in flight delays, predicting delays and cancellations, and improving airline operations.

Titanic - Machine Learning from Disaster - This tableau dataset can be used for developing machine learning models to predict survival rates and understand factors that influenced survival.

COVID-19 - This dataset can be used for tracking the pandemic, analyzing the effectiveness of public health interventions, and forecasting future trends.

Spotify Tracks DB - This dataset can be used for analyzing musical trends, predicting popular songs, and developing recommendation systems.

120 Years of Olympic History: Athletes and Results - This tableau dataset can be used for analyzing performance trends, identifying successful athletes and countries, and predicting future medal counts.

NBA Players - This dataset can be used for analyzing player performance, predicting outcomes of games, and understanding the economics of the NBA.

The 2014 Inc. 5000 - This dataset can be used for analyzing business trends, identifying successful companies and industries, and predicting future growth.

Pokemon Index - This tableau dataset can be used for analyzing the popularity of different Pokemon, predicting the outcomes of battles, and developing recommendation systems.

Tour de France Statistics - This dataset can be used for analyzing performance trends, predicting outcomes of races, and understanding the economics of cycling.

US Home Sales - This tableau dataset can be used for analyzing housing trends, predicting future prices, and understanding the real estate market.

Global Superstore - This dataset can be used for analyzing sales trends, identifying popular products and markets, and forecasting demand for products in the future.

Final Thoughts

Tableau is a valuable tool for anyone who needs to visualize and analyze data, from business analysts to data scientists.

The common use cases and tableau datasets will help you better understand the role of Tableau in helping organizations make smarter, real-time decisions.

They are also available for anyone to download and use freely.


More Dataset Listicles:

  1. Hugging Datasets
  2. PyTorch Datasets
  3. Power BI Datasets


Written by datasets | A library of open datasets for data analytics/machine learning compiled by HackerNoon.
Published by HackerNoon on 2023/03/13