Analyzing Montreal’s BIXI Ridership With Data And Visuals

Written by mindtradesconsulting | Published 2021/07/14
Tech Story Tags: big-data | big-data-analytics | machine-learning | digital-transformation | case-study | data-analysis | data-visualization | good-company

TLDR Bixi is a hybrid between "BIcycle" and "TaXI" to underline the concept of using a bicycle just like a taxi. The goal is to find results that can help improve the quality of the service. The datasets regarding bixi ridership have over four million records. The data includes the start and end station code, the start. date plus time, the duration in seconds, and membership details. For this particular case study, we will use open datasets mentioned above (datet ODCS)via the TL;DR App

If you live or work in Montreal, you've probably passed by one of the 700 Bixi stations. And you've undoubtedly come across "Bixists," i.e., cyclists riding one of the grey bikes (or perhaps a blue bike). That's Bixi. It is a hybrid between "BIcycle" and "TaXI" to underline the concept of using a bicycle just like a taxi.
While it has been growing in popularity, it would be an excellent opportunity to analyze various aspects of Bixi data, especially in Montreal. The goal is to find results that can help improve the quality of the service, help customers find their closest Bixi station, the most and least crowded time to pick up a Bixi, and all other services that will benefit the customer. For this particular case study, we will use open datasets.

Datasets

The datasets we will be using in this analysis are:
The datasets regarding bixi ridership have over four million records. The data includes the start and end station code, the start and end date plus time, the duration in seconds, and membership details. The file with all the station details (Stations.csv) contains the station's details against their station codes like code, name, address, coordinates, active status. The weather files (noaa-daily-weather-data.csv) contain all-weather information throughout the year like date, precipitation, snow, max and min temperature, elevation, coordinates, and country code. We have the list of holidays and their corresponding date, as per the year, to know what the traffic is like on those days. 

Data Process

After procuring all the data, we will follow these steps to complete our analysis. Here's what we will be finding out: 
  1. Find the most popular and least popular bike stations.
  2. Find the most and least popular days of the week in terms of ridership.
  3. Find the most and least popular times of the day in terms of ridership.
  4. Find the most and least popular ridership by weather conditions.
  5. Which stations exhaust their bike capacity the fastest and when? Which stations have the most unused bikes?
  6. Are there any holidays apart from the weekends when bike ridership is popular
  7. Does it make a difference if you are a member or not

Most and Least Popular Stations

The first task is to find out the most and least popular bike stations. Popularity is measured by the number of bikes taken and returned at a station. For further simplicity, we will find four stations;
  1. Most Popular Station to Start - marked in green
  2. Least Popular Station to Start- marked in red
  3. Most Popular Station to End- marked in blue
  4. Least Popular Station to End - marked in yellow
To complete this task, we will be getting all the data specifically for 2016 and 2017. Next, we will sort stations according to their popularity. We will calculate popularity by a simple method, the more times a station has a bike rented or returned, the more popular that particular station is. Following this step, we will find the most and least popular stations according to starting point and ending point. They are listed below, along with their name and code.
For 2016
  • Most Popular Station to Start Mackay /de Maisonneuve (Sud) 6100   
  • Least Popular Station to Start Place Longueuil 5003  
  • Most Popular Station to End de Maisonneuve / de Bleury 6078  
  • Least Popular Station to End Place Longueuil 5003  
For 2017
  • Most Popular Station to Start Mackay /de Maisonneuve (Sud) 6100  
  • Least Popular Station to Start Place Longueuil 5003  
  • Most Popular Station to End Berri / de Maisonneuve 6015  
  • Least Popular Station to End Place Longueuil 5003
Note: As the Station with the least popularity to start and end was the same, a single point represents it on the map.

Most and Least Popular Days

Now that we have the most and least popular stations in Montreal, we will next determine the most and the least popular 'Days of the Week' when the ridership was most popular. For this analysis, we will use data from the sources mentioned above (dataset OD.csv) for 2016 and 2017. 
As for how we will analyze- we will check every entry, aka every date when a bixi was rented. We will then sort the dates into days and note the number of rentals made on that specific day. As a result, we will know the most popular day and the number of rented bikes on that day.
For 2016
  • Wednesday 632176 Rents
  • Tuesday 627808 Rents
  • Friday 613929 Rents
  • Thursday 607976 Rents
  • Monday 560265 Rents
  • Saturday 503056 Rents
  • Sunday 454870 Rents
For 2017
  • Wednesday 636580 Rents
  • Thursday 635897 Rents
  • Tuesday 595485 Rents
  • Saturday 570472 Rents
  • Friday 553359 Rents
  • Sunday 528526 Rents
  • Monday 498403 Rents
What do these numbers depict? It shows that maximum rentals are made on Wednesday, with a whopping number of 636580 in 2016 and 632176 in 2017.  Surprisingly, the days that see the least number of rentals are on Sundays in 2016. However, Mondays see the least number of rentals in 2017, making sense as it is the beginning of a workweek. 

Most and Least Popular Times of Day

We now know the most and least popular stations and the days of the week for ridership. The following agenda is to find the most and least popular times of day when a bike is rented.
For this analysis, we will be using the above dataset (OD.csv), which gives us an hourly timestamp and also looks at 24 hours throughout the year. This analysis aims to help us get to the most popular hour when rents were made and the time with least rents for both 2016 and 2017.
For 2016
  • Most Rents at 17:00 - 17:59 (425975 Rents)
  • Least Rents at 04:00 - 04:59 (11340 Rents)
For 2017
  • Most Rents at 17:00 - 17:59 (426821 Rents)
  • Least Rents at 04:00 - 04:59 (11478 Rents)
After looking at the analysis, it is clear that most rents were made at 5:00 PM throughout the year, which is the perfect time to go for a ride. As for the least rents, 4:00 AM was when bixis were least rented, which is not surprising. This analysis clearly shows that the evenings are when the stations are crowded.

Weather Impact on Ridership

Other than factors, one of the primary parameters is weather that can affect ridership. For this particular case - we will use two datasets - one with weather details (noaa-daily-weather-data.csv) and the other which has ridership details (OD.csv) for respective years. We will audit and compare them both to find any visible differences in ridership as weather changes.
We first see the number of rides taken on a particular day and their duration, then check the weather conditions for that day and plot them against a graph to view any visible changes. We display the results below;
2016  and 2017:
We have four parameters shown in the graphs (for 2016 and 2017) - ridership popularity, precipitation analysis, temperature analysis, and snow analysis. When you compare all the parameters, the results are as follows:
  • The popularity of ridership is high during higher temperatures - we are talking sunny and spring days. 
  • Certain days see the effects of precipitation, but if you love taking a ride while it showers, this time is perfect for you. 
  • The number of rides drops when it snows.

Stations that Exhaust Fastest and Slowest

We don't want a day when you head to the station and don't find a ride. One would assume that the most popular stations would exhaust the fastest, and those that are least popular would finish the slowest. While this is true in most cases, the popularity analysis is annual, and our bixi case study results may vary.  
To solve this puzzle, we will first check the number of rentals and returns in a day at a particular station, then compare them with their maximum capacity. If the rentals exceed the maximum capacity, it is safe to say that the station exhausts their bikes the fastest. If it is the other way round, then the station finishes its bikes the slowest.
Using the above data, the analysis leads us to the following:
For 2016
  • du Mont-Royal / Vincent-d'Indy 6306 Exhaust Rate 2.46
  • Villa Maria (Décarie / Monkland) 6101 Exhaust Rate 2.40
  • Louis-Colin / McKenna 6928 Exhaust Rate 2.17
  • Parc Jean-Brillant (Swail / Decelles) 6316 Exhaust Rate 2.12
  • de Darlington 6310 Exhaust Rate 1.98
  • Notre-Dame / Peel 6085 Exhaust Rate 0.58
  • St-Denis / de Maisonneuve 6014 Exhaust Rate 0.53
  • Viger/Jeanne Mance 6035 Exhaust Rate 0.52
  • Place-d'Armes (Viger / St-Urbain) 6032 Exhaust Rate 0.51
  • Square Victoria 6043 Exhaust Rate 0.44
For 2017
  • Villa Maria (Décarie / de Monkland) 6101 Exhaust Rate 2.54
  • Jean-Brillant / McKenna 6928 Exhaust Rate 2.43
  • du Mont-Royal / Vincent-d'Indy 6306 Exhaust Rate 2.31
  • de Darlington 6310 Exhaust Rate 1.97
  • Gatineau / Swail 6316 Exhaust Rate 1.97
  • Lucien L'Allier / St-Jacques 6096 Exhaust Rate 0.58
  • St-Urbain 6034 Exhaust Rate 0.55
  • Notre-Dame / Peel 6085 Exhaust Rate 0.55
  • Place-d'Armes (Viger / St-Urbain) 6032 Exhaust Rate 0.48
  • Square Victoria 6043 Exhaust Rate 0.45
If a station's exhaustion rate is 2, the number of rents is two times faster than the number of returns, which means that more bikes are taken than returned, and that station is likely to exhaust its bikes fast. If a station's exhaust rate is 0.5, the number of returns is twice the number of rents, which means that more bikes are returned than rented, and that station will have more bikes than its capacity causing it never to exhaust. 

Holidays Other than Weekends with Popular Ridership

For this particular analysis, we will be using the holiday data from 2016 and 2017. But we will only be using the holidays mentioned here. Firstly, we will check if there are weekends and get rid of them. Second, we will check their dates, compare them with the primary data mentioned above for the respective years, and finally find the popularity of overall ridership on that day.
Once these steps are completed, we will compare the holidays with the primary data, which will determine the most popular ridership, and which has the least. We will look at holidays such as Victoria Day, St. Jean Baptiste Day, Canada Day, Labour Day, and Thanksgiving Day. Here are the corresponding dates.
For 2016
  • St. Jean Baptiste Day 20095 Rents
  • Victoria Day 19109 Rents
  • Labour Day 18534 Rents
  • Canada Day 17083 Rents
  • Thanksgiving Day 9640 Rents
For 2017
  • St. Jean Baptiste Day 25096 Rents
  • Canada Day 16162 Rents
  • Labour Day 14802 Rents
  • Easter Monday 8036 Rents
  • Victoria Day 7961 Rents

Members and Non-Members in Ridership

Does being a member or a non-member have any influence on the analysis thus far? Let's find out. We again use one of the data sets mentioned in the Data Source section. We find the member's and non-members' data and compare it to the number of rents. We calculated and compared the total number of rents and popularity with members and non-members, and the results are as follows for both years.
83% members
17%, not members
In conclusion, 83% of the members played an essential part in renting the bikes and impacted the overall bixi traffic analysis. 

Top 4 interesting observations

  1. Surprisingly, the most popular time for bike rentals was not in the morning but in the evening. We expect that the morning rush hour to be the busiest, and our hypothesis was incorrect.
  2. We expected that Mondays might be the most popular day with people struggling to get into the offices on time after the weekend. Our hypothesis was wrong as Wednesdays were the most popular days of the week.
  3. We expected to see significant potential for usage on statutory holidays and weekends. Our hypothesis was true. As most of the use is member-driven (835), we think there is a significant opportunity to encourage BIXI usage for tourists and one-off usage by folks coming into the city for weekend activities, especially during the summer.
  4. We did analyze stations based on how quickly they exhaust bikes. We cannot tell if this results in mixed ridership opportunities and if there's a response mechanism to move the bikes back to a transit stop - which sees riders but no bikes and if such a mechanism responds to unmet demand.
  5. If you are from Montreal and have additional local insight, we look forward to hearing from you in the comments below.

Code:

How can MindTrades help?
This case study is only a tipping point to such in-depth analysis with insight and solutions. For more such case studies, contact https://www.mindtrades.com.

Written by mindtradesconsulting | A digital transformation company based in the US.
Published by HackerNoon on 2021/07/14