I now want to talk about the I discussed in the in more terms. Better a year late than never, I suppose. model first piece technical For predicting the outcome of a match I used a logistic regression model. I compared it against models based on naive bayes, neural networks, random forest and support vector machines. Every model was cross-validated and their optimal hyperparameters were found.’ The reason I sticked with a logistic regression model was that it had a prediction accuracy on par, or superior, than more complex solutions and the transparency of the model means you can use it for qualitative analysis. With logistic regression you understand what are the key features and their weight. Also, logistic regression returns probabilities that are pretty accurate and this is important to have a notion of how confident you are in your prediction. Features The model consists of the following features with their coefficients. Features were standarized before fitting the model: home court advantage: 0.10218887 effective field goal percentage difference: 0.16118265 turnover percentage difference: -0.05958713 offensive rebound percentage difference: 0.07061777 free throws to field goals attempts difference: 0.03267933 distance traveled in last 7 days difference: -0.01459163 form in last seven matches difference: 0.0828436 offensive rating difference: 0.17885523 defensive rating difference: -0.33924331 effective field goal percentage difference (court*): 0.10808104 turnover percentage difference (court*): -0.09548481 offensive rebound percentage difference (court*): 0.07055131 free throws to field goals attempts difference (court*): 0.0748545 form in last seven matches difference (court*): -0.00486437 offensive rating difference (court*): 0.14822224 defensive rating difference (court*): -0.21756487 *Considering court situation means that, for example if Team A is the host and Team B is the visitor the effective field goal percentage would be: A effective field goal percentage when playing at home — B effective field goal percentage on the road) The input of a given match would be the difference in each of these metrics between the two teams. Performance Let’s take the last Celtics ring season as an example: 2007-2008. This model would have correctly predicted 70 % of the matches. Is this number good ? We would obviously expect a dummy model that chooses winners randomly to be correct around 50 % of the time. However, we have a better benchmark at our disposal: Vegas money lines. A model that simply predicts that Vegas’ favorite would win, would have been correct 69.8 % of the time. Considering this is what bookies do for a living, spending a lot of resources building models for setting the odds, and leveraging the power of markets, I’d argue that having a similar performance than Vegas is a great result. Interestingly, 87.6% of the time our model picked the Vegas’ favorite, while 12.4 % it picked the underdog. 51 % of the time, it correctly predict that the underdog would in.

An Attempt to Predict the NBA with a Machine Learning System Written in Python Part II

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

How to predict the NBA with a Machine Learning system written in Python

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

100+ Free Pluralsight Courses to learn Python, Java, and Spring Boot

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

10 Ways AI Has Changed Our Lives

100 Days of AI Day 3: Leveraging AI for Prompt Engineering and Inference

How to predict the NBA with a Machine Learning system written in Python

Darwin's Hybrid Intelligence to Align AI & Human Goals for Startups & VCs

100+ Free Pluralsight Courses to learn Python, Java, and Spring Boot

100 Days of AI Day 1: From Newsletter to Podcast, Leveraging AI for Audio Transformation

10 Ways AI Has Changed Our Lives

100 Days of AI Day 3: Leveraging AI for Prompt Engineering and Inference

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps