This summer, I wanted to test my operations research skills on a problem that has been troubling me. Why am I so bad at online daily fantasy baseball?
If you’re reading this, you probably know what I’m talking about. I’m assuming you’re the type who wanted to do a little research and maybe stumbled on this data science blog post looking for tips on how to build your team.
Spoiler Alert: I have bad news for you.
In this blog, you’ll learn why online fantasy sports are difficult. You’ll see that the economy created by these online game providers is efficient, and you’ll need to put in a lot of time if you want to game the system.
If you’re not familiar with anything I’m talking about then this paragraph is for you. According to one online gambling website that will go nameless, online fantasy sports betting is a $48 billion dollar industry. In fantasy sports, you choose players who are playing that day and see if they can make you money by performing well. In the baseball version, you choose a roster of 8 position players and 2 pitchers; and their salary can’t go above the league’s salary cap. Their performance that day dictates the points you score. You don’t have to do anything but pick the best players that day.
Simple, right?
Enter my fantasy team, the Chicago Red Line Hustlers. Yes, I did build a logo for my make believe team. What’s it to you?
We’re not that good. I’ve been playing for a month or so and my team has lost money. Not a lot, I’m not putting a bunch on the line. However, I can’t quit my day job; and I want to know why.
To understand why I’m so bad at this game, I reached into my decision analytics toolkit and dusted off the old tried and true portfolio analytics tool — the Monte Carlo simulation. The simulation will help me understand the risk in personnel decisions — specifically which ten players should be picked given our constraint of being below the league salary cap.
The simulation will use real-life results for major league baseball players from 2018 to the day prior to the game we’re simulating. I achieve that by obtaining daily statistics from MLB.com’s public API for all active major leaguers. (Note: data engineering is beyond the scope of this blog.)
Once stats are collected, we must collect the eligible players involved in the fantasy game we’re interested in simulating. I achieve that by interacting with public API of a operator of online sport gambling specializing in daily fantasy sports. (Note: data engineering is beyond the scope of this blog.)
Once data is collected and eligible players are identified the simulation can do an exhausted search, modified by novel constraint propagation, to find the best performing lineup; or a user can define a lineup and assess its performance.
The simulation’s objective is to simulate the potential outcomes of baseball games played by a lineup of 2 pitchers and 8 position players. Those outcomes are scored using the classic rules of the online provider’s daily fantasy baseball contest.
With data available, we could go about experimenting and modeling expected outcomes from our real-world data. Our goal is to identify the fantasy scoring for 8 position players (representing all 8 fielding positions in baseball) and 2 pitchers. Our model will need to address the position players and pitchers, separately.
For position players, scoring is highly skewed towards batting outcomes. For position players, our focus will be on simulating plate appearances and the corresponding potential outcomes of that plate appearance.
We achieve this by simulating outcomes of offense and we only care about plate appearances as this is the only chance a player can impact the fantasy game. The flowchart of our simulation shows how our model will proceed to identify plate appearances and their potential outcomes.
Players can have a variety of plate appearances in each game, and it mostly is associated with the order they appear in the lineup. A position player in the back of the lineup tends to have fewer plate appearances than a player in the top (beginning) of the lineup. Our model has access to the number of plate appearances a player has had historically but doesn’t have access to the order in the lineup they appeared in for that game. That’s the first limitation to the accuracy of our simulation. I model plate appearances based on historical outcomes not on actual expectations based on the batting order for the upcoming game.
To accomplish our simulation, we empirically sample plate appearances from past games by using a random number generator to select plate appearances from an ordered list of historical plate appearances for each position player.
For each game simulated, this random selection of plate appearances is performed. This methodology is very accurate for matching expected results for each player.
Once plate appearances are simulated, we now need to identify the model for determining which outcomes are possible and leverage a nested probability to select an outcome that represents what the player is expected to achieve. The first stop in our nested or chained probability is sampling from the player’s ability to turn a plate appearance into an offensive outcome. This ratio can be applied to the simulation’s plate appearances to identify how many outcomes the player achieved in that game.
With the number of outcomes determined, we can then use the player’s historical outcomes achieved to simulate what that player is capable of. For example, if we’re simulating a player known for hitting a lot of home runs — we’d see many home runs in his historical performances.
The simulation will empirically sample from the distribution of outcomes that the player has achieved in the past. The outcome can then be scored according to the online fantasy provider’s score card and further processed to simulate additional outcomes based on the state of the game when that outcome occurred.
The game state when an outcome occurs will also impact three final outcomes our simulation must model. Those three outcomes depend on events that occurred before, during, and after the outcome is achieved by the player. The final layer of our flowchart shows runs, stolen bases, and RBIs being only possible when certain outcomes occur.
In addition, the amount of those three events is also impacted by the outcome. For example, if a player hits a home run, they are awarded at least one RBI and one run. They also have no chance of achieving a stolen base because of that outcome. We model all these rules in an additional layer of nested probability and empirically sampled based on past player performances to simulate game states the player was performing in. An example of one of these outcomes is below, modeling runs.
With all our functions defined, I can now simulate unlimited games for any player with enough sample performances to draw from. The result performs very well. Here’s an example of simulated Trea Turner from the LA Dodgers.
With batters accounted for in the simulation, the model must also be able to simulate pitcher outcomes. The fantasy scoring for pitchers only considers the results of plate appearances and outcomes of the batters they face.
For pitchers, the model starts with simulating innings pitched. This is important as my own novel research suggests that pitchers who pitch more innings tend to perform better (by reducing runs, hits, and improving their win percentage as the game progresses, etc.)
This observation is intuitive as a manager is less likely to pull a strong pitcher having a good outing from the game in preference for another pitcher.
To accomplish our simulation, we empirically sample innings pitched from past games by using a random number generator to select innings pitched from an ordered list of historical innings pitched for each pitcher. The code to achieve this is like the plate appearances function for batters. For each game simulated, this random selection is performed. This methodology is very accurate for matching expected innings pitched for each pitcher.
Pitcher outcomes are evaluated based on the number of innings pitched. As I indicated previously, I do this because player performance is different depending on how deep into the game the pitcher plays.
There are three outcomes that we care about— they are the number of wins, runs, hits, walks, and strikeouts pitched. I simulate these outcomes by sampling empirically for subsets of pitcher data corresponding to the innings pitched. This process ensures we respect pitcher performance as a function of innings pitched.
The final outcomes are a check for simulation status after each run. The model scores complete games whenever the simulated pitcher pitches 9 complete innings. It credits the pitcher with a complete game shutout whenever the simulated pitcher throws 9 complete innings and has no earned runs. Finally, the model credits the simulated pitcher with a no-hitter if the simulation produces a pitcher that completed all 9 innings and achieved no hits.
With all our pitching functions defined, I can now simulate unlimited games for any pitcher with sufficient sample data. The simulated data is very similar to actual MLB performances for the pitchers we’ve modeled.
The goal of this project was to assess the risk in selecting lineups for a daily fantasy sports game. To achieve this, I built a function that randomly fills out a valid lineup by spending up to the $50,000 salary cap indicated in the classic rules of a fantasy baseball provider. I did this 500 times to create 500 random, valid lineups to simulate. Each player in the valid lineup was put through their own simulation — simulating 50 games of performance each. The results were not surprising.
It appears that baseball outcomes have a lot of randomness in them.
Here’s a summary of the simulated performance of the top 10 random lineups chosen by the ratio of points to variability — like a Sharpe Ratio in finance. These lineups represent the least risky, but most effective options.
Here’s the same visual but selecting the top-scoring valid lineups.
These images show valid lineups with strong performance, but all these lineups have expected values that are very near each other. Through this broad search, we couldn’t find a lineup that significantly outperforms all other lineups. Our simulation was successful in showing that fantasy baseball games, without significant detailed player analysis and assessment, are simply games of chance.
In conclusion, simulation is a very useful tool for understanding the risks associated with decision-making. It has been applied successfully in a wide range of applications including product design, pilot training, queuing research, and now fantasy sports.
The results of this specific simulation show that baseball in the short term is very much random. Any player, on any given day, can have a good or bad game. Matchups can come into play, and game situations can influence a player’s decision.
We achieved the ability to model a very complex system and get very real-world results. Over the long term, I expect my simulation would generate players who perform similarly to their real-life counterparts.
The simulations can be improved. For example, we do not have data on the game state for any performance outcome. We simulate based on past achievements, but an improved model would simulate the game state as well. This would be an important next step as other player decisions and game state would impact the player’s ability to generate outcomes. In addition, the model cannot simulate matchups — a crucial topic in fantasy sports. Batters perform differently depending on the pitcher they face or the team they’re playing.
In the end, I hope to install new improvements that consider matchups and make incremental improvements to my ability to assess short-term player performance.
Also published here.