paint-brush
Use Beta Distribution and Thompson Sampling to Beat The Multi-armed Bandit at the Casinoby@ryan-yu
2,201 reads
2,201 reads

Use Beta Distribution and Thompson Sampling to Beat The Multi-armed Bandit at the Casino

by Ryan Yu15mFebruary 29th, 2020
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Beta distribution is a family of continuous probability distributions defined on the interval [0, 1] parametrized by two positive shape parameters, denoted by α and β, that appear as exponents of the random variable and control the shape of the distribution. We use Beta distribution to model the simplest form of the multi-armed bandit problem, which is the binary outcome/reward. In the casino example, each machine will pay a reward of $1 when the outcome is success, and $0 when it is fail. Our goal is to identify the machine with the highest probability of success.

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Use Beta Distribution and Thompson Sampling to Beat The Multi-armed Bandit at the Casino
Ryan Yu HackerNoon profile picture
Ryan Yu

Ryan Yu

@ryan-yu

L O A D I N G
. . . comments & more!

About Author

Ryan Yu HackerNoon profile picture
Ryan Yu@ryan-yu

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Learnrepo
Coffee-web
Learnrepo