Reinforcement Learning - The Value Function

Written by jingles | Published 2019/08/16
Tech Story Tags: data-science | machine-learning | programming | artificial-intelligence | reinforcement-learning | nodejs | value-function | explore-exploit

TLDR The value function is an efficient way to determine the value of being in a state. In a game of tic-tac-toe, getting 2 Xs in a row does not win the game, hence there is no reward. The value of state A is the sum of all next states’ probability multiplied by the reward for reaching that state A. In this case, a state A has a chance of winning the game by placing it at the top of a row. A state D is a state D with only 1 possible route to state E, since the only outcome is to receive the reward.via the TL;DR App

no story

Written by jingles | A data scientist who also enjoy developing products on the Web.
Published by HackerNoon on 2019/08/16