paint-brush
Learning Policies For Learning Policies — Meta Reinforcement Learning (RL²) in Tensorflowby@awjuliani
9,901 reads
9,901 reads

Learning Policies For Learning Policies — Meta Reinforcement Learning (RL²) in Tensorflow

by Arthur Juliani8mJanuary 25th, 2017
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Reinforcement <a href="https://hackernoon.com/tagged/learning" target="_blank">Learning</a> provides a framework for training agents to solve problems in the world. One of the limitations of these agents however is their inflexibility once trained. They are able to <strong><em>learn a policy</em></strong> to solve a specific problem (formalized as an <a href="https://en.wikipedia.org/wiki/Markov_decision_process" target="_blank">MDP</a>), but that learned policy is often useless in new problems, even relatively similar ones.

People Mentioned

Mention Thumbnail

Companies Mentioned

Mention Thumbnail
Mention Thumbnail
featured image - Learning Policies For Learning Policies — Meta Reinforcement Learning (RL²) in Tensorflow
Arthur Juliani HackerNoon profile picture
Arthur Juliani

Arthur Juliani

@awjuliani

L O A D I N G
. . . comments & more!

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Google
Awesomeopensource
Datascienceweekly
Papasearch
Aryan
Julien-vitay
Squarespace
Dreamerux