ReadWrite
paint-brush
How Do We Teach Reinforcement Learning Agents Human Preferences?by@languagemodels

How Do We Teach Reinforcement Learning Agents Human Preferences?

by Language Models (dot tech)December 3rd, 2024
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

ICPL enhances reinforcement learning by integrating Large Language Models with human preference feedback, refining reward functions interactively. Building on works like EUREKA, it sets new standards in RL reward design efficiency.
featured image - How Do We Teach Reinforcement Learning Agents Human Preferences?
Language Models (dot tech) HackerNoon profile picture
0-item
  1. Abstract and Introduction
  2. Related Work
  3. Problem Definition
  4. Method
  5. Experiments
  6. Conclusion and References


A. Appendix

A.1. Full Prompts and A.2 ICPL Details

A. 3 Baseline Details

A.4 Environment Details

A.5 Proxy Human Preference

A.6 Human-in-the-Loop Preference

3 PROBLEM DEFINITION

Our goal is to design a reward function that can be used to train reinforcement learning agents that demonstrate human-preferred behaviors. It is usually hard to design proper reward functions in reinforcement learning that induce policies that align well with human preferences.



Authors:

(1) Chao Yu, Tsinghua University;

(2) Hong Lu, Tsinghua University;

(3) Jiaxuan Gao, Tsinghua University;

(4) Qixin Tan, Tsinghua University;

(5) Xinting Yang, Tsinghua University;

(6) Yu Wang, with equal advising from Tsinghua University;

(7) Yi Wu, with equal advising from Tsinghua University and the Shanghai Qi Zhi Institute;

(8) Eugene Vinitsky, with equal advising from New York University ([email protected]).

This paper is available on arxiv under CC 4.0 license.


HackerNoon Services
L O A D I N G
. . . comments & more!

About Author

Language Models (dot tech) HackerNoon profile picture
Language Models (dot tech)@languagemodels
Large Language Models (LLMs) ushered in a technological revolution. We breakdown how the most important models work.
Read my storiesLearn More

TOPICS

purcat-imgmachine-learning #reinforcement-learning #in-context-learning #preference-learning #large-language-models #reward-functions #rlhf-efficiency #human-in-the-loop-rl #in-context-preference-learning

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader Terminal
Read this story w/o Javascript Lite
Also published here
Hackernoon
X
Bsky

RELATED STORIES

Article Thumbnail
Researchers Uncover Breakthrough in Human-In-the-Loop AI Training with ICPL
by languagemodels
Dec 03, 2024
#reinforcement-learning
Article Thumbnail
Researchers Uncover Breakthrough in Human-In-the-Loop AI Training with ICPL
by languagemodels
Dec 03, 2024
#reinforcement-learning
Article Thumbnail
Hacking Reinforcement Learning with a Little Help from Humans (and LLMs)
by languagemodels
Dec 03, 2024
#reinforcement-learning
Article Thumbnail
How ICPL Addresses the Core Problem of RL Reward Design
by languagemodels
Dec 03, 2024
#reinforcement-learning
Article Thumbnail
Human Preferences Help Scientists Train AI 30x Faster Than Before
by languagemodels
Dec 03, 2024
#reinforcement-learning
Join HackerNoonloading
Latest technology trends. Customized Experience. Curated Stories. Publish Your Ideas