384 reads

The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback

by
January 16th, 2024
featured image - The Alignment Ceiling: Objective Mismatch in Reinforcement Learning from Human Feedback