High-Confidence Policy Improvement from Human Feedback

Hon Tik Tse, Philip S. Thomas, Scott Niekum

Reinforcement Learning Journal, 2025