Training a Helpful and Harmless Assistant with Reinforcement Learning
from Human Feedback
Yuntao Bai,Andy Jones,Kamal Ndousse,Amanda Askell,Anna Chen,Nova DasSarma,Dawn Drain,Stanislav Fort,Deep Ganguli,Tom Henighan,Nicholas Joseph,Saurav Kadavath,Jackson Kernion,Tom Conerly,Sheer El-Showk,Nelson Elhage,Zac Hatfield-Dodds,Danny Hernandez,Tristan Hume,Scott Johnston,Shauna Kravec,Liane Lovitt,Neel Nanda,Catherine Olsson,Dario Amodei,Tom Brown,Jack Clark,Sam McCandlish,Chris Olah,Ben Mann,Jared Kaplan CoRR(2022)
AI Read Science
Must-Reading Tree
Example

Generate MRT to find the research sequence of this paper