Alert button

Evaluating Language-Model Agents on Realistic Autonomous Tasks

Jan 04, 2024
Megan Kinniment, Lucas Jun Koba Sato, Haoxing Du, Brian Goodrich, Max Hasin, Lawrence Chan, Luke Harold Miles, Tao R. Lin, Hjalmar Wijk, Joel Burget, Aaron Ho, Elizabeth Barnes, Paul Christiano

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: