Alert button
Picture for Brian Goodrich

Brian Goodrich

Alert button

Evaluating Language-Model Agents on Realistic Autonomous Tasks

Add code
Bookmark button
Alert button
Jan 04, 2024
Megan Kinniment, Lucas Jun Koba Sato, Haoxing Du, Brian Goodrich, Max Hasin, Lawrence Chan, Luke Harold Miles, Tao R. Lin, Hjalmar Wijk, Joel Burget, Aaron Ho, Elizabeth Barnes, Paul Christiano

Viaarxiv icon