Alert button
Picture for Vikrant Varma

Vikrant Varma

Alert button

Challenges with unsupervised LLM knowledge discovery

Add code
Bookmark button
Alert button
Dec 18, 2023
Sebastian Farquhar, Vikrant Varma, Zachary Kenton, Johannes Gasteiger, Vladimir Mikulik, Rohin Shah

Viaarxiv icon

Explaining grokking through circuit efficiency

Add code
Bookmark button
Alert button
Sep 05, 2023
Vikrant Varma, Rohin Shah, Zachary Kenton, János Kramár, Ramana Kumar

Viaarxiv icon

Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals

Add code
Bookmark button
Alert button
Oct 04, 2022
Rohin Shah, Vikrant Varma, Ramana Kumar, Mary Phuong, Victoria Krakovna, Jonathan Uesato, Zac Kenton

Figure 1 for Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals
Figure 2 for Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals
Figure 3 for Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals
Figure 4 for Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals
Viaarxiv icon

Safe Deep RL in 3D Environments using Human Feedback

Add code
Bookmark button
Alert button
Jan 21, 2022
Matthew Rahtz, Vikrant Varma, Ramana Kumar, Zachary Kenton, Shane Legg, Jan Leike

Figure 1 for Safe Deep RL in 3D Environments using Human Feedback
Figure 2 for Safe Deep RL in 3D Environments using Human Feedback
Figure 3 for Safe Deep RL in 3D Environments using Human Feedback
Figure 4 for Safe Deep RL in 3D Environments using Human Feedback
Viaarxiv icon

Imitating Interactive Intelligence

Add code
Bookmark button
Alert button
Jan 21, 2021
Josh Abramson, Arun Ahuja, Iain Barr, Arthur Brussee, Federico Carnevale, Mary Cassin, Rachita Chhaparia, Stephen Clark, Bogdan Damoc, Andrew Dudzik, Petko Georgiev, Aurelia Guy, Tim Harley, Felix Hill, Alden Hung, Zachary Kenton, Jessica Landon, Timothy Lillicrap, Kory Mathewson, Soňa Mokrá, Alistair Muldal, Adam Santoro, Nikolay Savinov, Vikrant Varma, Greg Wayne, Duncan Williams, Nathaniel Wong, Chen Yan, Rui Zhu

Figure 1 for Imitating Interactive Intelligence
Figure 2 for Imitating Interactive Intelligence
Figure 3 for Imitating Interactive Intelligence
Figure 4 for Imitating Interactive Intelligence
Viaarxiv icon