Alert button
Picture for Max Kaufmann

Max Kaufmann

Alert button

Visibility into AI Agents

Add code
Bookmark button
Alert button
Feb 04, 2024
Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Nitarshan Rajkumar, David Krueger, Noam Kolt, Lennart Heim, Markus Anderljung

Viaarxiv icon

The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"

Add code
Bookmark button
Alert button
Sep 22, 2023
Lukas Berglund, Meg Tong, Max Kaufmann, Mikita Balesni, Asa Cooper Stickland, Tomasz Korbak, Owain Evans

Figure 1 for The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
Figure 2 for The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
Figure 3 for The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
Figure 4 for The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
Viaarxiv icon

Taken out of context: On measuring situational awareness in LLMs

Add code
Bookmark button
Alert button
Sep 01, 2023
Lukas Berglund, Asa Cooper Stickland, Mikita Balesni, Max Kaufmann, Meg Tong, Tomasz Korbak, Daniel Kokotajlo, Owain Evans

Figure 1 for Taken out of context: On measuring situational awareness in LLMs
Figure 2 for Taken out of context: On measuring situational awareness in LLMs
Figure 3 for Taken out of context: On measuring situational awareness in LLMs
Figure 4 for Taken out of context: On measuring situational awareness in LLMs
Viaarxiv icon