Alert button
Picture for János Kramár

János Kramár

Alert button

Google DeepMind

AtP*: An efficient and scalable method for localizing LLM behaviour to components

Add code
Bookmark button
Alert button
Mar 01, 2024
János Kramár, Tom Lieberum, Rohin Shah, Neel Nanda

Figure 1 for AtP*: An efficient and scalable method for localizing LLM behaviour to components
Figure 2 for AtP*: An efficient and scalable method for localizing LLM behaviour to components
Figure 3 for AtP*: An efficient and scalable method for localizing LLM behaviour to components
Figure 4 for AtP*: An efficient and scalable method for localizing LLM behaviour to components
Viaarxiv icon

Explaining grokking through circuit efficiency

Add code
Bookmark button
Alert button
Sep 05, 2023
Vikrant Varma, Rohin Shah, Zachary Kenton, János Kramár, Ramana Kumar

Viaarxiv icon

Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla

Add code
Bookmark button
Alert button
Jul 24, 2023
Tom Lieberum, Matthew Rahtz, János Kramár, Neel Nanda, Geoffrey Irving, Rohin Shah, Vladimir Mikulik

Figure 1 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Figure 2 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Figure 3 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Figure 4 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Viaarxiv icon

Tracr: Compiled Transformers as a Laboratory for Interpretability

Add code
Bookmark button
Alert button
Jan 12, 2023
David Lindner, János Kramár, Matthew Rahtz, Thomas McGrath, Vladimir Mikulik

Figure 1 for Tracr: Compiled Transformers as a Laboratory for Interpretability
Figure 2 for Tracr: Compiled Transformers as a Laboratory for Interpretability
Figure 3 for Tracr: Compiled Transformers as a Laboratory for Interpretability
Figure 4 for Tracr: Compiled Transformers as a Laboratory for Interpretability
Viaarxiv icon

Learning to Play No-Press Diplomacy with Best Response Policy Iteration

Add code
Bookmark button
Alert button
Jun 17, 2020
Thomas Anthony, Tom Eccles, Andrea Tacchetti, János Kramár, Ian Gemp, Thomas C. Hudson, Nicolas Porcel, Marc Lanctot, Julien Pérolat, Richard Everett, Satinder Singh, Thore Graepel, Yoram Bachrach

Figure 1 for Learning to Play No-Press Diplomacy with Best Response Policy Iteration
Figure 2 for Learning to Play No-Press Diplomacy with Best Response Policy Iteration
Figure 3 for Learning to Play No-Press Diplomacy with Best Response Policy Iteration
Figure 4 for Learning to Play No-Press Diplomacy with Best Response Policy Iteration
Viaarxiv icon

OpenSpiel: A Framework for Reinforcement Learning in Games

Add code
Bookmark button
Alert button
Oct 10, 2019
Marc Lanctot, Edward Lockhart, Jean-Baptiste Lespiau, Vinicius Zambaldi, Satyaki Upadhyay, Julien Pérolat, Sriram Srinivasan, Finbarr Timbers, Karl Tuyls, Shayegan Omidshafiei, Daniel Hennes, Dustin Morrill, Paul Muller, Timo Ewalds, Ryan Faulkner, János Kramár, Bart De Vylder, Brennan Saeta, James Bradbury, David Ding, Sebastian Borgeaud, Matthew Lai, Julian Schrittwieser, Thomas Anthony, Edward Hughes, Ivo Danihelka, Jonah Ryan-Davis

Figure 1 for OpenSpiel: A Framework for Reinforcement Learning in Games
Figure 2 for OpenSpiel: A Framework for Reinforcement Learning in Games
Figure 3 for OpenSpiel: A Framework for Reinforcement Learning in Games
Figure 4 for OpenSpiel: A Framework for Reinforcement Learning in Games
Viaarxiv icon