Picture for Vladimir Mikulik

Vladimir Mikulik

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Challenges with unsupervised LLM knowledge discovery

Add code
Dec 18, 2023
Figure 1 for Challenges with unsupervised LLM knowledge discovery
Figure 2 for Challenges with unsupervised LLM knowledge discovery
Figure 3 for Challenges with unsupervised LLM knowledge discovery
Figure 4 for Challenges with unsupervised LLM knowledge discovery
Viaarxiv icon

The Hydra Effect: Emergent Self-repair in Language Model Computations

Add code
Jul 28, 2023
Figure 1 for The Hydra Effect: Emergent Self-repair in Language Model Computations
Figure 2 for The Hydra Effect: Emergent Self-repair in Language Model Computations
Figure 3 for The Hydra Effect: Emergent Self-repair in Language Model Computations
Figure 4 for The Hydra Effect: Emergent Self-repair in Language Model Computations
Viaarxiv icon

Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla

Add code
Jul 24, 2023
Figure 1 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Figure 2 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Figure 3 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Figure 4 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Viaarxiv icon

Tracr: Compiled Transformers as a Laboratory for Interpretability

Add code
Jan 12, 2023
Figure 1 for Tracr: Compiled Transformers as a Laboratory for Interpretability
Figure 2 for Tracr: Compiled Transformers as a Laboratory for Interpretability
Figure 3 for Tracr: Compiled Transformers as a Laboratory for Interpretability
Figure 4 for Tracr: Compiled Transformers as a Laboratory for Interpretability
Viaarxiv icon

Teaching language models to support answers with verified quotes

Add code
Mar 21, 2022
Figure 1 for Teaching language models to support answers with verified quotes
Figure 2 for Teaching language models to support answers with verified quotes
Figure 3 for Teaching language models to support answers with verified quotes
Figure 4 for Teaching language models to support answers with verified quotes
Viaarxiv icon

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

Add code
Dec 08, 2021
Figure 1 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Figure 2 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Figure 3 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Figure 4 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Viaarxiv icon

Alignment of Language Agents

Add code
Mar 26, 2021
Viaarxiv icon

Causal Analysis of Agent Behavior for AI Safety

Add code
Mar 05, 2021
Figure 1 for Causal Analysis of Agent Behavior for AI Safety
Figure 2 for Causal Analysis of Agent Behavior for AI Safety
Figure 3 for Causal Analysis of Agent Behavior for AI Safety
Figure 4 for Causal Analysis of Agent Behavior for AI Safety
Viaarxiv icon

Algorithms for Causal Reasoning in Probability Trees

Add code
Nov 12, 2020
Figure 1 for Algorithms for Causal Reasoning in Probability Trees
Figure 2 for Algorithms for Causal Reasoning in Probability Trees
Figure 3 for Algorithms for Causal Reasoning in Probability Trees
Figure 4 for Algorithms for Causal Reasoning in Probability Trees
Viaarxiv icon