Picture for Geoffrey Irving

Geoffrey Irving

Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla

Add code
Jul 24, 2023
Figure 1 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Figure 2 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Figure 3 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Figure 4 for Does Circuit Analysis Interpretability Scale? Evidence from Multiple Choice Capabilities in Chinchilla
Viaarxiv icon

Accelerating Large Language Model Decoding with Speculative Sampling

Add code
Feb 02, 2023
Figure 1 for Accelerating Large Language Model Decoding with Speculative Sampling
Figure 2 for Accelerating Large Language Model Decoding with Speculative Sampling
Figure 3 for Accelerating Large Language Model Decoding with Speculative Sampling
Viaarxiv icon

Solving math word problems with process- and outcome-based feedback

Add code
Nov 25, 2022
Figure 1 for Solving math word problems with process- and outcome-based feedback
Figure 2 for Solving math word problems with process- and outcome-based feedback
Figure 3 for Solving math word problems with process- and outcome-based feedback
Figure 4 for Solving math word problems with process- and outcome-based feedback
Viaarxiv icon

Fine-Tuning Language Models via Epistemic Neural Networks

Add code
Nov 03, 2022
Figure 1 for Fine-Tuning Language Models via Epistemic Neural Networks
Figure 2 for Fine-Tuning Language Models via Epistemic Neural Networks
Figure 3 for Fine-Tuning Language Models via Epistemic Neural Networks
Figure 4 for Fine-Tuning Language Models via Epistemic Neural Networks
Viaarxiv icon

Improving alignment of dialogue agents via targeted human judgements

Add code
Sep 28, 2022
Figure 1 for Improving alignment of dialogue agents via targeted human judgements
Figure 2 for Improving alignment of dialogue agents via targeted human judgements
Figure 3 for Improving alignment of dialogue agents via targeted human judgements
Figure 4 for Improving alignment of dialogue agents via targeted human judgements
Viaarxiv icon

Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models

Add code
Jun 16, 2022
Figure 1 for Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Figure 2 for Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Figure 3 for Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Viaarxiv icon

Teaching language models to support answers with verified quotes

Add code
Mar 21, 2022
Figure 1 for Teaching language models to support answers with verified quotes
Figure 2 for Teaching language models to support answers with verified quotes
Figure 3 for Teaching language models to support answers with verified quotes
Figure 4 for Teaching language models to support answers with verified quotes
Viaarxiv icon

Uncertainty Estimation for Language Reward Models

Add code
Mar 14, 2022
Figure 1 for Uncertainty Estimation for Language Reward Models
Figure 2 for Uncertainty Estimation for Language Reward Models
Figure 3 for Uncertainty Estimation for Language Reward Models
Figure 4 for Uncertainty Estimation for Language Reward Models
Viaarxiv icon

Red Teaming Language Models with Language Models

Add code
Feb 07, 2022
Figure 1 for Red Teaming Language Models with Language Models
Figure 2 for Red Teaming Language Models with Language Models
Figure 3 for Red Teaming Language Models with Language Models
Figure 4 for Red Teaming Language Models with Language Models
Viaarxiv icon

Improving language models by retrieving from trillions of tokens

Add code
Jan 11, 2022
Figure 1 for Improving language models by retrieving from trillions of tokens
Figure 2 for Improving language models by retrieving from trillions of tokens
Figure 3 for Improving language models by retrieving from trillions of tokens
Figure 4 for Improving language models by retrieving from trillions of tokens
Viaarxiv icon