Picture for Laura Weidinger

Laura Weidinger

The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources

Add code
Jun 26, 2024
Viaarxiv icon

STAR: SocioTechnical Approach to Red Teaming Language Models

Add code
Jun 17, 2024
Figure 1 for STAR: SocioTechnical Approach to Red Teaming Language Models
Figure 2 for STAR: SocioTechnical Approach to Red Teaming Language Models
Figure 3 for STAR: SocioTechnical Approach to Red Teaming Language Models
Figure 4 for STAR: SocioTechnical Approach to Red Teaming Language Models
Viaarxiv icon

Holistic Safety and Responsibility Evaluations of Advanced AI Models

Add code
Apr 22, 2024
Viaarxiv icon

Sociotechnical Safety Evaluation of Generative AI Systems

Add code
Oct 31, 2023
Viaarxiv icon

Improving alignment of dialogue agents via targeted human judgements

Add code
Sep 28, 2022
Figure 1 for Improving alignment of dialogue agents via targeted human judgements
Figure 2 for Improving alignment of dialogue agents via targeted human judgements
Figure 3 for Improving alignment of dialogue agents via targeted human judgements
Figure 4 for Improving alignment of dialogue agents via targeted human judgements
Viaarxiv icon

Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models

Add code
Jun 16, 2022
Figure 1 for Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Figure 2 for Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Figure 3 for Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Viaarxiv icon

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

Add code
Dec 08, 2021
Figure 1 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Figure 2 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Figure 3 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Figure 4 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Viaarxiv icon

Ethical and social risks of harm from Language Models

Add code
Dec 08, 2021
Figure 1 for Ethical and social risks of harm from Language Models
Figure 2 for Ethical and social risks of harm from Language Models
Viaarxiv icon

Alignment of Language Agents

Add code
Mar 26, 2021
Viaarxiv icon

Modelling Cooperation in Network Games with Spatio-Temporal Complexity

Add code
Feb 13, 2021
Figure 1 for Modelling Cooperation in Network Games with Spatio-Temporal Complexity
Figure 2 for Modelling Cooperation in Network Games with Spatio-Temporal Complexity
Figure 3 for Modelling Cooperation in Network Games with Spatio-Temporal Complexity
Figure 4 for Modelling Cooperation in Network Games with Spatio-Temporal Complexity
Viaarxiv icon