Picture for William Isaac

William Isaac

Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data

Add code
Jun 19, 2024
Viaarxiv icon

STAR: SocioTechnical Approach to Red Teaming Language Models

Add code
Jun 17, 2024
Figure 1 for STAR: SocioTechnical Approach to Red Teaming Language Models
Figure 2 for STAR: SocioTechnical Approach to Red Teaming Language Models
Figure 3 for STAR: SocioTechnical Approach to Red Teaming Language Models
Figure 4 for STAR: SocioTechnical Approach to Red Teaming Language Models
Viaarxiv icon

Holistic Safety and Responsibility Evaluations of Advanced AI Models

Add code
Apr 22, 2024
Viaarxiv icon

Recourse for reclamation: Chatting with generative language models

Add code
Mar 21, 2024
Figure 1 for Recourse for reclamation: Chatting with generative language models
Figure 2 for Recourse for reclamation: Chatting with generative language models
Figure 3 for Recourse for reclamation: Chatting with generative language models
Figure 4 for Recourse for reclamation: Chatting with generative language models
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

Sociotechnical Safety Evaluation of Generative AI Systems

Add code
Oct 31, 2023
Viaarxiv icon

Improving alignment of dialogue agents via targeted human judgements

Add code
Sep 28, 2022
Figure 1 for Improving alignment of dialogue agents via targeted human judgements
Figure 2 for Improving alignment of dialogue agents via targeted human judgements
Figure 3 for Improving alignment of dialogue agents via targeted human judgements
Figure 4 for Improving alignment of dialogue agents via targeted human judgements
Viaarxiv icon

Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models

Add code
Jun 16, 2022
Figure 1 for Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Figure 2 for Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Figure 3 for Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models
Viaarxiv icon

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

Add code
Dec 08, 2021
Figure 1 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Figure 2 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Figure 3 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Figure 4 for Scaling Language Models: Methods, Analysis & Insights from Training Gopher
Viaarxiv icon

Ethical and social risks of harm from Language Models

Add code
Dec 08, 2021
Figure 1 for Ethical and social risks of harm from Language Models
Figure 2 for Ethical and social risks of harm from Language Models
Viaarxiv icon