Picture for Dan Hendrycks

Dan Hendrycks

UC Berkeley

An Overview of Catastrophic AI Risks

Add code
Jul 11, 2023
Figure 1 for An Overview of Catastrophic AI Risks
Figure 2 for An Overview of Catastrophic AI Risks
Figure 3 for An Overview of Catastrophic AI Risks
Figure 4 for An Overview of Catastrophic AI Risks
Viaarxiv icon

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Add code
Jun 20, 2023
Figure 1 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Figure 2 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Figure 3 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Figure 4 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Viaarxiv icon

Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark

Add code
Apr 06, 2023
Figure 1 for Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
Figure 2 for Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
Figure 3 for Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
Figure 4 for Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
Viaarxiv icon

Natural Selection Favors AIs over Humans

Add code
Mar 28, 2023
Viaarxiv icon

MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding

Add code
Jan 06, 2023
Viaarxiv icon

How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios

Add code
Oct 18, 2022
Figure 1 for How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios
Figure 2 for How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios
Figure 3 for How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios
Figure 4 for How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios
Viaarxiv icon

OpenOOD: Benchmarking Generalized Out-of-Distribution Detection

Add code
Oct 13, 2022
Figure 1 for OpenOOD: Benchmarking Generalized Out-of-Distribution Detection
Figure 2 for OpenOOD: Benchmarking Generalized Out-of-Distribution Detection
Figure 3 for OpenOOD: Benchmarking Generalized Out-of-Distribution Detection
Viaarxiv icon

Forecasting Future World Events with Neural Networks

Add code
Jun 30, 2022
Figure 1 for Forecasting Future World Events with Neural Networks
Figure 2 for Forecasting Future World Events with Neural Networks
Figure 3 for Forecasting Future World Events with Neural Networks
Figure 4 for Forecasting Future World Events with Neural Networks
Viaarxiv icon

X-Risk Analysis for AI Research

Add code
Jun 18, 2022
Figure 1 for X-Risk Analysis for AI Research
Figure 2 for X-Risk Analysis for AI Research
Figure 3 for X-Risk Analysis for AI Research
Figure 4 for X-Risk Analysis for AI Research
Viaarxiv icon

Actionable Guidance for High-Consequence AI Risk Management: Towards Standards Addressing AI Catastrophic Risks

Add code
Jun 17, 2022
Viaarxiv icon