Alert button
Picture for Dan Hendrycks

Dan Hendrycks

Alert button

UC Berkeley

An Overview of Catastrophic AI Risks

Add code
Bookmark button
Alert button
Jun 21, 2023
Dan Hendrycks, Mantas Mazeika, Thomas Woodside

Figure 1 for An Overview of Catastrophic AI Risks
Figure 2 for An Overview of Catastrophic AI Risks
Figure 3 for An Overview of Catastrophic AI Risks
Figure 4 for An Overview of Catastrophic AI Risks
Viaarxiv icon

DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Add code
Bookmark button
Alert button
Jun 20, 2023
Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, Bo Li

Figure 1 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Figure 2 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Figure 3 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Figure 4 for DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Viaarxiv icon

Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark

Add code
Bookmark button
Alert button
Apr 06, 2023
Alexander Pan, Chan Jun Shern, Andy Zou, Nathaniel Li, Steven Basart, Thomas Woodside, Jonathan Ng, Hanlin Zhang, Scott Emmons, Dan Hendrycks

Figure 1 for Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
Figure 2 for Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
Figure 3 for Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
Figure 4 for Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark
Viaarxiv icon

Natural Selection Favors AIs over Humans

Add code
Bookmark button
Alert button
Mar 28, 2023
Dan Hendrycks

Figure 1 for Natural Selection Favors AIs over Humans
Figure 2 for Natural Selection Favors AIs over Humans
Figure 3 for Natural Selection Favors AIs over Humans
Figure 4 for Natural Selection Favors AIs over Humans
Viaarxiv icon

MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding

Add code
Bookmark button
Alert button
Jan 06, 2023
Steven H. Wang, Antoine Scardigli, Leonard Tang, Wei Chen, Dimitry Levkin, Anya Chen, Spencer Ball, Thomas Woodside, Oliver Zhang, Dan Hendrycks

Figure 1 for MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding
Figure 2 for MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding
Figure 3 for MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding
Figure 4 for MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding
Viaarxiv icon

How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios

Add code
Bookmark button
Alert button
Oct 18, 2022
Mantas Mazeika, Eric Tang, Andy Zou, Steven Basart, Jun Shern Chan, Dawn Song, David Forsyth, Jacob Steinhardt, Dan Hendrycks

Figure 1 for How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios
Figure 2 for How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios
Figure 3 for How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios
Figure 4 for How Would The Viewer Feel? Estimating Wellbeing From Video Scenarios
Viaarxiv icon

OpenOOD: Benchmarking Generalized Out-of-Distribution Detection

Add code
Bookmark button
Alert button
Oct 13, 2022
Jingkang Yang, Pengyun Wang, Dejian Zou, Zitang Zhou, Kunyuan Ding, Wenxuan Peng, Haoqi Wang, Guangyao Chen, Bo Li, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Dan Hendrycks, Yixuan Li, Ziwei Liu

Figure 1 for OpenOOD: Benchmarking Generalized Out-of-Distribution Detection
Figure 2 for OpenOOD: Benchmarking Generalized Out-of-Distribution Detection
Figure 3 for OpenOOD: Benchmarking Generalized Out-of-Distribution Detection
Viaarxiv icon

Forecasting Future World Events with Neural Networks

Add code
Bookmark button
Alert button
Jun 30, 2022
Andy Zou, Tristan Xiao, Ryan Jia, Joe Kwon, Mantas Mazeika, Richard Li, Dawn Song, Jacob Steinhardt, Owain Evans, Dan Hendrycks

Figure 1 for Forecasting Future World Events with Neural Networks
Figure 2 for Forecasting Future World Events with Neural Networks
Figure 3 for Forecasting Future World Events with Neural Networks
Figure 4 for Forecasting Future World Events with Neural Networks
Viaarxiv icon

X-Risk Analysis for AI Research

Add code
Bookmark button
Alert button
Jun 18, 2022
Dan Hendrycks, Mantas Mazeika

Figure 1 for X-Risk Analysis for AI Research
Figure 2 for X-Risk Analysis for AI Research
Figure 3 for X-Risk Analysis for AI Research
Figure 4 for X-Risk Analysis for AI Research
Viaarxiv icon

Actionable Guidance for High-Consequence AI Risk Management: Towards Standards Addressing AI Catastrophic Risks

Add code
Bookmark button
Alert button
Jun 17, 2022
Anthony M. Barrett, Dan Hendrycks, Jessica Newman, Brandie Nonnecke

Viaarxiv icon