Picture for Dan Hendrycks

Dan Hendrycks

UC Berkeley

Best Practices for Biorisk Evaluations on Open-Weight Bio-Foundation Models

Add code
Nov 03, 2025
Viaarxiv icon

Remote Labor Index: Measuring AI Automation of Remote Work

Add code
Oct 30, 2025
Viaarxiv icon

TextQuests: How Good are LLMs at Text-Based Video Games?

Add code
Jul 31, 2025
Figure 1 for TextQuests: How Good are LLMs at Text-Based Video Games?
Figure 2 for TextQuests: How Good are LLMs at Text-Based Video Games?
Figure 3 for TextQuests: How Good are LLMs at Text-Based Video Games?
Figure 4 for TextQuests: How Good are LLMs at Text-Based Video Games?
Viaarxiv icon

Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition

Add code
Jul 28, 2025
Viaarxiv icon

Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety

Add code
Jul 15, 2025
Figure 1 for Chain of Thought Monitorability: A New and Fragile Opportunity for AI Safety
Viaarxiv icon

The Singapore Consensus on Global AI Safety Research Priorities

Add code
Jun 25, 2025
Figure 1 for The Singapore Consensus on Global AI Safety Research Priorities
Figure 2 for The Singapore Consensus on Global AI Safety Research Priorities
Figure 3 for The Singapore Consensus on Global AI Safety Research Priorities
Viaarxiv icon

Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark

Add code
Apr 21, 2025
Figure 1 for Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark
Figure 2 for Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark
Figure 3 for Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark
Figure 4 for Virology Capabilities Test (VCT): A Multimodal Virology Q&A Benchmark
Viaarxiv icon

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

Add code
Mar 19, 2025
Viaarxiv icon

Superintelligence Strategy: Expert Version

Add code
Mar 07, 2025
Viaarxiv icon

The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems

Add code
Mar 05, 2025
Figure 1 for The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
Figure 2 for The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
Figure 3 for The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
Figure 4 for The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
Viaarxiv icon