Picture for Dawn Song

Dawn Song

University of California, Berkeley

The Singapore Consensus on Global AI Safety Research Priorities

Add code
Jun 25, 2025
Viaarxiv icon

OMEGA: Can LLMs Reason Outside the Box in Math? Evaluating Exploratory, Compositional, and Transformative Generalization

Add code
Jun 23, 2025
Viaarxiv icon

AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents

Add code
Jun 17, 2025
Viaarxiv icon

Pushing the Limits of Safety: A Technical Report on the ATLAS Challenge 2025

Add code
Jun 14, 2025
Viaarxiv icon

VERINA: Benchmarking Verifiable Code Generation

Add code
May 29, 2025
Viaarxiv icon

OVERT: A Benchmark for Over-Refusal Evaluation on Text-to-Image Models

Add code
May 28, 2025
Viaarxiv icon

Learning to Reason without External Rewards

Add code
May 26, 2025
Viaarxiv icon

A Critical Evaluation of Defenses against Prompt Injection Attacks

Add code
May 23, 2025
Viaarxiv icon

SafeKey: Amplifying Aha-Moment Insights for Safety Reasoning

Add code
May 22, 2025
Viaarxiv icon

In-Context Watermarks for Large Language Models

Add code
May 22, 2025
Viaarxiv icon