Picture for Dawn Song

Dawn Song

University of California, Berkeley

Climbing the Ladder of Reasoning: What LLMs Can-and Still Can't-Solve after SFT?

Add code
Apr 16, 2025
Figure 1 for Climbing the Ladder of Reasoning: What LLMs Can-and Still Can't-Solve after SFT?
Figure 2 for Climbing the Ladder of Reasoning: What LLMs Can-and Still Can't-Solve after SFT?
Figure 3 for Climbing the Ladder of Reasoning: What LLMs Can-and Still Can't-Solve after SFT?
Figure 4 for Climbing the Ladder of Reasoning: What LLMs Can-and Still Can't-Solve after SFT?
Viaarxiv icon

Progent: Programmable Privilege Control for LLM Agents

Add code
Apr 16, 2025
Figure 1 for Progent: Programmable Privilege Control for LLM Agents
Figure 2 for Progent: Programmable Privilege Control for LLM Agents
Figure 3 for Progent: Programmable Privilege Control for LLM Agents
Figure 4 for Progent: Programmable Privilege Control for LLM Agents
Viaarxiv icon

DataSentinel: A Game-Theoretic Detection of Prompt Injection Attacks

Add code
Apr 15, 2025
Viaarxiv icon

Assessing Judging Bias in Large Reasoning Models: An Empirical Study

Add code
Apr 14, 2025
Figure 1 for Assessing Judging Bias in Large Reasoning Models: An Empirical Study
Figure 2 for Assessing Judging Bias in Large Reasoning Models: An Empirical Study
Figure 3 for Assessing Judging Bias in Large Reasoning Models: An Empirical Study
Figure 4 for Assessing Judging Bias in Large Reasoning Models: An Empirical Study
Viaarxiv icon

Type-Constrained Code Generation with Language Models

Add code
Apr 12, 2025
Figure 1 for Type-Constrained Code Generation with Language Models
Figure 2 for Type-Constrained Code Generation with Language Models
Figure 3 for Type-Constrained Code Generation with Language Models
Figure 4 for Type-Constrained Code Generation with Language Models
Viaarxiv icon

Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs

Add code
Apr 07, 2025
Figure 1 for Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
Figure 2 for Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
Figure 3 for Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
Figure 4 for Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs
Viaarxiv icon

SoK: Frontier AI's Impact on the Cybersecurity Landscape

Add code
Apr 07, 2025
Figure 1 for SoK: Frontier AI's Impact on the Cybersecurity Landscape
Figure 2 for SoK: Frontier AI's Impact on the Cybersecurity Landscape
Figure 3 for SoK: Frontier AI's Impact on the Cybersecurity Landscape
Figure 4 for SoK: Frontier AI's Impact on the Cybersecurity Landscape
Viaarxiv icon

An Illusion of Progress? Assessing the Current State of Web Agents

Add code
Apr 02, 2025
Figure 1 for An Illusion of Progress? Assessing the Current State of Web Agents
Figure 2 for An Illusion of Progress? Assessing the Current State of Web Agents
Figure 3 for An Illusion of Progress? Assessing the Current State of Web Agents
Figure 4 for An Illusion of Progress? Assessing the Current State of Web Agents
Viaarxiv icon

MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models

Add code
Mar 19, 2025
Viaarxiv icon

Improving LLM Safety Alignment with Dual-Objective Optimization

Add code
Mar 05, 2025
Figure 1 for Improving LLM Safety Alignment with Dual-Objective Optimization
Figure 2 for Improving LLM Safety Alignment with Dual-Objective Optimization
Figure 3 for Improving LLM Safety Alignment with Dual-Objective Optimization
Figure 4 for Improving LLM Safety Alignment with Dual-Objective Optimization
Viaarxiv icon