Picture for Dawn Song

Dawn Song

University of California, Berkeley

AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs

Add code
Jul 29, 2024
Figure 1 for AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs
Figure 2 for AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs
Figure 3 for AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs
Figure 4 for AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs
Viaarxiv icon

AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases

Add code
Jul 17, 2024
Viaarxiv icon

Re-Tuning: Overcoming the Compositionality Limits of Large Language Models with Recursive Tuning

Add code
Jul 05, 2024
Viaarxiv icon

AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies

Add code
Jun 25, 2024
Figure 1 for AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies
Figure 2 for AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies
Figure 3 for AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies
Figure 4 for AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies
Viaarxiv icon

BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models

Add code
Jun 24, 2024
Viaarxiv icon

Data Shapley in One Training Run

Add code
Jun 16, 2024
Viaarxiv icon

GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning

Add code
Jun 13, 2024
Figure 1 for GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning
Figure 2 for GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning
Figure 3 for GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning
Figure 4 for GuardAgent: Safeguard LLM Agents by a Guard Agent via Knowledge-Enabled Reasoning
Viaarxiv icon

AI Risk Management Should Incorporate Both Safety and Security

Add code
May 29, 2024
Figure 1 for AI Risk Management Should Incorporate Both Safety and Security
Viaarxiv icon

KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking

Add code
Apr 03, 2024
Figure 1 for KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking
Figure 2 for KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking
Figure 3 for KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking
Figure 4 for KnowHalu: Hallucination Detection via Multi-Form Knowledge Based Factual Checking
Viaarxiv icon

RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content

Add code
Mar 19, 2024
Viaarxiv icon