Picture for Daniel Kang

Daniel Kang

Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

Add code
Jun 02, 2024
Viaarxiv icon

LLM Agents can Autonomously Exploit One-day Vulnerabilities

Add code
Apr 11, 2024
Viaarxiv icon

Trustless Audits without Revealing Data or Models

Add code
Apr 06, 2024
Figure 1 for Trustless Audits without Revealing Data or Models
Figure 2 for Trustless Audits without Revealing Data or Models
Figure 3 for Trustless Audits without Revealing Data or Models
Figure 4 for Trustless Audits without Revealing Data or Models
Viaarxiv icon

A Safe Harbor for AI Evaluation and Red Teaming

Add code
Mar 07, 2024
Figure 1 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 2 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 3 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 4 for A Safe Harbor for AI Evaluation and Red Teaming
Viaarxiv icon

InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents

Add code
Mar 05, 2024
Figure 1 for InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents
Figure 2 for InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents
Figure 3 for InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents
Figure 4 for InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents
Viaarxiv icon

LLM Agents can Autonomously Hack Websites

Add code
Feb 16, 2024
Figure 1 for LLM Agents can Autonomously Hack Websites
Figure 2 for LLM Agents can Autonomously Hack Websites
Figure 3 for LLM Agents can Autonomously Hack Websites
Figure 4 for LLM Agents can Autonomously Hack Websites
Viaarxiv icon

Removing RLHF Protections in GPT-4 via Fine-Tuning

Add code
Nov 10, 2023
Figure 1 for Removing RLHF Protections in GPT-4 via Fine-Tuning
Figure 2 for Removing RLHF Protections in GPT-4 via Fine-Tuning
Viaarxiv icon

Identifying and Mitigating the Security Risks of Generative AI

Add code
Aug 28, 2023
Figure 1 for Identifying and Mitigating the Security Risks of Generative AI
Viaarxiv icon

Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks

Add code
Feb 11, 2023
Figure 1 for Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks
Figure 2 for Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks
Figure 3 for Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks
Figure 4 for Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks
Viaarxiv icon

Q-Diffusion: Quantizing Diffusion Models

Add code
Feb 10, 2023
Figure 1 for Q-Diffusion: Quantizing Diffusion Models
Figure 2 for Q-Diffusion: Quantizing Diffusion Models
Figure 3 for Q-Diffusion: Quantizing Diffusion Models
Figure 4 for Q-Diffusion: Quantizing Diffusion Models
Viaarxiv icon