Picture for Nicholas Carlini

Nicholas Carlini

Dj

Cutting through buggy adversarial example defenses: fixing 1 line of code breaks Sabre

Add code
May 06, 2024
Figure 1 for Cutting through buggy adversarial example defenses: fixing 1 line of code breaks Sabre
Viaarxiv icon

Forcing Diffuse Distributions out of Language Models

Add code
Apr 16, 2024
Figure 1 for Forcing Diffuse Distributions out of Language Models
Figure 2 for Forcing Diffuse Distributions out of Language Models
Figure 3 for Forcing Diffuse Distributions out of Language Models
Figure 4 for Forcing Diffuse Distributions out of Language Models
Viaarxiv icon

Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models

Add code
Apr 01, 2024
Figure 1 for Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
Figure 2 for Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
Figure 3 for Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
Figure 4 for Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models
Viaarxiv icon

Diffusion Denoising as a Certified Defense against Clean-label Poisoning

Add code
Mar 18, 2024
Figure 1 for Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Figure 2 for Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Figure 3 for Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Figure 4 for Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Viaarxiv icon

Query-Based Adversarial Prompt Generation

Add code
Feb 19, 2024
Viaarxiv icon

Initialization Matters for Adversarial Transfer Learning

Add code
Dec 10, 2023
Viaarxiv icon

Scalable Extraction of Training Data from (Production) Language Models

Add code
Nov 28, 2023
Figure 1 for Scalable Extraction of Training Data from (Production) Language Models
Figure 2 for Scalable Extraction of Training Data from (Production) Language Models
Figure 3 for Scalable Extraction of Training Data from (Production) Language Models
Figure 4 for Scalable Extraction of Training Data from (Production) Language Models
Viaarxiv icon

Privacy Side Channels in Machine Learning Systems

Add code
Sep 11, 2023
Viaarxiv icon

Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System

Add code
Sep 09, 2023
Figure 1 for Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System
Figure 2 for Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System
Figure 3 for Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System
Figure 4 for Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System
Viaarxiv icon

Identifying and Mitigating the Security Risks of Generative AI

Add code
Aug 28, 2023
Figure 1 for Identifying and Mitigating the Security Risks of Generative AI
Viaarxiv icon