Alert button
Picture for Nicholas Carlini

Nicholas Carlini

Alert button

Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models

Add code
Bookmark button
Alert button
Apr 01, 2024
Yuxin Wen, Leo Marchyok, Sanghyun Hong, Jonas Geiping, Tom Goldstein, Nicholas Carlini

Viaarxiv icon

Diffusion Denoising as a Certified Defense against Clean-label Poisoning

Add code
Bookmark button
Alert button
Mar 18, 2024
Sanghyun Hong, Nicholas Carlini, Alexey Kurakin

Figure 1 for Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Figure 2 for Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Figure 3 for Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Figure 4 for Diffusion Denoising as a Certified Defense against Clean-label Poisoning
Viaarxiv icon

Query-Based Adversarial Prompt Generation

Add code
Bookmark button
Alert button
Feb 19, 2024
Jonathan Hayase, Ema Borevkovic, Nicholas Carlini, Florian Tramèr, Milad Nasr

Viaarxiv icon

Initialization Matters for Adversarial Transfer Learning

Add code
Bookmark button
Alert button
Dec 10, 2023
Andong Hua, Jindong Gu, Zhiyu Xue, Nicholas Carlini, Eric Wong, Yao Qin

Viaarxiv icon

Scalable Extraction of Training Data from (Production) Language Models

Add code
Bookmark button
Alert button
Nov 28, 2023
Milad Nasr, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A. Feder Cooper, Daphne Ippolito, Christopher A. Choquette-Choo, Eric Wallace, Florian Tramèr, Katherine Lee

Viaarxiv icon

Privacy Side Channels in Machine Learning Systems

Add code
Bookmark button
Alert button
Sep 11, 2023
Edoardo Debenedetti, Giorgio Severi, Nicholas Carlini, Christopher A. Choquette-Choo, Matthew Jagielski, Milad Nasr, Eric Wallace, Florian Tramèr

Viaarxiv icon

Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System

Add code
Bookmark button
Alert button
Sep 09, 2023
Daphne Ippolito, Nicholas Carlini, Katherine Lee, Milad Nasr, Yun William Yu

Figure 1 for Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System
Figure 2 for Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System
Figure 3 for Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System
Figure 4 for Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System
Viaarxiv icon

Identifying and Mitigating the Security Risks of Generative AI

Add code
Bookmark button
Alert button
Aug 28, 2023
Clark Barrett, Brad Boyd, Ellie Burzstein, Nicholas Carlini, Brad Chen, Jihye Choi, Amrita Roy Chowdhury, Mihai Christodorescu, Anupam Datta, Soheil Feizi, Kathleen Fisher, Tatsunori Hashimoto, Dan Hendrycks, Somesh Jha, Daniel Kang, Florian Kerschbaum, Eric Mitchell, John Mitchell, Zulfikar Ramzan, Khawaja Shams, Dawn Song, Ankur Taly, Diyi Yang

Figure 1 for Identifying and Mitigating the Security Risks of Generative AI
Viaarxiv icon

A LLM Assisted Exploitation of AI-Guardian

Add code
Bookmark button
Alert button
Jul 20, 2023
Nicholas Carlini

Viaarxiv icon

Are aligned neural networks adversarially aligned?

Add code
Bookmark button
Alert button
Jun 26, 2023
Nicholas Carlini, Milad Nasr, Christopher A. Choquette-Choo, Matthew Jagielski, Irena Gao, Anas Awadalla, Pang Wei Koh, Daphne Ippolito, Katherine Lee, Florian Tramer, Ludwig Schmidt

Figure 1 for Are aligned neural networks adversarially aligned?
Figure 2 for Are aligned neural networks adversarially aligned?
Figure 3 for Are aligned neural networks adversarially aligned?
Figure 4 for Are aligned neural networks adversarially aligned?
Viaarxiv icon