Alert button
Picture for Aniruddha Saha

Aniruddha Saha

Alert button

Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion

Add code
Bookmark button
Alert button
Mar 25, 2024
Hossein Souri, Arpit Bansal, Hamid Kazemi, Liam Fowl, Aniruddha Saha, Jonas Geiping, Andrew Gordon Wilson, Rama Chellappa, Tom Goldstein, Micah Goldblum

Viaarxiv icon

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

Add code
Bookmark button
Alert button
Jan 22, 2024
Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova, Hamid Kazemi, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

Viaarxiv icon

NEFTune: Noisy Embeddings Improve Instruction Finetuning

Add code
Bookmark button
Alert button
Oct 10, 2023
Neel Jain, Ping-yeh Chiang, Yuxin Wen, John Kirchenbauer, Hong-Min Chu, Gowthami Somepalli, Brian R. Bartoldson, Bhavya Kailkhura, Avi Schwarzschild, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

Figure 1 for NEFTune: Noisy Embeddings Improve Instruction Finetuning
Figure 2 for NEFTune: Noisy Embeddings Improve Instruction Finetuning
Figure 3 for NEFTune: Noisy Embeddings Improve Instruction Finetuning
Figure 4 for NEFTune: Noisy Embeddings Improve Instruction Finetuning
Viaarxiv icon

Baseline Defenses for Adversarial Attacks Against Aligned Language Models

Add code
Bookmark button
Alert button
Sep 04, 2023
Neel Jain, Avi Schwarzschild, Yuxin Wen, Gowthami Somepalli, John Kirchenbauer, Ping-yeh Chiang, Micah Goldblum, Aniruddha Saha, Jonas Geiping, Tom Goldstein

Figure 1 for Baseline Defenses for Adversarial Attacks Against Aligned Language Models
Figure 2 for Baseline Defenses for Adversarial Attacks Against Aligned Language Models
Figure 3 for Baseline Defenses for Adversarial Attacks Against Aligned Language Models
Figure 4 for Baseline Defenses for Adversarial Attacks Against Aligned Language Models
Viaarxiv icon

On the Reliability of Watermarks for Large Language Models

Add code
Bookmark button
Alert button
Jun 30, 2023
John Kirchenbauer, Jonas Geiping, Yuxin Wen, Manli Shu, Khalid Saifullah, Kezhi Kong, Kasun Fernando, Aniruddha Saha, Micah Goldblum, Tom Goldstein

Figure 1 for On the Reliability of Watermarks for Large Language Models
Figure 2 for On the Reliability of Watermarks for Large Language Models
Figure 3 for On the Reliability of Watermarks for Large Language Models
Figure 4 for On the Reliability of Watermarks for Large Language Models
Viaarxiv icon

Bring Your Own Data! Self-Supervised Evaluation for Large Language Models

Add code
Bookmark button
Alert button
Jun 29, 2023
Neel Jain, Khalid Saifullah, Yuxin Wen, John Kirchenbauer, Manli Shu, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein

Figure 1 for Bring Your Own Data! Self-Supervised Evaluation for Large Language Models
Figure 2 for Bring Your Own Data! Self-Supervised Evaluation for Large Language Models
Figure 3 for Bring Your Own Data! Self-Supervised Evaluation for Large Language Models
Figure 4 for Bring Your Own Data! Self-Supervised Evaluation for Large Language Models
Viaarxiv icon

Revisiting Image Classifier Training for Improved Certified Robust Defense against Adversarial Patches

Add code
Bookmark button
Alert button
Jun 22, 2023
Aniruddha Saha, Shuhua Yu, Arash Norouzzadeh, Wan-Yi Lin, Chaithanya Kumar Mummadi

Figure 1 for Revisiting Image Classifier Training for Improved Certified Robust Defense against Adversarial Patches
Figure 2 for Revisiting Image Classifier Training for Improved Certified Robust Defense against Adversarial Patches
Figure 3 for Revisiting Image Classifier Training for Improved Certified Robust Defense against Adversarial Patches
Figure 4 for Revisiting Image Classifier Training for Improved Certified Robust Defense against Adversarial Patches
Viaarxiv icon