Picture for Matthew Jagielski

Matthew Jagielski

Cascading Adversarial Bias from Injection to Distillation in Language Models

Add code
May 30, 2025
Viaarxiv icon

Strong Membership Inference Attacks on Massive Datasets and (Moderately) Large Language Models

Add code
May 24, 2025
Viaarxiv icon

Covert Attacks on Machine Learning Training in Passively Secure MPC

Add code
May 21, 2025
Viaarxiv icon

LLMs unlock new paths to monetizing exploits

Add code
May 16, 2025
Viaarxiv icon

Privacy Ripple Effects from Adding or Removing Personal Information in Language Model Training

Add code
Feb 21, 2025
Viaarxiv icon

Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards

Add code
Jan 13, 2025
Figure 1 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 2 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 3 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Figure 4 for Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards
Viaarxiv icon

On Evaluating the Durability of Safeguards for Open-Weight LLMs

Add code
Dec 10, 2024
Viaarxiv icon

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice

Add code
Dec 09, 2024
Figure 1 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Figure 2 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Figure 3 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Figure 4 for Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice
Viaarxiv icon

The Last Iterate Advantage: Empirical Auditing and Principled Heuristic Analysis of Differentially Private SGD

Add code
Oct 10, 2024
Viaarxiv icon

UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI

Add code
Jun 27, 2024
Viaarxiv icon