Picture for Niloofar Mireshghallah

Niloofar Mireshghallah

Position: Don't Just "Fix it in Post": A Science of AI Must Study Training Dynamics

Add code
Jun 03, 2026
Viaarxiv icon

Boundary-targeted Membership Inference Attacks on Safety Classifiers

Add code
May 21, 2026
Viaarxiv icon

SMDD-Bench: Can LLMs Solve Real-World Small Molecule Drug Design Tasks?

Add code
May 20, 2026
Viaarxiv icon

Alignment Whack-a-Mole : Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models

Add code
Mar 21, 2026
Viaarxiv icon

Learning to Reason in 13 Parameters

Add code
Feb 04, 2026
Viaarxiv icon

Privasis: Synthesizing the Largest "Public" Private Dataset from Scratch

Add code
Feb 03, 2026
Viaarxiv icon

Memorization Dynamics in Knowledge Distillation for Language Models

Add code
Jan 21, 2026
Viaarxiv icon

Quantifying the Effect of Test Set Contamination on Generative Evaluations

Add code
Jan 07, 2026
Viaarxiv icon

Reinforcement Learning Improves Traversal of Hierarchical Knowledge in LLMs

Add code
Nov 08, 2025
Viaarxiv icon

Position: Privacy Is Not Just Memorization!

Add code
Oct 02, 2025
Viaarxiv icon