Picture for Sujay Sanghavi

Sujay Sanghavi

Context-Free Synthetic Data Mitigates Forgetting

Add code
May 20, 2025
Viaarxiv icon

Asymptotically-Optimal Gaussian Bandits with Side Observations

Add code
May 15, 2025
Viaarxiv icon

InfoPO: On Mutual Information Maximization for Large Language Model Alignment

Add code
May 13, 2025
Viaarxiv icon

Geometric Median Matching for Robust k-Subset Selection from Noisy Data

Add code
Apr 03, 2025
Viaarxiv icon

Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting

Add code
Feb 05, 2025
Viaarxiv icon

Learning Mixtures of Experts with EM

Add code
Nov 09, 2024
Viaarxiv icon

RARe: Retrieval Augmented Retrieval with In-Context Examples

Add code
Oct 26, 2024
Figure 1 for RARe: Retrieval Augmented Retrieval with In-Context Examples
Figure 2 for RARe: Retrieval Augmented Retrieval with In-Context Examples
Figure 3 for RARe: Retrieval Augmented Retrieval with In-Context Examples
Figure 4 for RARe: Retrieval Augmented Retrieval with In-Context Examples
Viaarxiv icon

Geometric Median (GM) Matching for Robust Data Pruning

Add code
Jun 25, 2024
Viaarxiv icon

DataComp-LM: In search of the next generation of training sets for language models

Add code
Jun 18, 2024
Figure 1 for DataComp-LM: In search of the next generation of training sets for language models
Figure 2 for DataComp-LM: In search of the next generation of training sets for language models
Figure 3 for DataComp-LM: In search of the next generation of training sets for language models
Figure 4 for DataComp-LM: In search of the next generation of training sets for language models
Viaarxiv icon

Retraining with Predicted Hard Labels Provably Increases Model Accuracy

Add code
Jun 17, 2024
Viaarxiv icon