Picture for Baharan Mirzasoleiman

Baharan Mirzasoleiman

Dataset Distillation via Knowledge Distillation: Towards Efficient Self-Supervised Pre-Training of Deep Networks

Add code
Oct 03, 2024
Viaarxiv icon

Memory-efficient Training of LLMs with Larger Mini-batches

Add code
Jul 28, 2024
Viaarxiv icon

Make the Most of Your Data: Changing the Training Data Distribution to Improve In-distribution Generalization Performance

Add code
Apr 27, 2024
Viaarxiv icon

Data-Efficient Contrastive Language-Image Pretraining: Prioritizing Data Quality over Quantity

Add code
Mar 20, 2024
Viaarxiv icon

Investigating the Benefits of Projection Head for Representation Learning

Add code
Mar 18, 2024
Viaarxiv icon

SmallToLarge (S2L): Scalable Data Selection for Fine-tuning Large Language Models by Summarizing Training Trajectories of Small Models

Add code
Mar 12, 2024
Viaarxiv icon

Inference and Interference: The Role of Clipping, Pruning and Loss Landscapes in Differentially Private Stochastic Gradient Descent

Add code
Nov 12, 2023
Viaarxiv icon

Data Distillation Can Be Like Vodka: Distilling More Times For Better Quality

Add code
Oct 10, 2023
Viaarxiv icon

Understanding the Robustness of Multi-modal Contrastive Learning to Distribution Shift

Add code
Oct 08, 2023
Viaarxiv icon

Better Safe than Sorry: Pre-training CLIP against Targeted Data Poisoning and Backdoor Attacks

Add code
Oct 05, 2023
Viaarxiv icon