Picture for Meisam Razaviyayn

Meisam Razaviyayn

Less is More: Convergence Benefits of Fewer Data Weight Updates over Longer Horizon

Add code
Feb 23, 2026
Viaarxiv icon

Nested Learning: The Illusion of Deep Learning Architectures

Add code
Dec 31, 2025
Viaarxiv icon

Sampling and Loss Weights in Multi-Domain Training

Add code
Nov 10, 2025
Viaarxiv icon

TNT: Improving Chunkwise Training for Test-Time Memorization

Add code
Nov 10, 2025
Viaarxiv icon

Memory-Efficient Differentially Private Training with Gradient Random Projection

Add code
Jun 18, 2025
Viaarxiv icon

ATLAS: Learning to Optimally Memorize the Context at Test Time

Add code
May 29, 2025
Figure 1 for ATLAS: Learning to Optimally Memorize the Context at Test Time
Figure 2 for ATLAS: Learning to Optimally Memorize the Context at Test Time
Figure 3 for ATLAS: Learning to Optimally Memorize the Context at Test Time
Figure 4 for ATLAS: Learning to Optimally Memorize the Context at Test Time
Viaarxiv icon

It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization

Add code
Apr 17, 2025
Viaarxiv icon

Synthetic Text Generation for Training Large Language Models via Gradient Matching

Add code
Feb 24, 2025
Viaarxiv icon

PiKE: Adaptive Data Mixing for Multi-Task Learning Under Low Gradient Conflicts

Add code
Feb 10, 2025
Viaarxiv icon

Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence

Add code
Dec 24, 2024
Figure 1 for Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence
Figure 2 for Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence
Figure 3 for Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence
Figure 4 for Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence
Viaarxiv icon