Picture for Meisam Razaviyayn

Meisam Razaviyayn

Sampling and Loss Weights in Multi-Domain Training

Add code
Nov 10, 2025
Viaarxiv icon

TNT: Improving Chunkwise Training for Test-Time Memorization

Add code
Nov 10, 2025
Viaarxiv icon

Memory-Efficient Differentially Private Training with Gradient Random Projection

Add code
Jun 18, 2025
Viaarxiv icon

ATLAS: Learning to Optimally Memorize the Context at Test Time

Add code
May 29, 2025
Viaarxiv icon

It's All Connected: A Journey Through Test-Time Memorization, Attentional Bias, Retention, and Online Optimization

Add code
Apr 17, 2025
Viaarxiv icon

Synthetic Text Generation for Training Large Language Models via Gradient Matching

Add code
Feb 24, 2025
Viaarxiv icon

PiKE: Adaptive Data Mixing for Multi-Task Learning Under Low Gradient Conflicts

Add code
Feb 10, 2025
Viaarxiv icon

Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence

Add code
Dec 24, 2024
Figure 1 for Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence
Figure 2 for Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence
Figure 3 for Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence
Figure 4 for Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence
Viaarxiv icon

A Stochastic Optimization Framework for Private and Fair Learning From Decentralized Data

Add code
Nov 12, 2024
Figure 1 for A Stochastic Optimization Framework for Private and Fair Learning From Decentralized Data
Figure 2 for A Stochastic Optimization Framework for Private and Fair Learning From Decentralized Data
Figure 3 for A Stochastic Optimization Framework for Private and Fair Learning From Decentralized Data
Figure 4 for A Stochastic Optimization Framework for Private and Fair Learning From Decentralized Data
Viaarxiv icon

Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models

Add code
Oct 09, 2024
Figure 1 for Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Figure 2 for Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Figure 3 for Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Figure 4 for Addax: Utilizing Zeroth-Order Gradients to Improve Memory Efficiency and Performance of SGD for Fine-Tuning Language Models
Viaarxiv icon