Alert button
Picture for Mark Schmidt

Mark Schmidt

Alert button

Enhancing Policy Gradient with the Polyak Step-Size Adaption

Add code
Bookmark button
Alert button
Apr 11, 2024
Yunxiang Li, Rui Yuan, Chen Fan, Mark Schmidt, Samuel Horváth, Robert M. Gower, Martin Takáč

Viaarxiv icon

Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation

Add code
Bookmark button
Alert button
Apr 03, 2024
Aaron Mishkin, Mert Pilanci, Mark Schmidt

Viaarxiv icon

Heavy-Tailed Class Imbalance and Why Adam Outperforms Gradient Descent on Language Models

Add code
Bookmark button
Alert button
Feb 29, 2024
Frederik Kunstner, Robin Yadav, Alan Milligan, Mark Schmidt, Alberto Bietti

Viaarxiv icon

Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm

Add code
Bookmark button
Alert button
Jul 03, 2023
Amrutha Varshini Ramesh, Aaron Mishkin, Mark Schmidt, Yihan Zhou, Jonathan Wilder Lavington, Jennifer She

Figure 1 for Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm
Figure 2 for Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm
Figure 3 for Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm
Figure 4 for Analyzing and Improving Greedy 2-Coordinate Updates for Equality-Constrained Optimization via Steepest Descent in the 1-Norm
Viaarxiv icon

Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models

Add code
Bookmark button
Alert button
Jun 22, 2023
Leonardo Galli, Holger Rauhut, Mark Schmidt

Figure 1 for Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models
Figure 2 for Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models
Figure 3 for Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models
Figure 4 for Don't be so Monotone: Relaxing Stochastic Line Search in Over-Parameterized Models
Viaarxiv icon

Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking

Add code
Bookmark button
Alert button
Jun 05, 2023
Frederik Kunstner, Victor S. Portella, Mark Schmidt, Nick Harvey

Figure 1 for Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking
Figure 2 for Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking
Figure 3 for Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking
Figure 4 for Searching for Optimal Per-Coordinate Step-sizes with Multidimensional Backtracking
Viaarxiv icon

BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization

Add code
Bookmark button
Alert button
May 30, 2023
Chen Fan, Gaspard Choné-Ducasse, Mark Schmidt, Christos Thrampoulidis

Figure 1 for BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization
Figure 2 for BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization
Figure 3 for BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization
Figure 4 for BiSLS/SPS: Auto-tune Step Sizes for Stable Bi-level Optimization
Viaarxiv icon

Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be

Add code
Bookmark button
Alert button
Apr 27, 2023
Frederik Kunstner, Jacques Chen, Jonathan Wilder Lavington, Mark Schmidt

Figure 1 for Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be
Figure 2 for Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be
Figure 3 for Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be
Figure 4 for Noise Is Not the Main Factor Behind the Gap Between SGD and Adam on Transformers, but Sign Descent Might Be
Viaarxiv icon

Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition

Add code
Bookmark button
Alert button
Apr 02, 2023
Chen Fan, Christos Thrampoulidis, Mark Schmidt

Figure 1 for Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition
Figure 2 for Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition
Figure 3 for Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition
Figure 4 for Fast Convergence of Random Reshuffling under Over-Parameterization and the Polyak-Łojasiewicz Condition
Viaarxiv icon

Simplifying Momentum-based Riemannian Submanifold Optimization

Add code
Bookmark button
Alert button
Feb 20, 2023
Wu Lin, Valentin Duruisseaux, Melvin Leok, Frank Nielsen, Mohammad Emtiyaz Khan, Mark Schmidt

Figure 1 for Simplifying Momentum-based Riemannian Submanifold Optimization
Figure 2 for Simplifying Momentum-based Riemannian Submanifold Optimization
Figure 3 for Simplifying Momentum-based Riemannian Submanifold Optimization
Figure 4 for Simplifying Momentum-based Riemannian Submanifold Optimization
Viaarxiv icon