Alert button
Picture for Konstantin Mishchenko

Konstantin Mishchenko

Alert button

SIERRA, PSL

When, Why and How Much? Adaptive Learning Rate Scheduling by Refinement

Add code
Bookmark button
Alert button
Oct 11, 2023
Aaron Defazio, Ashok Cutkosky, Harsh Mehta, Konstantin Mishchenko

Figure 1 for When, Why and How Much? Adaptive Learning Rate Scheduling by Refinement
Figure 2 for When, Why and How Much? Adaptive Learning Rate Scheduling by Refinement
Figure 3 for When, Why and How Much? Adaptive Learning Rate Scheduling by Refinement
Figure 4 for When, Why and How Much? Adaptive Learning Rate Scheduling by Refinement
Viaarxiv icon

Adaptive Proximal Gradient Method for Convex Optimization

Add code
Bookmark button
Alert button
Aug 04, 2023
Yura Malitsky, Konstantin Mishchenko

Viaarxiv icon

Prodigy: An Expeditiously Adaptive Parameter-Free Learner

Add code
Bookmark button
Alert button
Jun 09, 2023
Konstantin Mishchenko, Aaron Defazio

Figure 1 for Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Figure 2 for Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Figure 3 for Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Figure 4 for Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Viaarxiv icon

Partially Personalized Federated Learning: Breaking the Curse of Data Heterogeneity

Add code
Bookmark button
Alert button
May 29, 2023
Konstantin Mishchenko, Rustem Islamov, Eduard Gorbunov, Samuel Horváth

Figure 1 for Partially Personalized Federated Learning: Breaking the Curse of Data Heterogeneity
Figure 2 for Partially Personalized Federated Learning: Breaking the Curse of Data Heterogeneity
Figure 3 for Partially Personalized Federated Learning: Breaking the Curse of Data Heterogeneity
Viaarxiv icon

DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method

Add code
Bookmark button
Alert button
May 25, 2023
Ahmed Khaled, Konstantin Mishchenko, Chi Jin

Figure 1 for DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method
Figure 2 for DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method
Figure 3 for DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method
Figure 4 for DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method
Viaarxiv icon

Two Losses Are Better Than One: Faster Optimization Using a Cheaper Proxy

Add code
Bookmark button
Alert button
Feb 07, 2023
Blake Woodworth, Konstantin Mishchenko, Francis Bach

Figure 1 for Two Losses Are Better Than One: Faster Optimization Using a Cheaper Proxy
Figure 2 for Two Losses Are Better Than One: Faster Optimization Using a Cheaper Proxy
Viaarxiv icon

Learning-Rate-Free Learning by D-Adaptation

Add code
Bookmark button
Alert button
Jan 20, 2023
Aaron Defazio, Konstantin Mishchenko

Figure 1 for Learning-Rate-Free Learning by D-Adaptation
Figure 2 for Learning-Rate-Free Learning by D-Adaptation
Figure 3 for Learning-Rate-Free Learning by D-Adaptation
Figure 4 for Learning-Rate-Free Learning by D-Adaptation
Viaarxiv icon

Convergence of First-Order Algorithms for Meta-Learning with Moreau Envelopes

Add code
Bookmark button
Alert button
Jan 17, 2023
Konstantin Mishchenko, Slavomír Hanzely, Peter Richtárik

Figure 1 for Convergence of First-Order Algorithms for Meta-Learning with Moreau Envelopes
Figure 2 for Convergence of First-Order Algorithms for Meta-Learning with Moreau Envelopes
Viaarxiv icon

Super-Universal Regularized Newton Method

Add code
Bookmark button
Alert button
Aug 11, 2022
Nikita Doikov, Konstantin Mishchenko, Yurii Nesterov

Figure 1 for Super-Universal Regularized Newton Method
Figure 2 for Super-Universal Regularized Newton Method
Figure 3 for Super-Universal Regularized Newton Method
Viaarxiv icon

Adaptive Learning Rates for Faster Stochastic Gradient Methods

Add code
Bookmark button
Alert button
Aug 10, 2022
Samuel Horváth, Konstantin Mishchenko, Peter Richtárik

Figure 1 for Adaptive Learning Rates for Faster Stochastic Gradient Methods
Figure 2 for Adaptive Learning Rates for Faster Stochastic Gradient Methods
Figure 3 for Adaptive Learning Rates for Faster Stochastic Gradient Methods
Figure 4 for Adaptive Learning Rates for Faster Stochastic Gradient Methods
Viaarxiv icon