Picture for Aleksandr Beznosikov

Aleksandr Beznosikov

Gradient Clipping Improves AdaGrad when the Noise Is Heavy-Tailed

Add code
Jun 06, 2024
Viaarxiv icon

Local Methods with Adaptivity via Scaling

Add code
Jun 02, 2024
Figure 1 for Local Methods with Adaptivity via Scaling
Figure 2 for Local Methods with Adaptivity via Scaling
Viaarxiv icon

Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning

Add code
Apr 04, 2024
Figure 1 for Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning
Figure 2 for Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning
Figure 3 for Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning
Figure 4 for Sparse Concept Bottleneck Models: Gumbel Tricks in Contrastive Learning
Viaarxiv icon

Activations and Gradients Compression for Model-Parallel Training

Add code
Jan 15, 2024
Viaarxiv icon

Optimal Data Splitting in Distributed Optimization for Machine Learning

Add code
Jan 15, 2024
Viaarxiv icon

Ito Diffusion Approximation of Universal Ito Chains for Sampling, Optimization and Boosting

Add code
Oct 09, 2023
Figure 1 for Ito Diffusion Approximation of Universal Ito Chains for Sampling, Optimization and Boosting
Figure 2 for Ito Diffusion Approximation of Universal Ito Chains for Sampling, Optimization and Boosting
Figure 3 for Ito Diffusion Approximation of Universal Ito Chains for Sampling, Optimization and Boosting
Viaarxiv icon

First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities

Add code
May 25, 2023
Figure 1 for First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities
Figure 2 for First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities
Viaarxiv icon

Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features

Add code
Apr 23, 2023
Figure 1 for Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features
Figure 2 for Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features
Figure 3 for Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features
Figure 4 for Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features
Viaarxiv icon

Similarity, Compression and Local Steps: Three Pillars of Efficient Communications for Distributed Variational Inequalities

Add code
Feb 15, 2023
Figure 1 for Similarity, Compression and Local Steps: Three Pillars of Efficient Communications for Distributed Variational Inequalities
Figure 2 for Similarity, Compression and Local Steps: Three Pillars of Efficient Communications for Distributed Variational Inequalities
Viaarxiv icon

SARAH-based Variance-reduced Algorithm for Stochastic Finite-sum Cocoercive Variational Inequalities

Add code
Oct 12, 2022
Figure 1 for SARAH-based Variance-reduced Algorithm for Stochastic Finite-sum Cocoercive Variational Inequalities
Viaarxiv icon