Picture for Kimon Antonakopoulos

Kimon Antonakopoulos

Constrained Stochastic Spectral Preconditioning Converges for Nonconvex Objectives

Add code
May 12, 2026
Viaarxiv icon

Training Neural Networks at Any Scale

Add code
Nov 14, 2025
Viaarxiv icon

Layer-wise Quantization for Quantized Optimistic Dual Averaging

Add code
May 20, 2025
Figure 1 for Layer-wise Quantization for Quantized Optimistic Dual Averaging
Figure 2 for Layer-wise Quantization for Quantized Optimistic Dual Averaging
Figure 3 for Layer-wise Quantization for Quantized Optimistic Dual Averaging
Figure 4 for Layer-wise Quantization for Quantized Optimistic Dual Averaging
Viaarxiv icon

Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees

Add code
Feb 18, 2025
Figure 1 for Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Figure 2 for Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Figure 3 for Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Figure 4 for Multi-Step Alignment as Markov Games: An Optimistic Online Gradient Descent Approach with Convergence Guarantees
Viaarxiv icon

Training Deep Learning Models with Norm-Constrained LMOs

Add code
Feb 11, 2025
Viaarxiv icon

Improving SAM Requires Rethinking its Optimization Formulation

Add code
Jul 17, 2024
Figure 1 for Improving SAM Requires Rethinking its Optimization Formulation
Figure 2 for Improving SAM Requires Rethinking its Optimization Formulation
Figure 3 for Improving SAM Requires Rethinking its Optimization Formulation
Figure 4 for Improving SAM Requires Rethinking its Optimization Formulation
Viaarxiv icon

Distributed Extra-gradient with Optimal Complexity and Communication Guarantees

Add code
Aug 17, 2023
Figure 1 for Distributed Extra-gradient with Optimal Complexity and Communication Guarantees
Figure 2 for Distributed Extra-gradient with Optimal Complexity and Communication Guarantees
Figure 3 for Distributed Extra-gradient with Optimal Complexity and Communication Guarantees
Figure 4 for Distributed Extra-gradient with Optimal Complexity and Communication Guarantees
Viaarxiv icon

Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods

Add code
Nov 03, 2022
Figure 1 for Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods
Figure 2 for Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods
Figure 3 for Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods
Figure 4 for Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods
Viaarxiv icon

Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization

Add code
Nov 03, 2022
Figure 1 for Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization
Figure 2 for Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization
Figure 3 for Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization
Figure 4 for Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization
Viaarxiv icon

No-Regret Learning in Games with Noisy Feedback: Faster Rates and Adaptivity via Learning Rate Separation

Add code
Jun 13, 2022
Figure 1 for No-Regret Learning in Games with Noisy Feedback: Faster Rates and Adaptivity via Learning Rate Separation
Figure 2 for No-Regret Learning in Games with Noisy Feedback: Faster Rates and Adaptivity via Learning Rate Separation
Figure 3 for No-Regret Learning in Games with Noisy Feedback: Faster Rates and Adaptivity via Learning Rate Separation
Figure 4 for No-Regret Learning in Games with Noisy Feedback: Faster Rates and Adaptivity via Learning Rate Separation
Viaarxiv icon