Picture for Aryan Mokhtari

Aryan Mokhtari

On the Crucial Role of Initialization for Matrix Factorization

Add code
Oct 24, 2024
Figure 1 for On the Crucial Role of Initialization for Matrix Factorization
Figure 2 for On the Crucial Role of Initialization for Matrix Factorization
Figure 3 for On the Crucial Role of Initialization for Matrix Factorization
Figure 4 for On the Crucial Role of Initialization for Matrix Factorization
Viaarxiv icon

Online Learning Guided Quasi-Newton Methods with Global Non-Asymptotic Convergence

Add code
Oct 03, 2024
Viaarxiv icon

Convergence Analysis of Adaptive Gradient Methods under Refined Smoothness and Noise Assumptions

Add code
Jun 07, 2024
Viaarxiv icon

Adaptive and Optimal Second-order Optimistic Methods for Minimax Optimization

Add code
Jun 04, 2024
Viaarxiv icon

Stochastic Newton Proximal Extragradient Method

Add code
Jun 03, 2024
Figure 1 for Stochastic Newton Proximal Extragradient Method
Figure 2 for Stochastic Newton Proximal Extragradient Method
Figure 3 for Stochastic Newton Proximal Extragradient Method
Viaarxiv icon

In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness

Add code
Feb 18, 2024
Figure 1 for In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
Figure 2 for In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
Figure 3 for In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
Figure 4 for In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
Viaarxiv icon

An Accelerated Gradient Method for Simple Bilevel Optimization with Convex Lower-level Problem

Add code
Feb 12, 2024
Viaarxiv icon

Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate

Add code
Jan 05, 2024
Figure 1 for Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate
Figure 2 for Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate
Figure 3 for Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate
Viaarxiv icon

Projection-Free Methods for Stochastic Simple Bilevel Optimization with Convex Lower-level Problem

Add code
Aug 15, 2023
Figure 1 for Projection-Free Methods for Stochastic Simple Bilevel Optimization with Convex Lower-level Problem
Figure 2 for Projection-Free Methods for Stochastic Simple Bilevel Optimization with Convex Lower-level Problem
Figure 3 for Projection-Free Methods for Stochastic Simple Bilevel Optimization with Convex Lower-level Problem
Figure 4 for Projection-Free Methods for Stochastic Simple Bilevel Optimization with Convex Lower-level Problem
Viaarxiv icon

Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks

Add code
Jul 17, 2023
Viaarxiv icon