Picture for Yuanzhi Li

Yuanzhi Li

On the One-sided Convergence of Adam-type Algorithms in Non-convex Non-concave Min-max Optimization

Add code
Sep 29, 2021
Figure 1 for On the One-sided Convergence of Adam-type Algorithms in Non-convex Non-concave Min-max Optimization
Figure 2 for On the One-sided Convergence of Adam-type Algorithms in Non-convex Non-concave Min-max Optimization
Figure 3 for On the One-sided Convergence of Adam-type Algorithms in Non-convex Non-concave Min-max Optimization
Figure 4 for On the One-sided Convergence of Adam-type Algorithms in Non-convex Non-concave Min-max Optimization
Viaarxiv icon

Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization

Add code
Aug 25, 2021
Figure 1 for Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
Figure 2 for Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
Figure 3 for Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
Viaarxiv icon

LoRA: Low-Rank Adaptation of Large Language Models

Add code
Jun 17, 2021
Figure 1 for LoRA: Low-Rank Adaptation of Large Language Models
Figure 2 for LoRA: Low-Rank Adaptation of Large Language Models
Figure 3 for LoRA: Low-Rank Adaptation of Large Language Models
Figure 4 for LoRA: Low-Rank Adaptation of Large Language Models
Viaarxiv icon

Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity

Add code
Jun 15, 2021
Figure 1 for Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity
Figure 2 for Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity
Figure 3 for Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity
Figure 4 for Sample Efficient Reinforcement Learning In Continuous State Spaces: A Perspective Beyond Linearity
Viaarxiv icon

Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning

Add code
Jun 12, 2021
Figure 1 for Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning
Figure 2 for Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning
Figure 3 for Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning
Figure 4 for Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning
Viaarxiv icon

Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions

Add code
Jun 04, 2021
Figure 1 for Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions
Figure 2 for Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions
Figure 3 for Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions
Figure 4 for Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions
Viaarxiv icon

Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability

Add code
Feb 26, 2021
Figure 1 for Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability
Figure 2 for Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability
Figure 3 for Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability
Figure 4 for Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability
Viaarxiv icon

When Is Generalizable Reinforcement Learning Tractable?

Add code
Jan 01, 2021
Figure 1 for When Is Generalizable Reinforcement Learning Tractable?
Figure 2 for When Is Generalizable Reinforcement Learning Tractable?
Figure 3 for When Is Generalizable Reinforcement Learning Tractable?
Figure 4 for When Is Generalizable Reinforcement Learning Tractable?
Viaarxiv icon

Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning

Add code
Dec 17, 2020
Figure 1 for Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Figure 2 for Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Figure 3 for Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Figure 4 for Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Viaarxiv icon

A law of robustness for two-layers neural networks

Add code
Sep 30, 2020
Figure 1 for A law of robustness for two-layers neural networks
Figure 2 for A law of robustness for two-layers neural networks
Figure 3 for A law of robustness for two-layers neural networks
Viaarxiv icon