Alert button
Picture for Zeyuan Allen-Zhu

Zeyuan Allen-Zhu

Alert button

Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws

Add code
Bookmark button
Alert button
Apr 08, 2024
Zeyuan Allen-Zhu, Yuanzhi Li

Viaarxiv icon

Reverse Training to Nurse the Reversal Curse

Add code
Bookmark button
Alert button
Mar 20, 2024
Olga Golovneva, Zeyuan Allen-Zhu, Jason Weston, Sainbayar Sukhbaatar

Figure 1 for Reverse Training to Nurse the Reversal Curse
Figure 2 for Reverse Training to Nurse the Reversal Curse
Figure 3 for Reverse Training to Nurse the Reversal Curse
Figure 4 for Reverse Training to Nurse the Reversal Curse
Viaarxiv icon

Physics of Language Models: Part 3.2, Knowledge Manipulation

Add code
Bookmark button
Alert button
Sep 25, 2023
Zeyuan Allen-Zhu, Yuanzhi Li

Figure 1 for Physics of Language Models: Part 3.2, Knowledge Manipulation
Figure 2 for Physics of Language Models: Part 3.2, Knowledge Manipulation
Figure 3 for Physics of Language Models: Part 3.2, Knowledge Manipulation
Figure 4 for Physics of Language Models: Part 3.2, Knowledge Manipulation
Viaarxiv icon

Physics of Language Models: Part 1, Context-Free Grammar

Add code
Bookmark button
Alert button
May 23, 2023
Zeyuan Allen-Zhu, Yuanzhi Li

Figure 1 for Physics of Language Models: Part 1, Context-Free Grammar
Figure 2 for Physics of Language Models: Part 1, Context-Free Grammar
Figure 3 for Physics of Language Models: Part 1, Context-Free Grammar
Figure 4 for Physics of Language Models: Part 1, Context-Free Grammar
Viaarxiv icon

LoRA: Low-Rank Adaptation of Large Language Models

Add code
Bookmark button
Alert button
Jun 17, 2021
Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Weizhu Chen

Figure 1 for LoRA: Low-Rank Adaptation of Large Language Models
Figure 2 for LoRA: Low-Rank Adaptation of Large Language Models
Figure 3 for LoRA: Low-Rank Adaptation of Large Language Models
Figure 4 for LoRA: Low-Rank Adaptation of Large Language Models
Viaarxiv icon

Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions

Add code
Bookmark button
Alert button
Jun 04, 2021
Zeyuan Allen-Zhu, Yuanzhi Li

Figure 1 for Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions
Figure 2 for Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions
Figure 3 for Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions
Figure 4 for Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions
Viaarxiv icon

Byzantine-Resilient Non-Convex Stochastic Gradient Descent

Add code
Bookmark button
Alert button
Dec 28, 2020
Zeyuan Allen-Zhu, Faeze Ebrahimian, Jerry Li, Dan Alistarh

Figure 1 for Byzantine-Resilient Non-Convex Stochastic Gradient Descent
Figure 2 for Byzantine-Resilient Non-Convex Stochastic Gradient Descent
Figure 3 for Byzantine-Resilient Non-Convex Stochastic Gradient Descent
Figure 4 for Byzantine-Resilient Non-Convex Stochastic Gradient Descent
Viaarxiv icon

Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning

Add code
Bookmark button
Alert button
Dec 17, 2020
Zeyuan Allen-Zhu, Yuanzhi Li

Figure 1 for Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Figure 2 for Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Figure 3 for Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Figure 4 for Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Viaarxiv icon

Feature Purification: How Adversarial Training Performs Robust Deep Learning

Add code
Bookmark button
Alert button
May 20, 2020
Zeyuan Allen-Zhu, Yuanzhi Li

Figure 1 for Feature Purification: How Adversarial Training Performs Robust Deep Learning
Figure 2 for Feature Purification: How Adversarial Training Performs Robust Deep Learning
Figure 3 for Feature Purification: How Adversarial Training Performs Robust Deep Learning
Figure 4 for Feature Purification: How Adversarial Training Performs Robust Deep Learning
Viaarxiv icon

Backward Feature Correction: How Deep Learning Performs Deep Learning

Add code
Bookmark button
Alert button
Jan 13, 2020
Zeyuan Allen-Zhu, Yuanzhi Li

Figure 1 for Backward Feature Correction: How Deep Learning Performs Deep Learning
Figure 2 for Backward Feature Correction: How Deep Learning Performs Deep Learning
Figure 3 for Backward Feature Correction: How Deep Learning Performs Deep Learning
Viaarxiv icon