Alert button
Picture for Tianlong Chen

Tianlong Chen

Alert button

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

Add code
Bookmark button
Alert button
Jul 19, 2023
Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen

Figure 1 for H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Figure 2 for H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Figure 3 for H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Figure 4 for H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Viaarxiv icon

Enhancing Adversarial Training via Reweighting Optimization Trajectory

Add code
Bookmark button
Alert button
Jul 07, 2023
Tianjin Huang, Shiwei Liu, Tianlong Chen, Meng Fang, Li Shen, Vlaod Menkovski, Lu Yin, Yulong Pei, Mykola Pechenizkiy

Figure 1 for Enhancing Adversarial Training via Reweighting Optimization Trajectory
Figure 2 for Enhancing Adversarial Training via Reweighting Optimization Trajectory
Figure 3 for Enhancing Adversarial Training via Reweighting Optimization Trajectory
Figure 4 for Enhancing Adversarial Training via Reweighting Optimization Trajectory
Viaarxiv icon

Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication

Add code
Bookmark button
Alert button
Jun 18, 2023
Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang

Figure 1 for Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
Figure 2 for Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
Figure 3 for Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
Figure 4 for Graph Ladling: Shockingly Simple Parallel GNN Training without Intermediate Communication
Viaarxiv icon

Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models

Add code
Bookmark button
Alert button
Jun 18, 2023
Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang

Figure 1 for Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
Figure 2 for Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
Figure 3 for Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
Figure 4 for Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
Viaarxiv icon

The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter

Add code
Bookmark button
Alert button
Jun 06, 2023
Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Zhangyang Wang

Figure 1 for The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Figure 2 for The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Figure 3 for The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Figure 4 for The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Viaarxiv icon

Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!

Add code
Bookmark button
Alert button
Mar 03, 2023
Shiwei Liu, Tianlong Chen, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, Ajay Jaiswal, Zhangyang Wang

Figure 1 for Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Figure 2 for Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Figure 3 for Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Figure 4 for Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Viaarxiv icon

Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers

Add code
Bookmark button
Alert button
Mar 02, 2023
Tianlong Chen, Zhenyu Zhang, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang

Figure 1 for Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
Figure 2 for Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
Figure 3 for Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
Figure 4 for Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
Viaarxiv icon

M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation

Add code
Bookmark button
Alert button
Feb 28, 2023
Junjie Yang, Xuxi Chen, Tianlong Chen, Zhangyang Wang, Yingbin Liang

Figure 1 for M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation
Figure 2 for M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation
Figure 3 for M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation
Figure 4 for M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation
Viaarxiv icon