Picture for Zhifeng Chen

Zhifeng Chen

Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning

Add code
Jan 28, 2022
Figure 1 for Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
Figure 2 for Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
Figure 3 for Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
Figure 4 for Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning
Viaarxiv icon

GLaM: Efficient Scaling of Language Models with Mixture-of-Experts

Add code
Dec 13, 2021
Figure 1 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Figure 2 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Figure 3 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Figure 4 for GLaM: Efficient Scaling of Language Models with Mixture-of-Experts
Viaarxiv icon

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

Add code
Oct 01, 2021
Figure 1 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 2 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 3 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 4 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Viaarxiv icon

Scene Transformer: A unified multi-task model for behavior prediction and planning

Add code
Jun 15, 2021
Figure 1 for Scene Transformer: A unified multi-task model for behavior prediction and planning
Figure 2 for Scene Transformer: A unified multi-task model for behavior prediction and planning
Figure 3 for Scene Transformer: A unified multi-task model for behavior prediction and planning
Figure 4 for Scene Transformer: A unified multi-task model for behavior prediction and planning
Viaarxiv icon

GSPMD: General and Scalable Parallelization for ML Computation Graphs

Add code
May 10, 2021
Figure 1 for GSPMD: General and Scalable Parallelization for ML Computation Graphs
Figure 2 for GSPMD: General and Scalable Parallelization for ML Computation Graphs
Figure 3 for GSPMD: General and Scalable Parallelization for ML Computation Graphs
Figure 4 for GSPMD: General and Scalable Parallelization for ML Computation Graphs
Viaarxiv icon

3D-MAN: 3D Multi-frame Attention Network for Object Detection

Add code
Mar 30, 2021
Figure 1 for 3D-MAN: 3D Multi-frame Attention Network for Object Detection
Figure 2 for 3D-MAN: 3D Multi-frame Attention Network for Object Detection
Figure 3 for 3D-MAN: 3D Multi-frame Attention Network for Object Detection
Figure 4 for 3D-MAN: 3D Multi-frame Attention Network for Object Detection
Viaarxiv icon

Scalable Scene Flow from Point Clouds in the Real World

Add code
Mar 15, 2021
Figure 1 for Scalable Scene Flow from Point Clouds in the Real World
Figure 2 for Scalable Scene Flow from Point Clouds in the Real World
Figure 3 for Scalable Scene Flow from Point Clouds in the Real World
Figure 4 for Scalable Scene Flow from Point Clouds in the Real World
Viaarxiv icon

Pseudo-labeling for Scalable 3D Object Detection

Add code
Mar 02, 2021
Figure 1 for Pseudo-labeling for Scalable 3D Object Detection
Figure 2 for Pseudo-labeling for Scalable 3D Object Detection
Figure 3 for Pseudo-labeling for Scalable 3D Object Detection
Figure 4 for Pseudo-labeling for Scalable 3D Object Detection
Viaarxiv icon

Computing Cliques and Cavities in Networks

Add code
Jan 03, 2021
Figure 1 for Computing Cliques and Cavities in Networks
Figure 2 for Computing Cliques and Cavities in Networks
Figure 3 for Computing Cliques and Cavities in Networks
Figure 4 for Computing Cliques and Cavities in Networks
Viaarxiv icon

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding

Add code
Jun 30, 2020
Figure 1 for GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Figure 2 for GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Figure 3 for GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Figure 4 for GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
Viaarxiv icon