Alert button
Picture for Yanli Zhao

Yanli Zhao

Alert button

Wukong: Towards a Scaling Law for Large-Scale Recommendation

Add code
Bookmark button
Alert button
Mar 08, 2024
Buyun Zhang, Liang Luo, Yuxin Chen, Jade Nie, Xi Liu, Daifeng Guo, Yanli Zhao, Shen Li, Yuchen Hao, Yantao Yao, Guna Lakshminarayanan, Ellie Dingqiao Wen, Jongsoo Park, Maxim Naumov, Wenlin Chen

Figure 1 for Wukong: Towards a Scaling Law for Large-Scale Recommendation
Figure 2 for Wukong: Towards a Scaling Law for Large-Scale Recommendation
Figure 3 for Wukong: Towards a Scaling Law for Large-Scale Recommendation
Figure 4 for Wukong: Towards a Scaling Law for Large-Scale Recommendation
Viaarxiv icon

Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation

Add code
Bookmark button
Alert button
Mar 07, 2024
Liang Luo, Buyun Zhang, Michael Tsang, Yinbin Ma, Ching-Hsiang Chu, Yuxin Chen, Shen Li, Yuchen Hao, Yanli Zhao, Guna Lakshminarayanan, Ellie Dingqiao Wen, Jongsoo Park, Dheevatsa Mudigere, Maxim Naumov

Figure 1 for Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation
Figure 2 for Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation
Figure 3 for Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation
Figure 4 for Disaggregated Multi-Tower: Topology-aware Modeling Technique for Efficient Large-Scale Recommendation
Viaarxiv icon

PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel

Add code
Bookmark button
Alert button
Apr 21, 2023
Yanli Zhao, Andrew Gu, Rohan Varma, Liang Luo, Chien-Chin Huang, Min Xu, Less Wright, Hamid Shojanazeri, Myle Ott, Sam Shleifer, Alban Desmaison, Can Balioglu, Bernard Nguyen, Geeta Chauhan, Yuchen Hao, Shen Li

Figure 1 for PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
Figure 2 for PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
Figure 3 for PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
Figure 4 for PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
Viaarxiv icon

PyTorch Distributed: Experiences on Accelerating Data Parallel Training

Add code
Bookmark button
Alert button
Jun 28, 2020
Shen Li, Yanli Zhao, Rohan Varma, Omkar Salpekar, Pieter Noordhuis, Teng Li, Adam Paszke, Jeff Smith, Brian Vaughan, Pritam Damania, Soumith Chintala

Figure 1 for PyTorch Distributed: Experiences on Accelerating Data Parallel Training
Figure 2 for PyTorch Distributed: Experiences on Accelerating Data Parallel Training
Figure 3 for PyTorch Distributed: Experiences on Accelerating Data Parallel Training
Figure 4 for PyTorch Distributed: Experiences on Accelerating Data Parallel Training
Viaarxiv icon