Alert button
Picture for Louis Feng

Louis Feng

Alert button

Towards Universal Performance Modeling for Machine Learning Training on Multi-GPU Platforms

Add code
Bookmark button
Alert button
Apr 19, 2024
Zhongyi Lin, Ning Sun, Pallab Bhattacharya, Xizhou Feng, Louis Feng, John D. Owens

Viaarxiv icon

Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces

Add code
Bookmark button
Alert button
May 26, 2023
Srinivas Sridharan, Taekyung Heo, Louis Feng, Zhaodong Wang, Matt Bergeron, Wenyin Fu, Shengbao Zheng, Brian Coutinho, Saeed Rashidi, Changhai Man, Tushar Krishna

Figure 1 for Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces
Figure 2 for Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces
Figure 3 for Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces
Figure 4 for Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces
Viaarxiv icon

Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models

Add code
Bookmark button
Alert button
May 03, 2023
Daochen Zha, Louis Feng, Liang Luo, Bhargav Bhushanam, Zirui Liu, Yusuo Hu, Jade Nie, Yuzhen Huang, Yuandong Tian, Arun Kejariwal, Xia Hu

Figure 1 for Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models
Figure 2 for Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models
Figure 3 for Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models
Figure 4 for Pre-train and Search: Efficient Embedding Table Sharding with Pre-trained Neural Cost Models
Viaarxiv icon

Mystique: Accurate and Scalable Production AI Benchmarks Generation

Add code
Bookmark button
Alert button
Dec 16, 2022
Mingyu Liang, Wenyin Fu, Louis Feng, Zhongyi Lin, Pavani Panakanti, Srinivas Sridharan, Christina Delimitrou

Figure 1 for Mystique: Accurate and Scalable Production AI Benchmarks Generation
Figure 2 for Mystique: Accurate and Scalable Production AI Benchmarks Generation
Figure 3 for Mystique: Accurate and Scalable Production AI Benchmarks Generation
Figure 4 for Mystique: Accurate and Scalable Production AI Benchmarks Generation
Viaarxiv icon

DreamShard: Generalizable Embedding Table Placement for Recommender Systems

Add code
Bookmark button
Alert button
Oct 05, 2022
Daochen Zha, Louis Feng, Qiaoyu Tan, Zirui Liu, Kwei-Herng Lai, Bhargav Bhushanam, Yuandong Tian, Arun Kejariwal, Xia Hu

Figure 1 for DreamShard: Generalizable Embedding Table Placement for Recommender Systems
Figure 2 for DreamShard: Generalizable Embedding Table Placement for Recommender Systems
Figure 3 for DreamShard: Generalizable Embedding Table Placement for Recommender Systems
Figure 4 for DreamShard: Generalizable Embedding Table Placement for Recommender Systems
Viaarxiv icon

AutoShard: Automated Embedding Table Sharding for Recommender Systems

Add code
Bookmark button
Alert button
Aug 12, 2022
Daochen Zha, Louis Feng, Bhargav Bhushanam, Dhruv Choudhary, Jade Nie, Yuandong Tian, Jay Chae, Yinbin Ma, Arun Kejariwal, Xia Hu

Figure 1 for AutoShard: Automated Embedding Table Sharding for Recommender Systems
Figure 2 for AutoShard: Automated Embedding Table Sharding for Recommender Systems
Figure 3 for AutoShard: Automated Embedding Table Sharding for Recommender Systems
Figure 4 for AutoShard: Automated Embedding Table Sharding for Recommender Systems
Viaarxiv icon

Building a Performance Model for Deep Learning Recommendation Model Training on GPUs

Add code
Bookmark button
Alert button
Jan 19, 2022
Zhongyi Lin, Louis Feng, Ehsan K. Ardestani, Jaewon Lee, John Lundell, Changkyu Kim, Arun Kejariwal, John D. Owens

Figure 1 for Building a Performance Model for Deep Learning Recommendation Model Training on GPUs
Figure 2 for Building a Performance Model for Deep Learning Recommendation Model Training on GPUs
Figure 3 for Building a Performance Model for Deep Learning Recommendation Model Training on GPUs
Figure 4 for Building a Performance Model for Deep Learning Recommendation Model Training on GPUs
Viaarxiv icon

Alternate Model Growth and Pruning for Efficient Training of Recommendation Systems

Add code
Bookmark button
Alert button
May 04, 2021
Xiaocong Du, Bhargav Bhushanam, Jiecao Yu, Dhruv Choudhary, Tianxiang Gao, Sherman Wong, Louis Feng, Jongsoo Park, Yu Cao, Arun Kejariwal

Figure 1 for Alternate Model Growth and Pruning for Efficient Training of Recommendation Systems
Figure 2 for Alternate Model Growth and Pruning for Efficient Training of Recommendation Systems
Figure 3 for Alternate Model Growth and Pruning for Efficient Training of Recommendation Systems
Figure 4 for Alternate Model Growth and Pruning for Efficient Training of Recommendation Systems
Viaarxiv icon