Alert button
Picture for Zangwei Zheng

Zangwei Zheng

Alert button

DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers

Add code
Bookmark button
Alert button
Mar 15, 2024
Xuanlei Zhao, Shenggan Cheng, Zangwei Zheng, Zheming Yang, Ziming Liu, Yang You

Viaarxiv icon

Helen: Optimizing CTR Prediction Models with Frequency-wise Hessian Eigenvalue Regularization

Add code
Bookmark button
Alert button
Feb 23, 2024
Zirui Zhu, Yong Liu, Zangwei Zheng, Huifeng Guo, Yang You

Viaarxiv icon

OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

Add code
Bookmark button
Alert button
Jan 29, 2024
Fuzhao Xue, Zian Zheng, Yao Fu, Jinjie Ni, Zangwei Zheng, Wangchunshu Zhou, Yang You

Viaarxiv icon

CAME: Confidence-guided Adaptive Memory Efficient Optimization

Add code
Bookmark button
Alert button
Jul 05, 2023
Yang Luo, Xiaozhe Ren, Zangwei Zheng, Zhuo Jiang, Xin Jiang, Yang You

Figure 1 for CAME: Confidence-guided Adaptive Memory Efficient Optimization
Figure 2 for CAME: Confidence-guided Adaptive Memory Efficient Optimization
Figure 3 for CAME: Confidence-guided Adaptive Memory Efficient Optimization
Figure 4 for CAME: Confidence-guided Adaptive Memory Efficient Optimization
Viaarxiv icon

To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis

Add code
Bookmark button
Alert button
May 22, 2023
Fuzhao Xue, Yao Fu, Wangchunshu Zhou, Zangwei Zheng, Yang You

Figure 1 for To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis
Figure 2 for To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis
Figure 3 for To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis
Figure 4 for To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis
Viaarxiv icon

Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline

Add code
Bookmark button
Alert button
May 22, 2023
Zangwei Zheng, Xiaozhe Ren, Fuzhao Xue, Yang Luo, Xin Jiang, Yang You

Figure 1 for Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline
Figure 2 for Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline
Figure 3 for Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline
Figure 4 for Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline
Viaarxiv icon

Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models

Add code
Bookmark button
Alert button
Mar 12, 2023
Zangwei Zheng, Mingyuan Ma, Kai Wang, Ziheng Qin, Xiangyu Yue, Yang You

Figure 1 for Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
Figure 2 for Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
Figure 3 for Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
Figure 4 for Preventing Zero-Shot Transfer Degradation in Continual Learning of Vision-Language Models
Viaarxiv icon

InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning

Add code
Bookmark button
Alert button
Mar 08, 2023
Ziheng Qin, Kai Wang, Zangwei Zheng, Jianyang Gu, Xiangyu Peng, Daquan Zhou, Yang You

Figure 1 for InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning
Figure 2 for InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning
Figure 3 for InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning
Figure 4 for InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning
Viaarxiv icon

Prompt Vision Transformer for Domain Generalization

Add code
Bookmark button
Alert button
Aug 18, 2022
Zangwei Zheng, Xiangyu Yue, Kai Wang, Yang You

Figure 1 for Prompt Vision Transformer for Domain Generalization
Figure 2 for Prompt Vision Transformer for Domain Generalization
Figure 3 for Prompt Vision Transformer for Domain Generalization
Figure 4 for Prompt Vision Transformer for Domain Generalization
Viaarxiv icon

Deeper vs Wider: A Revisit of Transformer Configuration

Add code
Bookmark button
Alert button
May 24, 2022
Fuzhao Xue, Jianghai Chen, Aixin Sun, Xiaozhe Ren, Zangwei Zheng, Xiaoxin He, Xin Jiang, Yang You

Figure 1 for Deeper vs Wider: A Revisit of Transformer Configuration
Figure 2 for Deeper vs Wider: A Revisit of Transformer Configuration
Figure 3 for Deeper vs Wider: A Revisit of Transformer Configuration
Figure 4 for Deeper vs Wider: A Revisit of Transformer Configuration
Viaarxiv icon