Alert button

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Jan 11, 2024
Damai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y. K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, Wenfeng Liang

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: