Alert button
Picture for Weilin Zhao

Weilin Zhao

Alert button

MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies

Add code
Bookmark button
Alert button
Apr 09, 2024
Shengding Hu, Yuge Tu, Xu Han, Chaoqun He, Ganqu Cui, Xiang Long, Zhi Zheng, Yewei Fang, Yuxiang Huang, Weilin Zhao, Xinrong Zhang, Zheng Leng Thai, Kaihuo Zhang, Chongyi Wang, Yuan Yao, Chenyang Zhao, Jie Zhou, Jie Cai, Zhongwu Zhai, Ning Ding, Chao Jia, Guoyang Zeng, Dahai Li, Zhiyuan Liu, Maosong Sun

Viaarxiv icon

Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models

Add code
Bookmark button
Alert button
Mar 18, 2024
Ning Ding, Yulin Chen, Ganqu Cui, Xingtai Lv, Weilin Zhao, Ruobing Xie, Bowen Zhou, Zhiyuan Liu, Maosong Sun

Figure 1 for Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models
Figure 2 for Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models
Figure 3 for Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models
Figure 4 for Mastering Text, Code and Math Simultaneously via Fusing Highly Specialized Language Models
Viaarxiv icon

BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

Add code
Bookmark button
Alert button
Mar 14, 2024
Sun Ao, Weilin Zhao, Xu Han, Cheng Yang, Zhiyuan Liu, Chuan Shi, Maosong Sun, Shengnan Wang, Teng Su

Figure 1 for BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Figure 2 for BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Figure 3 for BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Figure 4 for BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences
Viaarxiv icon

Ouroboros: Speculative Decoding with Large Model Enhanced Drafting

Add code
Bookmark button
Alert button
Feb 21, 2024
Weilin Zhao, Yuxiang Huang, Xu Han, Chaojun Xiao, Zhiyuan Liu, Maosong Sun

Viaarxiv icon

Unlock Predictable Scaling from Emergent Abilities

Add code
Bookmark button
Alert button
Oct 05, 2023
Shengding Hu, Xin Liu, Xu Han, Xinrong Zhang, Chaoqun He, Weilin Zhao, Yankai Lin, Ning Ding, Zebin Ou, Guoyang Zeng, Zhiyuan Liu, Maosong Sun

Figure 1 for Unlock Predictable Scaling from Emergent Abilities
Figure 2 for Unlock Predictable Scaling from Emergent Abilities
Figure 3 for Unlock Predictable Scaling from Emergent Abilities
Figure 4 for Unlock Predictable Scaling from Emergent Abilities
Viaarxiv icon

CPET: Effective Parameter-Efficient Tuning for Compressed Large Language Models

Add code
Bookmark button
Alert button
Jul 15, 2023
Weilin Zhao, Yuxiang Huang, Xu Han, Zhiyuan Liu, Zhengyan Zhang, Maosong Sun

Figure 1 for CPET: Effective Parameter-Efficient Tuning for Compressed Large Language Models
Figure 2 for CPET: Effective Parameter-Efficient Tuning for Compressed Large Language Models
Figure 3 for CPET: Effective Parameter-Efficient Tuning for Compressed Large Language Models
Figure 4 for CPET: Effective Parameter-Efficient Tuning for Compressed Large Language Models
Viaarxiv icon

OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models

Add code
Bookmark button
Alert button
Jul 05, 2023
Shengding Hu, Ning Ding, Weilin Zhao, Xingtai Lv, Zhen Zhang, Zhiyuan Liu, Maosong Sun

Figure 1 for OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models
Figure 2 for OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models
Figure 3 for OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models
Figure 4 for OpenDelta: A Plug-and-play Library for Parameter-efficient Adaptation of Pre-trained Models
Viaarxiv icon

Tool Learning with Foundation Models

Add code
Bookmark button
Alert button
Apr 17, 2023
Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu, Zhenning Dai, Lan Yan, Xin Cong, Yaxi Lu, Weilin Zhao, Yuxiang Huang, Junxi Yan, Xu Han, Xian Sun, Dahai Li, Jason Phang, Cheng Yang, Tongshuang Wu, Heng Ji, Zhiyuan Liu, Maosong Sun

Figure 1 for Tool Learning with Foundation Models
Figure 2 for Tool Learning with Foundation Models
Figure 3 for Tool Learning with Foundation Models
Figure 4 for Tool Learning with Foundation Models
Viaarxiv icon

A Roadmap for Big Model

Add code
Bookmark button
Alert button
Apr 02, 2022
Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han, Zhenghao Liu, Ning Ding, Yongming Rao, Yizhao Gao, Liang Zhang, Ming Ding, Cong Fang, Yisen Wang, Mingsheng Long, Jing Zhang, Yinpeng Dong, Tianyu Pang, Peng Cui, Lingxiao Huang, Zheng Liang, Huawei Shen, Hui Zhang, Quanshi Zhang, Qingxiu Dong, Zhixing Tan, Mingxuan Wang, Shuo Wang, Long Zhou, Haoran Li, Junwei Bao, Yingwei Pan, Weinan Zhang, Zhou Yu, Rui Yan, Chence Shi, Minghao Xu, Zuobai Zhang, Guoqiang Wang, Xiang Pan, Mengjie Li, Xiaoyu Chu, Zijun Yao, Fangwei Zhu, Shulin Cao, Weicheng Xue, Zixuan Ma, Zhengyan Zhang, Shengding Hu, Yujia Qin, Chaojun Xiao, Zheni Zeng, Ganqu Cui, Weize Chen, Weilin Zhao, Yuan Yao, Peng Li, Wenzhao Zheng, Wenliang Zhao, Ziyi Wang, Borui Zhang, Nanyi Fei, Anwen Hu, Zenan Ling, Haoyang Li, Boxi Cao, Xianpei Han, Weidong Zhan, Baobao Chang, Hao Sun, Jiawen Deng, Chujie Zheng, Juanzi Li, Lei Hou, Xigang Cao, Jidong Zhai, Zhiyuan Liu, Maosong Sun, Jiwen Lu, Zhiwu Lu, Qin Jin, Ruihua Song, Ji-Rong Wen, Zhouchen Lin, Liwei Wang, Hang Su, Jun Zhu, Zhifang Sui, Jiajun Zhang, Yang Liu, Xiaodong He, Minlie Huang, Jian Tang, Jie Tang

Figure 1 for A Roadmap for Big Model
Figure 2 for A Roadmap for Big Model
Figure 3 for A Roadmap for Big Model
Figure 4 for A Roadmap for Big Model
Viaarxiv icon

Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models

Add code
Bookmark button
Alert button
Mar 15, 2022
Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao, Xiaozhi Wang, Zhiyuan Liu, Hai-Tao Zheng, Jianfei Chen, Yang Liu, Jie Tang, Juanzi Li, Maosong Sun

Figure 1 for Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models
Figure 2 for Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models
Figure 3 for Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models
Figure 4 for Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models
Viaarxiv icon