Picture for Jiangang Luo

Jiangang Luo

Layer-adaptive Expert Pruning for Pre-Training of Mixture-of-Experts Large Language Models

Add code
Jan 20, 2026
Viaarxiv icon

Yuan3.0 Flash: An Open Multimodal Large Language Model for Enterprise Applications

Add code
Jan 05, 2026
Viaarxiv icon

Yuan 2.0-M32: Mixture of Experts with Attention Router

Add code
May 29, 2024
Viaarxiv icon

YUAN 2.0: A Large Language Model with Localized Filtering-based Attention

Add code
Dec 04, 2023
Figure 1 for YUAN 2.0: A Large Language Model with Localized Filtering-based Attention
Figure 2 for YUAN 2.0: A Large Language Model with Localized Filtering-based Attention
Figure 3 for YUAN 2.0: A Large Language Model with Localized Filtering-based Attention
Figure 4 for YUAN 2.0: A Large Language Model with Localized Filtering-based Attention
Viaarxiv icon

Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning

Add code
Oct 12, 2021
Figure 1 for Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
Figure 2 for Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
Figure 3 for Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
Figure 4 for Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning
Viaarxiv icon