Picture for Yizhe Xiong

Yizhe Xiong

Parameter-Free Fine-tuning via Redundancy Elimination for Vision Foundation Models

Add code
Apr 11, 2025
Viaarxiv icon

DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs

Add code
Feb 18, 2025
Viaarxiv icon

Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts

Add code
Feb 18, 2025
Figure 1 for Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts
Figure 2 for Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts
Figure 3 for Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts
Figure 4 for Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts
Viaarxiv icon

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Add code
Dec 30, 2024
Figure 1 for Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Figure 2 for Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Figure 3 for Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Figure 4 for Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Viaarxiv icon

Breaking the Stage Barrier: A Novel Single-Stage Approach to Long Context Extension for Large Language Models

Add code
Dec 10, 2024
Viaarxiv icon

LBPE: Long-token-first Tokenization to Improve Large Language Models

Add code
Nov 08, 2024
Figure 1 for LBPE: Long-token-first Tokenization to Improve Large Language Models
Figure 2 for LBPE: Long-token-first Tokenization to Improve Large Language Models
Figure 3 for LBPE: Long-token-first Tokenization to Improve Large Language Models
Figure 4 for LBPE: Long-token-first Tokenization to Improve Large Language Models
Viaarxiv icon

CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts

Add code
Oct 21, 2024
Viaarxiv icon

MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts

Add code
Jul 13, 2024
Viaarxiv icon

Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal

Add code
Apr 27, 2024
Viaarxiv icon

Temporal Scaling Law for Large Language Models

Add code
Apr 27, 2024
Viaarxiv icon