Picture for Zhenpeng Su

Zhenpeng Su

GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment

Add code
May 19, 2026
Viaarxiv icon

Good Reasoning Makes Good Demonstrations: Implicit Reasoning Quality Supervision via In-Context Reinforcement Learning

Add code
Mar 10, 2026
Viaarxiv icon

Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization

Add code
Aug 12, 2025
Figure 1 for Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Figure 2 for Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Figure 3 for Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Figure 4 for Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
Viaarxiv icon

LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference

Add code
May 18, 2025
Figure 1 for LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference
Figure 2 for LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference
Figure 3 for LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference
Figure 4 for LightRetriever: A LLM-based Hybrid Retrieval Architecture with 1000x Faster Query Inference
Viaarxiv icon

Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts

Add code
Feb 18, 2025
Figure 1 for Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts
Figure 2 for Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts
Figure 3 for Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts
Figure 4 for Finedeep: Mitigating Sparse Activation in Dense LLMs via Multi-Layer Fine-Grained Experts
Viaarxiv icon

DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs

Add code
Feb 18, 2025
Viaarxiv icon

CartesianMoE: Boosting Knowledge Sharing among Experts via Cartesian Product Routing in Mixture-of-Experts

Add code
Oct 21, 2024
Viaarxiv icon

Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval

Add code
Aug 20, 2024
Figure 1 for Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval
Figure 2 for Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval
Figure 3 for Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval
Figure 4 for Task-level Distributionally Robust Optimization for Large Language Model-based Dense Retrieval
Viaarxiv icon

MaskMoE: Boosting Token-Level Learning via Routing Mask in Mixture-of-Experts

Add code
Jul 13, 2024
Viaarxiv icon

Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal

Add code
Apr 27, 2024
Figure 1 for Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal
Figure 2 for Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal
Figure 3 for Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal
Figure 4 for Scaffold-BPE: Enhancing Byte Pair Encoding with Simple and Effective Scaffold Token Removal
Viaarxiv icon