Picture for Yunhe Wang

Yunhe Wang

and Other Contributors

Towards Efficient Agents: A Co-Design of Inference Architecture and System

Add code
Dec 20, 2025
Viaarxiv icon

SCOPE: Prompt Evolution for Enhancing Agent Effectiveness

Add code
Dec 17, 2025
Viaarxiv icon

VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse

Add code
Dec 16, 2025
Viaarxiv icon

Revealing the Power of Post-Training for Small Language Models via Knowledge Distillation

Add code
Sep 30, 2025
Viaarxiv icon

Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models

Add code
Aug 09, 2025
Figure 1 for Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models
Figure 2 for Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models
Figure 3 for Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models
Figure 4 for Rethinking 1-bit Optimization Leveraging Pre-trained Large Language Models
Viaarxiv icon

EAQuant: Enhancing Post-Training Quantization for MoE Models via Expert-Aware Optimization

Add code
Jun 16, 2025
Viaarxiv icon

Pangu DeepDiver: Adaptive Search Intensity Scaling via Open-Web Reinforcement Learning

Add code
May 30, 2025
Viaarxiv icon

Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition

Add code
May 29, 2025
Figure 1 for Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition
Figure 2 for Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition
Figure 3 for Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition
Figure 4 for Pangu Embedded: An Efficient Dual-system LLM Reasoner with Metacognition
Viaarxiv icon

SlimLLM: Accurate Structured Pruning for Large Language Models

Add code
May 28, 2025
Figure 1 for SlimLLM: Accurate Structured Pruning for Large Language Models
Figure 2 for SlimLLM: Accurate Structured Pruning for Large Language Models
Figure 3 for SlimLLM: Accurate Structured Pruning for Large Language Models
Figure 4 for SlimLLM: Accurate Structured Pruning for Large Language Models
Viaarxiv icon

Pangu Pro MoE: Mixture of Grouped Experts for Efficient Sparsity

Add code
May 28, 2025
Viaarxiv icon