Picture for Duyu Tang

Duyu Tang

Not All Models Suit Expert Offloading: On Local Routing Consistency of Mixture-of-Expert Models

Add code
May 21, 2025
Viaarxiv icon

Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs

Add code
May 16, 2025
Viaarxiv icon

Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs

Add code
Apr 10, 2025
Viaarxiv icon

Mixture of Lookup Experts

Add code
Mar 20, 2025
Viaarxiv icon

Enhancing Non-English Capabilities of English-Centric Large Language Models through Deep Supervision Fine-Tuning

Add code
Mar 05, 2025
Viaarxiv icon

Unshackling Context Length: An Efficient Selective Attention Approach through Query-Key Compression

Add code
Feb 20, 2025
Viaarxiv icon

Ensuring Consistency for In-Image Translation

Add code
Dec 24, 2024
Viaarxiv icon

Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective

Add code
Dec 23, 2024
Figure 1 for Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective
Figure 2 for Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective
Figure 3 for Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective
Figure 4 for Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective
Viaarxiv icon

XTransplant: A Probe into the Upper Bound Performance of Multilingual Capability and Culture Adaptability in LLMs via Mutual Cross-lingual Feed-forward Transplantation

Add code
Dec 17, 2024
Viaarxiv icon

ToolACE: Winning the Points of LLM Function Calling

Add code
Sep 02, 2024
Figure 1 for ToolACE: Winning the Points of LLM Function Calling
Figure 2 for ToolACE: Winning the Points of LLM Function Calling
Figure 3 for ToolACE: Winning the Points of LLM Function Calling
Figure 4 for ToolACE: Winning the Points of LLM Function Calling
Viaarxiv icon