Picture for Lexiang Tang

Lexiang Tang

Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision

Add code
Jan 27, 2026
Viaarxiv icon

GIFT: Unlocking Global Optimality in Post-Training via Finite-Temperature Gibbs Initialization

Add code
Jan 14, 2026
Viaarxiv icon

Leash: Adaptive Length Penalty and Reward Shaping for Efficient Large Reasoning Model

Add code
Dec 25, 2025
Figure 1 for Leash: Adaptive Length Penalty and Reward Shaping for Efficient Large Reasoning Model
Figure 2 for Leash: Adaptive Length Penalty and Reward Shaping for Efficient Large Reasoning Model
Figure 3 for Leash: Adaptive Length Penalty and Reward Shaping for Efficient Large Reasoning Model
Figure 4 for Leash: Adaptive Length Penalty and Reward Shaping for Efficient Large Reasoning Model
Viaarxiv icon

Not All Tokens and Heads Are Equally Important: Dual-Level Attention Intervention for Hallucination Mitigation

Add code
Jun 14, 2025
Figure 1 for Not All Tokens and Heads Are Equally Important: Dual-Level Attention Intervention for Hallucination Mitigation
Figure 2 for Not All Tokens and Heads Are Equally Important: Dual-Level Attention Intervention for Hallucination Mitigation
Figure 3 for Not All Tokens and Heads Are Equally Important: Dual-Level Attention Intervention for Hallucination Mitigation
Figure 4 for Not All Tokens and Heads Are Equally Important: Dual-Level Attention Intervention for Hallucination Mitigation
Viaarxiv icon

Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions

Add code
Jun 09, 2025
Figure 1 for Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions
Figure 2 for Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions
Figure 3 for Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions
Figure 4 for Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions
Viaarxiv icon