Picture for Xiaoyu Tan

Xiaoyu Tan

INF Technology

Youtu-Agent: Scaling Agent Productivity with Automated Generation and Hybrid Policy Optimization

Add code
Dec 31, 2025
Viaarxiv icon

Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models

Add code
Dec 31, 2025
Viaarxiv icon

SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents

Add code
Dec 26, 2025
Viaarxiv icon

Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning

Add code
Sep 26, 2025
Figure 1 for Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning
Figure 2 for Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning
Figure 3 for Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning
Figure 4 for Learn the Ropes, Then Trust the Wins: Self-imitation with Progressive Exploration for Agentic Reinforcement Learning
Viaarxiv icon

The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward

Add code
Sep 09, 2025
Figure 1 for The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward
Figure 2 for The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward
Figure 3 for The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward
Figure 4 for The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward
Viaarxiv icon

Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents

Add code
Mar 11, 2025
Figure 1 for Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents
Figure 2 for Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents
Figure 3 for Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents
Figure 4 for Guess What I am Thinking: A Benchmark for Inner Thought Reasoning of Role-Playing Language Agents
Viaarxiv icon

AURORA:Automated Training Framework of Universal Process Reward Models via Ensemble Prompting and Reverse Verification

Add code
Feb 17, 2025
Figure 1 for AURORA:Automated Training Framework of Universal Process Reward Models via Ensemble Prompting and Reverse Verification
Figure 2 for AURORA:Automated Training Framework of Universal Process Reward Models via Ensemble Prompting and Reverse Verification
Figure 3 for AURORA:Automated Training Framework of Universal Process Reward Models via Ensemble Prompting and Reverse Verification
Figure 4 for AURORA:Automated Training Framework of Universal Process Reward Models via Ensemble Prompting and Reverse Verification
Viaarxiv icon

Refine Knowledge of Large Language Models via Adaptive Contrastive Learning

Add code
Feb 11, 2025
Viaarxiv icon

SCP-116K: A High-Quality Problem-Solution Dataset and a Generalized Pipeline for Automated Extraction in the Higher Education Science Domain

Add code
Jan 26, 2025
Figure 1 for SCP-116K: A High-Quality Problem-Solution Dataset and a Generalized Pipeline for Automated Extraction in the Higher Education Science Domain
Figure 2 for SCP-116K: A High-Quality Problem-Solution Dataset and a Generalized Pipeline for Automated Extraction in the Higher Education Science Domain
Viaarxiv icon

An Attentive Dual-Encoder Framework Leveraging Multimodal Visual and Semantic Information for Automatic OSAHS Diagnosis

Add code
Dec 25, 2024
Figure 1 for An Attentive Dual-Encoder Framework Leveraging Multimodal Visual and Semantic Information for Automatic OSAHS Diagnosis
Figure 2 for An Attentive Dual-Encoder Framework Leveraging Multimodal Visual and Semantic Information for Automatic OSAHS Diagnosis
Figure 3 for An Attentive Dual-Encoder Framework Leveraging Multimodal Visual and Semantic Information for Automatic OSAHS Diagnosis
Figure 4 for An Attentive Dual-Encoder Framework Leveraging Multimodal Visual and Semantic Information for Automatic OSAHS Diagnosis
Viaarxiv icon