Picture for Wenxiang Jiao

Wenxiang Jiao

TourPlanner: A Competitive Consensus Framework with Constraint-Gated Reinforcement Learning for Travel Planning

Add code
Jan 08, 2026
Viaarxiv icon

Agent2World: Learning to Generate Symbolic World Models via Adaptive Multi-Agent Feedback

Add code
Dec 26, 2025
Viaarxiv icon

LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls

Add code
Nov 18, 2025
Figure 1 for LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls
Figure 2 for LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls
Figure 3 for LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls
Figure 4 for LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls
Viaarxiv icon

Fints: Efficient Inference-Time Personalization for LLMs with Fine-Grained Instance-Tailored Steering

Add code
Oct 31, 2025
Viaarxiv icon

DeepCompress: A Dual Reward Strategy for Dynamically Exploring and Compressing Reasoning Chains

Add code
Oct 31, 2025
Viaarxiv icon

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Add code
Oct 24, 2025
Viaarxiv icon

Curing Miracle Steps in LLM Mathematical Reasoning with Rubric Rewards

Add code
Oct 09, 2025
Viaarxiv icon

REA-RL: Reflection-Aware Online Reinforcement Learning for Efficient Large Reasoning Models

Add code
May 26, 2025
Viaarxiv icon

Towards Evaluating Proactive Risk Awareness of Multimodal Language Models

Add code
May 23, 2025
Viaarxiv icon

Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

Add code
May 21, 2025
Viaarxiv icon