Picture for Yao Hu

Yao Hu

Alibaba Group

RecLLM-R1: A Two-Stage Training Paradigm with Reinforcement Learning and Chain-of-Thought v1

Add code
Jun 24, 2025
Viaarxiv icon

AgentGroupChat-V2: Divide-and-Conquer Is What LLM-Based Multi-Agent System Need

Add code
Jun 18, 2025
Viaarxiv icon

Plan Your Travel and Travel with Your Plan: Wide-Horizon Planning and Evaluation via LLM

Add code
Jun 14, 2025
Viaarxiv icon

Mind the Quote: Enabling Quotation-Aware Dialogue in LLMs via Plug-and-Play Modules

Add code
May 30, 2025
Viaarxiv icon

Silencer: From Discovery to Mitigation of Self-Bias in LLM-as-Benchmark-Generator

Add code
May 27, 2025
Viaarxiv icon

Progressive Scaling Visual Object Tracking

Add code
May 26, 2025
Viaarxiv icon

MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning

Add code
May 26, 2025
Viaarxiv icon

PaRT: Enhancing Proactive Social Chatbots with Personalized Real-Time Retrieval

Add code
Apr 29, 2025
Viaarxiv icon

MT-R1-Zero: Advancing LLM-based Machine Translation via R1-Zero-like Reinforcement Learning

Add code
Apr 14, 2025
Viaarxiv icon

SocioVerse: A World Model for Social Simulation Powered by LLM Agents and A Pool of 10 Million Real-World Users

Add code
Apr 14, 2025
Viaarxiv icon