Picture for Yan Gao

Yan Gao

UnityMAS-O: A General RL Optimization Framework for LLM-Based Multi-Agent Systems

Add code
May 26, 2026
Viaarxiv icon

Tournament-GRPO: Group-Wise Tournament Rewards for Reinforcement Learning in Open-Ended Long-Form Generation

Add code
May 26, 2026
Viaarxiv icon

Knowledge-Graph Paths as Intermediate Supervision for Self-Evolving Search Agents

Add code
May 07, 2026
Viaarxiv icon

SPARD: Self-Paced Curriculum for RL Alignment via Integrating Reward Dynamics and Data Utility

Add code
Apr 09, 2026
Viaarxiv icon

PRAISE: Prefix-Based Rollout Reuse in Agentic Search Training

Add code
Apr 04, 2026
Viaarxiv icon

PRCCF: A Persona-guided Retrieval and Causal-aware Cognitive Filtering Framework for Emotional Support Conversation

Add code
Apr 02, 2026
Viaarxiv icon

Adaptive Federated Fine-Tuning of Self-Supervised Speech Representations

Add code
Mar 23, 2026
Viaarxiv icon

Logics-Parsing-Omni Technical Report

Add code
Mar 12, 2026
Viaarxiv icon

Aligning Large Language Models with Searcher Preferences

Add code
Mar 11, 2026
Viaarxiv icon

STRUCTUREDAGENT: Planning with AND/OR Trees for Long-Horizon Web Tasks

Add code
Mar 05, 2026
Viaarxiv icon