Picture for Chenyan Xiong

Chenyan Xiong

Microsoft Research

Benchmark Test-Time Scaling of General LLM Agents

Add code
Feb 22, 2026
Viaarxiv icon

Agentic Search in the Wild: Intents and Trajectory Dynamics from 14M+ Real Search Requests

Add code
Jan 24, 2026
Viaarxiv icon

ORBIT -- Open Recommendation Benchmark for Reproducible Research with Hidden Tests

Add code
Oct 30, 2025
Viaarxiv icon

AutoRule: Reasoning Chain-of-thought Extracted Rule-based Rewards Improve Preference Learning

Add code
Jun 18, 2025
Viaarxiv icon

Semi-structured LLM Reasoners Can Be Rigorously Audited

Add code
May 30, 2025
Viaarxiv icon

ConsRec: Denoising Sequential Recommendation through User-Consistent Preference Modeling

Add code
May 28, 2025
Viaarxiv icon

FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models

Add code
May 26, 2025
Viaarxiv icon

DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research

Add code
May 25, 2025
Figure 1 for DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research
Figure 2 for DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research
Figure 3 for DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research
Figure 4 for DeepResearchGym: A Free, Transparent, and Reproducible Evaluation Sandbox for Deep Research
Viaarxiv icon

Aligning Web Query Generation with Ranking Objectives via Direct Preference Optimization

Add code
May 25, 2025
Viaarxiv icon

PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric Pruning

Add code
Feb 21, 2025
Figure 1 for PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric Pruning
Figure 2 for PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric Pruning
Figure 3 for PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric Pruning
Figure 4 for PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric Pruning
Viaarxiv icon