Picture for Yubo Ma

Yubo Ma

ARES: Automated Rubric Synthesis for Scalable LLM Reinforcement Learning

Add code
May 25, 2026
Viaarxiv icon

Unified Data Selection for LLM Reasoning

Add code
May 21, 2026
Viaarxiv icon

On Predicting the Post-training Potential of Pre-trained LLMs

Add code
May 12, 2026
Viaarxiv icon

Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models

Add code
May 12, 2026
Viaarxiv icon

SkillGraph: Skill-Augmented Reinforcement Learning for Agents via Evolving Skill Graphs

Add code
May 12, 2026
Viaarxiv icon

WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation

Add code
May 11, 2026
Viaarxiv icon

SketchFaceGS: Real-Time Sketch-Driven Face Editing and Generation with Gaussian Splatting

Add code
Apr 21, 2026
Viaarxiv icon

ClinConsensus: A Consensus-Based Benchmark for Evaluating Chinese Medical LLMs across Difficulty Levels

Add code
Mar 03, 2026
Viaarxiv icon

EMemBench: Interactive Benchmarking of Episodic Memory for VLM Agents

Add code
Jan 23, 2026
Viaarxiv icon

PLawBench: A Rubric-Based Benchmark for Evaluating LLMs in Real-World Legal Practice

Add code
Jan 23, 2026
Viaarxiv icon