Picture for Yubo Ma

Yubo Ma

DLawBench: Evaluating LLMs Through Multi-Turn Legal Consultation

Add code
Jun 11, 2026
Viaarxiv icon

ARES: Automated Rubric Synthesis for Scalable LLM Reinforcement Learning

Add code
May 25, 2026
Viaarxiv icon

Unified Data Selection for LLM Reasoning

Add code
May 21, 2026
Viaarxiv icon

Qwen-Scope: Turning Sparse Features into Development Tools for Large Language Models

Add code
May 12, 2026
Viaarxiv icon

On Predicting the Post-training Potential of Pre-trained LLMs

Add code
May 12, 2026
Viaarxiv icon

SkillGraph: Skill-Augmented Reinforcement Learning for Agents via Evolving Skill Graphs

Add code
May 12, 2026
Viaarxiv icon

WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation

Add code
May 11, 2026
Viaarxiv icon

SketchFaceGS: Real-Time Sketch-Driven Face Editing and Generation with Gaussian Splatting

Add code
Apr 21, 2026
Viaarxiv icon

ClinConsensus: A Consensus-Based Benchmark for Evaluating Chinese Medical LLMs across Difficulty Levels

Add code
Mar 03, 2026
Viaarxiv icon

EMemBench: Interactive Benchmarking of Episodic Memory for VLM Agents

Add code
Jan 23, 2026
Viaarxiv icon