Picture for Qingyu Yin

Qingyu Yin

Shopping Reasoning Bench: An Expert-Authored Benchmark for Multi-Turn Conversational Shopping Assistants

Add code
Jun 10, 2026
Viaarxiv icon

On the Geometry of On-Policy Distillation

Add code
Jun 05, 2026
Viaarxiv icon

Unlocking Latent Value: Taxonomy-Guided Recovery of High-Performing Data from Low-Tier Web Corpora

Add code
Jun 05, 2026
Viaarxiv icon

QUBRIC: Co-Designing Queries and Rubrics for RL Beyond Verifiable Rewards

Add code
Jun 02, 2026
Viaarxiv icon

Controllable and Verifiable Tool-Use Data Synthesis for Agentic Reinforcement Learning

Add code
Apr 10, 2026
Viaarxiv icon

JoyAI-LLM Flash: Advancing Mid-Scale LLMs with Token Efficiency

Add code
Apr 03, 2026
Viaarxiv icon

Training LLMs for Multi-Step Tool Orchestration with Constrained Data Synthesis and Graduated Rewards

Add code
Mar 25, 2026
Viaarxiv icon

HeaPA: Difficulty-Aware Heap Sampling and On-Policy Query Augmentation for LLM Reinforcement Learning

Add code
Jan 30, 2026
Viaarxiv icon

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

Add code
Jan 29, 2026
Viaarxiv icon

Finding RELIEF: Shaping Reasoning Behavior without Reasoning Supervision via Belief Engineering

Add code
Jan 20, 2026
Viaarxiv icon