Picture for Dandan Tu

Dandan Tu

Huawei Technologies Co., Ltd., Beijing, China

SeePhys Pro: Diagnosing Modality Transfer and Blind-Training Effects in Multimodal RLVR for Physics Reasoning

Add code
May 10, 2026
Viaarxiv icon

Culture-Aware Machine Translation in Large Language Models: Benchmarking and Investigation

Add code
Apr 27, 2026
Viaarxiv icon

Visual Preference Optimization with Rubric Rewards

Add code
Apr 14, 2026
Viaarxiv icon

Schema-Aware Planning and Hybrid Knowledge Toolset for Reliable Knowledge Graph Triple Verification

Add code
Apr 05, 2026
Viaarxiv icon

Not All Tokens See Equally: Perception-Grounded Policy Optimization for Large Vision-Language Models

Add code
Apr 02, 2026
Viaarxiv icon

CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification

Add code
Mar 02, 2026
Viaarxiv icon

Scene-Aware Memory Discrimination: Deciding Which Personal Knowledge Stays

Add code
Feb 12, 2026
Viaarxiv icon

CLI-Gym: Scalable CLI Task Generation via Agentic Environment Inversion

Add code
Feb 11, 2026
Viaarxiv icon

FeatureBench: Benchmarking Agentic Coding for Complex Feature Development

Add code
Feb 11, 2026
Viaarxiv icon

Precision over Diversity: High-Precision Reward Generalizes to Robust Instruction Following

Add code
Jan 08, 2026
Viaarxiv icon