Picture for Yuchen Liu

Yuchen Liu

M$^3$-VQA: A Benchmark for Multimodal, Multi-Entity, Multi-Hop Visual Question Answering

Add code
Apr 28, 2026
Viaarxiv icon

When AI reviews science: Can we trust the referee?

Add code
Apr 26, 2026
Viaarxiv icon

RoboWM-Bench: A Benchmark for Evaluating World Models in Robotic Manipulation

Add code
Apr 21, 2026
Viaarxiv icon

Weakly-supervised Learning for Physics-informed Neural Motion Planning via Sparse Roadmap

Add code
Apr 14, 2026
Viaarxiv icon

ProUIE: A Macro-to-Micro Progressive Learning Method for LLM-based Universal Information Extraction

Add code
Apr 12, 2026
Viaarxiv icon

FactReview: Evidence-Grounded Reviews with Literature Positioning and Execution-Based Claim Verification

Add code
Apr 07, 2026
Viaarxiv icon

GISTBench: Evaluating LLM User Understanding via Evidence-Based Interest Verification

Add code
Mar 31, 2026
Viaarxiv icon

Stepwise Credit Assignment for GRPO on Flow-Matching Models

Add code
Mar 30, 2026
Viaarxiv icon

BiPreManip: Learning Affordance-Based Bimanual Preparatory Manipulation through Anticipatory Collaboration

Add code
Mar 23, 2026
Viaarxiv icon

GUI-CEval: A Hierarchical and Comprehensive Chinese Benchmark for Mobile GUI Agents

Add code
Mar 16, 2026
Viaarxiv icon