Picture for Yu Wang

Yu Wang

University of Oregon

AJ-Bench: Benchmarking Agent-as-a-Judge for Environment-Aware Evaluation

Add code
Apr 20, 2026
Viaarxiv icon

From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench

Add code
Apr 16, 2026
Viaarxiv icon

The Fourth Challenge on Image Super-Resolution ($\times$4) at NTIRE 2026: Benchmark Results and Method Overview

Add code
Apr 16, 2026
Viaarxiv icon

ReviewGrounder: Improving Review Substantiveness with Rubric-Guided, Tool-Integrated Agents

Add code
Apr 15, 2026
Viaarxiv icon

Why Multimodal In-Context Learning Lags Behind? Unveiling the Inner Mechanisms and Bottlenecks

Add code
Apr 15, 2026
Viaarxiv icon

CocoaBench: Evaluating Unified Digital Agents in the Wild

Add code
Apr 14, 2026
Viaarxiv icon

Self-Correcting RAG: Enhancing Faithfulness via MMKP Context Selection and NLI-Guided MCTS

Add code
Apr 12, 2026
Viaarxiv icon

The Second Challenge on Real-World Face Restoration at NTIRE 2026: Methods and Results

Add code
Apr 12, 2026
Viaarxiv icon

RCBSF: A Multi-Agent Framework for Automated Contract Revision via Stackelberg Game

Add code
Apr 12, 2026
Viaarxiv icon

A-MBER: Affective Memory Benchmark for Emotion Recognition

Add code
Apr 08, 2026
Viaarxiv icon