Picture for Chaoyou Fu

Chaoyou Fu

SpeechParaling-Bench: A Comprehensive Benchmark for Paralinguistic-Aware Speech Generation

Add code
Apr 22, 2026
Viaarxiv icon

Tango: Taming Visual Signals for Efficient Video Large Language Models

Add code
Apr 13, 2026
Viaarxiv icon

ActFER: Agentic Facial Expression Recognition via Active Tool-Augmented Visual Reasoning

Add code
Apr 10, 2026
Viaarxiv icon

Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding

Add code
Apr 06, 2026
Viaarxiv icon

Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?

Add code
Apr 03, 2026
Viaarxiv icon

Benchmarking PhD-Level Coding in 3D Geometric Computer Vision

Add code
Mar 31, 2026
Viaarxiv icon

VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding

Add code
Mar 23, 2026
Viaarxiv icon

MAC: A Conversion Rate Prediction Benchmark Featuring Labels Under Multiple Attribution Mechanisms

Add code
Mar 02, 2026
Viaarxiv icon

BABE: Biology Arena BEnchmark

Add code
Feb 05, 2026
Viaarxiv icon

VITA-VLA: Efficiently Teaching Vision-Language Models to Act via Action Expert Distillation

Add code
Oct 10, 2025
Viaarxiv icon