Picture for Wenhao Chai

Wenhao Chai

Muddit: Liberating Generation Beyond Text-to-Image with a Unified Discrete Diffusion Model

Add code
May 29, 2025
Viaarxiv icon

GAM-Agent: Game-Theoretic and Uncertainty-Aware Collaboration for Complex Visual Reasoning

Add code
May 29, 2025
Viaarxiv icon

TEMPURA: Temporal Event Masked Prediction and Understanding for Reasoning in Action

Add code
May 02, 2025
Viaarxiv icon

Video-MMLU: A Massive Multi-Discipline Lecture Understanding Benchmark

Add code
Apr 20, 2025
Viaarxiv icon

Science-T2I: Addressing Scientific Illusions in Image Synthesis

Add code
Apr 17, 2025
Viaarxiv icon

An Empirical Study of GPT-4o Image Generation Capabilities

Add code
Apr 08, 2025
Viaarxiv icon

EMMOE: A Comprehensive Benchmark for Embodied Mobile Manipulation in Open Environments

Add code
Mar 11, 2025
Viaarxiv icon

DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models

Add code
Mar 06, 2025
Viaarxiv icon

Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think

Add code
Feb 27, 2025
Viaarxiv icon

Pointmap Association and Piecewise-Plane Constraint for Consistent and Compact 3D Gaussian Segmentation Field

Add code
Feb 22, 2025
Viaarxiv icon