Picture for Chunhua Shen

Chunhua Shen

The University of Adelaide

Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting

Add code
Jun 05, 2025
Viaarxiv icon

Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO

Add code
May 27, 2025
Viaarxiv icon

Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

Add code
May 26, 2025
Viaarxiv icon

POMATO: Marrying Pointmap Matching with Temporal Motion for Dynamic 3D Reconstruction

Add code
Apr 08, 2025
Viaarxiv icon

Aether: Geometric-Aware Unified World Modeling

Add code
Mar 25, 2025
Viaarxiv icon

SegAgent: Exploring Pixel Understanding Capabilities in MLLMs by Imitating Human Annotator Trajectories

Add code
Mar 11, 2025
Viaarxiv icon

PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training

Add code
Mar 09, 2025
Viaarxiv icon

DICEPTION: A Generalist Diffusion Model for Visual Perceptual Tasks

Add code
Feb 25, 2025
Viaarxiv icon

Revisiting Convolution Architecture in the Realm of DNA Foundation Models

Add code
Feb 25, 2025
Viaarxiv icon

A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction

Add code
Feb 08, 2025
Figure 1 for A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
Figure 2 for A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
Figure 3 for A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
Figure 4 for A Physical Coherence Benchmark for Evaluating Video Generation Models via Optical Flow-guided Frame Prediction
Viaarxiv icon