Picture for Min-Hung Chen

Min-Hung Chen

Learning Skills from Action-Free Videos

Add code
Dec 23, 2025
Viaarxiv icon

4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation

Add code
Dec 22, 2025
Viaarxiv icon

Zoom-Zero: Reinforced Coarse-to-Fine Video Understanding via Temporal Zoom-in

Add code
Dec 16, 2025
Viaarxiv icon

VADER: Towards Causal Video Anomaly Understanding with Relation-Aware Large Language Models

Add code
Nov 10, 2025
Viaarxiv icon

TC-LoRA: Temporally Modulated Conditional LoRA for Adaptive Diffusion Control

Add code
Oct 10, 2025
Figure 1 for TC-LoRA: Temporally Modulated Conditional LoRA for Adaptive Diffusion Control
Figure 2 for TC-LoRA: Temporally Modulated Conditional LoRA for Adaptive Diffusion Control
Figure 3 for TC-LoRA: Temporally Modulated Conditional LoRA for Adaptive Diffusion Control
Figure 4 for TC-LoRA: Temporally Modulated Conditional LoRA for Adaptive Diffusion Control
Viaarxiv icon

Temporal Prompting Matters: Rethinking Referring Video Object Segmentation

Add code
Oct 08, 2025
Viaarxiv icon

Autoregressive Universal Video Segmentation Model

Add code
Aug 26, 2025
Figure 1 for Autoregressive Universal Video Segmentation Model
Figure 2 for Autoregressive Universal Video Segmentation Model
Figure 3 for Autoregressive Universal Video Segmentation Model
Figure 4 for Autoregressive Universal Video Segmentation Model
Viaarxiv icon

MovieCORE: COgnitive REasoning in Movies

Add code
Aug 26, 2025
Figure 1 for MovieCORE: COgnitive REasoning in Movies
Figure 2 for MovieCORE: COgnitive REasoning in Movies
Figure 3 for MovieCORE: COgnitive REasoning in Movies
Figure 4 for MovieCORE: COgnitive REasoning in Movies
Viaarxiv icon

LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos

Add code
Aug 19, 2025
Figure 1 for LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
Figure 2 for LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
Figure 3 for LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
Figure 4 for LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos
Viaarxiv icon

ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning

Add code
Jul 22, 2025
Figure 1 for ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Figure 2 for ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Figure 3 for ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Figure 4 for ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning
Viaarxiv icon