Picture for Jian-Fang Hu

Jian-Fang Hu

CoopDiff: Anticipating 3D Human-object Interactions via Contact-consistent Decoupled Diffusion

Add code
Aug 10, 2025
Viaarxiv icon

Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation

Add code
May 19, 2025
Viaarxiv icon

PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild

Add code
Apr 15, 2025
Figure 1 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Figure 2 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Figure 3 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Figure 4 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Viaarxiv icon

Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

Add code
Apr 02, 2025
Figure 1 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Figure 2 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Figure 3 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Figure 4 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Viaarxiv icon

ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025

Add code
Mar 30, 2025
Viaarxiv icon

ViSpeak: Visual Instruction Feedback in Streaming Videos

Add code
Mar 17, 2025
Figure 1 for ViSpeak: Visual Instruction Feedback in Streaming Videos
Figure 2 for ViSpeak: Visual Instruction Feedback in Streaming Videos
Figure 3 for ViSpeak: Visual Instruction Feedback in Streaming Videos
Figure 4 for ViSpeak: Visual Instruction Feedback in Streaming Videos
Viaarxiv icon

Progressive Human Motion Generation Based on Text and Few Motion Frames

Add code
Mar 17, 2025
Figure 1 for Progressive Human Motion Generation Based on Text and Few Motion Frames
Figure 2 for Progressive Human Motion Generation Based on Text and Few Motion Frames
Figure 3 for Progressive Human Motion Generation Based on Text and Few Motion Frames
Figure 4 for Progressive Human Motion Generation Based on Text and Few Motion Frames
Viaarxiv icon

AdvAD: Exploring Non-Parametric Diffusion for Imperceptible Adversarial Attacks

Add code
Mar 12, 2025
Viaarxiv icon

ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations

Add code
Jan 24, 2025
Figure 1 for ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations
Figure 2 for ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations
Figure 3 for ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations
Figure 4 for ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations
Viaarxiv icon

SAUGE: Taming SAM for Uncertainty-Aligned Multi-Granularity Edge Detection

Add code
Dec 17, 2024
Viaarxiv icon