Picture for Wei-Shi Zheng

Wei-Shi Zheng

Chain of Methodologies: Scaling Test Time Computation without Training

Add code
Jun 08, 2025
Viaarxiv icon

Reinforcing Video Reasoning with Focused Thinking

Add code
May 30, 2025
Viaarxiv icon

Long-RVOS: A Comprehensive Benchmark for Long-term Referring Video Object Segmentation

Add code
May 19, 2025
Viaarxiv icon

ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding

Add code
Apr 25, 2025
Figure 1 for ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding
Figure 2 for ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding
Figure 3 for ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding
Figure 4 for ActionArt: Advancing Multimodal Large Models for Fine-Grained Human-Centric Video Understanding
Viaarxiv icon

PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild

Add code
Apr 15, 2025
Figure 1 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Figure 2 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Figure 3 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Figure 4 for PVUW 2025 Challenge Report: Advances in Pixel-level Understanding of Complex Videos in the Wild
Viaarxiv icon

Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

Add code
Apr 02, 2025
Figure 1 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Figure 2 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Figure 3 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Figure 4 for Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Viaarxiv icon

Decoupled Distillation to Erase: A General Unlearning Method for Any Class-centric Tasks

Add code
Mar 31, 2025
Viaarxiv icon

ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025

Add code
Mar 30, 2025
Figure 1 for ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025
Figure 2 for ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025
Figure 3 for ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025
Figure 4 for ReferDINO-Plus: 2nd Solution for 4th PVUW MeViS Challenge at CVPR 2025
Viaarxiv icon

Efficient Explicit Joint-level Interaction Modeling with Mamba for Text-guided HOI Generation

Add code
Mar 29, 2025
Figure 1 for Efficient Explicit Joint-level Interaction Modeling with Mamba for Text-guided HOI Generation
Figure 2 for Efficient Explicit Joint-level Interaction Modeling with Mamba for Text-guided HOI Generation
Figure 3 for Efficient Explicit Joint-level Interaction Modeling with Mamba for Text-guided HOI Generation
Figure 4 for Efficient Explicit Joint-level Interaction Modeling with Mamba for Text-guided HOI Generation
Viaarxiv icon

VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness

Add code
Mar 27, 2025
Viaarxiv icon