Picture for Nicu Sebe

Nicu Sebe

H$_{2}$OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers

Add code
Sep 08, 2025
Figure 1 for H$_{2}$OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers
Figure 2 for H$_{2}$OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers
Figure 3 for H$_{2}$OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers
Figure 4 for H$_{2}$OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers
Viaarxiv icon

Organ-Agents: Virtual Human Physiology Simulator via LLMs

Add code
Aug 20, 2025
Viaarxiv icon

Masked Clustering Prediction for Unsupervised Point Cloud Pre-training

Add code
Aug 12, 2025
Figure 1 for Masked Clustering Prediction for Unsupervised Point Cloud Pre-training
Figure 2 for Masked Clustering Prediction for Unsupervised Point Cloud Pre-training
Figure 3 for Masked Clustering Prediction for Unsupervised Point Cloud Pre-training
Figure 4 for Masked Clustering Prediction for Unsupervised Point Cloud Pre-training
Viaarxiv icon

Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation

Add code
Aug 12, 2025
Figure 1 for Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation
Figure 2 for Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation
Figure 3 for Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation
Figure 4 for Hierarchical Visual Prompt Learning for Continual Video Instance Segmentation
Viaarxiv icon

Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis

Add code
Jul 09, 2025
Figure 1 for Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
Figure 2 for Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
Figure 3 for Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
Figure 4 for Spatial-Temporal Graph Mamba for Music-Guided Dance Video Synthesis
Viaarxiv icon

Orthogonal Projection Subspace to Aggregate Online Prior-knowledge for Continual Test-time Adaptation

Add code
Jun 23, 2025
Figure 1 for Orthogonal Projection Subspace to Aggregate Online Prior-knowledge for Continual Test-time Adaptation
Figure 2 for Orthogonal Projection Subspace to Aggregate Online Prior-knowledge for Continual Test-time Adaptation
Figure 3 for Orthogonal Projection Subspace to Aggregate Online Prior-knowledge for Continual Test-time Adaptation
Figure 4 for Orthogonal Projection Subspace to Aggregate Online Prior-knowledge for Continual Test-time Adaptation
Viaarxiv icon

SceneSplat++: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting

Add code
Jun 10, 2025
Viaarxiv icon

When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding

Add code
Jun 05, 2025
Figure 1 for When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
Figure 2 for When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
Figure 3 for When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
Figure 4 for When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding
Viaarxiv icon

VidText: Towards Comprehensive Evaluation for Video Text Understanding

Add code
May 28, 2025
Viaarxiv icon

Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals

Add code
May 27, 2025
Figure 1 for Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals
Figure 2 for Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals
Figure 3 for Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals
Figure 4 for Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals
Viaarxiv icon