Picture for Nicu Sebe

Nicu Sebe

GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models

Add code
Aug 29, 2024
Viaarxiv icon

Global-Local Distillation Network-Based Audio-Visual Speaker Tracking with Incomplete Modalities

Add code
Aug 26, 2024
Figure 1 for Global-Local Distillation Network-Based Audio-Visual Speaker Tracking with Incomplete Modalities
Figure 2 for Global-Local Distillation Network-Based Audio-Visual Speaker Tracking with Incomplete Modalities
Figure 3 for Global-Local Distillation Network-Based Audio-Visual Speaker Tracking with Incomplete Modalities
Figure 4 for Global-Local Distillation Network-Based Audio-Visual Speaker Tracking with Incomplete Modalities
Viaarxiv icon

PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection

Add code
Aug 26, 2024
Figure 1 for PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection
Figure 2 for PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection
Figure 3 for PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection
Figure 4 for PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection
Viaarxiv icon

Large Language Models for Multimodal Deformable Image Registration

Add code
Aug 20, 2024
Viaarxiv icon

ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining

Add code
Aug 20, 2024
Figure 1 for ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining
Figure 2 for ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining
Figure 3 for ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining
Figure 4 for ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining
Viaarxiv icon

When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding

Add code
Aug 15, 2024
Figure 1 for When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
Figure 2 for When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
Figure 3 for When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
Figure 4 for When Video Coding Meets Multimodal Large Language Models: A Unified Paradigm for Video Coding
Viaarxiv icon

Masked Image Modeling: A Survey

Add code
Aug 13, 2024
Viaarxiv icon

Towards End-to-End Explainable Facial Action Unit Recognition via Vision-Language Joint Learning

Add code
Aug 01, 2024
Viaarxiv icon

Towards Localized Fine-Grained Control for Facial Expression Generation

Add code
Jul 25, 2024
Figure 1 for Towards Localized Fine-Grained Control for Facial Expression Generation
Figure 2 for Towards Localized Fine-Grained Control for Facial Expression Generation
Figure 3 for Towards Localized Fine-Grained Control for Facial Expression Generation
Figure 4 for Towards Localized Fine-Grained Control for Facial Expression Generation
Viaarxiv icon

Any Image Restoration with Efficient Automatic Degradation Adaptation

Add code
Jul 18, 2024
Viaarxiv icon