Picture for Bonan Ding

Bonan Ding

Not All Modalities Are Equal: Instruction-Aware Gating for Multimodal Videos

Add code
May 25, 2026
Viaarxiv icon

SSLFusion: Scale & Space Aligned Latent Fusion Model for Multimodal 3D Object Detection

Add code
Apr 07, 2025
Figure 1 for SSLFusion: Scale & Space Aligned Latent Fusion Model for Multimodal 3D Object Detection
Figure 2 for SSLFusion: Scale & Space Aligned Latent Fusion Model for Multimodal 3D Object Detection
Figure 3 for SSLFusion: Scale & Space Aligned Latent Fusion Model for Multimodal 3D Object Detection
Figure 4 for SSLFusion: Scale & Space Aligned Latent Fusion Model for Multimodal 3D Object Detection
Viaarxiv icon

VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection

Add code
Apr 15, 2024
Figure 1 for VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection
Figure 2 for VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection
Figure 3 for VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection
Figure 4 for VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection
Viaarxiv icon