Picture for Bernard Ghanem

Bernard Ghanem

Can Video Diffusion Model Reconstruct 4D Geometry?

Add code
Mar 27, 2025
Viaarxiv icon

BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding

Add code
Mar 27, 2025
Viaarxiv icon

Structured-Noise Masked Modeling for Video, Audio and Beyond

Add code
Mar 20, 2025
Figure 1 for Structured-Noise Masked Modeling for Video, Audio and Beyond
Figure 2 for Structured-Noise Masked Modeling for Video, Audio and Beyond
Figure 3 for Structured-Noise Masked Modeling for Video, Audio and Beyond
Figure 4 for Structured-Noise Masked Modeling for Video, Audio and Beyond
Viaarxiv icon

DiffCLIP: Differential Attention Meets CLIP

Add code
Mar 09, 2025
Viaarxiv icon

TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long Videos

Add code
Mar 09, 2025
Viaarxiv icon

OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection

Add code
Feb 27, 2025
Viaarxiv icon

Shh, don't say that! Domain Certification in LLMs

Add code
Feb 26, 2025
Viaarxiv icon

Optimizing Singular Spectrum for Large Language Model Compression

Add code
Feb 20, 2025
Viaarxiv icon

Continuous Knowledge-Preserving Decomposition for Few-Shot Continual Learning

Add code
Jan 09, 2025
Figure 1 for Continuous Knowledge-Preserving Decomposition for Few-Shot Continual Learning
Figure 2 for Continuous Knowledge-Preserving Decomposition for Few-Shot Continual Learning
Figure 3 for Continuous Knowledge-Preserving Decomposition for Few-Shot Continual Learning
Figure 4 for Continuous Knowledge-Preserving Decomposition for Few-Shot Continual Learning
Viaarxiv icon

EAGLE: Enhanced Visual Grounding Minimizes Hallucinations in Instructional Multimodal Models

Add code
Jan 06, 2025
Viaarxiv icon