Picture for Daan de Geus

Daan de Geus

How Important are Videos for Training Video LLMs?

Add code
Jun 07, 2025
Viaarxiv icon

DONUT: A Decoder-Only Model for Trajectory Prediction

Add code
Jun 07, 2025
Viaarxiv icon

DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation

Add code
Mar 24, 2025
Viaarxiv icon

Your ViT is Secretly an Image Segmentation Model

Add code
Mar 24, 2025
Viaarxiv icon

2024 BRAVO Challenge Track 1 1st Place Report: Evaluating Robustness of Vision Foundation Models for Semantic Segmentation

Add code
Sep 25, 2024
Viaarxiv icon

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Add code
Sep 17, 2024
Figure 1 for Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Figure 2 for Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Figure 3 for Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Figure 4 for Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Viaarxiv icon

Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation

Add code
Jun 17, 2024
Figure 1 for Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation
Figure 2 for Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation
Figure 3 for Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation
Figure 4 for Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation
Viaarxiv icon

ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers

Add code
Jun 14, 2024
Figure 1 for ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers
Figure 2 for ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers
Figure 3 for ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers
Figure 4 for ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers
Viaarxiv icon

Task-aligned Part-aware Panoptic Segmentation through Joint Object-Part Representations

Add code
Jun 14, 2024
Viaarxiv icon

How to Benchmark Vision Foundation Models for Semantic Segmentation?

Add code
Apr 18, 2024
Viaarxiv icon