Picture for Daan de Geus

Daan de Geus

DONUT: A Decoder-Only Model for Trajectory Prediction

Add code
Jun 07, 2025
Figure 1 for DONUT: A Decoder-Only Model for Trajectory Prediction
Figure 2 for DONUT: A Decoder-Only Model for Trajectory Prediction
Figure 3 for DONUT: A Decoder-Only Model for Trajectory Prediction
Figure 4 for DONUT: A Decoder-Only Model for Trajectory Prediction
Viaarxiv icon

How Important are Videos for Training Video LLMs?

Add code
Jun 07, 2025
Figure 1 for How Important are Videos for Training Video LLMs?
Figure 2 for How Important are Videos for Training Video LLMs?
Figure 3 for How Important are Videos for Training Video LLMs?
Figure 4 for How Important are Videos for Training Video LLMs?
Viaarxiv icon

Your ViT is Secretly an Image Segmentation Model

Add code
Mar 24, 2025
Figure 1 for Your ViT is Secretly an Image Segmentation Model
Figure 2 for Your ViT is Secretly an Image Segmentation Model
Figure 3 for Your ViT is Secretly an Image Segmentation Model
Figure 4 for Your ViT is Secretly an Image Segmentation Model
Viaarxiv icon

DINO in the Room: Leveraging 2D Foundation Models for 3D Segmentation

Add code
Mar 24, 2025
Viaarxiv icon

2024 BRAVO Challenge Track 1 1st Place Report: Evaluating Robustness of Vision Foundation Models for Semantic Segmentation

Add code
Sep 25, 2024
Viaarxiv icon

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think

Add code
Sep 17, 2024
Figure 1 for Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Figure 2 for Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Figure 3 for Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Figure 4 for Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Viaarxiv icon

Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation

Add code
Jun 17, 2024
Figure 1 for Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation
Figure 2 for Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation
Figure 3 for Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation
Figure 4 for Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation
Viaarxiv icon

ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers

Add code
Jun 14, 2024
Figure 1 for ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers
Figure 2 for ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers
Figure 3 for ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers
Figure 4 for ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers
Viaarxiv icon

Task-aligned Part-aware Panoptic Segmentation through Joint Object-Part Representations

Add code
Jun 14, 2024
Viaarxiv icon

How to Benchmark Vision Foundation Models for Semantic Segmentation?

Add code
Apr 18, 2024
Figure 1 for How to Benchmark Vision Foundation Models for Semantic Segmentation?
Figure 2 for How to Benchmark Vision Foundation Models for Semantic Segmentation?
Figure 3 for How to Benchmark Vision Foundation Models for Semantic Segmentation?
Figure 4 for How to Benchmark Vision Foundation Models for Semantic Segmentation?
Viaarxiv icon