Picture for Benno Krojer

Benno Krojer

A Shortcut-aware Video-QA Benchmark for Physical Understanding via Minimal Video Pairs

Add code
Jun 11, 2025
Viaarxiv icon

Learning Action and Reasoning-Centric Image Editing from Videos and Simulations

Add code
Jul 03, 2024
Figure 1 for Learning Action and Reasoning-Centric Image Editing from Videos and Simulations
Figure 2 for Learning Action and Reasoning-Centric Image Editing from Videos and Simulations
Figure 3 for Learning Action and Reasoning-Centric Image Editing from Videos and Simulations
Figure 4 for Learning Action and Reasoning-Centric Image Editing from Videos and Simulations
Viaarxiv icon

Improving Automatic VQA Evaluation Using Large Language Models

Add code
Oct 04, 2023
Viaarxiv icon

Pragmatic Inference with a CLIP Listener for Contrastive Captioning

Add code
Jun 15, 2023
Figure 1 for Pragmatic Inference with a CLIP Listener for Contrastive Captioning
Figure 2 for Pragmatic Inference with a CLIP Listener for Contrastive Captioning
Figure 3 for Pragmatic Inference with a CLIP Listener for Contrastive Captioning
Figure 4 for Pragmatic Inference with a CLIP Listener for Contrastive Captioning
Viaarxiv icon

Are Diffusion Models Vision-And-Language Reasoners?

Add code
May 25, 2023
Viaarxiv icon

Image Retrieval from Contextual Descriptions

Add code
Mar 29, 2022
Figure 1 for Image Retrieval from Contextual Descriptions
Figure 2 for Image Retrieval from Contextual Descriptions
Figure 3 for Image Retrieval from Contextual Descriptions
Figure 4 for Image Retrieval from Contextual Descriptions
Viaarxiv icon