Picture for Paola Cascante-Bonilla

Paola Cascante-Bonilla

EgoGroups: A Benchmark For Detecting Social Groups of People in the Wild

Add code
Mar 23, 2026
Viaarxiv icon

Do VLMs Need Vision Transformers? Evaluating State Space Models as Vision Encoders

Add code
Mar 19, 2026
Viaarxiv icon

Can Hallucination Correction Improve Video-Language Alignment?

Add code
Feb 20, 2025
Viaarxiv icon

Natural Language Inference Improves Compositionality in Vision-Language Models

Add code
Oct 29, 2024
Figure 1 for Natural Language Inference Improves Compositionality in Vision-Language Models
Figure 2 for Natural Language Inference Improves Compositionality in Vision-Language Models
Figure 3 for Natural Language Inference Improves Compositionality in Vision-Language Models
Figure 4 for Natural Language Inference Improves Compositionality in Vision-Language Models
Viaarxiv icon

PropTest: Automatic Property Testing for Improved Visual Programming

Add code
Mar 25, 2024
Figure 1 for PropTest: Automatic Property Testing for Improved Visual Programming
Figure 2 for PropTest: Automatic Property Testing for Improved Visual Programming
Figure 3 for PropTest: Automatic Property Testing for Improved Visual Programming
Figure 4 for PropTest: Automatic Property Testing for Improved Visual Programming
Viaarxiv icon

Learning from Models and Data for Visual Grounding

Add code
Mar 20, 2024
Figure 1 for Learning from Models and Data for Visual Grounding
Figure 2 for Learning from Models and Data for Visual Grounding
Figure 3 for Learning from Models and Data for Visual Grounding
Figure 4 for Learning from Models and Data for Visual Grounding
Viaarxiv icon

Grounding Language Models for Visual Entity Recognition

Add code
Feb 28, 2024
Figure 1 for Grounding Language Models for Visual Entity Recognition
Figure 2 for Grounding Language Models for Visual Entity Recognition
Figure 3 for Grounding Language Models for Visual Entity Recognition
Figure 4 for Grounding Language Models for Visual Entity Recognition
Viaarxiv icon

Improved Visual Grounding through Self-Consistent Explanations

Add code
Dec 07, 2023
Viaarxiv icon

Going Beyond Nouns With Vision & Language Models Using Synthetic Data

Add code
Mar 30, 2023
Figure 1 for Going Beyond Nouns With Vision & Language Models Using Synthetic Data
Figure 2 for Going Beyond Nouns With Vision & Language Models Using Synthetic Data
Figure 3 for Going Beyond Nouns With Vision & Language Models Using Synthetic Data
Figure 4 for Going Beyond Nouns With Vision & Language Models Using Synthetic Data
Viaarxiv icon

CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning

Add code
Nov 23, 2022
Figure 1 for CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning
Figure 2 for CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning
Figure 3 for CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning
Figure 4 for CODA-Prompt: COntinual Decomposed Attention-based Prompting for Rehearsal-Free Continual Learning
Viaarxiv icon