Picture for Basura Fernando

Basura Fernando

CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes

Add code
Apr 01, 2024
Figure 1 for CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes
Figure 2 for CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes
Figure 3 for CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes
Figure 4 for CausalChaos! Dataset for Comprehensive Causal Action Question Answering Over Longer Causal Chains Grounded in Dynamic Visual Scenes
Viaarxiv icon

Zero Shot Open-ended Video Inference

Add code
Jan 23, 2024
Viaarxiv icon

Learning to Visually Connect Actions and their Effects

Add code
Jan 19, 2024
Viaarxiv icon

Motion Flow Matching for Human Motion Synthesis and Editing

Add code
Dec 14, 2023
Viaarxiv icon

Semi-supervised multimodal coreference resolution in image narrations

Add code
Oct 20, 2023
Viaarxiv icon

ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in Situation Recognition

Add code
Jul 02, 2023
Figure 1 for ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in Situation Recognition
Figure 2 for ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in Situation Recognition
Figure 3 for ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in Situation Recognition
Figure 4 for ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in Situation Recognition
Viaarxiv icon

Revealing the Illusion of Joint Multimodal Understanding in VideoQA Models

Add code
Jun 15, 2023
Figure 1 for Revealing the Illusion of Joint Multimodal Understanding in VideoQA Models
Figure 2 for Revealing the Illusion of Joint Multimodal Understanding in VideoQA Models
Figure 3 for Revealing the Illusion of Joint Multimodal Understanding in VideoQA Models
Figure 4 for Revealing the Illusion of Joint Multimodal Understanding in VideoQA Models
Viaarxiv icon

Modelling Spatio-Temporal Interactions for Compositional Action Recognition

Add code
May 04, 2023
Figure 1 for Modelling Spatio-Temporal Interactions for Compositional Action Recognition
Figure 2 for Modelling Spatio-Temporal Interactions for Compositional Action Recognition
Figure 3 for Modelling Spatio-Temporal Interactions for Compositional Action Recognition
Figure 4 for Modelling Spatio-Temporal Interactions for Compositional Action Recognition
Viaarxiv icon

Fine-Grained Regional Prompt Tuning for Visual Abductive Reasoning

Add code
Mar 18, 2023
Figure 1 for Fine-Grained Regional Prompt Tuning for Visual Abductive Reasoning
Figure 2 for Fine-Grained Regional Prompt Tuning for Visual Abductive Reasoning
Figure 3 for Fine-Grained Regional Prompt Tuning for Visual Abductive Reasoning
Figure 4 for Fine-Grained Regional Prompt Tuning for Visual Abductive Reasoning
Viaarxiv icon

Who are you referring to? Weakly supervised coreference resolution with multimodal grounding

Add code
Nov 26, 2022
Figure 1 for Who are you referring to? Weakly supervised coreference resolution with multimodal grounding
Figure 2 for Who are you referring to? Weakly supervised coreference resolution with multimodal grounding
Figure 3 for Who are you referring to? Weakly supervised coreference resolution with multimodal grounding
Figure 4 for Who are you referring to? Weakly supervised coreference resolution with multimodal grounding
Viaarxiv icon