Picture for Sanjay Subramanian

Sanjay Subramanian

University of Pennsylvania

Pose Priors from Language Models

Add code
May 06, 2024
Figure 1 for Pose Priors from Language Models
Figure 2 for Pose Priors from Language Models
Figure 3 for Pose Priors from Language Models
Figure 4 for Pose Priors from Language Models
Viaarxiv icon

TraveLER: A Multi-LMM Agent Framework for Video Question-Answering

Add code
Apr 01, 2024
Figure 1 for TraveLER: A Multi-LMM Agent Framework for Video Question-Answering
Figure 2 for TraveLER: A Multi-LMM Agent Framework for Video Question-Answering
Figure 3 for TraveLER: A Multi-LMM Agent Framework for Video Question-Answering
Figure 4 for TraveLER: A Multi-LMM Agent Framework for Video Question-Answering
Viaarxiv icon

Recursive Visual Programming

Add code
Dec 04, 2023
Figure 1 for Recursive Visual Programming
Figure 2 for Recursive Visual Programming
Figure 3 for Recursive Visual Programming
Figure 4 for Recursive Visual Programming
Viaarxiv icon

From Wrong To Right: A Recursive Approach Towards Vision-Language Explanation

Add code
Nov 21, 2023
Viaarxiv icon

Can Language Models Learn to Listen?

Add code
Aug 21, 2023
Figure 1 for Can Language Models Learn to Listen?
Figure 2 for Can Language Models Learn to Listen?
Figure 3 for Can Language Models Learn to Listen?
Figure 4 for Can Language Models Learn to Listen?
Viaarxiv icon

Modular Visual Question Answering via Code Generation

Add code
Jun 08, 2023
Figure 1 for Modular Visual Question Answering via Code Generation
Figure 2 for Modular Visual Question Answering via Code Generation
Figure 3 for Modular Visual Question Answering via Code Generation
Figure 4 for Modular Visual Question Answering via Code Generation
Viaarxiv icon

ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension

Add code
Apr 12, 2022
Figure 1 for ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Figure 2 for ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Figure 3 for ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Figure 4 for ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Viaarxiv icon

MedICaT: A Dataset of Medical Images, Captions, and Textual References

Add code
Oct 12, 2020
Figure 1 for MedICaT: A Dataset of Medical Images, Captions, and Textual References
Figure 2 for MedICaT: A Dataset of Medical Images, Captions, and Textual References
Figure 3 for MedICaT: A Dataset of Medical Images, Captions, and Textual References
Figure 4 for MedICaT: A Dataset of Medical Images, Captions, and Textual References
Viaarxiv icon

Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering

Add code
Jul 01, 2020
Figure 1 for Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering
Figure 2 for Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering
Figure 3 for Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering
Figure 4 for Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering
Viaarxiv icon

Obtaining Faithful Interpretations from Compositional Neural Networks

Add code
May 02, 2020
Figure 1 for Obtaining Faithful Interpretations from Compositional Neural Networks
Figure 2 for Obtaining Faithful Interpretations from Compositional Neural Networks
Figure 3 for Obtaining Faithful Interpretations from Compositional Neural Networks
Figure 4 for Obtaining Faithful Interpretations from Compositional Neural Networks
Viaarxiv icon