Picture for Sanjay Subramanian

Sanjay Subramanian

University of Pennsylvania

AutoPresent: Designing Structured Visuals from Scratch

Add code
Jan 01, 2025
Figure 1 for AutoPresent: Designing Structured Visuals from Scratch
Figure 2 for AutoPresent: Designing Structured Visuals from Scratch
Figure 3 for AutoPresent: Designing Structured Visuals from Scratch
Figure 4 for AutoPresent: Designing Structured Visuals from Scratch
Viaarxiv icon

Using Language Models to Disambiguate Lexical Choices in Translation

Add code
Nov 08, 2024
Figure 1 for Using Language Models to Disambiguate Lexical Choices in Translation
Figure 2 for Using Language Models to Disambiguate Lexical Choices in Translation
Figure 3 for Using Language Models to Disambiguate Lexical Choices in Translation
Figure 4 for Using Language Models to Disambiguate Lexical Choices in Translation
Viaarxiv icon

Pose Priors from Language Models

Add code
May 06, 2024
Viaarxiv icon

TraveLER: A Multi-LMM Agent Framework for Video Question-Answering

Add code
Apr 01, 2024
Figure 1 for TraveLER: A Multi-LMM Agent Framework for Video Question-Answering
Figure 2 for TraveLER: A Multi-LMM Agent Framework for Video Question-Answering
Figure 3 for TraveLER: A Multi-LMM Agent Framework for Video Question-Answering
Figure 4 for TraveLER: A Multi-LMM Agent Framework for Video Question-Answering
Viaarxiv icon

Recursive Visual Programming

Add code
Dec 04, 2023
Figure 1 for Recursive Visual Programming
Figure 2 for Recursive Visual Programming
Figure 3 for Recursive Visual Programming
Figure 4 for Recursive Visual Programming
Viaarxiv icon

From Wrong To Right: A Recursive Approach Towards Vision-Language Explanation

Add code
Nov 21, 2023
Figure 1 for From Wrong To Right: A Recursive Approach Towards Vision-Language Explanation
Figure 2 for From Wrong To Right: A Recursive Approach Towards Vision-Language Explanation
Figure 3 for From Wrong To Right: A Recursive Approach Towards Vision-Language Explanation
Figure 4 for From Wrong To Right: A Recursive Approach Towards Vision-Language Explanation
Viaarxiv icon

Can Language Models Learn to Listen?

Add code
Aug 21, 2023
Figure 1 for Can Language Models Learn to Listen?
Figure 2 for Can Language Models Learn to Listen?
Figure 3 for Can Language Models Learn to Listen?
Figure 4 for Can Language Models Learn to Listen?
Viaarxiv icon

Modular Visual Question Answering via Code Generation

Add code
Jun 08, 2023
Viaarxiv icon

ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension

Add code
Apr 12, 2022
Figure 1 for ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Figure 2 for ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Figure 3 for ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Figure 4 for ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension
Viaarxiv icon

MedICaT: A Dataset of Medical Images, Captions, and Textual References

Add code
Oct 12, 2020
Figure 1 for MedICaT: A Dataset of Medical Images, Captions, and Textual References
Figure 2 for MedICaT: A Dataset of Medical Images, Captions, and Textual References
Figure 3 for MedICaT: A Dataset of Medical Images, Captions, and Textual References
Figure 4 for MedICaT: A Dataset of Medical Images, Captions, and Textual References
Viaarxiv icon