Picture for Pradyumna Narayana

Pradyumna Narayana

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models

Add code
May 28, 2023
Figure 1 for KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models
Figure 2 for KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models
Figure 3 for KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models
Figure 4 for KAFA: Rethinking Image Ad Understanding with Knowledge-Augmented Feature Adaptation of Vision-Language Models
Viaarxiv icon

Discriminative Diffusion Models as Few-shot Vision and Language Learners

Add code
May 18, 2023
Figure 1 for Discriminative Diffusion Models as Few-shot Vision and Language Learners
Figure 2 for Discriminative Diffusion Models as Few-shot Vision and Language Learners
Figure 3 for Discriminative Diffusion Models as Few-shot Vision and Language Learners
Figure 4 for Discriminative Diffusion Models as Few-shot Vision and Language Learners
Viaarxiv icon

MetaCLUE: Towards Comprehensive Visual Metaphors Research

Add code
Dec 19, 2022
Figure 1 for MetaCLUE: Towards Comprehensive Visual Metaphors Research
Figure 2 for MetaCLUE: Towards Comprehensive Visual Metaphors Research
Figure 3 for MetaCLUE: Towards Comprehensive Visual Metaphors Research
Figure 4 for MetaCLUE: Towards Comprehensive Visual Metaphors Research
Viaarxiv icon

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis

Add code
Dec 09, 2022
Figure 1 for Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Figure 2 for Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Figure 3 for Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Figure 4 for Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Viaarxiv icon

CPL: Counterfactual Prompt Learning for Vision and Language Models

Add code
Oct 19, 2022
Figure 1 for CPL: Counterfactual Prompt Learning for Vision and Language Models
Figure 2 for CPL: Counterfactual Prompt Learning for Vision and Language Models
Figure 3 for CPL: Counterfactual Prompt Learning for Vision and Language Models
Figure 4 for CPL: Counterfactual Prompt Learning for Vision and Language Models
Viaarxiv icon

Diagnosing Vision-and-Language Navigation: What Really Matters

Add code
Mar 30, 2021
Figure 1 for Diagnosing Vision-and-Language Navigation: What Really Matters
Figure 2 for Diagnosing Vision-and-Language Navigation: What Really Matters
Figure 3 for Diagnosing Vision-and-Language Navigation: What Really Matters
Figure 4 for Diagnosing Vision-and-Language Navigation: What Really Matters
Viaarxiv icon

Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations

Add code
Oct 07, 2020
Figure 1 for Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations
Figure 2 for Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations
Figure 3 for Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations
Figure 4 for Towards Understanding Sample Variance in Visually Grounded Language Generation: Evaluations and Observations
Viaarxiv icon

Leveraging Organizational Resources to Adapt Models to New Data Modalities

Add code
Aug 23, 2020
Figure 1 for Leveraging Organizational Resources to Adapt Models to New Data Modalities
Figure 2 for Leveraging Organizational Resources to Adapt Models to New Data Modalities
Figure 3 for Leveraging Organizational Resources to Adapt Models to New Data Modalities
Figure 4 for Leveraging Organizational Resources to Adapt Models to New Data Modalities
Viaarxiv icon

Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation

Add code
Jul 01, 2020
Figure 1 for Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
Figure 2 for Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
Figure 3 for Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
Figure 4 for Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation
Viaarxiv icon