Picture for Dinesh Manocha

Dinesh Manocha

Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time

Add code
Jul 01, 2024
Viaarxiv icon

Speech2UnifiedExpressions: Synchronous Synthesis of Co-Speech Affective Face and Body Expressions from Affordable Inputs

Add code
Jun 26, 2024
Viaarxiv icon

IntCoOp: Interpretability-Aware Vision-Language Prompt Tuning

Add code
Jun 19, 2024
Figure 1 for IntCoOp: Interpretability-Aware Vision-Language Prompt Tuning
Figure 2 for IntCoOp: Interpretability-Aware Vision-Language Prompt Tuning
Figure 3 for IntCoOp: Interpretability-Aware Vision-Language Prompt Tuning
Figure 4 for IntCoOp: Interpretability-Aware Vision-Language Prompt Tuning
Viaarxiv icon

Embodied Question Answering via Multi-LLM Systems

Add code
Jun 18, 2024
Figure 1 for Embodied Question Answering via Multi-LLM Systems
Figure 2 for Embodied Question Answering via Multi-LLM Systems
Figure 3 for Embodied Question Answering via Multi-LLM Systems
Figure 4 for Embodied Question Answering via Multi-LLM Systems
Viaarxiv icon

GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

Add code
Jun 17, 2024
Viaarxiv icon

AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models

Add code
Jun 16, 2024
Viaarxiv icon

MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models

Add code
Jun 07, 2024
Figure 1 for MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models
Figure 2 for MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models
Figure 3 for MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models
Figure 4 for MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models
Viaarxiv icon

ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions

Add code
Jun 06, 2024
Viaarxiv icon

LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition

Add code
Jun 06, 2024
Viaarxiv icon

Transfer Q Star: Principled Decoding for LLM Alignment

Add code
May 30, 2024
Viaarxiv icon