Picture for Shruti Palaskar

Shruti Palaskar

Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection

Add code
Jun 13, 2024
Figure 1 for Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection
Figure 2 for Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection
Figure 3 for Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection
Figure 4 for Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection
Viaarxiv icon

On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization

Add code
May 24, 2022
Figure 1 for On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Figure 2 for On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Figure 3 for On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Figure 4 for On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Viaarxiv icon

Speech Summarization using Restricted Self-Attention

Add code
Oct 12, 2021
Figure 1 for Speech Summarization using Restricted Self-Attention
Figure 2 for Speech Summarization using Restricted Self-Attention
Figure 3 for Speech Summarization using Restricted Self-Attention
Figure 4 for Speech Summarization using Restricted Self-Attention
Viaarxiv icon

How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language

Add code
Aug 18, 2020
Figure 1 for How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
Figure 2 for How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
Figure 3 for How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
Viaarxiv icon

ASR Error Correction and Domain Adaptation Using Machine Translation

Add code
Mar 13, 2020
Figure 1 for ASR Error Correction and Domain Adaptation Using Machine Translation
Figure 2 for ASR Error Correction and Domain Adaptation Using Machine Translation
Figure 3 for ASR Error Correction and Domain Adaptation Using Machine Translation
Figure 4 for ASR Error Correction and Domain Adaptation Using Machine Translation
Viaarxiv icon

Multimodal Abstractive Summarization for How2 Videos

Add code
Jun 19, 2019
Figure 1 for Multimodal Abstractive Summarization for How2 Videos
Figure 2 for Multimodal Abstractive Summarization for How2 Videos
Figure 3 for Multimodal Abstractive Summarization for How2 Videos
Figure 4 for Multimodal Abstractive Summarization for How2 Videos
Viaarxiv icon

Learned In Speech Recognition: Contextual Acoustic Word Embeddings

Add code
Feb 18, 2019
Figure 1 for Learned In Speech Recognition: Contextual Acoustic Word Embeddings
Figure 2 for Learned In Speech Recognition: Contextual Acoustic Word Embeddings
Figure 3 for Learned In Speech Recognition: Contextual Acoustic Word Embeddings
Figure 4 for Learned In Speech Recognition: Contextual Acoustic Word Embeddings
Viaarxiv icon

Learning from Multiview Correlations in Open-Domain Videos

Add code
Nov 21, 2018
Figure 1 for Learning from Multiview Correlations in Open-Domain Videos
Figure 2 for Learning from Multiview Correlations in Open-Domain Videos
Figure 3 for Learning from Multiview Correlations in Open-Domain Videos
Figure 4 for Learning from Multiview Correlations in Open-Domain Videos
Viaarxiv icon

Multimodal Grounding for Sequence-to-Sequence Speech Recognition

Add code
Nov 09, 2018
Figure 1 for Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Figure 2 for Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Figure 3 for Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Figure 4 for Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Viaarxiv icon

How2: A Large-scale Dataset for Multimodal Language Understanding

Add code
Nov 01, 2018
Figure 1 for How2: A Large-scale Dataset for Multimodal Language Understanding
Figure 2 for How2: A Large-scale Dataset for Multimodal Language Understanding
Figure 3 for How2: A Large-scale Dataset for Multimodal Language Understanding
Figure 4 for How2: A Large-scale Dataset for Multimodal Language Understanding
Viaarxiv icon