Alert button
Picture for Siddharth Dalmia

Siddharth Dalmia

Alert button

Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems

Add code
Bookmark button
Alert button
Apr 04, 2024
Frank Palma Gomez, Ramon Sanabria, Yun-hsuan Sung, Daniel Cer, Siddharth Dalmia, Gustavo Hernandez Abrego

Viaarxiv icon

LLM Augmented LLMs: Expanding Capabilities through Composition

Add code
Bookmark button
Alert button
Jan 04, 2024
Rachit Bansal, Bidisha Samanta, Siddharth Dalmia, Nitish Gupta, Shikhar Vashishth, Sriram Ganapathy, Abhishek Bapna, Prateek Jain, Partha Talukdar

Viaarxiv icon

Multimodal Modeling For Spoken Language Identification

Add code
Bookmark button
Alert button
Sep 19, 2023
Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa

Figure 1 for Multimodal Modeling For Spoken Language Identification
Figure 2 for Multimodal Modeling For Spoken Language Identification
Figure 3 for Multimodal Modeling For Spoken Language Identification
Figure 4 for Multimodal Modeling For Spoken Language Identification
Viaarxiv icon

ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit

Add code
Bookmark button
Alert button
Apr 11, 2023
Brian Yan, Jiatong Shi, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polák, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe

Figure 1 for ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Figure 2 for ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Figure 3 for ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Figure 4 for ESPnet-ST-v2: Multipurpose Spoken Language Translation Toolkit
Viaarxiv icon

Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation

Add code
Bookmark button
Alert button
Nov 11, 2022
Motoi Omachi, Brian Yan, Siddharth Dalmia, Yuya Fujita, Shinji Watanabe

Figure 1 for Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation
Figure 2 for Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation
Figure 3 for Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation
Figure 4 for Align, Write, Re-order: Explainable End-to-End Speech Translation via Operation Sequence Generation
Viaarxiv icon

A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding

Add code
Bookmark button
Alert button
Nov 10, 2022
Yifan Peng, Siddhant Arora, Yosuke Higuchi, Yushi Ueda, Sujay Kumar, Karthik Ganesan, Siddharth Dalmia, Xuankai Chang, Shinji Watanabe

Figure 1 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 2 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 3 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Figure 4 for A Study on the Integration of Pre-trained SSL, ASR, LM and SLU Models for Spoken Language Understanding
Viaarxiv icon

Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models

Add code
Bookmark button
Alert button
Oct 27, 2022
Siddhant Arora, Siddharth Dalmia, Brian Yan, Florian Metze, Alan W Black, Shinji Watanabe

Figure 1 for Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models
Figure 2 for Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models
Figure 3 for Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models
Figure 4 for Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models
Viaarxiv icon

CTC Alignments Improve Autoregressive Translation

Add code
Bookmark button
Alert button
Oct 11, 2022
Brian Yan, Siddharth Dalmia, Yosuke Higuchi, Graham Neubig, Florian Metze, Alan W Black, Shinji Watanabe

Figure 1 for CTC Alignments Improve Autoregressive Translation
Figure 2 for CTC Alignments Improve Autoregressive Translation
Figure 3 for CTC Alignments Improve Autoregressive Translation
Figure 4 for CTC Alignments Improve Autoregressive Translation
Viaarxiv icon

Two-Pass Low Latency End-to-End Spoken Language Understanding

Add code
Bookmark button
Alert button
Jul 14, 2022
Siddhant Arora, Siddharth Dalmia, Xuankai Chang, Brian Yan, Alan Black, Shinji Watanabe

Figure 1 for Two-Pass Low Latency End-to-End Spoken Language Understanding
Figure 2 for Two-Pass Low Latency End-to-End Spoken Language Understanding
Figure 3 for Two-Pass Low Latency End-to-End Spoken Language Understanding
Figure 4 for Two-Pass Low Latency End-to-End Spoken Language Understanding
Viaarxiv icon

Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding

Add code
Bookmark button
Alert button
Jul 06, 2022
Yifan Peng, Siddharth Dalmia, Ian Lane, Shinji Watanabe

Figure 1 for Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Figure 2 for Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Figure 3 for Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Figure 4 for Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Viaarxiv icon