Alert button
Picture for Somshubra Majumdar

Somshubra Majumdar

Alert button

Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition

Add code
Bookmark button
Alert button
Jan 11, 2024
Vahid Noroozi, Somshubra Majumdar, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

Stateful FastConformer with Cache-based Inference for Streaming Automatic Speech Recognition

Add code
Bookmark button
Alert button
Dec 27, 2023
Vahid Noroozi, Somshubra Majumdar, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

Investigating End-to-End ASR Architectures for Long Form Audio Transcription

Add code
Bookmark button
Alert button
Sep 20, 2023
Nithin Rao Koluguri, Samuel Kriman, Georgy Zelenfroind, Somshubra Majumdar, Dima Rekesh, Vahid Noroozi, Jagadeesh Balam, Boris Ginsburg

Figure 1 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Figure 2 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Figure 3 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Figure 4 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Viaarxiv icon

Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition

Add code
Bookmark button
Alert button
May 19, 2023
Dima Rekesh, Samuel Kriman, Somshubra Majumdar, Vahid Noroozi, He Huang, Oleksii Hrinchuk, Ankur Kumar, Boris Ginsburg

Figure 1 for Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Figure 2 for Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Figure 3 for Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Figure 4 for Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Viaarxiv icon

Efficient Sequence Transduction by Jointly Predicting Tokens and Durations

Add code
Bookmark button
Alert button
Apr 13, 2023
Hainan Xu, Fei Jia, Somshubra Majumdar, He Huang, Shinji Watanabe, Boris Ginsburg

Figure 1 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 2 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 3 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 4 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Viaarxiv icon

Multi-blank Transducers for Speech Recognition

Add code
Bookmark button
Alert button
Nov 04, 2022
Hainan Xu, Fei Jia, Somshubra Majumdar, Shinji Watanabe, Boris Ginsburg

Figure 1 for Multi-blank Transducers for Speech Recognition
Figure 2 for Multi-blank Transducers for Speech Recognition
Figure 3 for Multi-blank Transducers for Speech Recognition
Figure 4 for Multi-blank Transducers for Speech Recognition
Viaarxiv icon

Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition

Add code
Bookmark button
Alert button
Oct 06, 2022
Somshubra Majumdar, Shantanu Acharya, Vitaly Lavrukhin, Boris Ginsburg

Figure 1 for Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition
Figure 2 for Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition
Figure 3 for Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition
Viaarxiv icon

CTC Variations Through New WFST Topologies

Add code
Bookmark button
Alert button
Oct 06, 2021
Aleksandr Laptev, Somshubra Majumdar, Boris Ginsburg

Figure 1 for CTC Variations Through New WFST Topologies
Figure 2 for CTC Variations Through New WFST Topologies
Figure 3 for CTC Variations Through New WFST Topologies
Figure 4 for CTC Variations Through New WFST Topologies
Viaarxiv icon