Picture for Somshubra Majumdar

Somshubra Majumdar

Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASR

Add code
Sep 02, 2024
Viaarxiv icon

Genetic Instruct: Scaling up Synthetic Generation of Coding Instructions for Large Language Models

Add code
Jul 29, 2024
Viaarxiv icon

Less is More: Accurate Speech Recognition & Translation without Web-Scale Data

Add code
Jun 28, 2024
Viaarxiv icon

Instruction Data Generation and Unsupervised Adaptation for Speech Language Models

Add code
Jun 18, 2024
Figure 1 for Instruction Data Generation and Unsupervised Adaptation for Speech Language Models
Figure 2 for Instruction Data Generation and Unsupervised Adaptation for Speech Language Models
Figure 3 for Instruction Data Generation and Unsupervised Adaptation for Speech Language Models
Figure 4 for Instruction Data Generation and Unsupervised Adaptation for Speech Language Models
Viaarxiv icon

Nemotron-4 340B Technical Report

Add code
Jun 17, 2024
Figure 1 for Nemotron-4 340B Technical Report
Figure 2 for Nemotron-4 340B Technical Report
Figure 3 for Nemotron-4 340B Technical Report
Figure 4 for Nemotron-4 340B Technical Report
Viaarxiv icon

Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition

Add code
Jan 11, 2024
Figure 1 for Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition
Figure 2 for Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition
Figure 3 for Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition
Figure 4 for Stateful Conformer with Cache-based Inference for Streaming Automatic Speech Recognition
Viaarxiv icon

Investigating End-to-End ASR Architectures for Long Form Audio Transcription

Add code
Sep 20, 2023
Figure 1 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Figure 2 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Figure 3 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Figure 4 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Viaarxiv icon

Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition

Add code
May 19, 2023
Figure 1 for Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Figure 2 for Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Figure 3 for Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Figure 4 for Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Viaarxiv icon

Efficient Sequence Transduction by Jointly Predicting Tokens and Durations

Add code
Apr 13, 2023
Figure 1 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 2 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 3 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 4 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Viaarxiv icon

Multi-blank Transducers for Speech Recognition

Add code
Nov 04, 2022
Figure 1 for Multi-blank Transducers for Speech Recognition
Figure 2 for Multi-blank Transducers for Speech Recognition
Figure 3 for Multi-blank Transducers for Speech Recognition
Figure 4 for Multi-blank Transducers for Speech Recognition
Viaarxiv icon