Alert button
Picture for Nithin Rao Koluguri

Nithin Rao Koluguri

Alert button

Investigating End-to-End ASR Architectures for Long Form Audio Transcription

Add code
Bookmark button
Alert button
Sep 20, 2023
Nithin Rao Koluguri, Samuel Kriman, Georgy Zelenfroind, Somshubra Majumdar, Dima Rekesh, Vahid Noroozi, Jagadeesh Balam, Boris Ginsburg

Figure 1 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Figure 2 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Figure 3 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Figure 4 for Investigating End-to-End ASR Architectures for Long Form Audio Transcription
Viaarxiv icon

Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition

Add code
Bookmark button
Alert button
Sep 19, 2023
Krishna C. Puvvada, Nithin Rao Koluguri, Kunal Dhawan, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

AmberNet: A Compact End-to-End Model for Spoken Language Identification

Add code
Bookmark button
Alert button
Oct 27, 2022
Fei Jia, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg

Figure 1 for AmberNet: A Compact End-to-End Model for Spoken Language Identification
Figure 2 for AmberNet: A Compact End-to-End Model for Spoken Language Identification
Figure 3 for AmberNet: A Compact End-to-End Model for Spoken Language Identification
Figure 4 for AmberNet: A Compact End-to-End Model for Spoken Language Identification
Viaarxiv icon

Multi-scale Speaker Diarization with Dynamic Scale Weighting

Add code
Bookmark button
Alert button
Mar 30, 2022
Tae Jin Park, Nithin Rao Koluguri, Jagadeesh Balam, Boris Ginsburg

Figure 1 for Multi-scale Speaker Diarization with Dynamic Scale Weighting
Figure 2 for Multi-scale Speaker Diarization with Dynamic Scale Weighting
Figure 3 for Multi-scale Speaker Diarization with Dynamic Scale Weighting
Figure 4 for Multi-scale Speaker Diarization with Dynamic Scale Weighting
Viaarxiv icon

TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context

Add code
Bookmark button
Alert button
Oct 08, 2021
Nithin Rao Koluguri, Taejin Park, Boris Ginsburg

Figure 1 for TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context
Figure 2 for TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context
Figure 3 for TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context
Figure 4 for TitaNet: Neural Model for speaker representation with 1D Depth-wise separable convolutions and global context
Viaarxiv icon