Alert button
Picture for Yatharth Saraf

Yatharth Saraf

Alert button

Accent-Robust Automatic Speech Recognition Using Supervised and Unsupervised Wav2vec Embeddings

Add code
Bookmark button
Alert button
Oct 08, 2021
Jialu Li, Vimal Manohar, Pooja Chitkara, Andros Tjandra, Michael Picheny, Frank Zhang, Xiaohui Zhang, Yatharth Saraf

Figure 1 for Accent-Robust Automatic Speech Recognition Using Supervised and Unsupervised Wav2vec Embeddings
Figure 2 for Accent-Robust Automatic Speech Recognition Using Supervised and Unsupervised Wav2vec Embeddings
Figure 3 for Accent-Robust Automatic Speech Recognition Using Supervised and Unsupervised Wav2vec Embeddings
Figure 4 for Accent-Robust Automatic Speech Recognition Using Supervised and Unsupervised Wav2vec Embeddings
Viaarxiv icon

Improved Language Identification Through Cross-Lingual Self-Supervised Learning

Add code
Bookmark button
Alert button
Aug 04, 2021
Andros Tjandra, Diptanu Gon Choudhury, Frank Zhang, Kritika Singh, Alexis Conneau, Alexei Baevski, Assaf Sela, Yatharth Saraf, Michael Auli

Figure 1 for Improved Language Identification Through Cross-Lingual Self-Supervised Learning
Figure 2 for Improved Language Identification Through Cross-Lingual Self-Supervised Learning
Figure 3 for Improved Language Identification Through Cross-Lingual Self-Supervised Learning
Figure 4 for Improved Language Identification Through Cross-Lingual Self-Supervised Learning
Viaarxiv icon

On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models

Add code
Bookmark button
Alert button
Jul 09, 2021
Xiaohui Zhang, Vimal Manohar, David Zhang, Frank Zhang, Yangyang Shi, Nayan Singhal, Julian Chan, Fuchun Peng, Yatharth Saraf, Mike Seltzer

Figure 1 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Figure 2 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Figure 3 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Figure 4 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Viaarxiv icon

Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition

Add code
Bookmark button
Alert button
Jun 14, 2021
Vimal Manohar, Tatiana Likhomanenko, Qiantong Xu, Wei-Ning Hsu, Ronan Collobert, Yatharth Saraf, Geoffrey Zweig, Abdelrahman Mohamed

Figure 1 for Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
Figure 2 for Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
Figure 3 for Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
Figure 4 for Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
Viaarxiv icon

Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion

Add code
Bookmark button
Alert button
Apr 05, 2021
Duc Le, Mahaveer Jain, Gil Keren, Suyoun Kim, Yangyang Shi, Jay Mahadeokar, Julian Chan, Yuan Shangguan, Christian Fuegen, Ozlem Kalinli, Yatharth Saraf, Michael L. Seltzer

Figure 1 for Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Figure 2 for Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Figure 3 for Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Figure 4 for Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion
Viaarxiv icon

A Multi-View Approach To Audio-Visual Speaker Verification

Add code
Bookmark button
Alert button
Feb 11, 2021
Leda Sarı, Kritika Singh, Jiatong Zhou, Lorenzo Torresani, Nayan Singhal, Yatharth Saraf

Figure 1 for A Multi-View Approach To Audio-Visual Speaker Verification
Figure 2 for A Multi-View Approach To Audio-Visual Speaker Verification
Figure 3 for A Multi-View Approach To Audio-Visual Speaker Verification
Figure 4 for A Multi-View Approach To Audio-Visual Speaker Verification
Viaarxiv icon

Improving RNN Transducer Based ASR with Auxiliary Tasks

Add code
Bookmark button
Alert button
Nov 09, 2020
Chunxi Liu, Frank Zhang, Duc Le, Suyoun Kim, Yatharth Saraf, Geoffrey Zweig

Figure 1 for Improving RNN Transducer Based ASR with Auxiliary Tasks
Figure 2 for Improving RNN Transducer Based ASR with Auxiliary Tasks
Figure 3 for Improving RNN Transducer Based ASR with Auxiliary Tasks
Figure 4 for Improving RNN Transducer Based ASR with Auxiliary Tasks
Viaarxiv icon

Contextual RNN-T For Open Domain ASR

Add code
Bookmark button
Alert button
Jun 04, 2020
Mahaveer Jain, Gil Keren, Jay Mahadeokar, Yatharth Saraf

Figure 1 for Contextual RNN-T For Open Domain ASR
Figure 2 for Contextual RNN-T For Open Domain ASR
Figure 3 for Contextual RNN-T For Open Domain ASR
Figure 4 for Contextual RNN-T For Open Domain ASR
Viaarxiv icon