Alert button
Picture for Hasim Sak

Hasim Sak

Alert button

Contrastive Siamese Network for Semi-supervised Speech Recognition

Add code
Bookmark button
Alert button
May 27, 2022
Soheil Khorram, Jaeyoung Kim, Anshuman Tripathi, Han Lu, Qian Zhang, Hasim Sak

Figure 1 for Contrastive Siamese Network for Semi-supervised Speech Recognition
Figure 2 for Contrastive Siamese Network for Semi-supervised Speech Recognition
Figure 3 for Contrastive Siamese Network for Semi-supervised Speech Recognition
Figure 4 for Contrastive Siamese Network for Semi-supervised Speech Recognition
Viaarxiv icon

Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection

Add code
Bookmark button
Alert button
Oct 05, 2021
Wei Xia, Han Lu, Quan Wang, Anshuman Tripathi, Yiling Huang, Ignacio Lopez Moreno, Hasim Sak

Figure 1 for Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection
Figure 2 for Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection
Figure 3 for Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection
Figure 4 for Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection
Viaarxiv icon

Reducing Streaming ASR Model Delay with Self Alignment

Add code
Bookmark button
Alert button
May 06, 2021
Jaeyoung Kim, Han Lu, Anshuman Tripathi, Qian Zhang, Hasim Sak

Figure 1 for Reducing Streaming ASR Model Delay with Self Alignment
Figure 2 for Reducing Streaming ASR Model Delay with Self Alignment
Figure 3 for Reducing Streaming ASR Model Delay with Self Alignment
Figure 4 for Reducing Streaming ASR Model Delay with Self Alignment
Viaarxiv icon

Transformer Transducer: One Model Unifying Streaming and Non-streaming Speech Recognition

Add code
Bookmark button
Alert button
Oct 07, 2020
Anshuman Tripathi, Jaeyoung Kim, Qian Zhang, Han Lu, Hasim Sak

Figure 1 for Transformer Transducer: One Model Unifying Streaming and Non-streaming Speech Recognition
Figure 2 for Transformer Transducer: One Model Unifying Streaming and Non-streaming Speech Recognition
Figure 3 for Transformer Transducer: One Model Unifying Streaming and Non-streaming Speech Recognition
Figure 4 for Transformer Transducer: One Model Unifying Streaming and Non-streaming Speech Recognition
Viaarxiv icon

A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition

Add code
Bookmark button
Alert button
Feb 28, 2020
Erik McDermott, Hasim Sak, Ehsan Variani

Figure 1 for A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition
Figure 2 for A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition
Figure 3 for A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition
Figure 4 for A Density Ratio Approach to Language Model Fusion in End-To-End Automatic Speech Recognition
Viaarxiv icon

Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss

Add code
Bookmark button
Alert button
Feb 14, 2020
Qian Zhang, Han Lu, Hasim Sak, Anshuman Tripathi, Erik McDermott, Stephen Koo, Shankar Kumar

Figure 1 for Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
Figure 2 for Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
Figure 3 for Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
Figure 4 for Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
Viaarxiv icon

Adversarial Training for Multilingual Acoustic Modeling

Add code
Bookmark button
Alert button
Jun 17, 2019
Ke Hu, Hasim Sak, Hank Liao

Figure 1 for Adversarial Training for Multilingual Acoustic Modeling
Figure 2 for Adversarial Training for Multilingual Acoustic Modeling
Figure 3 for Adversarial Training for Multilingual Acoustic Modeling
Figure 4 for Adversarial Training for Multilingual Acoustic Modeling
Viaarxiv icon

Large-Scale Visual Speech Recognition

Add code
Bookmark button
Alert button
Oct 01, 2018
Brendan Shillingford, Yannis Assael, Matthew W. Hoffman, Thomas Paine, Cían Hughes, Utsav Prabhu, Hank Liao, Hasim Sak, Kanishka Rao, Lorrayne Bennett, Marie Mulville, Ben Coppin, Ben Laurie, Andrew Senior, Nando de Freitas

Figure 1 for Large-Scale Visual Speech Recognition
Figure 2 for Large-Scale Visual Speech Recognition
Figure 3 for Large-Scale Visual Speech Recognition
Figure 4 for Large-Scale Visual Speech Recognition
Viaarxiv icon