Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Streaming parallel transducer beam search with fast-slow cascaded encoders



Jay Mahadeokar , Yangyang Shi , Ke Li , Duc Le , Jiedan Zhu , Vikas Chandra , Ozlem Kalinli , Michael L Seltzer

* 5 pages, 2 figures, Interspeech 2022 submission 

   Access Paper or Ask Questions

TorchAudio: Building Blocks for Audio and Speech Processing



Yao-Yuan Yang , Moto Hira , Zhaoheng Ni , Anjali Chourdia , Artyom Astafurov , Caroline Chen , Ching-Feng Yeh , Christian Puhrsch , David Pollack , Dmitriy Genzel , Donny Greenberg , Edward Z. Yang , Jason Lian , Jay Mahadeokar , Jeff Hwang , Ji Chen , Peter Goldsborough , Prabhat Roy , Sean Narenthiran , Shinji Watanabe , Soumith Chintala , Vincent Quenneville-Bélair , Yangyang Shi

* Submitted to ICASSP 2022 

   Access Paper or Ask Questions

Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution



Yangyang Shi , Chunyang Wu , Dilin Wang , Alex Xiao , Jay Mahadeokar , Xiaohui Zhang , Chunxi Liu , Ke Li , Yuan Shangguan , Varun Nagaraja , Ozlem Kalinli , Mike Seltzer

* 5 pages, 3 figures, submit to ICASSP 2022 

   Access Paper or Ask Questions

Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study



Dawei Liang , Yangyang Shi , Yun Wang , Nayan Singhal , Alex Xiao , Jonathan Shaw , Edison Thomaz , Ozlem Kalinli , Mike Seltzer

* Submitted to ICASSP 2022 

   Access Paper or Ask Questions

Collaborative Training of Acoustic Encoders for Speech Recognition



Varun Nagaraja , Yangyang Shi , Ganesh Venkatesh , Ozlem Kalinli , Michael L. Seltzer , Vikas Chandra

* INTERSPEECH 2021 

   Access Paper or Ask Questions

On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models



Xiaohui Zhang , Vimal Manohar , David Zhang , Frank Zhang , Yangyang Shi , Nayan Singhal , Julian Chan , Fuchun Peng , Yatharth Saraf , Mike Seltzer

* submitted to ASRU 2021 

   Access Paper or Ask Questions

Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios



Jay Mahadeokar , Yangyang Shi , Yuan Shangguan , Chunyang Wu , Alex Xiao , Hang Su , Duc Le , Ozlem Kalinli , Christian Fuegen , Michael L. Seltzer

* Submitted to Interspeech 2021 (under review) 

   Access Paper or Ask Questions

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition



Yuan Shangguan , Rohit Prabhavalkar , Hang Su , Jay Mahadeokar , Yangyang Shi , Jiatong Zhou , Chunyang Wu , Duc Le , Ozlem Kalinli , Christian Fuegen , Michael L. Seltzer

* Submitted to Interspeech 2021 

   Access Paper or Ask Questions

Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion



Duc Le , Mahaveer Jain , Gil Keren , Suyoun Kim , Yangyang Shi , Jay Mahadeokar , Julian Chan , Yuan Shangguan , Christian Fuegen , Ozlem Kalinli , Yatharth Saraf , Michael L. Seltzer

* Submitted to INTERSPEECH 2021 

   Access Paper or Ask Questions

1
2
>>