Alert button
Picture for Yangyang Shi

Yangyang Shi

Alert button

Biased Self-supervised learning for ASR

Nov 04, 2022
Florian L. Kreyssig, Yangyang Shi, Jinxi Guo, Leda Sari, Abdelrahman Mohamed, Philip C. Woodland

Figure 1 for Biased Self-supervised learning for ASR
Figure 2 for Biased Self-supervised learning for ASR
Figure 3 for Biased Self-supervised learning for ASR
Viaarxiv icon

SCA: Streaming Cross-attention Alignment for Echo Cancellation

Nov 01, 2022
Yang Liu, Yangyang Shi, Yun Li, Kaustubh Kalgaonkar, Sriram Srinivasan, Xin Lei

Figure 1 for SCA: Streaming Cross-attention Alignment for Echo Cancellation
Figure 2 for SCA: Streaming Cross-attention Alignment for Echo Cancellation
Figure 3 for SCA: Streaming Cross-attention Alignment for Echo Cancellation
Figure 4 for SCA: Streaming Cross-attention Alignment for Echo Cancellation
Viaarxiv icon

Learning a Dual-Mode Speech Recognition Model via Self-Pruning

Jul 25, 2022
Chunxi Liu, Yuan Shangguan, Haichuan Yang, Yangyang Shi, Raghuraman Krishnamoorthi, Ozlem Kalinli

Figure 1 for Learning a Dual-Mode Speech Recognition Model via Self-Pruning
Figure 2 for Learning a Dual-Mode Speech Recognition Model via Self-Pruning
Figure 3 for Learning a Dual-Mode Speech Recognition Model via Self-Pruning
Figure 4 for Learning a Dual-Mode Speech Recognition Model via Self-Pruning
Viaarxiv icon

Streaming parallel transducer beam search with fast-slow cascaded encoders

Mar 29, 2022
Jay Mahadeokar, Yangyang Shi, Ke Li, Duc Le, Jiedan Zhu, Vikas Chandra, Ozlem Kalinli, Michael L Seltzer

Figure 1 for Streaming parallel transducer beam search with fast-slow cascaded encoders
Figure 2 for Streaming parallel transducer beam search with fast-slow cascaded encoders
Figure 3 for Streaming parallel transducer beam search with fast-slow cascaded encoders
Figure 4 for Streaming parallel transducer beam search with fast-slow cascaded encoders
Viaarxiv icon

TorchAudio: Building Blocks for Audio and Speech Processing

Oct 28, 2021
Yao-Yuan Yang, Moto Hira, Zhaoheng Ni, Anjali Chourdia, Artyom Astafurov, Caroline Chen, Ching-Feng Yeh, Christian Puhrsch, David Pollack, Dmitriy Genzel, Donny Greenberg, Edward Z. Yang, Jason Lian, Jay Mahadeokar, Jeff Hwang, Ji Chen, Peter Goldsborough, Prabhat Roy, Sean Narenthiran, Shinji Watanabe, Soumith Chintala, Vincent Quenneville-Bélair, Yangyang Shi

Figure 1 for TorchAudio: Building Blocks for Audio and Speech Processing
Figure 2 for TorchAudio: Building Blocks for Audio and Speech Processing
Figure 3 for TorchAudio: Building Blocks for Audio and Speech Processing
Figure 4 for TorchAudio: Building Blocks for Audio and Speech Processing
Viaarxiv icon

Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution

Oct 07, 2021
Yangyang Shi, Chunyang Wu, Dilin Wang, Alex Xiao, Jay Mahadeokar, Xiaohui Zhang, Chunxi Liu, Ke Li, Yuan Shangguan, Varun Nagaraja, Ozlem Kalinli, Mike Seltzer

Figure 1 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Figure 2 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Figure 3 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Figure 4 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Viaarxiv icon

Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

Oct 07, 2021
Dawei Liang, Yangyang Shi, Yun Wang, Nayan Singhal, Alex Xiao, Jonathan Shaw, Edison Thomaz, Ozlem Kalinli, Mike Seltzer

Figure 1 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Figure 2 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Figure 3 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Figure 4 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Viaarxiv icon

Collaborative Training of Acoustic Encoders for Speech Recognition

Jul 13, 2021
Varun Nagaraja, Yangyang Shi, Ganesh Venkatesh, Ozlem Kalinli, Michael L. Seltzer, Vikas Chandra

Figure 1 for Collaborative Training of Acoustic Encoders for Speech Recognition
Figure 2 for Collaborative Training of Acoustic Encoders for Speech Recognition
Figure 3 for Collaborative Training of Acoustic Encoders for Speech Recognition
Viaarxiv icon

On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models

Jul 09, 2021
Xiaohui Zhang, Vimal Manohar, David Zhang, Frank Zhang, Yangyang Shi, Nayan Singhal, Julian Chan, Fuchun Peng, Yatharth Saraf, Mike Seltzer

Figure 1 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Figure 2 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Figure 3 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Figure 4 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Viaarxiv icon