Alert button
Picture for Zelin Wu

Zelin Wu

Alert button

High-precision Voice Search Query Correction via Retrievable Speech-text Embedings

Add code
Bookmark button
Alert button
Jan 08, 2024
Christopher Li, Gary Wang, Kyle Kastner, Heng Su, Allen Chen, Andrew Rosenberg, Zhehuai Chen, Zelin Wu, Leonid Velikovich, Pat Rondon, Diamantino Caseiro, Petar Aleksic

Viaarxiv icon

SLM: Bridge the thin gap between speech and text foundation models

Add code
Bookmark button
Alert button
Sep 30, 2023
Mingqiu Wang, Wei Han, Izhak Shafran, Zelin Wu, Chung-Cheng Chiu, Yuan Cao, Yongqiang Wang, Nanxin Chen, Yu Zhang, Hagen Soltau, Paul Rubenstein, Lukas Zilka, Dian Yu, Zhong Meng, Golan Pundak, Nikhil Siddhartha, Johan Schalkwyk, Yonghui Wu

Figure 1 for SLM: Bridge the thin gap between speech and text foundation models
Figure 2 for SLM: Bridge the thin gap between speech and text foundation models
Figure 3 for SLM: Bridge the thin gap between speech and text foundation models
Figure 4 for SLM: Bridge the thin gap between speech and text foundation models
Viaarxiv icon

Contextual Biasing with the Knuth-Morris-Pratt Matching Algorithm

Add code
Bookmark button
Alert button
Sep 29, 2023
Weiran Wang, Zelin Wu, Diamantino Caseiro, Tsendsuren Munkhdalai, Khe Chai Sim, Pat Rondon, Golan Pundak, Gan Song, Rohit Prabhavalkar, Zhong Meng, Ding Zhao, Tara Sainath, Pedro Moreno Mengibar

Viaarxiv icon

A Deliberation-based Joint Acoustic and Text Decoder

Add code
Bookmark button
Alert button
Mar 23, 2023
Sepand Mavandadi, Tara N. Sainath, Ke Hu, Zelin Wu

Figure 1 for A Deliberation-based Joint Acoustic and Text Decoder
Figure 2 for A Deliberation-based Joint Acoustic and Text Decoder
Figure 3 for A Deliberation-based Joint Acoustic and Text Decoder
Figure 4 for A Deliberation-based Joint Acoustic and Text Decoder
Viaarxiv icon

Streaming Intended Query Detection using E2E Modeling for Continued Conversation

Add code
Bookmark button
Alert button
Aug 29, 2022
Shuo-yiin Chang, Guru Prakash, Zelin Wu, Qiao Liang, Tara N. Sainath, Bo Li, Adam Stambler, Shyam Upadhyay, Manaal Faruqui, Trevor Strohman

Figure 1 for Streaming Intended Query Detection using E2E Modeling for Continued Conversation
Figure 2 for Streaming Intended Query Detection using E2E Modeling for Continued Conversation
Figure 3 for Streaming Intended Query Detection using E2E Modeling for Continued Conversation
Figure 4 for Streaming Intended Query Detection using E2E Modeling for Continued Conversation
Viaarxiv icon

Speech Recognition with Augmented Synthesized Speech

Add code
Bookmark button
Alert button
Sep 25, 2019
Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro Moreno, Yonghui Wu, Zelin Wu

Figure 1 for Speech Recognition with Augmented Synthesized Speech
Figure 2 for Speech Recognition with Augmented Synthesized Speech
Figure 3 for Speech Recognition with Augmented Synthesized Speech
Figure 4 for Speech Recognition with Augmented Synthesized Speech
Viaarxiv icon

Improving Performance of End-to-End ASR on Numeric Sequences

Add code
Bookmark button
Alert button
Jul 01, 2019
Cal Peyser, Hao Zhang, Tara N. Sainath, Zelin Wu

Figure 1 for Improving Performance of End-to-End ASR on Numeric Sequences
Figure 2 for Improving Performance of End-to-End ASR on Numeric Sequences
Figure 3 for Improving Performance of End-to-End ASR on Numeric Sequences
Viaarxiv icon

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

Add code
Bookmark button
Alert button
Feb 21, 2019
Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon

Figure 1 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Figure 2 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Figure 3 for Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling
Viaarxiv icon

VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking

Add code
Bookmark button
Alert button
Oct 27, 2018
Quan Wang, Hannah Muckenhirn, Kevin Wilson, Prashant Sridhar, Zelin Wu, John Hershey, Rif A. Saurous, Ron J. Weiss, Ye Jia, Ignacio Lopez Moreno

Figure 1 for VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
Figure 2 for VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
Figure 3 for VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
Figure 4 for VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
Viaarxiv icon