Alert button
Picture for Zhehuai Chen

Zhehuai Chen

Alert button

Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Apr 04, 2024
Hainan Xu, Zhehuai Chen, Fei Jia, Boris Ginsburg

Viaarxiv icon

GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators

Add code
Bookmark button
Alert button
Feb 10, 2024
Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, Eng Siong Chng

Viaarxiv icon

High-precision Voice Search Query Correction via Retrievable Speech-text Embedings

Add code
Bookmark button
Alert button
Jan 08, 2024
Christopher Li, Gary Wang, Kyle Kastner, Heng Su, Allen Chen, Andrew Rosenberg, Zhehuai Chen, Zelin Wu, Leonid Velikovich, Pat Rondon, Diamantino Caseiro, Petar Aleksic

Viaarxiv icon

SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation

Add code
Bookmark button
Alert button
Oct 13, 2023
Zhehuai Chen, He Huang, Andrei Andrusenko, Oleksii Hrinchuk, Krishna C. Puvvada, Jason Li, Subhankar Ghosh, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Add code
Bookmark button
Alert button
Aug 14, 2023
Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran

Viaarxiv icon

Understanding Shared Speech-Text Representations

Add code
Bookmark button
Alert button
Apr 27, 2023
Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang

Figure 1 for Understanding Shared Speech-Text Representations
Figure 2 for Understanding Shared Speech-Text Representations
Figure 3 for Understanding Shared Speech-Text Representations
Figure 4 for Understanding Shared Speech-Text Representations
Viaarxiv icon

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Add code
Bookmark button
Alert button
Mar 03, 2023
Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara Sainath, Pedro Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu

Figure 1 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 2 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 3 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 4 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Viaarxiv icon

Accelerating RNN-T Training and Inference Using CTC guidance

Add code
Bookmark button
Alert button
Oct 29, 2022
Yongqiang Wang, Zhehuai Chen, Chengjian Zheng, Yu Zhang, Wei Han, Parisa Haghani

Figure 1 for Accelerating RNN-T Training and Inference Using CTC guidance
Figure 2 for Accelerating RNN-T Training and Inference Using CTC guidance
Figure 3 for Accelerating RNN-T Training and Inference Using CTC guidance
Figure 4 for Accelerating RNN-T Training and Inference Using CTC guidance
Viaarxiv icon

Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech

Add code
Bookmark button
Alert button
Oct 27, 2022
Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran

Figure 1 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 2 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 3 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 4 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Viaarxiv icon