Alert button
Picture for Jinxi Guo

Jinxi Guo

Alert button

Effective internal language model training and fusion for factorized transducer model

Add code
Bookmark button
Alert button
Apr 02, 2024
Jinxi Guo, Niko Moritz, Yingyi Ma, Frank Seide, Chunyang Wu, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

Viaarxiv icon

Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

Add code
Bookmark button
Alert button
Sep 22, 2023
Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli

Figure 1 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 2 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 3 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 4 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Viaarxiv icon

Prompting Large Language Models with Speech Recognition Abilities

Add code
Bookmark button
Alert button
Jul 21, 2023
Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Ke Li, Jinxi Guo, Wenhan Xiong, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

Figure 1 for Prompting Large Language Models with Speech Recognition Abilities
Figure 2 for Prompting Large Language Models with Speech Recognition Abilities
Figure 3 for Prompting Large Language Models with Speech Recognition Abilities
Figure 4 for Prompting Large Language Models with Speech Recognition Abilities
Viaarxiv icon

Improving Fast-slow Encoder based Transducer with Streaming Deliberation

Add code
Bookmark button
Alert button
Dec 15, 2022
Ke Li, Jay Mahadeokar, Jinxi Guo, Yangyang Shi, Gil Keren, Ozlem Kalinli, Michael L. Seltzer, Duc Le

Figure 1 for Improving Fast-slow Encoder based Transducer with Streaming Deliberation
Figure 2 for Improving Fast-slow Encoder based Transducer with Streaming Deliberation
Figure 3 for Improving Fast-slow Encoder based Transducer with Streaming Deliberation
Figure 4 for Improving Fast-slow Encoder based Transducer with Streaming Deliberation
Viaarxiv icon

Biased Self-supervised learning for ASR

Add code
Bookmark button
Alert button
Nov 04, 2022
Florian L. Kreyssig, Yangyang Shi, Jinxi Guo, Leda Sari, Abdelrahman Mohamed, Philip C. Woodland

Figure 1 for Biased Self-supervised learning for ASR
Figure 2 for Biased Self-supervised learning for ASR
Figure 3 for Biased Self-supervised learning for ASR
Viaarxiv icon

VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition

Add code
Bookmark button
Alert button
Feb 22, 2022
Jinhan Wang, Xiaosu Tong, Jinxi Guo, Di He, Roland Maas

Figure 1 for VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition
Figure 2 for VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition
Figure 3 for VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition
Figure 4 for VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition
Viaarxiv icon

REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling

Add code
Bookmark button
Alert button
Dec 14, 2020
Hu Hu, Xuesong Yang, Zeynab Raeesy, Jinxi Guo, Gokce Keskin, Harish Arsikere, Ariya Rastrow, Andreas Stolcke, Roland Maas

Figure 1 for REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling
Figure 2 for REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling
Viaarxiv icon

Variable frame rate-based data augmentation to handle speaking-style variability for automatic speaker verification

Add code
Bookmark button
Alert button
Aug 08, 2020
Amber Afshan, Jinxi Guo, Soo Jin Park, Vijay Ravi, Alan McCree, Abeer Alwan

Figure 1 for Variable frame rate-based data augmentation to handle speaking-style variability for automatic speaker verification
Figure 2 for Variable frame rate-based data augmentation to handle speaking-style variability for automatic speaker verification
Figure 3 for Variable frame rate-based data augmentation to handle speaking-style variability for automatic speaker verification
Viaarxiv icon

Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition

Add code
Bookmark button
Alert button
Jul 27, 2020
Jinxi Guo, Gautam Tiwari, Jasha Droppo, Maarten Van Segbroeck, Che-Wei Huang, Andreas Stolcke, Roland Maas

Figure 1 for Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition
Figure 2 for Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition
Figure 3 for Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition
Figure 4 for Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition
Viaarxiv icon

Singing voice conversion with non-parallel data

Add code
Bookmark button
Alert button
Mar 11, 2019
Xin Chen, Wei Chu, Jinxi Guo, Ning Xu

Figure 1 for Singing voice conversion with non-parallel data
Figure 2 for Singing voice conversion with non-parallel data
Figure 3 for Singing voice conversion with non-parallel data
Figure 4 for Singing voice conversion with non-parallel data
Viaarxiv icon