Alert button
Picture for Yuan Shangguan

Yuan Shangguan

Alert button

Not All Weights Are Created Equal: Enhancing Energy Efficiency in On-Device Streaming Speech Recognition

Feb 20, 2024
Yang Li, Yuan Shangguan, Yuhao Wang, Liangzhen Lai, Ernie Chang, Changsheng Zhao, Yangyang Shi, Vikas Chandra

Viaarxiv icon

Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data

Nov 12, 2023
Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

Viaarxiv icon

Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

Sep 22, 2023
Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli

Figure 1 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 2 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 3 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 4 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Viaarxiv icon

Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition

Sep 21, 2023
Yang Li, Liangzhen Lai, Yuan Shangguan, Forrest N. Iandola, Ernie Chang, Yangyang Shi, Vikas Chandra

Figure 1 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Figure 2 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Figure 3 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Figure 4 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Viaarxiv icon

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models

Sep 05, 2023
Yuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra

Figure 1 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 2 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 3 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 4 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Viaarxiv icon

Prompting Large Language Models with Speech Recognition Abilities

Jul 21, 2023
Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Ke Li, Jinxi Guo, Wenhan Xiong, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

Figure 1 for Prompting Large Language Models with Speech Recognition Abilities
Figure 2 for Prompting Large Language Models with Speech Recognition Abilities
Figure 3 for Prompting Large Language Models with Speech Recognition Abilities
Figure 4 for Prompting Large Language Models with Speech Recognition Abilities
Viaarxiv icon

Towards Selection of Text-to-speech Data to Augment ASR Training

May 30, 2023
Shuo Liu, Leda Sarı, Chunyang Wu, Gil Keren, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli

Figure 1 for Towards Selection of Text-to-speech Data to Augment ASR Training
Figure 2 for Towards Selection of Text-to-speech Data to Augment ASR Training
Figure 3 for Towards Selection of Text-to-speech Data to Augment ASR Training
Figure 4 for Towards Selection of Text-to-speech Data to Augment ASR Training
Viaarxiv icon

Multi-Head State Space Model for Speech Recognition

May 25, 2023
Yassir Fathullah, Chunyang Wu, Yuan Shangguan, Junteng Jia, Wenhan Xiong, Jay Mahadeokar, Chunxi Liu, Yangyang Shi, Ozlem Kalinli, Mike Seltzer, Mark J. F. Gales

Figure 1 for Multi-Head State Space Model for Speech Recognition
Figure 2 for Multi-Head State Space Model for Speech Recognition
Figure 3 for Multi-Head State Space Model for Speech Recognition
Figure 4 for Multi-Head State Space Model for Speech Recognition
Viaarxiv icon