Alert button
Picture for Mike Seltzer

Mike Seltzer

Alert button

Effective internal language model training and fusion for factorized transducer model

Add code
Bookmark button
Alert button
Apr 02, 2024
Jinxi Guo, Niko Moritz, Yingyi Ma, Frank Seide, Chunyang Wu, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

Viaarxiv icon

Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data

Add code
Bookmark button
Alert button
Nov 12, 2023
Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

Viaarxiv icon

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models

Add code
Bookmark button
Alert button
Sep 05, 2023
Yuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra

Figure 1 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 2 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 3 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 4 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Viaarxiv icon

Prompting Large Language Models with Speech Recognition Abilities

Add code
Bookmark button
Alert button
Jul 21, 2023
Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Ke Li, Jinxi Guo, Wenhan Xiong, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer

Figure 1 for Prompting Large Language Models with Speech Recognition Abilities
Figure 2 for Prompting Large Language Models with Speech Recognition Abilities
Figure 3 for Prompting Large Language Models with Speech Recognition Abilities
Figure 4 for Prompting Large Language Models with Speech Recognition Abilities
Viaarxiv icon

Multi-Head State Space Model for Speech Recognition

Add code
Bookmark button
Alert button
May 25, 2023
Yassir Fathullah, Chunyang Wu, Yuan Shangguan, Junteng Jia, Wenhan Xiong, Jay Mahadeokar, Chunxi Liu, Yangyang Shi, Ozlem Kalinli, Mike Seltzer, Mark J. F. Gales

Figure 1 for Multi-Head State Space Model for Speech Recognition
Figure 2 for Multi-Head State Space Model for Speech Recognition
Figure 3 for Multi-Head State Space Model for Speech Recognition
Figure 4 for Multi-Head State Space Model for Speech Recognition
Viaarxiv icon

Dynamic Speech Endpoint Detection with Regression Targets

Add code
Bookmark button
Alert button
Oct 25, 2022
Dawei Liang, Hang Su, Tarun Singh, Jay Mahadeokar, Shanil Puri, Jiedan Zhu, Edison Thomaz, Mike Seltzer

Figure 1 for Dynamic Speech Endpoint Detection with Regression Targets
Figure 2 for Dynamic Speech Endpoint Detection with Regression Targets
Figure 3 for Dynamic Speech Endpoint Detection with Regression Targets
Figure 4 for Dynamic Speech Endpoint Detection with Regression Targets
Viaarxiv icon

Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution

Add code
Bookmark button
Alert button
Oct 07, 2021
Yangyang Shi, Chunyang Wu, Dilin Wang, Alex Xiao, Jay Mahadeokar, Xiaohui Zhang, Chunxi Liu, Ke Li, Yuan Shangguan, Varun Nagaraja, Ozlem Kalinli, Mike Seltzer

Figure 1 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Figure 2 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Figure 3 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Figure 4 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Viaarxiv icon

Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

Add code
Bookmark button
Alert button
Oct 07, 2021
Dawei Liang, Yangyang Shi, Yun Wang, Nayan Singhal, Alex Xiao, Jonathan Shaw, Edison Thomaz, Ozlem Kalinli, Mike Seltzer

Figure 1 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Figure 2 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Figure 3 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Figure 4 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Viaarxiv icon

On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models

Add code
Bookmark button
Alert button
Jul 09, 2021
Xiaohui Zhang, Vimal Manohar, David Zhang, Frank Zhang, Yangyang Shi, Nayan Singhal, Julian Chan, Fuchun Peng, Yatharth Saraf, Mike Seltzer

Figure 1 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Figure 2 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Figure 3 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Figure 4 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Viaarxiv icon