Alert button
Picture for Brian Kingsbury

Brian Kingsbury

Alert button

Exploring the limits of decoder-only models trained on public speech recognition corpora

Add code
Bookmark button
Alert button
Jan 31, 2024
Ankit Gupta, George Saon, Brian Kingsbury

Viaarxiv icon

Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization

Add code
Bookmark button
Alert button
Jan 13, 2024
A F M Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen

Viaarxiv icon

Soft Random Sampling: A Theoretical and Empirical Analysis

Add code
Bookmark button
Alert button
Nov 24, 2023
Xiaodong Cui, Ashish Mittal, Songtao Lu, Wei Zhang, George Saon, Brian Kingsbury

Viaarxiv icon

Semi-Autoregressive Streaming ASR With Label Context

Add code
Bookmark button
Alert button
Sep 19, 2023
Siddhant Arora, George Saon, Shinji Watanabe, Brian Kingsbury

Figure 1 for Semi-Autoregressive Streaming ASR With Label Context
Figure 2 for Semi-Autoregressive Streaming ASR With Label Context
Figure 3 for Semi-Autoregressive Streaming ASR With Label Context
Figure 4 for Semi-Autoregressive Streaming ASR With Label Context
Viaarxiv icon

Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages

Add code
Bookmark button
Alert button
May 21, 2023
Andrew Rouditchenko, Sameer Khurana, Samuel Thomas, Rogerio Feris, Leonid Karlinsky, Hilde Kuehne, David Harwath, Brian Kingsbury, James Glass

Figure 1 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Figure 2 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Figure 3 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Viaarxiv icon

High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction

Add code
Bookmark button
Alert button
May 11, 2023
Kristjan Greenewald, Brian Kingsbury, Yuancheng Yu

Figure 1 for High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction
Figure 2 for High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction
Figure 3 for High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction
Figure 4 for High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction
Viaarxiv icon

C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval

Add code
Bookmark button
Alert button
Oct 07, 2022
Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogerio Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James Glass

Figure 1 for C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Figure 2 for C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Figure 3 for C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Figure 4 for C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval
Viaarxiv icon

VQ-T: RNN Transducers using Vector-Quantized Prediction Network States

Add code
Bookmark button
Alert button
Aug 03, 2022
Jiatong Shi, George Saon, David Haws, Shinji Watanabe, Brian Kingsbury

Figure 1 for VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
Figure 2 for VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
Figure 3 for VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
Figure 4 for VQ-T: RNN Transducers using Vector-Quantized Prediction Network States
Viaarxiv icon

Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization

Add code
Bookmark button
Alert button
Jun 16, 2022
Andrea Fasoli, Chia-Yu Chen, Mauricio Serrano, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Kailash Gopalakrishnan

Figure 1 for Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
Figure 2 for Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
Figure 3 for Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
Viaarxiv icon