Alert button
Picture for Kartik Audhkhasi

Kartik Audhkhasi

Alert button

O-1: Self-training with Oracle and 1-best Hypothesis

Add code
Bookmark button
Alert button
Aug 14, 2023
Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Kartik Audhkhasi

Figure 1 for O-1: Self-training with Oracle and 1-best Hypothesis
Figure 2 for O-1: Self-training with Oracle and 1-best Hypothesis
Figure 3 for O-1: Self-training with Oracle and 1-best Hypothesis
Figure 4 for O-1: Self-training with Oracle and 1-best Hypothesis
Viaarxiv icon

Large-scale Language Model Rescoring on Long-form Data

Add code
Bookmark button
Alert button
Jun 13, 2023
Tongzhou Chen, Cyril Allauzen, Yinghui Huang, Daniel Park, David Rybach, W. Ronny Huang, Rodrigo Cabrera, Kartik Audhkhasi, Bhuvana Ramabhadran, Pedro J. Moreno, Michael Riley

Figure 1 for Large-scale Language Model Rescoring on Long-form Data
Figure 2 for Large-scale Language Model Rescoring on Long-form Data
Figure 3 for Large-scale Language Model Rescoring on Long-form Data
Figure 4 for Large-scale Language Model Rescoring on Long-form Data
Viaarxiv icon

Robust Knowledge Distillation from RNN-T Models With Noisy Training Labels Using Full-Sum Loss

Add code
Bookmark button
Alert button
Mar 10, 2023
Mohammad Zeineldeen, Kartik Audhkhasi, Murali Karthick Baskar, Bhuvana Ramabhadran

Figure 1 for Robust Knowledge Distillation from RNN-T Models With Noisy Training Labels Using Full-Sum Loss
Figure 2 for Robust Knowledge Distillation from RNN-T Models With Noisy Training Labels Using Full-Sum Loss
Figure 3 for Robust Knowledge Distillation from RNN-T Models With Noisy Training Labels Using Full-Sum Loss
Figure 4 for Robust Knowledge Distillation from RNN-T Models With Noisy Training Labels Using Full-Sum Loss
Viaarxiv icon

Modular Hybrid Autoregressive Transducer

Add code
Bookmark button
Alert button
Oct 31, 2022
Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno

Figure 1 for Modular Hybrid Autoregressive Transducer
Figure 2 for Modular Hybrid Autoregressive Transducer
Figure 3 for Modular Hybrid Autoregressive Transducer
Figure 4 for Modular Hybrid Autoregressive Transducer
Viaarxiv icon

Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition

Add code
Bookmark button
Alert button
Sep 13, 2022
Kartik Audhkhasi, Yinghui Huang, Bhuvana Ramabhadran, Pedro J. Moreno

Figure 1 for Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition
Figure 2 for Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition
Figure 3 for Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition
Figure 4 for Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition
Viaarxiv icon

Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems

Add code
Bookmark button
Alert button
Oct 08, 2020
Yinghui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, Michael Picheny

Figure 1 for Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems
Figure 2 for Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems
Figure 3 for Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems
Figure 4 for Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems
Viaarxiv icon

End-to-End Spoken Language Understanding Without Full Transcripts

Add code
Bookmark button
Alert button
Sep 30, 2020
Hong-Kwang J. Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis Lastras

Figure 1 for End-to-End Spoken Language Understanding Without Full Transcripts
Figure 2 for End-to-End Spoken Language Understanding Without Full Transcripts
Figure 3 for End-to-End Spoken Language Understanding Without Full Transcripts
Figure 4 for End-to-End Spoken Language Understanding Without Full Transcripts
Viaarxiv icon

AVLnet: Learning Audio-Visual Language Representations from Instructional Videos

Add code
Bookmark button
Alert button
Jun 16, 2020
Andrew Rouditchenko, Angie Boggust, David Harwath, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Rogerio Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James Glass

Figure 1 for AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Figure 2 for AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Figure 3 for AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Figure 4 for AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Viaarxiv icon

Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300

Add code
Bookmark button
Alert button
Jan 20, 2020
Zoltán Tüske, George Saon, Kartik Audhkhasi, Brian Kingsbury

Figure 1 for Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300
Figure 2 for Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300
Figure 3 for Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300
Figure 4 for Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300
Viaarxiv icon

Challenging the Boundaries of Speech Recognition: The MALACH Corpus

Add code
Bookmark button
Alert button
Aug 09, 2019
Michael Picheny, Zóltan Tüske, Brian Kingsbury, Kartik Audhkhasi, Xiaodong Cui, George Saon

Figure 1 for Challenging the Boundaries of Speech Recognition: The MALACH Corpus
Figure 2 for Challenging the Boundaries of Speech Recognition: The MALACH Corpus
Viaarxiv icon