Picture for Hainan Xu

Hainan Xu

Romanization Encoding For Multilingual ASR

Add code
Jul 05, 2024
Viaarxiv icon

Label-Looping: Highly Efficient Decoding for Transducers

Add code
Jun 10, 2024
Viaarxiv icon

Speed of Light Exact Greedy Decoding for RNN-T Speech Recognition Models on GPU

Add code
Jun 06, 2024
Viaarxiv icon

Transducers with Pronunciation-aware Embeddings for Automatic Speech Recognition

Add code
Apr 04, 2024
Viaarxiv icon

TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer

Add code
Mar 20, 2024
Figure 1 for TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
Figure 2 for TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
Figure 3 for TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
Figure 4 for TDT-KWS: Fast And Accurate Keyword Spotting Using Token-and-duration Transducer
Viaarxiv icon

Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition

Add code
Sep 26, 2023
Figure 1 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Figure 2 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Figure 3 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Figure 4 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Viaarxiv icon

Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts

Add code
Jun 01, 2023
Figure 1 for Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
Figure 2 for Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
Figure 3 for Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
Figure 4 for Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts
Viaarxiv icon

Efficient Sequence Transduction by Jointly Predicting Tokens and Durations

Add code
Apr 13, 2023
Figure 1 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 2 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 3 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 4 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Viaarxiv icon

Multi-blank Transducers for Speech Recognition

Add code
Nov 04, 2022
Figure 1 for Multi-blank Transducers for Speech Recognition
Figure 2 for Multi-blank Transducers for Speech Recognition
Figure 3 for Multi-blank Transducers for Speech Recognition
Figure 4 for Multi-blank Transducers for Speech Recognition
Viaarxiv icon

An Asynchronous WFST-Based Decoder For Automatic Speech Recognition

Add code
Mar 16, 2021
Figure 1 for An Asynchronous WFST-Based Decoder For Automatic Speech Recognition
Figure 2 for An Asynchronous WFST-Based Decoder For Automatic Speech Recognition
Figure 3 for An Asynchronous WFST-Based Decoder For Automatic Speech Recognition
Figure 4 for An Asynchronous WFST-Based Decoder For Automatic Speech Recognition
Viaarxiv icon