Picture for Xiaodong Cui

Xiaodong Cui

Diagonal State Space Augmented Transformers for Speech Recognition

Add code
Feb 27, 2023
Viaarxiv icon

Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization

Add code
Jun 16, 2022
Figure 1 for Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
Figure 2 for Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
Figure 3 for Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
Viaarxiv icon

Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing

Add code
Mar 29, 2022
Figure 1 for Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing
Figure 2 for Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing
Figure 3 for Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing
Figure 4 for Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing
Viaarxiv icon

Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent

Add code
Dec 02, 2021
Figure 1 for Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Figure 2 for Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Figure 3 for Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Figure 4 for Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Viaarxiv icon

Asynchronous Decentralized Distributed Training of Acoustic Models

Add code
Oct 21, 2021
Figure 1 for Asynchronous Decentralized Distributed Training of Acoustic Models
Figure 2 for Asynchronous Decentralized Distributed Training of Acoustic Models
Figure 3 for Asynchronous Decentralized Distributed Training of Acoustic Models
Figure 4 for Asynchronous Decentralized Distributed Training of Acoustic Models
Viaarxiv icon

4-bit Quantization of LSTM-based Speech Recognition Models

Add code
Aug 27, 2021
Figure 1 for 4-bit Quantization of LSTM-based Speech Recognition Models
Figure 2 for 4-bit Quantization of LSTM-based Speech Recognition Models
Figure 3 for 4-bit Quantization of LSTM-based Speech Recognition Models
Figure 4 for 4-bit Quantization of LSTM-based Speech Recognition Models
Viaarxiv icon

Reducing Exposure Bias in Training Recurrent Neural Network Transducers

Add code
Aug 24, 2021
Figure 1 for Reducing Exposure Bias in Training Recurrent Neural Network Transducers
Figure 2 for Reducing Exposure Bias in Training Recurrent Neural Network Transducers
Figure 3 for Reducing Exposure Bias in Training Recurrent Neural Network Transducers
Figure 4 for Reducing Exposure Bias in Training Recurrent Neural Network Transducers
Viaarxiv icon

On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation

Add code
Jun 09, 2021
Figure 1 for On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation
Figure 2 for On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation
Figure 3 for On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation
Figure 4 for On Sample Based Explanation Methods for NLP:Efficiency, Faithfulness, and Semantic Evaluation
Viaarxiv icon

ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training

Add code
Apr 21, 2021
Figure 1 for ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training
Figure 2 for ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training
Figure 3 for ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training
Figure 4 for ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training
Viaarxiv icon

Federated Acoustic Modeling For Automatic Speech Recognition

Add code
Feb 08, 2021
Figure 1 for Federated Acoustic Modeling For Automatic Speech Recognition
Figure 2 for Federated Acoustic Modeling For Automatic Speech Recognition
Figure 3 for Federated Acoustic Modeling For Automatic Speech Recognition
Figure 4 for Federated Acoustic Modeling For Automatic Speech Recognition
Viaarxiv icon