Alert button

"speech": models, code, and papers
Alert button

USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models

Dec 13, 2023
Shaojin Ding, Qiu David, David Rim, Yanzhang He, Oleg Rybakov, Bo Li, Rohit Prabhavalkar, Weiran Wang, Tara N. Sainath, Shivani Agrawal, Zhonglin Han, Jian Li, Amir Yazdanbakhsh

Figure 1 for USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Figure 2 for USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Figure 3 for USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Figure 4 for USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Viaarxiv icon

Toward A Reinforcement-Learning-Based System for Adjusting Medication to Minimize Speech Disfluency

Dec 12, 2023
Pavlos Constas, Vikram Rawal, Matthew Honorio Oliveira, Andreas Constas, Aditya Khan, Kaison Cheung, Najma Sultani, Carrie Chen, Micol Altomare, Michael Akzam, Jiacheng Chen, Vhea He, Lauren Altomare, Heraa Murqi, Asad Khan, Nimit Amikumar Bhanshali, Youssef Rachad, Michael Guerzhoy

Viaarxiv icon

Study of cognitive component of auditory attention to natural speech events

Dec 19, 2023
Nhan D. T. Nguyen, Kaare Mikkelsen, Preben Kidmose

Viaarxiv icon

StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning Models

Nov 28, 2023
Kazuki Yamauchi, Yusuke Ijima, Yuki Saito

Viaarxiv icon

A Deep Representation Learning-based Speech Enhancement Method Using Complex Convolution Recurrent Variational Autoencoder

Dec 15, 2023
Yang Xiang, Jingguang Tian, Xinhui Hu, Xinkang Xu, ZhaoHui Yin

Viaarxiv icon

Distributed Speech Dereverberation Using Weighted Prediction Error

Dec 05, 2023
Ziye Yang, Mengfei Zhang, Jie Chen

Viaarxiv icon

ROSE: A Recognition-Oriented Speech Enhancement Framework in Air Traffic Control Using Multi-Objective Learning

Dec 11, 2023
Xincheng Yu, Dongyue Guo, Jianwei Zhang, Yi Lin

Viaarxiv icon

On the Parameter Estimation of Sinusoidal Models for Speech and Audio Signals

Jan 02, 2024
George P. Kafentzis

Viaarxiv icon

An audio-quality-based multi-strategy approach for target speaker extraction in the MISP 2023 Challenge

Jan 08, 2024
Runduo Han, Xiaopeng Yan, Weiming Xu, Pengcheng Guo, Jiayao Sun, He Wang, Quan Lu, Ning Jiang, Lei Xie

Viaarxiv icon

Frame-level emotional state alignment method for speech emotion recognition

Dec 27, 2023
Qifei Li, Yingming Gao, Cong Wang, Yayue Deng, Jinlong Xue, Yichen Han, Ya Li

Viaarxiv icon