Alert button

"speech recognition": models, code, and papers
Alert button

AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data

Add code
Bookmark button
Alert button
Sep 25, 2023
Jianwei Yu, Hangting Chen, Yanyao Bian, Xiang Li, Yi Luo, Jinchuan Tian, Mengyang Liu, Jiayi Jiang, Shuai Wang

Figure 1 for AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data
Figure 2 for AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data
Figure 3 for AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data
Figure 4 for AutoPrep: An Automatic Preprocessing Framework for In-the-Wild Speech Data
Viaarxiv icon

Open-vocabulary Keyword-spotting with Adaptive Instance Normalization

Sep 13, 2023
Aviv Navon, Aviv Shamsian, Neta Glazer, Gill Hetz, Joseph Keshet

Figure 1 for Open-vocabulary Keyword-spotting with Adaptive Instance Normalization
Figure 2 for Open-vocabulary Keyword-spotting with Adaptive Instance Normalization
Figure 3 for Open-vocabulary Keyword-spotting with Adaptive Instance Normalization
Figure 4 for Open-vocabulary Keyword-spotting with Adaptive Instance Normalization
Viaarxiv icon

speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition

Add code
Bookmark button
Alert button
May 30, 2023
Haoyu Lu, Nan Li, Tongtong Song, Longbiao Wang, Jianwu Dang, Xiaobao Wang, Shiliang Zhang

Figure 1 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Figure 2 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Figure 3 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Figure 4 for speech and noise dual-stream spectrogram refine network with speech distortion loss for robust speech recognition
Viaarxiv icon

Corpus Synthesis for Zero-shot ASR domain Adaptation using Large Language Models

Sep 18, 2023
Hsuan Su, Ting-Yao Hu, Hema Swetha Koppula, Raviteja Vemulapalli, Jen-Hao Rick Chang, Karren Yang, Gautam Varma Mantena, Oncel Tuzel

Viaarxiv icon

CoMFLP: Correlation Measure based Fast Search on ASR Layer Pruning

Add code
Bookmark button
Alert button
Sep 21, 2023
Wei Liu, Zhiyuan Peng, Tan Lee

Figure 1 for CoMFLP: Correlation Measure based Fast Search on ASR Layer Pruning
Figure 2 for CoMFLP: Correlation Measure based Fast Search on ASR Layer Pruning
Figure 3 for CoMFLP: Correlation Measure based Fast Search on ASR Layer Pruning
Figure 4 for CoMFLP: Correlation Measure based Fast Search on ASR Layer Pruning
Viaarxiv icon

A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement

Sep 21, 2023
Bengt J. Borgstrom, Michael S. Brandstein

Figure 1 for A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement
Figure 2 for A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement
Figure 3 for A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement
Figure 4 for A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement
Viaarxiv icon

Mapping AI Arguments in Journalism Studies

Sep 03, 2023
Gregory Gondwe

Figure 1 for Mapping AI Arguments in Journalism Studies
Viaarxiv icon

Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning

May 23, 2023
Sara Kashiwagi, Keitaro Tanaka, Qi Feng, Shigeo Morishima

Figure 1 for Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning
Figure 2 for Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning
Figure 3 for Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning
Figure 4 for Improving the Gap in Visual Speech Recognition Between Normal and Silent Speech Based on Metric Learning
Viaarxiv icon

Continuous Modeling of the Denoising Process for Speech Enhancement Based on Deep Learning

Sep 17, 2023
Zilu Guo, Jun Du, CHin-Hui Lee

Figure 1 for Continuous Modeling of the Denoising Process for Speech Enhancement Based on Deep Learning
Figure 2 for Continuous Modeling of the Denoising Process for Speech Enhancement Based on Deep Learning
Figure 3 for Continuous Modeling of the Denoising Process for Speech Enhancement Based on Deep Learning
Figure 4 for Continuous Modeling of the Denoising Process for Speech Enhancement Based on Deep Learning
Viaarxiv icon

TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition

Add code
Bookmark button
Alert button
May 23, 2023
Hongfei Xue, Qijie Shao, Peikun Chen, Pengcheng Guo, Lei Xie, Jie Liu

Figure 1 for TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition
Figure 2 for TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition
Figure 3 for TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition
Figure 4 for TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition
Viaarxiv icon