Alert button

"speech recognition": models, code, and papers
Alert button

An ASR-free Fluency Scoring Approach with Self-Supervised Learning

Mar 13, 2023
Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee

Figure 1 for An ASR-free Fluency Scoring Approach with Self-Supervised Learning
Figure 2 for An ASR-free Fluency Scoring Approach with Self-Supervised Learning
Figure 3 for An ASR-free Fluency Scoring Approach with Self-Supervised Learning
Figure 4 for An ASR-free Fluency Scoring Approach with Self-Supervised Learning
Viaarxiv icon

A Dataset for Speech Emotion Recognition in Greek Theatrical Plays

Add code
Bookmark button
Alert button
Mar 27, 2022
Maria Moutti, Sofia Eleftheriou, Panagiotis Koromilas, Theodoros Giannakopoulos

Figure 1 for A Dataset for Speech Emotion Recognition in Greek Theatrical Plays
Figure 2 for A Dataset for Speech Emotion Recognition in Greek Theatrical Plays
Figure 3 for A Dataset for Speech Emotion Recognition in Greek Theatrical Plays
Viaarxiv icon

A review of on-device fully neural end-to-end automatic speech recognition algorithms

Dec 19, 2020
Chanwoo Kim, Dhananjaya Gowda, Dongsoo Lee, Jiyeon Kim, Ankur Kumar, Sungsoo Kim, Abhinav Garg, Changwoo Han

Figure 1 for A review of on-device fully neural end-to-end automatic speech recognition algorithms
Figure 2 for A review of on-device fully neural end-to-end automatic speech recognition algorithms
Figure 3 for A review of on-device fully neural end-to-end automatic speech recognition algorithms
Figure 4 for A review of on-device fully neural end-to-end automatic speech recognition algorithms
Viaarxiv icon

Attention-based Contextual Language Model Adaptation for Speech Recognition

Add code
Bookmark button
Alert button
Jun 02, 2021
Richard Diehl Martinez, Scott Novotney, Ivan Bulyko, Ariya Rastrow, Andreas Stolcke, Ankur Gandhe

Figure 1 for Attention-based Contextual Language Model Adaptation for Speech Recognition
Figure 2 for Attention-based Contextual Language Model Adaptation for Speech Recognition
Figure 3 for Attention-based Contextual Language Model Adaptation for Speech Recognition
Figure 4 for Attention-based Contextual Language Model Adaptation for Speech Recognition
Viaarxiv icon

Speech Recognition by Simply Fine-tuning BERT

Add code
Bookmark button
Alert button
Jan 30, 2021
Wen-Chin Huang, Chia-Hua Wu, Shang-Bao Luo, Kuan-Yu Chen, Hsin-Min Wang, Tomoki Toda

Figure 1 for Speech Recognition by Simply Fine-tuning BERT
Figure 2 for Speech Recognition by Simply Fine-tuning BERT
Figure 3 for Speech Recognition by Simply Fine-tuning BERT
Figure 4 for Speech Recognition by Simply Fine-tuning BERT
Viaarxiv icon

Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition

Add code
Bookmark button
Alert button
Mar 30, 2022
Lester Phillip Violeta, Wen-Chin Huang, Tomoki Toda

Figure 1 for Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition
Figure 2 for Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition
Figure 3 for Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition
Figure 4 for Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition
Viaarxiv icon

Exploiting Single-Channel Speech For Multi-channel End-to-end Speech Recognition

Add code
Bookmark button
Alert button
Jul 06, 2021
Keyu An, Zhijian Ou

Figure 1 for Exploiting Single-Channel Speech For Multi-channel End-to-end Speech Recognition
Figure 2 for Exploiting Single-Channel Speech For Multi-channel End-to-end Speech Recognition
Figure 3 for Exploiting Single-Channel Speech For Multi-channel End-to-end Speech Recognition
Figure 4 for Exploiting Single-Channel Speech For Multi-channel End-to-end Speech Recognition
Viaarxiv icon

QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion

Add code
Bookmark button
Alert button
Feb 23, 2023
Houjian Guo, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro

Figure 1 for QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Figure 2 for QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Figure 3 for QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Figure 4 for QuickVC: Any-to-many Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Viaarxiv icon

Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition

Oct 30, 2021
Midia Yousefi, John H. L. Hanse

Figure 1 for Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition
Figure 2 for Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition
Figure 3 for Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition
Figure 4 for Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition
Viaarxiv icon