Alert button

"speech recognition": models, code, and papers
Alert button

Efficient Keyword Spotting through long-range interactions with Temporal Lambda Networks

Apr 16, 2021
Biel Tura, Ferran Diego, Carlos Segura, Jordi Luque

Figure 1 for Efficient Keyword Spotting through long-range interactions with Temporal Lambda Networks
Figure 2 for Efficient Keyword Spotting through long-range interactions with Temporal Lambda Networks
Figure 3 for Efficient Keyword Spotting through long-range interactions with Temporal Lambda Networks
Figure 4 for Efficient Keyword Spotting through long-range interactions with Temporal Lambda Networks
Viaarxiv icon

Lexical Access Model for Italian -- Modeling human speech processing: identification of words in running speech toward lexical access based on the detection of landmarks and other acoustic cues to features

Jun 24, 2021
Maria-Gabriella Di Benedetto, Stefanie Shattuck-Hufnagel, Jeung-Yoon Choi, Luca De Nardis, Javier Arango, Ian Chan, Alec DeCaprio

Figure 1 for Lexical Access Model for Italian -- Modeling human speech processing: identification of words in running speech toward lexical access based on the detection of landmarks and other acoustic cues to features
Figure 2 for Lexical Access Model for Italian -- Modeling human speech processing: identification of words in running speech toward lexical access based on the detection of landmarks and other acoustic cues to features
Figure 3 for Lexical Access Model for Italian -- Modeling human speech processing: identification of words in running speech toward lexical access based on the detection of landmarks and other acoustic cues to features
Figure 4 for Lexical Access Model for Italian -- Modeling human speech processing: identification of words in running speech toward lexical access based on the detection of landmarks and other acoustic cues to features
Viaarxiv icon

Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data

Mar 09, 2020
Vincent Roger, Jérôme Farinas, Julien Pinquier

Figure 1 for Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data
Figure 2 for Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data
Figure 3 for Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data
Figure 4 for Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data
Viaarxiv icon

BRDS: An FPGA-based LSTM Accelerator with Row-Balanced Dual-Ratio Sparsification

Jan 07, 2021
Seyed Abolfazl Ghasemzadeh, Erfan Bank Tavakoli, Mehdi Kamal, Ali Afzali-Kusha, Massoud Pedram

Figure 1 for BRDS: An FPGA-based LSTM Accelerator with Row-Balanced Dual-Ratio Sparsification
Figure 2 for BRDS: An FPGA-based LSTM Accelerator with Row-Balanced Dual-Ratio Sparsification
Figure 3 for BRDS: An FPGA-based LSTM Accelerator with Row-Balanced Dual-Ratio Sparsification
Figure 4 for BRDS: An FPGA-based LSTM Accelerator with Row-Balanced Dual-Ratio Sparsification
Viaarxiv icon

Speech2Slot: An End-to-End Knowledge-based Slot Filling from Speech

May 10, 2021
Pengwei Wang, Xin Ye, Xiaohuan Zhou, Jinghui Xie, Hao Wang

Figure 1 for Speech2Slot: An End-to-End Knowledge-based Slot Filling from Speech
Figure 2 for Speech2Slot: An End-to-End Knowledge-based Slot Filling from Speech
Figure 3 for Speech2Slot: An End-to-End Knowledge-based Slot Filling from Speech
Figure 4 for Speech2Slot: An End-to-End Knowledge-based Slot Filling from Speech
Viaarxiv icon

WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network

Apr 20, 2020
Abhishek Niranjan, Mukesh Sharma, Sai Bharath Chandra Gutha, M Ali Basha Shaik

Figure 1 for WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network
Figure 2 for WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network
Figure 3 for WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network
Figure 4 for WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network
Viaarxiv icon

Persian phonemes recognition using PPNet

Dec 17, 2018
Saber Malekzadeh, Mohammad Hossein Gholizadeh, Seyed Naser Razavi

Figure 1 for Persian phonemes recognition using PPNet
Figure 2 for Persian phonemes recognition using PPNet
Figure 3 for Persian phonemes recognition using PPNet
Figure 4 for Persian phonemes recognition using PPNet
Viaarxiv icon

Bridging the Gap Between Clean Data Training and Real-World Inference for Spoken Language Understanding

Apr 13, 2021
Di Wu, Yiren Chen, Liang Ding, Dacheng Tao

Figure 1 for Bridging the Gap Between Clean Data Training and Real-World Inference for Spoken Language Understanding
Figure 2 for Bridging the Gap Between Clean Data Training and Real-World Inference for Spoken Language Understanding
Figure 3 for Bridging the Gap Between Clean Data Training and Real-World Inference for Spoken Language Understanding
Figure 4 for Bridging the Gap Between Clean Data Training and Real-World Inference for Spoken Language Understanding
Viaarxiv icon

HLT-NUS SUBMISSION FOR 2020 NIST Conversational Telephone Speech SRE

Nov 12, 2021
Rohan Kumar Das, Ruijie Tao, Haizhou Li

Figure 1 for HLT-NUS SUBMISSION FOR 2020 NIST Conversational Telephone Speech SRE
Figure 2 for HLT-NUS SUBMISSION FOR 2020 NIST Conversational Telephone Speech SRE
Viaarxiv icon

Experiments of ASR-based mispronunciation detection for children and adult English learners

Apr 13, 2021
Nina Hosseini-Kivanani, Roberto Gretter, Marco Matassoni, Giuseppe Daniele Falavigna

Figure 1 for Experiments of ASR-based mispronunciation detection for children and adult English learners
Figure 2 for Experiments of ASR-based mispronunciation detection for children and adult English learners
Figure 3 for Experiments of ASR-based mispronunciation detection for children and adult English learners
Viaarxiv icon