Alert button

"speech recognition": models, code, and papers
Alert button

On the Compression of Recurrent Neural Networks with an Application to LVCSR acoustic modeling for Embedded Speech Recognition

May 02, 2016
Rohit Prabhavalkar, Ouais Alsharif, Antoine Bruguier, Ian McGraw

Figure 1 for On the Compression of Recurrent Neural Networks with an Application to LVCSR acoustic modeling for Embedded Speech Recognition
Figure 2 for On the Compression of Recurrent Neural Networks with an Application to LVCSR acoustic modeling for Embedded Speech Recognition
Viaarxiv icon

DeepTalk: Vocal Style Encoding for Speaker Recognition and Speech Synthesis

Dec 09, 2020
Anurag Chowdhury, Arun Ross, Prabu David

Figure 1 for DeepTalk: Vocal Style Encoding for Speaker Recognition and Speech Synthesis
Figure 2 for DeepTalk: Vocal Style Encoding for Speaker Recognition and Speech Synthesis
Figure 3 for DeepTalk: Vocal Style Encoding for Speaker Recognition and Speech Synthesis
Figure 4 for DeepTalk: Vocal Style Encoding for Speaker Recognition and Speech Synthesis
Viaarxiv icon

An Online Multilingual Hate speech Recognition System

Add code
Bookmark button
Alert button
Nov 24, 2020
Neeraj Vashistha, Arkaitz Zubiaga

Figure 1 for An Online Multilingual Hate speech Recognition System
Figure 2 for An Online Multilingual Hate speech Recognition System
Figure 3 for An Online Multilingual Hate speech Recognition System
Figure 4 for An Online Multilingual Hate speech Recognition System
Viaarxiv icon

Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Data Using (Psycho-)Linguistic and Fluency Features

Nov 13, 2021
Yu Qiao, Sourabh Zanwar, Rishab Bhattacharyya, Daniel Wiechmann, Wei Zhou, Elma Kerz, Ralf Schlüter

Figure 1 for Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Data Using (Psycho-)Linguistic and Fluency Features
Figure 2 for Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Data Using (Psycho-)Linguistic and Fluency Features
Figure 3 for Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Data Using (Psycho-)Linguistic and Fluency Features
Figure 4 for Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Data Using (Psycho-)Linguistic and Fluency Features
Viaarxiv icon

Bootstrap an end-to-end ASR system by multilingual training, transfer learning, text-to-text mapping and synthetic audio

Nov 25, 2020
Manuel Giollo, Deniz Gunceler, Yulan Liu, Daniel Willett

Figure 1 for Bootstrap an end-to-end ASR system by multilingual training, transfer learning, text-to-text mapping and synthetic audio
Figure 2 for Bootstrap an end-to-end ASR system by multilingual training, transfer learning, text-to-text mapping and synthetic audio
Figure 3 for Bootstrap an end-to-end ASR system by multilingual training, transfer learning, text-to-text mapping and synthetic audio
Figure 4 for Bootstrap an end-to-end ASR system by multilingual training, transfer learning, text-to-text mapping and synthetic audio
Viaarxiv icon

SD-QA: Spoken Dialectal Question Answering for the Real World

Add code
Bookmark button
Alert button
Sep 24, 2021
Fahim Faisal, Sharlina Keshava, Md Mahfuz ibn Alam, Antonios Anastasopoulos

Figure 1 for SD-QA: Spoken Dialectal Question Answering for the Real World
Figure 2 for SD-QA: Spoken Dialectal Question Answering for the Real World
Figure 3 for SD-QA: Spoken Dialectal Question Answering for the Real World
Figure 4 for SD-QA: Spoken Dialectal Question Answering for the Real World
Viaarxiv icon

Common Voice: A Massively-Multilingual Speech Corpus

Add code
Bookmark button
Alert button
Dec 13, 2019
Rosana Ardila, Megan Branson, Kelly Davis, Michael Henretty, Michael Kohler, Josh Meyer, Reuben Morais, Lindsay Saunders, Francis M. Tyers, Gregor Weber

Figure 1 for Common Voice: A Massively-Multilingual Speech Corpus
Figure 2 for Common Voice: A Massively-Multilingual Speech Corpus
Figure 3 for Common Voice: A Massively-Multilingual Speech Corpus
Figure 4 for Common Voice: A Massively-Multilingual Speech Corpus
Viaarxiv icon

GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge

Add code
Bookmark button
Alert button
Sep 21, 2022
Dongkeon Park, Yechan Yu, Kyeong Wan Park, Ji Won Kim, Hong Kook Kim

Figure 1 for GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge
Figure 2 for GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge
Figure 3 for GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge
Figure 4 for GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge
Viaarxiv icon

Acoustic-to-Word Models with Conversational Context Information

May 21, 2019
Suyoun Kim, Florian Metze

Figure 1 for Acoustic-to-Word Models with Conversational Context Information
Figure 2 for Acoustic-to-Word Models with Conversational Context Information
Figure 3 for Acoustic-to-Word Models with Conversational Context Information
Figure 4 for Acoustic-to-Word Models with Conversational Context Information
Viaarxiv icon