Alert button

"speech": models, code, and papers
Alert button

LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects

Feb 22, 2022
Alexander Johnson, Ruchao Fan, Robin Morris, Abeer Alwan

Figure 1 for LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects
Figure 2 for LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects
Figure 3 for LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects
Figure 4 for LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects
Viaarxiv icon

WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language

Mar 11, 2022
Federico Tavella, Viktor Schlegel, Marta Romeo, Aphrodite Galata, Angelo Cangelosi

Figure 1 for WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language
Figure 2 for WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language
Figure 3 for WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language
Figure 4 for WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language
Viaarxiv icon

Deep Text-to-Speech System with Seq2Seq Model

Add code
Bookmark button
Alert button
Mar 11, 2019
Gary Wang

Figure 1 for Deep Text-to-Speech System with Seq2Seq Model
Figure 2 for Deep Text-to-Speech System with Seq2Seq Model
Figure 3 for Deep Text-to-Speech System with Seq2Seq Model
Figure 4 for Deep Text-to-Speech System with Seq2Seq Model
Viaarxiv icon

Audio visual character profiles for detecting background characters in entertainment media

Mar 21, 2022
Rahul Sharma, Shrikanth Narayanan

Figure 1 for Audio visual character profiles for detecting background characters in entertainment media
Figure 2 for Audio visual character profiles for detecting background characters in entertainment media
Figure 3 for Audio visual character profiles for detecting background characters in entertainment media
Figure 4 for Audio visual character profiles for detecting background characters in entertainment media
Viaarxiv icon

PHO-LID: A Unified Model Incorporating Acoustic-Phonetic and Phonotactic Information for Language Identification

Add code
Bookmark button
Alert button
Mar 31, 2022
Hexin Liu, Leibny Paola Garcia Perera, Andy W. H. Khong, Suzy J. Styles, Sanjeev Khudanpur

Figure 1 for PHO-LID: A Unified Model Incorporating Acoustic-Phonetic and Phonotactic Information for Language Identification
Figure 2 for PHO-LID: A Unified Model Incorporating Acoustic-Phonetic and Phonotactic Information for Language Identification
Figure 3 for PHO-LID: A Unified Model Incorporating Acoustic-Phonetic and Phonotactic Information for Language Identification
Figure 4 for PHO-LID: A Unified Model Incorporating Acoustic-Phonetic and Phonotactic Information for Language Identification
Viaarxiv icon

Towards an Efficient Voice Identification Using Wav2Vec2.0 and HuBERT Based on the Quran Reciters Dataset

Nov 11, 2021
Aly Moustafa, Salah A. Aly

Figure 1 for Towards an Efficient Voice Identification Using Wav2Vec2.0 and HuBERT Based on the Quran Reciters Dataset
Figure 2 for Towards an Efficient Voice Identification Using Wav2Vec2.0 and HuBERT Based on the Quran Reciters Dataset
Figure 3 for Towards an Efficient Voice Identification Using Wav2Vec2.0 and HuBERT Based on the Quran Reciters Dataset
Figure 4 for Towards an Efficient Voice Identification Using Wav2Vec2.0 and HuBERT Based on the Quran Reciters Dataset
Viaarxiv icon

Automatic Identification and Classification of Bragging in Social Media

Mar 11, 2022
Mali Jin, Daniel Preoţiuc-Pietro, A. Seza Doğruöz, Nikolaos Aletras

Figure 1 for Automatic Identification and Classification of Bragging in Social Media
Figure 2 for Automatic Identification and Classification of Bragging in Social Media
Figure 3 for Automatic Identification and Classification of Bragging in Social Media
Figure 4 for Automatic Identification and Classification of Bragging in Social Media
Viaarxiv icon

Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription

Add code
Bookmark button
Alert button
Apr 22, 2020
Andrei Andrusenko, Aleksandr Laptev, Ivan Medennikov

Figure 1 for Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
Figure 2 for Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
Figure 3 for Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
Figure 4 for Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
Viaarxiv icon

ADEPT: A Dataset for Evaluating Prosody Transfer

Jun 15, 2021
Alexandra Torresquintero, Tian Huey Teh, Christopher G. R. Wallis, Marlene Staib, Devang S Ram Mohan, Vivian Hu, Lorenzo Foglianti, Jiameng Gao, Simon King

Figure 1 for ADEPT: A Dataset for Evaluating Prosody Transfer
Figure 2 for ADEPT: A Dataset for Evaluating Prosody Transfer
Figure 3 for ADEPT: A Dataset for Evaluating Prosody Transfer
Viaarxiv icon

L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment

Add code
Bookmark button
Alert button
Feb 21, 2022
Eric Guizzo, Christian Marinoni, Marco Pennese, Xinlei Ren, Xiguang Zheng, Chen Zhang, Bruno Masiero, Aurelio Uncini, Danilo Comminiello

Figure 1 for L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment
Figure 2 for L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment
Figure 3 for L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment
Viaarxiv icon