Alert button

"speech": models, code, and papers
Alert button

Listen only to me! How well can target speech extraction handle false alarms?

Apr 11, 2022
Marc Delcroix, Keisuke Kinoshita, Tsubasa Ochiai, Katerina Zmolikova, Hiroshi Sato, Tomohiro Nakatani

Figure 1 for Listen only to me! How well can target speech extraction handle false alarms?
Figure 2 for Listen only to me! How well can target speech extraction handle false alarms?
Figure 3 for Listen only to me! How well can target speech extraction handle false alarms?
Figure 4 for Listen only to me! How well can target speech extraction handle false alarms?
Viaarxiv icon

Alignment Entropy Regularization

Dec 22, 2022
Ehsan Variani, Ke Wu, David Rybach, Cyril Allauzen, Michael Riley

Figure 1 for Alignment Entropy Regularization
Figure 2 for Alignment Entropy Regularization
Figure 3 for Alignment Entropy Regularization
Figure 4 for Alignment Entropy Regularization
Viaarxiv icon

Korean Tokenization for Beam Search Rescoring in Speech Recognition

Feb 22, 2022
Kyuhong Shim, Hyewon Bae, Wonyong Sung

Figure 1 for Korean Tokenization for Beam Search Rescoring in Speech Recognition
Figure 2 for Korean Tokenization for Beam Search Rescoring in Speech Recognition
Figure 3 for Korean Tokenization for Beam Search Rescoring in Speech Recognition
Figure 4 for Korean Tokenization for Beam Search Rescoring in Speech Recognition
Viaarxiv icon

VoiceFixer: Toward General Speech Restoration with Neural Vocoder

Add code
Bookmark button
Alert button
Oct 05, 2021
Haohe Liu, Qiuqiang Kong, Qiao Tian, Yan Zhao, DeLiang Wang, Chuanzeng Huang, Yuxuan Wang

Figure 1 for VoiceFixer: Toward General Speech Restoration with Neural Vocoder
Figure 2 for VoiceFixer: Toward General Speech Restoration with Neural Vocoder
Figure 3 for VoiceFixer: Toward General Speech Restoration with Neural Vocoder
Figure 4 for VoiceFixer: Toward General Speech Restoration with Neural Vocoder
Viaarxiv icon

Overlapping Word Removal is All You Need: Revisiting Data Imbalance in Hope Speech Detection

Add code
Bookmark button
Alert button
Apr 12, 2022
Hariharan RamakrishnaIyer LekshmiAmmal, Manikandan Ravikiran, Gayathri Nisha, Navyasree Balamuralidhar, Adithya Madhusoodanan, Anand Kumar Madasamy, Bharathi Raja Chakravarthi

Figure 1 for Overlapping Word Removal is All You Need: Revisiting Data Imbalance in Hope Speech Detection
Figure 2 for Overlapping Word Removal is All You Need: Revisiting Data Imbalance in Hope Speech Detection
Figure 3 for Overlapping Word Removal is All You Need: Revisiting Data Imbalance in Hope Speech Detection
Figure 4 for Overlapping Word Removal is All You Need: Revisiting Data Imbalance in Hope Speech Detection
Viaarxiv icon

AERO: Audio Super Resolution in the Spectral Domain

Nov 22, 2022
Moshe Mandel, Or Tal, Yossi Adi

Figure 1 for AERO: Audio Super Resolution in the Spectral Domain
Figure 2 for AERO: Audio Super Resolution in the Spectral Domain
Figure 3 for AERO: Audio Super Resolution in the Spectral Domain
Figure 4 for AERO: Audio Super Resolution in the Spectral Domain
Viaarxiv icon

EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers

Add code
Bookmark button
Alert button
Mar 31, 2022
Yushi Ueda, Soumi Maiti, Shinji Watanabe, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Yong Xu

Figure 1 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Figure 2 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Figure 3 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Figure 4 for EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Viaarxiv icon

Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module

Add code
Bookmark button
Alert button
Feb 16, 2022
Adam Gabryś, Goeric Huybrechts, Manuel Sam Ribeiro, Chung-Ming Chien, Julian Roth, Giulia Comini, Roberto Barra-Chicote, Bartek Perz, Jaime Lorenzo-Trueba

Figure 1 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 2 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 3 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Figure 4 for Voice Filter: Few-shot text-to-speech speaker adaptation using voice conversion as a post-processing module
Viaarxiv icon

On-device neural speech synthesis

Sep 17, 2021
Sivanand Achanta, Albert Antony, Ladan Golipour, Jiangchuan Li, Tuomo Raitio, Ramya Rasipuram, Francesco Rossi, Jennifer Shi, Jaimin Upadhyay, David Winarsky, Hepeng Zhang

Figure 1 for On-device neural speech synthesis
Figure 2 for On-device neural speech synthesis
Figure 3 for On-device neural speech synthesis
Figure 4 for On-device neural speech synthesis
Viaarxiv icon

Arabic Text-To-Speech (TTS) Data Preparation

Apr 07, 2022
Hala Al Masri, Muhy Eddin Za'ter

Figure 1 for Arabic Text-To-Speech (TTS) Data Preparation
Viaarxiv icon