Alert button

"speech": models, code, and papers
Alert button

PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions

Add code
Bookmark button
Alert button
Sep 15, 2023
Reo Shimizu, Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana

Figure 1 for PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions
Figure 2 for PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions
Viaarxiv icon

Towards an Interpretable Representation of Speaker Identity via Perceptual Voice Qualities

Oct 04, 2023
Robin Netzorg, Bohan Yu, Andrea Guzman, Peter Wu, Luna McNulty, Gopala Anumanchipalli

Figure 1 for Towards an Interpretable Representation of Speaker Identity via Perceptual Voice Qualities
Figure 2 for Towards an Interpretable Representation of Speaker Identity via Perceptual Voice Qualities
Figure 3 for Towards an Interpretable Representation of Speaker Identity via Perceptual Voice Qualities
Viaarxiv icon

Written and spoken corpus of real and fake social media postings about COVID-19

Oct 06, 2023
Ng Bee Chin, Ng Zhi Ee Nicole, Kyla Kwan, Lee Yong Han Dylann, Liu Fang, Xu Hong

Figure 1 for Written and spoken corpus of real and fake social media postings about COVID-19
Figure 2 for Written and spoken corpus of real and fake social media postings about COVID-19
Figure 3 for Written and spoken corpus of real and fake social media postings about COVID-19
Viaarxiv icon

Federated Representation Learning for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Aug 07, 2023
Guruprasad V Ramesh, Gopinath Chennupati, Milind Rao, Anit Kumar Sahu, Ariya Rastrow, Jasha Droppo

Figure 1 for Federated Representation Learning for Automatic Speech Recognition
Figure 2 for Federated Representation Learning for Automatic Speech Recognition
Figure 3 for Federated Representation Learning for Automatic Speech Recognition
Figure 4 for Federated Representation Learning for Automatic Speech Recognition
Viaarxiv icon

A Fused Deep Denoising Sound Coding Strategy for Bilateral Cochlear Implants

Add code
Bookmark button
Alert button
Oct 02, 2023
Tom Gajecki, Waldo Nogueira

Viaarxiv icon

Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation

Add code
Bookmark button
Alert button
Aug 03, 2023
Minsu Kim, Jeongsoo Choi, Dahun Kim, Yong Man Ro

Figure 1 for Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
Figure 2 for Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
Figure 3 for Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
Figure 4 for Many-to-Many Spoken Language Translation via Unified Speech and Text Representation Learning with Unit-to-Unit Translation
Viaarxiv icon

PuoBERTa: Training and evaluation of a curated language model for Setswana

Add code
Bookmark button
Alert button
Oct 24, 2023
Vukosi Marivate, Moseli Mots'Oehli, Valencia Wagner, Richard Lastrucci, Isheanesu Dzingirai

Figure 1 for PuoBERTa: Training and evaluation of a curated language model for Setswana
Figure 2 for PuoBERTa: Training and evaluation of a curated language model for Setswana
Figure 3 for PuoBERTa: Training and evaluation of a curated language model for Setswana
Figure 4 for PuoBERTa: Training and evaluation of a curated language model for Setswana
Viaarxiv icon

Detecting Deepfakes Without Seeing Any

Add code
Bookmark button
Alert button
Nov 02, 2023
Tal Reiss, Bar Cavia, Yedid Hoshen

Viaarxiv icon

M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec

Add code
Bookmark button
Alert button
Sep 23, 2023
Anton Ratnarajah, Shi-Xiong Zhang, Dong Yu

Figure 1 for M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec
Figure 2 for M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec
Figure 3 for M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec
Figure 4 for M3-AUDIODEC: Multi-channel multi-speaker multi-spatial audio codec
Viaarxiv icon

End-to-End Evaluation for Low-Latency Simultaneous Speech Translation

Add code
Bookmark button
Alert button
Aug 07, 2023
Christian Huber, Tu Anh Dinh, Carlos Mullov, Ngoc Quan Pham, Thai Binh Nguyen, Fabian Retkowski, Stefan Constantin, Enes Yavuz Ugan, Danni Liu, Zhaolin Li, Sai Koneru, Jan Niehues, Alexander Waibel

Figure 1 for End-to-End Evaluation for Low-Latency Simultaneous Speech Translation
Figure 2 for End-to-End Evaluation for Low-Latency Simultaneous Speech Translation
Figure 3 for End-to-End Evaluation for Low-Latency Simultaneous Speech Translation
Figure 4 for End-to-End Evaluation for Low-Latency Simultaneous Speech Translation
Viaarxiv icon