Alert button

"speech": models, code, and papers
Alert button

Improved Speech Reconstruction from Silent Video

Aug 29, 2017
Ariel Ephrat, Tavi Halperin, Shmuel Peleg

Figure 1 for Improved Speech Reconstruction from Silent Video
Figure 2 for Improved Speech Reconstruction from Silent Video
Figure 3 for Improved Speech Reconstruction from Silent Video
Figure 4 for Improved Speech Reconstruction from Silent Video
Viaarxiv icon

A Machine Learning Approach to Detect Suicidal Ideation in US Veterans Based on Acoustic and Linguistic Features of Speech

Sep 14, 2020
Vaibhav Sourirajan, Anas Belouali, Mary Ann Dutton, Matthew Reinhard, Jyotishman Pathak

Figure 1 for A Machine Learning Approach to Detect Suicidal Ideation in US Veterans Based on Acoustic and Linguistic Features of Speech
Figure 2 for A Machine Learning Approach to Detect Suicidal Ideation in US Veterans Based on Acoustic and Linguistic Features of Speech
Figure 3 for A Machine Learning Approach to Detect Suicidal Ideation in US Veterans Based on Acoustic and Linguistic Features of Speech
Figure 4 for A Machine Learning Approach to Detect Suicidal Ideation in US Veterans Based on Acoustic and Linguistic Features of Speech
Viaarxiv icon

FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge

Add code
Bookmark button
Alert button
Nov 25, 2020
Bichen Wu, Qing He, Peizhao Zhang, Thilo Koehler, Kurt Keutzer, Peter Vajda

Figure 1 for FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge
Figure 2 for FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge
Figure 3 for FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge
Figure 4 for FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge
Viaarxiv icon

Improving the fusion of acoustic and text representations in RNN-T

Jan 25, 2022
Chao Zhang, Bo Li, Zhiyun Lu, Tara N. Sainath, Shuo-yiin Chang

Figure 1 for Improving the fusion of acoustic and text representations in RNN-T
Figure 2 for Improving the fusion of acoustic and text representations in RNN-T
Figure 3 for Improving the fusion of acoustic and text representations in RNN-T
Figure 4 for Improving the fusion of acoustic and text representations in RNN-T
Viaarxiv icon

Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning

Add code
Bookmark button
Alert button
Jul 24, 2019
Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, RJ Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran

Figure 1 for Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Figure 2 for Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Figure 3 for Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Figure 4 for Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning
Viaarxiv icon

Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis

Add code
Bookmark button
Alert button
Nov 10, 2020
Erica Cooper, Xin Wang, Yi Zhao, Yusuke Yasuda, Junichi Yamagishi

Figure 1 for Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis
Figure 2 for Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis
Figure 3 for Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis
Figure 4 for Pretraining Strategies, Waveform Model Choice, and Acoustic Configurations for Multi-Speaker End-to-End Speech Synthesis
Viaarxiv icon

GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints

Add code
Bookmark button
Alert button
Aug 16, 2021
Ji-Hoon Kim, Sang-Hoon Lee, Ji-Hyun Lee, Hong-Gyu Jung, Seong-Whan Lee

Figure 1 for GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints
Figure 2 for GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints
Figure 3 for GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints
Figure 4 for GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints
Viaarxiv icon

Beyond $L_p$ clipping: Equalization-based Psychoacoustic Attacks against ASRs

Add code
Bookmark button
Alert button
Oct 25, 2021
Hadi Abdullah, Muhammad Sajidur Rahman, Christian Peeters, Cassidy Gibson, Washington Garcia, Vincent Bindschaedler, Thomas Shrimpton, Patrick Traynor

Figure 1 for Beyond $L_p$ clipping: Equalization-based Psychoacoustic Attacks against ASRs
Figure 2 for Beyond $L_p$ clipping: Equalization-based Psychoacoustic Attacks against ASRs
Figure 3 for Beyond $L_p$ clipping: Equalization-based Psychoacoustic Attacks against ASRs
Figure 4 for Beyond $L_p$ clipping: Equalization-based Psychoacoustic Attacks against ASRs
Viaarxiv icon

The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes

Add code
Bookmark button
Alert button
Jun 08, 2020
Douwe Kiela, Hamed Firooz, Aravind Mohan, Vedanuj Goswami, Amanpreet Singh, Pratik Ringshia, Davide Testuggine

Figure 1 for The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Figure 2 for The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Figure 3 for The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Figure 4 for The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
Viaarxiv icon

Neural Simultaneous Speech Translation Using Alignment-Based Chunking

May 29, 2020
Patrick Wilken, Tamer Alkhouli, Evgeny Matusov, Pavel Golik

Figure 1 for Neural Simultaneous Speech Translation Using Alignment-Based Chunking
Figure 2 for Neural Simultaneous Speech Translation Using Alignment-Based Chunking
Figure 3 for Neural Simultaneous Speech Translation Using Alignment-Based Chunking
Figure 4 for Neural Simultaneous Speech Translation Using Alignment-Based Chunking
Viaarxiv icon