Alert button

"speech": models, code, and papers
Alert button

Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset

Sep 14, 2022
Michael Chinen, Jan Skoglund, Chandan K A Reddy, Alessandro Ragano, Andrew Hines

Figure 1 for Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset
Figure 2 for Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset
Figure 3 for Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset
Viaarxiv icon

Textless Speech Emotion Conversion using Decomposed and Discrete Representations

Nov 14, 2021
Felix Kreuk, Adam Polyak, Jade Copet, Eugene Kharitonov, Tu-Anh Nguyen, Morgane Rivière, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux, Yossi Adi

Figure 1 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Figure 2 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Figure 3 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Figure 4 for Textless Speech Emotion Conversion using Decomposed and Discrete Representations
Viaarxiv icon

Visual Speech Recognition for Multiple Languages in the Wild

Feb 26, 2022
Pingchuan Ma, Stavros Petridis, Maja Pantic

Figure 1 for Visual Speech Recognition for Multiple Languages in the Wild
Figure 2 for Visual Speech Recognition for Multiple Languages in the Wild
Figure 3 for Visual Speech Recognition for Multiple Languages in the Wild
Figure 4 for Visual Speech Recognition for Multiple Languages in the Wild
Viaarxiv icon

Improving performance of real-time full-band blind packet-loss concealment with predictive network

Nov 10, 2022
Viet-Anh Nguyen, Anh H. T. Nguyen, Andy W. H. Khong

Figure 1 for Improving performance of real-time full-band blind packet-loss concealment with predictive network
Figure 2 for Improving performance of real-time full-band blind packet-loss concealment with predictive network
Figure 3 for Improving performance of real-time full-band blind packet-loss concealment with predictive network
Figure 4 for Improving performance of real-time full-band blind packet-loss concealment with predictive network
Viaarxiv icon

Improving Fast-slow Encoder based Transducer with Streaming Deliberation

Dec 15, 2022
Ke Li, Jay Mahadeokar, Jinxi Guo, Yangyang Shi, Gil Keren, Ozlem Kalinli, Michael L. Seltzer, Duc Le

Figure 1 for Improving Fast-slow Encoder based Transducer with Streaming Deliberation
Figure 2 for Improving Fast-slow Encoder based Transducer with Streaming Deliberation
Figure 3 for Improving Fast-slow Encoder based Transducer with Streaming Deliberation
Figure 4 for Improving Fast-slow Encoder based Transducer with Streaming Deliberation
Viaarxiv icon

BD-SHS: A Benchmark Dataset for Learning to Detect Online Bangla Hate Speech in Different Social Contexts

Jun 01, 2022
Nauros Romim, Mosahed Ahmed, Md. Saiful Islam, Arnab Sen Sharma, Hriteshwar Talukder, Mohammad Ruhul Amin

Figure 1 for BD-SHS: A Benchmark Dataset for Learning to Detect Online Bangla Hate Speech in Different Social Contexts
Figure 2 for BD-SHS: A Benchmark Dataset for Learning to Detect Online Bangla Hate Speech in Different Social Contexts
Figure 3 for BD-SHS: A Benchmark Dataset for Learning to Detect Online Bangla Hate Speech in Different Social Contexts
Figure 4 for BD-SHS: A Benchmark Dataset for Learning to Detect Online Bangla Hate Speech in Different Social Contexts
Viaarxiv icon

Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases

Jul 08, 2021
Subhashini Venugopalan, Joel Shor, Manoj Plakal, Jimmy Tobin, Katrin Tomanek, Jordan R. Green, Michael P. Brenner

Figure 1 for Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases
Figure 2 for Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases
Figure 3 for Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases
Figure 4 for Comparing Supervised Models And Learned Speech Representations For Classifying Intelligibility Of Disordered Speech On Selected Phrases
Viaarxiv icon

Does Simultaneous Speech Translation need Simultaneous Models?

Apr 08, 2022
Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Figure 1 for Does Simultaneous Speech Translation need Simultaneous Models?
Figure 2 for Does Simultaneous Speech Translation need Simultaneous Models?
Figure 3 for Does Simultaneous Speech Translation need Simultaneous Models?
Figure 4 for Does Simultaneous Speech Translation need Simultaneous Models?
Viaarxiv icon

Streaming Punctuation for Long-form Dictation with Transformers

Oct 11, 2022
Piyush Behre, Sharman Tan, Padma Varadharajan, Shuangyu Chang

Figure 1 for Streaming Punctuation for Long-form Dictation with Transformers
Figure 2 for Streaming Punctuation for Long-form Dictation with Transformers
Figure 3 for Streaming Punctuation for Long-form Dictation with Transformers
Figure 4 for Streaming Punctuation for Long-form Dictation with Transformers
Viaarxiv icon

DAL: Feature Learning from Overt Speech to Decode Imagined Speech-based EEG Signals with Convolutional Autoencoder

Jul 15, 2021
Dae-Hyeok Lee, Sung-Jin Kim, Seong-Whan Lee

Figure 1 for DAL: Feature Learning from Overt Speech to Decode Imagined Speech-based EEG Signals with Convolutional Autoencoder
Figure 2 for DAL: Feature Learning from Overt Speech to Decode Imagined Speech-based EEG Signals with Convolutional Autoencoder
Figure 3 for DAL: Feature Learning from Overt Speech to Decode Imagined Speech-based EEG Signals with Convolutional Autoencoder
Figure 4 for DAL: Feature Learning from Overt Speech to Decode Imagined Speech-based EEG Signals with Convolutional Autoencoder
Viaarxiv icon