Alert button

"speech": models, code, and papers
Alert button

SpeechYOLO: Detection and Localization of Speech Objects

Add code
Bookmark button
Alert button
Apr 14, 2019
Yael Segal, Tzeviya Sylvia Fuchs, Joseph Keshet

Figure 1 for SpeechYOLO: Detection and Localization of Speech Objects
Figure 2 for SpeechYOLO: Detection and Localization of Speech Objects
Figure 3 for SpeechYOLO: Detection and Localization of Speech Objects
Figure 4 for SpeechYOLO: Detection and Localization of Speech Objects
Viaarxiv icon

PSO-Convolutional Neural Networks with Heterogeneous Learning Rate

Add code
Bookmark button
Alert button
May 20, 2022
Nguyen Huu Phong, Augusto Santos, Bernardete Ribeiro

Figure 1 for PSO-Convolutional Neural Networks with Heterogeneous Learning Rate
Figure 2 for PSO-Convolutional Neural Networks with Heterogeneous Learning Rate
Figure 3 for PSO-Convolutional Neural Networks with Heterogeneous Learning Rate
Figure 4 for PSO-Convolutional Neural Networks with Heterogeneous Learning Rate
Viaarxiv icon

Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation

Dec 29, 2019
Thomas Drugman, Baris Bozkurt, Thierry Dutoit

Figure 1 for Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation
Figure 2 for Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation
Figure 3 for Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation
Figure 4 for Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation
Viaarxiv icon

Emotion Intensity and its Control for Emotional Voice Conversion

Add code
Bookmark button
Alert button
Jan 10, 2022
Kun Zhou, Berrak Sisman, Rajib Rana, Björn W. Schuller, Haizhou Li

Figure 1 for Emotion Intensity and its Control for Emotional Voice Conversion
Figure 2 for Emotion Intensity and its Control for Emotional Voice Conversion
Figure 3 for Emotion Intensity and its Control for Emotional Voice Conversion
Figure 4 for Emotion Intensity and its Control for Emotional Voice Conversion
Viaarxiv icon

The Vicomtech Spoofing-Aware Biometric System for the SASV Challenge

Add code
Bookmark button
Alert button
Apr 04, 2022
Juan M. Martín-Doñas, Iván G. Torre, Aitor Álvarez, Joaquin Arellano

Figure 1 for The Vicomtech Spoofing-Aware Biometric System for the SASV Challenge
Figure 2 for The Vicomtech Spoofing-Aware Biometric System for the SASV Challenge
Figure 3 for The Vicomtech Spoofing-Aware Biometric System for the SASV Challenge
Figure 4 for The Vicomtech Spoofing-Aware Biometric System for the SASV Challenge
Viaarxiv icon

A Comparative Study on Transformer vs RNN in Speech Applications

Add code
Bookmark button
Alert button
Sep 28, 2019
Shigeki Karita, Nanxin Chen, Tomoki Hayashi, Takaaki Hori, Hirofumi Inaguma, Ziyan Jiang, Masao Someki, Nelson Enrique Yalta Soplin, Ryuichi Yamamoto, Xiaofei Wang, Shinji Watanabe, Takenori Yoshimura, Wangyou Zhang

Figure 1 for A Comparative Study on Transformer vs RNN in Speech Applications
Figure 2 for A Comparative Study on Transformer vs RNN in Speech Applications
Figure 3 for A Comparative Study on Transformer vs RNN in Speech Applications
Figure 4 for A Comparative Study on Transformer vs RNN in Speech Applications
Viaarxiv icon

Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization

Add code
Bookmark button
Alert button
Mar 29, 2022
Evelina Bakhturina, Yang Zhang, Boris Ginsburg

Figure 1 for Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization
Figure 2 for Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization
Figure 3 for Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization
Figure 4 for Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization
Viaarxiv icon

tPLCnet: Real-time Deep Packet Loss Concealment in the Time Domain Using a Short Temporal Context

Apr 04, 2022
Nils L. Westhausen, Bernd T. Meyer

Figure 1 for tPLCnet: Real-time Deep Packet Loss Concealment in the Time Domain Using a Short Temporal Context
Figure 2 for tPLCnet: Real-time Deep Packet Loss Concealment in the Time Domain Using a Short Temporal Context
Figure 3 for tPLCnet: Real-time Deep Packet Loss Concealment in the Time Domain Using a Short Temporal Context
Figure 4 for tPLCnet: Real-time Deep Packet Loss Concealment in the Time Domain Using a Short Temporal Context
Viaarxiv icon

Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection

Add code
Bookmark button
Alert button
May 22, 2020
Danni Liu, Gerasimos Spanakis, Jan Niehues

Figure 1 for Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection
Figure 2 for Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection
Figure 3 for Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection
Figure 4 for Low-Latency Sequence-to-Sequence Speech Recognition and Translation by Partial Hypothesis Selection
Viaarxiv icon

Speech Prediction in Silent Videos using Variational Autoencoders

Nov 14, 2020
Ravindra Yadav, Ashish Sardana, Vinay P Namboodiri, Rajesh M Hegde

Figure 1 for Speech Prediction in Silent Videos using Variational Autoencoders
Figure 2 for Speech Prediction in Silent Videos using Variational Autoencoders
Figure 3 for Speech Prediction in Silent Videos using Variational Autoencoders
Figure 4 for Speech Prediction in Silent Videos using Variational Autoencoders
Viaarxiv icon