Alert button

"speech": models, code, and papers
Alert button

Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts

Jun 24, 2022
Atijit Anuchitanukul, Lucia Specia

Figure 1 for Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts
Figure 2 for Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts
Figure 3 for Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts
Figure 4 for Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts
Viaarxiv icon

Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue Imaging

Jul 12, 2021
Tamás Gábor Csapó

Figure 1 for Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue Imaging
Figure 2 for Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue Imaging
Figure 3 for Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue Imaging
Figure 4 for Extending Text-to-Speech Synthesis with Articulatory Movement Prediction using Ultrasound Tongue Imaging
Viaarxiv icon

Learning to Count Words in Fluent Speech enables Online Speech Recognition

Jun 11, 2020
George Sterpu, Christian Saam, Naomi Harte

Figure 1 for Learning to Count Words in Fluent Speech enables Online Speech Recognition
Figure 2 for Learning to Count Words in Fluent Speech enables Online Speech Recognition
Figure 3 for Learning to Count Words in Fluent Speech enables Online Speech Recognition
Figure 4 for Learning to Count Words in Fluent Speech enables Online Speech Recognition
Viaarxiv icon

Bi-LSTM Scoring Based Similarity Measurement with Agglomerative Hierarchical Clustering (AHC) for Speaker Diarization

May 19, 2022
Siddharth S. Nijhawan, Homayoon Beigi

Figure 1 for Bi-LSTM Scoring Based Similarity Measurement with Agglomerative Hierarchical Clustering (AHC) for Speaker Diarization
Figure 2 for Bi-LSTM Scoring Based Similarity Measurement with Agglomerative Hierarchical Clustering (AHC) for Speaker Diarization
Figure 3 for Bi-LSTM Scoring Based Similarity Measurement with Agglomerative Hierarchical Clustering (AHC) for Speaker Diarization
Figure 4 for Bi-LSTM Scoring Based Similarity Measurement with Agglomerative Hierarchical Clustering (AHC) for Speaker Diarization
Viaarxiv icon

Multiclass ASMA vs Targeted PGD Attack in Image Segmentation

Aug 03, 2022
Johnson Vo, Jiabao Xie, Sahil Patel

Figure 1 for Multiclass ASMA vs Targeted PGD Attack in Image Segmentation
Figure 2 for Multiclass ASMA vs Targeted PGD Attack in Image Segmentation
Figure 3 for Multiclass ASMA vs Targeted PGD Attack in Image Segmentation
Figure 4 for Multiclass ASMA vs Targeted PGD Attack in Image Segmentation
Viaarxiv icon

Conditional independence for pretext task selection in Self-supervised speech representation learning

Apr 15, 2021
Salah Zaiem, Titouan Parcollet, Slim Essid

Figure 1 for Conditional independence for pretext task selection in Self-supervised speech representation learning
Figure 2 for Conditional independence for pretext task selection in Self-supervised speech representation learning
Figure 3 for Conditional independence for pretext task selection in Self-supervised speech representation learning
Figure 4 for Conditional independence for pretext task selection in Self-supervised speech representation learning
Viaarxiv icon

Understanding the Tradeoffs in Client-Side Privacy for Speech Recognition

Jan 22, 2021
Peter Wu, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency

Figure 1 for Understanding the Tradeoffs in Client-Side Privacy for Speech Recognition
Figure 2 for Understanding the Tradeoffs in Client-Side Privacy for Speech Recognition
Figure 3 for Understanding the Tradeoffs in Client-Side Privacy for Speech Recognition
Figure 4 for Understanding the Tradeoffs in Client-Side Privacy for Speech Recognition
Viaarxiv icon

Augmenting Part-of-speech Tagging with Syntactic Information for Vietnamese and Chinese

Feb 24, 2021
Duc-Vu Nguyen, Kiet Van Nguyen, Ngan Luu-Thuy Nguyen

Figure 1 for Augmenting Part-of-speech Tagging with Syntactic Information for Vietnamese and Chinese
Figure 2 for Augmenting Part-of-speech Tagging with Syntactic Information for Vietnamese and Chinese
Figure 3 for Augmenting Part-of-speech Tagging with Syntactic Information for Vietnamese and Chinese
Figure 4 for Augmenting Part-of-speech Tagging with Syntactic Information for Vietnamese and Chinese
Viaarxiv icon

Multilingual Speech Recognition using Knowledge Transfer across Learning Processes

Oct 15, 2021
Rimita Lahiri, Kenichi Kumatani, Eric Sun, Yao Qian

Figure 1 for Multilingual Speech Recognition using Knowledge Transfer across Learning Processes
Figure 2 for Multilingual Speech Recognition using Knowledge Transfer across Learning Processes
Figure 3 for Multilingual Speech Recognition using Knowledge Transfer across Learning Processes
Figure 4 for Multilingual Speech Recognition using Knowledge Transfer across Learning Processes
Viaarxiv icon

Nonlinear Vectorial Prediction with Neural Nets

Apr 04, 2022
Marcos Faundez-Zanuy

Figure 1 for Nonlinear Vectorial Prediction with Neural Nets
Figure 2 for Nonlinear Vectorial Prediction with Neural Nets
Figure 3 for Nonlinear Vectorial Prediction with Neural Nets
Figure 4 for Nonlinear Vectorial Prediction with Neural Nets
Viaarxiv icon