Alert button

"speech": models, code, and papers
Alert button

Learning Speaker Embedding from Text-to-Speech

Add code
Bookmark button
Alert button
Oct 21, 2020
Jaejin Cho, Piotr Zelasko, Jesus Villalba, Shinji Watanabe, Najim Dehak

Figure 1 for Learning Speaker Embedding from Text-to-Speech
Figure 2 for Learning Speaker Embedding from Text-to-Speech
Figure 3 for Learning Speaker Embedding from Text-to-Speech
Figure 4 for Learning Speaker Embedding from Text-to-Speech
Viaarxiv icon

Simplified Self-Attention for Transformer-based End-to-End Speech Recognition

May 21, 2020
Haoneng Luo, Shiliang Zhang, Ming Lei, Lei Xie

Figure 1 for Simplified Self-Attention for Transformer-based End-to-End Speech Recognition
Figure 2 for Simplified Self-Attention for Transformer-based End-to-End Speech Recognition
Figure 3 for Simplified Self-Attention for Transformer-based End-to-End Speech Recognition
Figure 4 for Simplified Self-Attention for Transformer-based End-to-End Speech Recognition
Viaarxiv icon

Advanced Rich Transcription System for Estonian Speech

Add code
Bookmark button
Alert button
Jan 11, 2019
Tanel Alumäe, Ottokar Tilk, Asadullah

Figure 1 for Advanced Rich Transcription System for Estonian Speech
Figure 2 for Advanced Rich Transcription System for Estonian Speech
Figure 3 for Advanced Rich Transcription System for Estonian Speech
Figure 4 for Advanced Rich Transcription System for Estonian Speech
Viaarxiv icon

Towards Interpretable and Transferable Speech Emotion Recognition: Latent Representation Based Analysis of Features, Methods and Corpora

May 05, 2021
Sneha Das, Nicole Nadine Lønfeldt, Anne Katrine Pagsberg, Line H. Clemmensen

Figure 1 for Towards Interpretable and Transferable Speech Emotion Recognition: Latent Representation Based Analysis of Features, Methods and Corpora
Figure 2 for Towards Interpretable and Transferable Speech Emotion Recognition: Latent Representation Based Analysis of Features, Methods and Corpora
Figure 3 for Towards Interpretable and Transferable Speech Emotion Recognition: Latent Representation Based Analysis of Features, Methods and Corpora
Figure 4 for Towards Interpretable and Transferable Speech Emotion Recognition: Latent Representation Based Analysis of Features, Methods and Corpora
Viaarxiv icon

End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020

Jun 04, 2020
Marco Gaido, Mattia Antonino Di Gangi, Matteo Negri, Marco Turchi

Figure 1 for End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020
Figure 2 for End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020
Figure 3 for End-to-End Speech-Translation with Knowledge Distillation: FBK@IWSLT2020
Viaarxiv icon

Streaming parallel transducer beam search with fast-slow cascaded encoders

Mar 29, 2022
Jay Mahadeokar, Yangyang Shi, Ke Li, Duc Le, Jiedan Zhu, Vikas Chandra, Ozlem Kalinli, Michael L Seltzer

Figure 1 for Streaming parallel transducer beam search with fast-slow cascaded encoders
Figure 2 for Streaming parallel transducer beam search with fast-slow cascaded encoders
Figure 3 for Streaming parallel transducer beam search with fast-slow cascaded encoders
Figure 4 for Streaming parallel transducer beam search with fast-slow cascaded encoders
Viaarxiv icon

BiosecurID: a multimodal biometric database

Nov 02, 2021
Julian Fierrez, Javier Galbally, Javier Ortega-Garcia, Manuel R Freire, Fernando Alonso-Fernandez, Daniel Ramos, Doroteo Torre Toledano, Joaquin Gonzalez-Rodriguez, Juan A Siguenza, Javier Garrido-Salas, E Anguiano, Guillermo Gonzalez-de-Rivera, Ricardo Ribalda, Marcos Faundez-Zanuy, JA Ortega, Valentín Cardeñoso-Payo, A Viloria, Carlos E Vivaracho, Q Isaac Moro, Juan J Igarza, J Sanchez, Inmaculada Hernaez, Carlos Orrite-Urunuela, Francisco Martinez-Contreras, Juan José Gracia-Roche

Figure 1 for BiosecurID: a multimodal biometric database
Figure 2 for BiosecurID: a multimodal biometric database
Figure 3 for BiosecurID: a multimodal biometric database
Figure 4 for BiosecurID: a multimodal biometric database
Viaarxiv icon

Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System

Apr 20, 2020
Viet Lam Phung, Phan Huy Kinh, Anh Tuan Dinh, Quoc Bao Nguyen

Figure 1 for Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System
Figure 2 for Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System
Figure 3 for Data Processing for Optimizing Naturalness of Vietnamese Text-to-speech System
Viaarxiv icon

Large-Scale Visual Speech Recognition

Add code
Bookmark button
Alert button
Oct 01, 2018
Brendan Shillingford, Yannis Assael, Matthew W. Hoffman, Thomas Paine, Cían Hughes, Utsav Prabhu, Hank Liao, Hasim Sak, Kanishka Rao, Lorrayne Bennett, Marie Mulville, Ben Coppin, Ben Laurie, Andrew Senior, Nando de Freitas

Figure 1 for Large-Scale Visual Speech Recognition
Figure 2 for Large-Scale Visual Speech Recognition
Figure 3 for Large-Scale Visual Speech Recognition
Figure 4 for Large-Scale Visual Speech Recognition
Viaarxiv icon

Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang

Add code
Bookmark button
Alert button
Mar 09, 2022
Daphna Keidar, Andreas Opedal, Zhijing Jin, Mrinmaya Sachan

Figure 1 for Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang
Figure 2 for Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang
Figure 3 for Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang
Figure 4 for Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang
Viaarxiv icon