Alert button

"speech": models, code, and papers
Alert button

ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition

Add code
Bookmark button
Alert button
Jul 14, 2021
Afra Alishahia, Grzegorz Chrupała, Alejandrina Cristia, Emmanuel Dupoux, Bertrand Higy, Marvin Lavechin, Okko Räsänen, Chen Yu

Figure 1 for ZR-2021VG: Zero-Resource Speech Challenge, Visually-Grounded Language Modelling track, 2021 edition
Viaarxiv icon

Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders

Add code
Bookmark button
Alert button
May 12, 2021
Chen Xu, Bojie Hu, Yanyang Li, Yuhao Zhang, shen huang, Qi Ju, Tong Xiao, Jingbo Zhu

Figure 1 for Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders
Figure 2 for Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders
Figure 3 for Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders
Figure 4 for Stacked Acoustic-and-Textual Encoding: Integrating the Pre-trained Models into Speech Translation Encoders
Viaarxiv icon

Gender Bias in Word Embeddings: A Comprehensive Analysis of Frequency, Syntax, and Semantics

Add code
Bookmark button
Alert button
Jun 07, 2022
Aylin Caliskan, Pimparkar Parth Ajay, Tessa Charlesworth, Robert Wolfe, Mahzarin R. Banaji

Figure 1 for Gender Bias in Word Embeddings: A Comprehensive Analysis of Frequency, Syntax, and Semantics
Figure 2 for Gender Bias in Word Embeddings: A Comprehensive Analysis of Frequency, Syntax, and Semantics
Figure 3 for Gender Bias in Word Embeddings: A Comprehensive Analysis of Frequency, Syntax, and Semantics
Figure 4 for Gender Bias in Word Embeddings: A Comprehensive Analysis of Frequency, Syntax, and Semantics
Viaarxiv icon

MFFCN: Multi-layer Feature Fusion Convolution Network for Audio-visual Speech Enhancement

Feb 04, 2021
Xinmeng Xu, Yang Wang, Dongxiang Xu, Yiyuan Peng, Cong Zhang, Jie Jia, Yang Wang, Binbin Chen

Figure 1 for MFFCN: Multi-layer Feature Fusion Convolution Network for Audio-visual Speech Enhancement
Figure 2 for MFFCN: Multi-layer Feature Fusion Convolution Network for Audio-visual Speech Enhancement
Figure 3 for MFFCN: Multi-layer Feature Fusion Convolution Network for Audio-visual Speech Enhancement
Figure 4 for MFFCN: Multi-layer Feature Fusion Convolution Network for Audio-visual Speech Enhancement
Viaarxiv icon

Machine Learning based COVID-19 Detection from Smartphone Recordings: Cough, Breath and Speech

Apr 12, 2021
Madhurananda Pahar, Thomas Niesler

Figure 1 for Machine Learning based COVID-19 Detection from Smartphone Recordings: Cough, Breath and Speech
Figure 2 for Machine Learning based COVID-19 Detection from Smartphone Recordings: Cough, Breath and Speech
Figure 3 for Machine Learning based COVID-19 Detection from Smartphone Recordings: Cough, Breath and Speech
Figure 4 for Machine Learning based COVID-19 Detection from Smartphone Recordings: Cough, Breath and Speech
Viaarxiv icon

FlexLip: A Controllable Text-to-Lip System

Add code
Bookmark button
Alert button
Jun 07, 2022
Dan Oneata, Beata Lorincz, Adriana Stan, Horia Cucu

Figure 1 for FlexLip: A Controllable Text-to-Lip System
Figure 2 for FlexLip: A Controllable Text-to-Lip System
Figure 3 for FlexLip: A Controllable Text-to-Lip System
Figure 4 for FlexLip: A Controllable Text-to-Lip System
Viaarxiv icon

Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training

Mar 02, 2022
Ramon Sanabria, Wei-Ning Hsu, Alexei Baevski, Michael Auli

Figure 1 for Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Figure 2 for Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Figure 3 for Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Figure 4 for Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Viaarxiv icon

Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI

Add code
Bookmark button
Alert button
Dec 05, 2021
Jinchuan Tian, Jianwei Yu, Chao Weng, Shi-Xiong Zhang, Dan Su, Dong Yu, Yuexian Zou

Figure 1 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Figure 2 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Figure 3 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Figure 4 for Consistent Training and Decoding For End-to-end Speech Recognition Using Lattice-free MMI
Viaarxiv icon

Model Blending for Text Classification

Aug 05, 2022
Ramit Pahwa

Figure 1 for Model Blending for Text Classification
Figure 2 for Model Blending for Text Classification
Figure 3 for Model Blending for Text Classification
Figure 4 for Model Blending for Text Classification
Viaarxiv icon

Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation

Add code
Bookmark button
Alert button
Apr 22, 2022
Detai Xin, Shinnosuke Takamichi, Takuma Okamoto, Hisashi Kawai, Hiroshi Saruwatari

Figure 1 for Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Figure 2 for Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Figure 3 for Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Figure 4 for Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation
Viaarxiv icon