Alert button

"speech": models, code, and papers
Alert button

Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis

Nov 04, 2020
Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Figure 1 for Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis
Figure 2 for Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis
Figure 3 for Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis
Figure 4 for Cross-Lingual Machine Speech Chain for Javanese, Sundanese, Balinese, and Bataks Speech Recognition and Synthesis
Viaarxiv icon

Deformable TDNN with adaptive receptive fields for speech recognition

Apr 30, 2021
Keyu An, Yi Zhang, Zhijian Ou

Figure 1 for Deformable TDNN with adaptive receptive fields for speech recognition
Figure 2 for Deformable TDNN with adaptive receptive fields for speech recognition
Figure 3 for Deformable TDNN with adaptive receptive fields for speech recognition
Figure 4 for Deformable TDNN with adaptive receptive fields for speech recognition
Viaarxiv icon

Semantic Communication Systems for Speech Transmission

Feb 24, 2021
Zhenzi Weng, Zhijin Qin

Figure 1 for Semantic Communication Systems for Speech Transmission
Figure 2 for Semantic Communication Systems for Speech Transmission
Figure 3 for Semantic Communication Systems for Speech Transmission
Figure 4 for Semantic Communication Systems for Speech Transmission
Viaarxiv icon

Audio Input Generates Continuous Frames to Synthesize Facial Video Using Generative Adiversarial Networks

Jul 18, 2022
Hanhaodi Zhang

Figure 1 for Audio Input Generates Continuous Frames to Synthesize Facial Video Using Generative Adiversarial Networks
Figure 2 for Audio Input Generates Continuous Frames to Synthesize Facial Video Using Generative Adiversarial Networks
Figure 3 for Audio Input Generates Continuous Frames to Synthesize Facial Video Using Generative Adiversarial Networks
Figure 4 for Audio Input Generates Continuous Frames to Synthesize Facial Video Using Generative Adiversarial Networks
Viaarxiv icon

LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech

Add code
Bookmark button
Alert button
Apr 23, 2021
Solene Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, Natalia Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Esteve, Benjamin Lecouteux, Francois Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier

Figure 1 for LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
Figure 2 for LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
Figure 3 for LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
Figure 4 for LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
Viaarxiv icon

DeepHate: Hate Speech Detection via Multi-Faceted Text Representations

Add code
Bookmark button
Alert button
Mar 14, 2021
Rui Cao, Roy Ka-Wei Lee, Tuan-Anh Hoang

Figure 1 for DeepHate: Hate Speech Detection via Multi-Faceted Text Representations
Figure 2 for DeepHate: Hate Speech Detection via Multi-Faceted Text Representations
Figure 3 for DeepHate: Hate Speech Detection via Multi-Faceted Text Representations
Figure 4 for DeepHate: Hate Speech Detection via Multi-Faceted Text Representations
Viaarxiv icon

Star DGT: a Robust Gabor Transform for Speech Denoising

Apr 29, 2021
Vasiliki Kouni, Holger Rauhut

Figure 1 for Star DGT: a Robust Gabor Transform for Speech Denoising
Figure 2 for Star DGT: a Robust Gabor Transform for Speech Denoising
Figure 3 for Star DGT: a Robust Gabor Transform for Speech Denoising
Figure 4 for Star DGT: a Robust Gabor Transform for Speech Denoising
Viaarxiv icon

Speak Like a Dog: Human to Non-human creature Voice Conversion

Add code
Bookmark button
Alert button
Jun 09, 2022
Kohei Suzuki, Shoki Sakamoto, Tadahiro Taniguchi, Hirokazu Kameoka

Figure 1 for Speak Like a Dog: Human to Non-human creature Voice Conversion
Figure 2 for Speak Like a Dog: Human to Non-human creature Voice Conversion
Figure 3 for Speak Like a Dog: Human to Non-human creature Voice Conversion
Figure 4 for Speak Like a Dog: Human to Non-human creature Voice Conversion
Viaarxiv icon

Lombard Effect for Bilingual Speakers in Cantonese and English: importance of spectro-temporal features

Apr 14, 2022
Maximilian Karl Scharf, Sabine Hochmuth, Lena L. N. Wong, Birger Kollmeier, Anna Warzybok

Figure 1 for Lombard Effect for Bilingual Speakers in Cantonese and English: importance of spectro-temporal features
Viaarxiv icon

IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task

Add code
Bookmark button
Alert button
Jun 30, 2021
Pavel Denisov, Manuel Mager, Ngoc Thang Vu

Figure 1 for IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
Figure 2 for IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
Figure 3 for IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
Figure 4 for IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
Viaarxiv icon