Alert button

"speech": models, code, and papers
Alert button

Transformer-based Cascaded Multimodal Speech Translation

Add code
Bookmark button
Alert button
Oct 29, 2019
Zixiu Wu, Ozan Caglayan, Julia Ive, Josiah Wang, Lucia Specia

Figure 1 for Transformer-based Cascaded Multimodal Speech Translation
Figure 2 for Transformer-based Cascaded Multimodal Speech Translation
Figure 3 for Transformer-based Cascaded Multimodal Speech Translation
Figure 4 for Transformer-based Cascaded Multimodal Speech Translation
Viaarxiv icon

Kurdish (Sorani) Speech to Text: Presenting an Experimental Dataset

Add code
Bookmark button
Alert button
Dec 02, 2019
Akam Qader, Hossein Hassani

Figure 1 for Kurdish (Sorani) Speech to Text: Presenting an Experimental Dataset
Viaarxiv icon

Unsupervised vs. transfer learning for multimodal one-shot matching of speech and images

Add code
Bookmark button
Alert button
Aug 14, 2020
Leanne Nortje, Herman Kamper

Figure 1 for Unsupervised vs. transfer learning for multimodal one-shot matching of speech and images
Figure 2 for Unsupervised vs. transfer learning for multimodal one-shot matching of speech and images
Figure 3 for Unsupervised vs. transfer learning for multimodal one-shot matching of speech and images
Figure 4 for Unsupervised vs. transfer learning for multimodal one-shot matching of speech and images
Viaarxiv icon

Sequence-To-Sequence Voice Conversion using F0 and Time Conditioning and Adversarial Learning

Oct 07, 2021
Frederik Bous, Laurent Benaroya, Nicolas Obin, Axel Roebel

Figure 1 for Sequence-To-Sequence Voice Conversion using F0 and Time Conditioning and Adversarial Learning
Figure 2 for Sequence-To-Sequence Voice Conversion using F0 and Time Conditioning and Adversarial Learning
Viaarxiv icon

Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders

Add code
Bookmark button
Alert button
Oct 25, 2019
Andy T. Liu, Shu-wen Yang, Po-Han Chi, Po-chun Hsu, Hung-yi Lee

Figure 1 for Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Figure 2 for Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Figure 3 for Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Figure 4 for Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Viaarxiv icon

Lite Audio-Visual Speech Enhancement

Add code
Bookmark button
Alert button
May 24, 2020
Shang-Yi Chuang, Yu Tsao, Chen-Chou Lo, Hsin-Min Wang

Figure 1 for Lite Audio-Visual Speech Enhancement
Figure 2 for Lite Audio-Visual Speech Enhancement
Figure 3 for Lite Audio-Visual Speech Enhancement
Figure 4 for Lite Audio-Visual Speech Enhancement
Viaarxiv icon

LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition

Oct 21, 2020
Xie Chen, Sarangarajan Parthasarathy, William Gale, Shuangyu Chang, Michael Zeng

Figure 1 for LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition
Figure 2 for LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition
Figure 3 for LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition
Figure 4 for LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition
Viaarxiv icon

Delta Keyword Transformer: Bringing Transformers to the Edge through Dynamically Pruned Multi-Head Self-Attention

Mar 20, 2022
Zuzana Jelčicová, Marian Verhelst

Figure 1 for Delta Keyword Transformer: Bringing Transformers to the Edge through Dynamically Pruned Multi-Head Self-Attention
Figure 2 for Delta Keyword Transformer: Bringing Transformers to the Edge through Dynamically Pruned Multi-Head Self-Attention
Figure 3 for Delta Keyword Transformer: Bringing Transformers to the Edge through Dynamically Pruned Multi-Head Self-Attention
Figure 4 for Delta Keyword Transformer: Bringing Transformers to the Edge through Dynamically Pruned Multi-Head Self-Attention
Viaarxiv icon

Self-supervised curriculum learning for speaker verification

Apr 05, 2022
Hee-Soo Heo, Jee-weon Jung, Jingu Kang, Youngki Kwon, You Jin Kim, Bong-Jin Lee, Joon Son Chung

Figure 1 for Self-supervised curriculum learning for speaker verification
Figure 2 for Self-supervised curriculum learning for speaker verification
Figure 3 for Self-supervised curriculum learning for speaker verification
Figure 4 for Self-supervised curriculum learning for speaker verification
Viaarxiv icon

Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?

Jun 01, 2020
Alina Karakanta, Matteo Negri, Marco Turchi

Figure 1 for Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?
Figure 2 for Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?
Figure 3 for Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?
Viaarxiv icon