Alert button

"speech": models, code, and papers
Alert button

Unsupervised feature learning for speech using correspondence and Siamese networks

Mar 28, 2020
Petri-Johan Last, Herman A. Engelbrecht, Herman Kamper

Figure 1 for Unsupervised feature learning for speech using correspondence and Siamese networks
Figure 2 for Unsupervised feature learning for speech using correspondence and Siamese networks
Figure 3 for Unsupervised feature learning for speech using correspondence and Siamese networks
Figure 4 for Unsupervised feature learning for speech using correspondence and Siamese networks
Viaarxiv icon

Towards localisation of keywords in speech using weak supervision

Dec 14, 2020
Kayode Olaleye, Benjamin van Niekerk, Herman Kamper

Viaarxiv icon

Recurrent Neural Network Transducer for Audio-Visual Speech Recognition

Nov 08, 2019
Takaki Makino, Hank Liao, Yannis Assael, Brendan Shillingford, Basilio Garcia, Otavio Braga, Olivier Siohan

Figure 1 for Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Figure 2 for Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Figure 3 for Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Figure 4 for Recurrent Neural Network Transducer for Audio-Visual Speech Recognition
Viaarxiv icon

Speech enhancement with variational autoencoders and alpha-stable distributions

Feb 08, 2019
Simon Leglaive, Umut Simsekli, Antoine Liutkus, Laurent Girin, Radu Horaud

Figure 1 for Speech enhancement with variational autoencoders and alpha-stable distributions
Figure 2 for Speech enhancement with variational autoencoders and alpha-stable distributions
Figure 3 for Speech enhancement with variational autoencoders and alpha-stable distributions
Viaarxiv icon

Deep Iterative Phase Retrieval for Ptychography

Feb 17, 2022
Simon Welker, Tal Peer, Henry N. Chapman, Timo Gerkmann

Figure 1 for Deep Iterative Phase Retrieval for Ptychography
Figure 2 for Deep Iterative Phase Retrieval for Ptychography
Figure 3 for Deep Iterative Phase Retrieval for Ptychography
Figure 4 for Deep Iterative Phase Retrieval for Ptychography
Viaarxiv icon

A Hierarchical Model for Spoken Language Recognition

Add code
Bookmark button
Alert button
Jan 04, 2022
Luciana Ferrer, Diego Castan, Mitchell McLaren, Aaron Lawson

Figure 1 for A Hierarchical Model for Spoken Language Recognition
Figure 2 for A Hierarchical Model for Spoken Language Recognition
Figure 3 for A Hierarchical Model for Spoken Language Recognition
Figure 4 for A Hierarchical Model for Spoken Language Recognition
Viaarxiv icon

AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline

Add code
Bookmark button
Alert button
Sep 16, 2017
Hui Bu, Jiayu Du, Xingyu Na, Bengu Wu, Hao Zheng

Figure 1 for AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline
Figure 2 for AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline
Figure 3 for AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline
Figure 4 for AISHELL-1: An Open-Source Mandarin Speech Corpus and A Speech Recognition Baseline
Viaarxiv icon

End-to-End Multi-speaker Speech Recognition with Transformer

Add code
Bookmark button
Alert button
Feb 13, 2020
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe

Figure 1 for End-to-End Multi-speaker Speech Recognition with Transformer
Figure 2 for End-to-End Multi-speaker Speech Recognition with Transformer
Figure 3 for End-to-End Multi-speaker Speech Recognition with Transformer
Figure 4 for End-to-End Multi-speaker Speech Recognition with Transformer
Viaarxiv icon

TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening

Feb 16, 2022
Zijian Ding, Jiawen Kang, Tinky Oi Ting HO, Ka Ho Wong, Helene H. Fung, Helen Meng, Xiaojuan Ma

Figure 1 for TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening
Figure 2 for TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening
Figure 3 for TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening
Figure 4 for TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening
Viaarxiv icon

WaveNODE: A Continuous Normalizing Flow for Speech Synthesis

Add code
Bookmark button
Alert button
Jun 09, 2020
Hyeongju Kim, Hyeonseung Lee, Woo Hyun Kang, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim

Figure 1 for WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Figure 2 for WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Figure 3 for WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Figure 4 for WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Viaarxiv icon