Alert button

"speech": models, code, and papers
Alert button

End-to-End Multi-speaker Speech Recognition with Transformer

Add code
Bookmark button
Alert button
Feb 13, 2020
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux, Shinji Watanabe

Figure 1 for End-to-End Multi-speaker Speech Recognition with Transformer
Figure 2 for End-to-End Multi-speaker Speech Recognition with Transformer
Figure 3 for End-to-End Multi-speaker Speech Recognition with Transformer
Figure 4 for End-to-End Multi-speaker Speech Recognition with Transformer
Viaarxiv icon

wav2letter++: The Fastest Open-source Speech Recognition System

Add code
Bookmark button
Alert button
Dec 18, 2018
Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert

Figure 1 for wav2letter++: The Fastest Open-source Speech Recognition System
Figure 2 for wav2letter++: The Fastest Open-source Speech Recognition System
Figure 3 for wav2letter++: The Fastest Open-source Speech Recognition System
Figure 4 for wav2letter++: The Fastest Open-source Speech Recognition System
Viaarxiv icon

WaveNODE: A Continuous Normalizing Flow for Speech Synthesis

Add code
Bookmark button
Alert button
Jun 09, 2020
Hyeongju Kim, Hyeonseung Lee, Woo Hyun Kang, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim

Figure 1 for WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Figure 2 for WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Figure 3 for WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Figure 4 for WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Viaarxiv icon

Deep Iterative Phase Retrieval for Ptychography

Feb 17, 2022
Simon Welker, Tal Peer, Henry N. Chapman, Timo Gerkmann

Figure 1 for Deep Iterative Phase Retrieval for Ptychography
Figure 2 for Deep Iterative Phase Retrieval for Ptychography
Figure 3 for Deep Iterative Phase Retrieval for Ptychography
Figure 4 for Deep Iterative Phase Retrieval for Ptychography
Viaarxiv icon

A Hierarchical Model for Spoken Language Recognition

Add code
Bookmark button
Alert button
Jan 04, 2022
Luciana Ferrer, Diego Castan, Mitchell McLaren, Aaron Lawson

Figure 1 for A Hierarchical Model for Spoken Language Recognition
Figure 2 for A Hierarchical Model for Spoken Language Recognition
Figure 3 for A Hierarchical Model for Spoken Language Recognition
Figure 4 for A Hierarchical Model for Spoken Language Recognition
Viaarxiv icon

End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation

Add code
Bookmark button
Alert button
Nov 26, 2019
Yi Luo, Zhuo Chen, Nima Mesgarani, Takuya Yoshioka

Figure 1 for End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation
Figure 2 for End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation
Figure 3 for End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation
Viaarxiv icon

Parallel Composition of Weighted Finite-State Transducers

Add code
Bookmark button
Alert button
Oct 06, 2021
Shubho Sengupta, Vineel Pratap, Awni Hannun

Figure 1 for Parallel Composition of Weighted Finite-State Transducers
Figure 2 for Parallel Composition of Weighted Finite-State Transducers
Figure 3 for Parallel Composition of Weighted Finite-State Transducers
Figure 4 for Parallel Composition of Weighted Finite-State Transducers
Viaarxiv icon

TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening

Feb 16, 2022
Zijian Ding, Jiawen Kang, Tinky Oi Ting HO, Ka Ho Wong, Helene H. Fung, Helen Meng, Xiaojuan Ma

Figure 1 for TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening
Figure 2 for TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening
Figure 3 for TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening
Figure 4 for TalkTive: A Conversational Agent Using Backchannels to Engage Older Adults in Neurocognitive Disorders Screening
Viaarxiv icon

The Zero Resource Speech Challenge 2019: TTS without T

Apr 25, 2019
Ewan Dunbar, Robin Algayres, Julien Karadayi, Mathieu Bernard, Juan Benjumea, Xuan-Nga Cao, Lucie Miskic, Charlotte Dugrain, Lucas Ondel, Alan W. Black, Laurent Besacier, Sakriani Sakti, Emmanuel Dupoux

Figure 1 for The Zero Resource Speech Challenge 2019: TTS without T
Figure 2 for The Zero Resource Speech Challenge 2019: TTS without T
Figure 3 for The Zero Resource Speech Challenge 2019: TTS without T
Figure 4 for The Zero Resource Speech Challenge 2019: TTS without T
Viaarxiv icon