Alert button

"speech": models, code, and papers
Alert button

Phase Aware Speech Enhancement using Realisation of Complex-valued LSTM

Add code
Bookmark button
Alert button
Oct 27, 2020
Raktim Gautam Goswami, Sivaganesh Andhavarapu, K Sri Rama Murty

Figure 1 for Phase Aware Speech Enhancement using Realisation of Complex-valued LSTM
Figure 2 for Phase Aware Speech Enhancement using Realisation of Complex-valued LSTM
Figure 3 for Phase Aware Speech Enhancement using Realisation of Complex-valued LSTM
Figure 4 for Phase Aware Speech Enhancement using Realisation of Complex-valued LSTM
Viaarxiv icon

Generalized RNN beamformer for target speech separation

Add code
Bookmark button
Alert button
Jan 04, 2021
Yong Xu, Zhuohuang Zhang, Meng Yu, Shi-Xiong Zhang, Lianwu Chen, Dong Yu

Figure 1 for Generalized RNN beamformer for target speech separation
Figure 2 for Generalized RNN beamformer for target speech separation
Figure 3 for Generalized RNN beamformer for target speech separation
Viaarxiv icon

ATCSpeechNet: A multilingual end-to-end speech recognition framework for air traffic control systems

Feb 17, 2021
Yi Lin, Bo Yang, Linchao Li, Dongyue Guo, Jianwei Zhang, Hu Chen, Yi Zhang

Figure 1 for ATCSpeechNet: A multilingual end-to-end speech recognition framework for air traffic control systems
Figure 2 for ATCSpeechNet: A multilingual end-to-end speech recognition framework for air traffic control systems
Figure 3 for ATCSpeechNet: A multilingual end-to-end speech recognition framework for air traffic control systems
Figure 4 for ATCSpeechNet: A multilingual end-to-end speech recognition framework for air traffic control systems
Viaarxiv icon

Unsupervised Speech Decomposition via Triple Information Bottleneck

Add code
Bookmark button
Alert button
Apr 29, 2020
Kaizhi Qian, Yang Zhang, Shiyu Chang, David Cox, Mark Hasegawa-Johnson

Figure 1 for Unsupervised Speech Decomposition via Triple Information Bottleneck
Figure 2 for Unsupervised Speech Decomposition via Triple Information Bottleneck
Figure 3 for Unsupervised Speech Decomposition via Triple Information Bottleneck
Figure 4 for Unsupervised Speech Decomposition via Triple Information Bottleneck
Viaarxiv icon

The Geometry of Multilingual Language Model Representations

Add code
Bookmark button
Alert button
May 22, 2022
Tyler A. Chang, Zhuowen Tu, Benjamin K. Bergen

Figure 1 for The Geometry of Multilingual Language Model Representations
Figure 2 for The Geometry of Multilingual Language Model Representations
Figure 3 for The Geometry of Multilingual Language Model Representations
Figure 4 for The Geometry of Multilingual Language Model Representations
Viaarxiv icon

Perceptimatic: A human speech perception benchmark for unsupervised subword modelling

Add code
Bookmark button
Alert button
Oct 12, 2020
Juliette Millet, Ewan Dunbar

Figure 1 for Perceptimatic: A human speech perception benchmark for unsupervised subword modelling
Figure 2 for Perceptimatic: A human speech perception benchmark for unsupervised subword modelling
Figure 3 for Perceptimatic: A human speech perception benchmark for unsupervised subword modelling
Viaarxiv icon

Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis

Add code
Bookmark button
Alert button
Apr 28, 2021
Erica Cooper, Xin Wang, Junichi Yamagishi

Figure 1 for Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis
Figure 2 for Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis
Figure 3 for Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis
Figure 4 for Text-to-Speech Synthesis Techniques for MIDI-to-Audio Synthesis
Viaarxiv icon

Using Affect as a Communication Modality to Improve Human-Robot Communication in Robot-Assisted Search and Rescue Scenarios

Add code
Bookmark button
Alert button
Aug 20, 2022
Sami Alperen Akgun, Moojan Ghafurian, Mark Crowley, Kerstin Dautenhahn

Figure 1 for Using Affect as a Communication Modality to Improve Human-Robot Communication in Robot-Assisted Search and Rescue Scenarios
Figure 2 for Using Affect as a Communication Modality to Improve Human-Robot Communication in Robot-Assisted Search and Rescue Scenarios
Figure 3 for Using Affect as a Communication Modality to Improve Human-Robot Communication in Robot-Assisted Search and Rescue Scenarios
Figure 4 for Using Affect as a Communication Modality to Improve Human-Robot Communication in Robot-Assisted Search and Rescue Scenarios
Viaarxiv icon

Maximum Phase Modeling for Sparse Linear Prediction of Speech

Jun 07, 2020
Thomas Drugman

Figure 1 for Maximum Phase Modeling for Sparse Linear Prediction of Speech
Figure 2 for Maximum Phase Modeling for Sparse Linear Prediction of Speech
Figure 3 for Maximum Phase Modeling for Sparse Linear Prediction of Speech
Figure 4 for Maximum Phase Modeling for Sparse Linear Prediction of Speech
Viaarxiv icon

WeNet: Production First and Production Ready End-to-End Speech Recognition Toolkit

Add code
Bookmark button
Alert button
Feb 02, 2021
Binbin Zhang, Di Wu, Chao Yang, Xiaoyu Chen, Zhendong Peng, Xiangming Wang, Zhuoyuan Yao, Xiong Wang, Fan Yu, Lei Xie, Xin Lei

Figure 1 for WeNet: Production First and Production Ready End-to-End Speech Recognition Toolkit
Figure 2 for WeNet: Production First and Production Ready End-to-End Speech Recognition Toolkit
Figure 3 for WeNet: Production First and Production Ready End-to-End Speech Recognition Toolkit
Figure 4 for WeNet: Production First and Production Ready End-to-End Speech Recognition Toolkit
Viaarxiv icon