Alert button

"speech": models, code, and papers
Alert button

Quantitative phase and absorption contrast imaging

Mar 23, 2022
Miguel Moscoso, Alexei Novikov, George Papanicolaou, Chrysoula Tsogka

Figure 1 for Quantitative phase and absorption contrast imaging
Figure 2 for Quantitative phase and absorption contrast imaging
Figure 3 for Quantitative phase and absorption contrast imaging
Figure 4 for Quantitative phase and absorption contrast imaging
Viaarxiv icon

A Context-Aware Feature Fusion Framework for Punctuation Restoration

Mar 23, 2022
Yangjun Wu, Kebin Fang, Yao Zhao

Figure 1 for A Context-Aware Feature Fusion Framework for Punctuation Restoration
Figure 2 for A Context-Aware Feature Fusion Framework for Punctuation Restoration
Figure 3 for A Context-Aware Feature Fusion Framework for Punctuation Restoration
Figure 4 for A Context-Aware Feature Fusion Framework for Punctuation Restoration
Viaarxiv icon

Speech Enhancement with Zero-Shot Model Selection

Dec 17, 2020
Ryandhimas E. Zezario, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao

Figure 1 for Speech Enhancement with Zero-Shot Model Selection
Figure 2 for Speech Enhancement with Zero-Shot Model Selection
Figure 3 for Speech Enhancement with Zero-Shot Model Selection
Figure 4 for Speech Enhancement with Zero-Shot Model Selection
Viaarxiv icon

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

May 09, 2021
Yichong Leng, Xu Tan, Linchen Zhu, Jin Xu, Renqian Luo, Linquan Liu, Tao Qin, Xiang-Yang Li, Ed Lin, Tie-Yan Liu

Figure 1 for FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Figure 2 for FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Figure 3 for FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Figure 4 for FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Viaarxiv icon

Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition

Feb 16, 2021
Aswin Shanmugam Subramanian, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu

Figure 1 for Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition
Figure 2 for Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition
Figure 3 for Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition
Figure 4 for Deep Learning based Multi-Source Localization with Source Splitting and its Effectiveness in Multi-Talker Speech Recognition
Viaarxiv icon

SVSNet: An End-to-end Speaker Voice Similarity Assessment Model

Jul 20, 2021
Cheng-Hung Hu, Yu-Huai Peng, Junichi Yamagishi, Yu Tsao, Hsin-Min Wang

Figure 1 for SVSNet: An End-to-end Speaker Voice Similarity Assessment Model
Figure 2 for SVSNet: An End-to-end Speaker Voice Similarity Assessment Model
Figure 3 for SVSNet: An End-to-end Speaker Voice Similarity Assessment Model
Figure 4 for SVSNet: An End-to-end Speaker Voice Similarity Assessment Model
Viaarxiv icon

Computational bioacoustics with deep learning: a review and roadmap

Dec 13, 2021
Dan Stowell

Figure 1 for Computational bioacoustics with deep learning: a review and roadmap
Figure 2 for Computational bioacoustics with deep learning: a review and roadmap
Viaarxiv icon

Rhythm Zone Theory: Speech Rhythms are Physical after all

Mar 12, 2019
Dafydd Gibbon, Xuewei Lin

Figure 1 for Rhythm Zone Theory: Speech Rhythms are Physical after all
Figure 2 for Rhythm Zone Theory: Speech Rhythms are Physical after all
Figure 3 for Rhythm Zone Theory: Speech Rhythms are Physical after all
Figure 4 for Rhythm Zone Theory: Speech Rhythms are Physical after all
Viaarxiv icon

Detecting Emotion Carriers by Combining Acoustic and Lexical Representations

Dec 13, 2021
Sebastian P. Bayerl, Aniruddha Tammewar, Korbinian Riedhammer, Giuseppe Riccardi

Figure 1 for Detecting Emotion Carriers by Combining Acoustic and Lexical Representations
Figure 2 for Detecting Emotion Carriers by Combining Acoustic and Lexical Representations
Figure 3 for Detecting Emotion Carriers by Combining Acoustic and Lexical Representations
Figure 4 for Detecting Emotion Carriers by Combining Acoustic and Lexical Representations
Viaarxiv icon

Does Audio Deepfake Detection Generalize?

Mar 31, 2022
Nicolas M. Müller, Pavel Czempin, Franziska Dieckmann, Adam Froghyar, Konstantin Böttinger

Figure 1 for Does Audio Deepfake Detection Generalize?
Figure 2 for Does Audio Deepfake Detection Generalize?
Figure 3 for Does Audio Deepfake Detection Generalize?
Viaarxiv icon