Alert button

"speech": models, code, and papers
Alert button

LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models

Nov 28, 2023
Chi-Chang Lee, Hong-Wei Chen, Chu-Song Chen, Hsin-Min Wang, Tsung-Te Liu, Yu Tsao

Figure 1 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Figure 2 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Figure 3 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Figure 4 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Viaarxiv icon

Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency

Nov 05, 2023
Sungho Jeon, Ching-Feng Yeh, Hakan Inan, Wei-Ning Hsu, Rashi Rungta, Yashar Mehdad, Daniel Bikel

Viaarxiv icon

Self Generated Wargame AI: Double Layer Agent Task Planning Based on Large Language Model

Dec 02, 2023
Y. Sun, C. Yu, J. Zhao, W. Wang, X. Zhou

Viaarxiv icon

A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement

Sep 21, 2023
Bengt J. Borgstrom, Michael S. Brandstein

Figure 1 for A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement
Figure 2 for A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement
Figure 3 for A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement
Figure 4 for A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement
Viaarxiv icon

FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency

Add code
Bookmark button
Alert button
Sep 22, 2023
Rui Liu, Jiatian Xi, Ziyue Jiang, Haizhou Li

Figure 1 for FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
Figure 2 for FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
Figure 3 for FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
Figure 4 for FluentEditor: Text-based Speech Editing by Considering Acoustic and Prosody Consistency
Viaarxiv icon

Acoustic and linguistic representations for speech continuous emotion recognition in call center conversations

Oct 06, 2023
Manon Macary, Marie Tahon, Yannick Estève, Daniel Luzzati

Viaarxiv icon

Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition

Add code
Bookmark button
Alert button
Sep 26, 2023
Dongji Gao, Hainan Xu, Desh Raj, Leibny Paola Garcia Perera, Daniel Povey, Sanjeev Khudanpur

Figure 1 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Figure 2 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Figure 3 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Figure 4 for Learning from Flawed Data: Weakly Supervised Automatic Speech Recognition
Viaarxiv icon

Average Token Delay: A Duration-aware Latency Metric for Simultaneous Translation

Nov 27, 2023
Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Viaarxiv icon

Overview of the VLSP 2022 -- Abmusu Shared Task: A Data Challenge for Vietnamese Abstractive Multi-document Summarization

Nov 27, 2023
Mai-Vu Tran, Hoang-Quynh Le, Duy-Cat Can, Quoc-An Nguyen

Viaarxiv icon

Hate speech detection in algerian dialect using deep learning

Sep 20, 2023
Dihia Lanasri, Juan Olano, Sifal Klioui, Sin Liang Lee, Lamia Sekkai

Figure 1 for Hate speech detection in algerian dialect using deep learning
Figure 2 for Hate speech detection in algerian dialect using deep learning
Viaarxiv icon