Alert button

"speech": models, code, and papers
Alert button

Machine Perceptual Quality: Evaluating the Impact of Severe Lossy Compression on Audio and Image Models

Add code
Bookmark button
Alert button
Jan 15, 2024
Dan Jacobellis, Daniel Cummings, Neeraja J. Yadwadkar

Viaarxiv icon

Accent-VITS:accent transfer for end-to-end TTS

Dec 29, 2023
Linhan Ma, Yongmao Zhang, Xinfa Zhu, Yi Lei, Ziqian Ning, Pengcheng Zhu, Lei Xie

Viaarxiv icon

Chain of Generation: Multi-Modal Gesture Synthesis via Cascaded Conditional Control

Dec 26, 2023
Zunnan Xu, Yachao Zhang, Sicheng Yang, Ronghui Li, Xiu Li

Viaarxiv icon

Multi-objective Non-intrusive Hearing-aid Speech Assessment Model

Nov 15, 2023
Hsin-Tien Chiang, Szu-Wei Fu, Hsin-Min Wang, Yu Tsao, John H. L. Hansen

Figure 1 for Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Figure 2 for Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Figure 3 for Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Figure 4 for Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Viaarxiv icon

How does end-to-end speech recognition training impact speech enhancement artifacts?

Nov 20, 2023
Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri

Viaarxiv icon

Synthetic Data Generation Techniques for Developing AI-based Speech Assessments for Parkinson's Disease (A Comparative Study)

Dec 04, 2023
Mahboobeh Parsapoor

Viaarxiv icon

Sparsity-Driven EEG Channel Selection for Brain-Assisted Speech Enhancement

Add code
Bookmark button
Alert button
Nov 23, 2023
Jie Zhang, Qing-Tian Xu, Zhen-Hua Ling

Figure 1 for Sparsity-Driven EEG Channel Selection for Brain-Assisted Speech Enhancement
Figure 2 for Sparsity-Driven EEG Channel Selection for Brain-Assisted Speech Enhancement
Figure 3 for Sparsity-Driven EEG Channel Selection for Brain-Assisted Speech Enhancement
Figure 4 for Sparsity-Driven EEG Channel Selection for Brain-Assisted Speech Enhancement
Viaarxiv icon

Singer Identity Representation Learning using Self-Supervised Techniques

Jan 10, 2024
Bernardo Torres, Stefan Lattner, Gaël Richard

Viaarxiv icon

MuTox: Universal MUltilingual Audio-based TOXicity Dataset and Zero-shot Detector

Jan 10, 2024
Marta R. Costa-jussà, Mariano Coria Meglioli, Pierre Andrews, David Dale, Prangthip Hansanti, Elahe Kalbassi, Alex Mourachko, Christophe Ropers, Carleigh Wood

Viaarxiv icon

MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation

Dec 19, 2023
Shengkui Zhao, Yukun Ma, Chongjia Ni, Chong Zhang, Hao Wang, Trung Hieu Nguyen, Kun Zhou, Jiaqi Yip, Dianwen Ng, Bin Ma

Viaarxiv icon