Alert button

"speech": models, code, and papers
Alert button

EmphAssess : a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models

Dec 21, 2023
Maureen de Seyssel, Antony D'Avirro, Adina Williams, Emmanuel Dupoux

Viaarxiv icon

DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging

Feb 04, 2024
Matteo Pagliardini, Amirkeivan Mohtashami, Francois Fleuret, Martin Jaggi

Viaarxiv icon

Multichannel blind speech source separation with a disjoint constraint source model

Jan 03, 2024
Jianyu Wang, Shanzheng Guan

Viaarxiv icon

Comparison of parameters of vowel sounds of russian and english languages

Jan 26, 2024
V. I. Fedoseev, A. A. Konev, A. Yu. Yakimuk

Viaarxiv icon

Freetalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness

Jan 07, 2024
Sicheng Yang, Zunnan Xu, Haiwei Xue, Yongkang Cheng, Shaoli Huang, Mingming Gong, Zhiyong Wu

Viaarxiv icon

Absolute convergence and error thresholds in non-active adaptive sampling

Feb 04, 2024
Manuel Vilares Ferro, Victor M. Darriba Bilbao, Jesús Vilares Ferro

Viaarxiv icon

Enhance Reasoning for Large Language Models in the Game Werewolf

Feb 04, 2024
Shuang Wu, Liwen Zhu, Tao Yang, Shiwei Xu, Qiang Fu, Yang Wei, Haobo Fu

Viaarxiv icon

Decoupled Spatial and Temporal Processing for Resource Efficient Multichannel Speech Enhancement

Jan 15, 2024
Ashutosh Pandey, Buye Xu

Viaarxiv icon

Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization

Jan 13, 2024
A F M Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen

Viaarxiv icon

Hyperbolic Distance-Based Speech Separation

Jan 07, 2024
Darius Petermann, Minje Kim

Viaarxiv icon