Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hye-Jin Shim

Uni-VERSA: Versatile Speech Assessment with a Unified Network

May 27, 2025

Jiatong Shi, Hye-Jin Shim, Shinji Watanabe

Abstract:Subjective listening tests remain the golden standard for speech quality assessment, but are costly, variable, and difficult to scale. In contrast, existing objective metrics, such as PESQ, F0 correlation, and DNSMOS, typically capture only specific aspects of speech quality. To address these limitations, we introduce Uni-VERSA, a unified network that simultaneously predicts various objective metrics, encompassing naturalness, intelligibility, speaker characteristics, prosody, and noise, for a comprehensive evaluation of speech signals. We formalize its framework, evaluation protocol, and applications in speech enhancement, synthesis, and quality control. A benchmark based on the URGENT24 challenge, along with a baseline leveraging self-supervised representations, demonstrates that Uni-VERSA provides a viable alternative to single-aspect evaluation methods. Moreover, it aligns closely with human perception, making it a promising approach for future speech quality assessment.

* Accepted by Interspeech

Via

Access Paper or Ask Questions

Replay spoofing detection system for automatic speaker verification using multi-task learning of noise classes

Oct 25, 2018

Hye-Jin Shim, Jee-weon Jung, Hee-Soo Heo, Sunghyun Yoon, Ha-Jin Yu

Figure 1 for Replay spoofing detection system for automatic speaker verification using multi-task learning of noise classes

Figure 2 for Replay spoofing detection system for automatic speaker verification using multi-task learning of noise classes

Figure 3 for Replay spoofing detection system for automatic speaker verification using multi-task learning of noise classes

Figure 4 for Replay spoofing detection system for automatic speaker verification using multi-task learning of noise classes

Abstract:In this paper, we propose a replay attack spoofing detection system for automatic speaker verification using multitask learning of noise classes. We define the noise that is caused by the replay attack as replay noise. We explore the effectiveness of training a deep neural network simultaneously for replay attack spoofing detection and replay noise classification. The multi-task learning includes classifying the noise of playback devices, recording environments, and recording devices as well as the spoofing detection. Each of the three types of the noise classes also includes a genuine class. The experiment results on the ASVspoof2017 datasets demonstrate that the performance of our proposed system is improved by 30% relatively on the evaluation set.

* 5 pages, accepted by Technologies and Applications of Artificial Intelligence(TAAI)

Via

Access Paper or Ask Questions