Picture for Yu-Wen Chen

Yu-Wen Chen

From Who Said What to Who They Are: Modular Training-free Identity-Aware LLM Refinement of Speaker Diarization

Add code
Sep 18, 2025
Viaarxiv icon

Read to Hear: A Zero-Shot Pronunciation Assessment Using Textual Descriptions and LLMs

Add code
Sep 17, 2025
Viaarxiv icon

Bridging the Gap: Integrating Pre-trained Speech Enhancement and Recognition Models for Robust Speech Recognition

Add code
Jun 18, 2024
Viaarxiv icon

Exploring Robustness in Doctor-Patient Conversation Summarization: An Analysis of Out-of-Domain SOAP Notes

Add code
Jun 05, 2024
Viaarxiv icon

A Study on Incorporating Whisper for Robust Speech Assessment

Add code
Sep 22, 2023
Figure 1 for A Study on Incorporating Whisper for Robust Speech Assessment
Figure 2 for A Study on Incorporating Whisper for Robust Speech Assessment
Figure 3 for A Study on Incorporating Whisper for Robust Speech Assessment
Figure 4 for A Study on Incorporating Whisper for Robust Speech Assessment
Viaarxiv icon

Noise robust speech emotion recognition with signal-to-noise ratio adapting speech enhancement

Add code
Sep 03, 2023
Viaarxiv icon

MultiPA: a multi-task speech pronunciation assessment system for a closed and open response scenario

Add code
Aug 24, 2023
Figure 1 for MultiPA: a multi-task speech pronunciation assessment system for a closed and open response scenario
Figure 2 for MultiPA: a multi-task speech pronunciation assessment system for a closed and open response scenario
Figure 3 for MultiPA: a multi-task speech pronunciation assessment system for a closed and open response scenario
Figure 4 for MultiPA: a multi-task speech pronunciation assessment system for a closed and open response scenario
Viaarxiv icon

Investigation of Factorized Optical Flows as Mid-Level Representations

Add code
Mar 10, 2022
Figure 1 for Investigation of Factorized Optical Flows as Mid-Level Representations
Figure 2 for Investigation of Factorized Optical Flows as Mid-Level Representations
Figure 3 for Investigation of Factorized Optical Flows as Mid-Level Representations
Figure 4 for Investigation of Factorized Optical Flows as Mid-Level Representations
Viaarxiv icon

InQSS: a speech intelligibility assessment model using a multi-task learning network

Add code
Nov 04, 2021
Figure 1 for InQSS: a speech intelligibility assessment model using a multi-task learning network
Figure 2 for InQSS: a speech intelligibility assessment model using a multi-task learning network
Figure 3 for InQSS: a speech intelligibility assessment model using a multi-task learning network
Figure 4 for InQSS: a speech intelligibility assessment model using a multi-task learning network
Viaarxiv icon

The AS-NU System for the M2VoC Challenge

Add code
Apr 07, 2021
Figure 1 for The AS-NU System for the M2VoC Challenge
Figure 2 for The AS-NU System for the M2VoC Challenge
Figure 3 for The AS-NU System for the M2VoC Challenge
Figure 4 for The AS-NU System for the M2VoC Challenge
Viaarxiv icon