Alert button
Picture for Hsin-Min Wang

Hsin-Min Wang

Alert button

SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data

Feb 10, 2024
Hsuan-Fu Wang, Yi-Jen Shih, Heng-Jui Chang, Layne Berry, Puyuan Peng, Hung-yi Lee, Hsin-Min Wang, David Harwath

Viaarxiv icon

HAAQI-Net: A non-intrusive neural music quality assessment model for hearing aids

Jan 02, 2024
Dyah A. M. G. Wisnu, Epri Pratiwi, Stefano Rini, Ryandhimas E. Zezario, Hsin-Min Wang, Yu Tsao

Viaarxiv icon

LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models

Nov 28, 2023
Chi-Chang Lee, Hong-Wei Chen, Chu-Song Chen, Hsin-Min Wang, Tsung-Te Liu, Yu Tsao

Figure 1 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Figure 2 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Figure 3 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Figure 4 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Viaarxiv icon

D4AM: A General Denoising Framework for Downstream Acoustic Models

Nov 28, 2023
Chi-Chang Lee, Yu Tsao, Hsin-Min Wang, Chu-Song Chen

Viaarxiv icon

Multi-objective Non-intrusive Hearing-aid Speech Assessment Model

Nov 15, 2023
Hsin-Tien Chiang, Szu-Wei Fu, Hsin-Min Wang, Yu Tsao, John H. L. Hansen

Figure 1 for Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Figure 2 for Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Figure 3 for Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Figure 4 for Multi-objective Non-intrusive Hearing-aid Speech Assessment Model
Viaarxiv icon

AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Video Deepfake Detection

Nov 05, 2023
Sahibzada Adil Shahzad, Ammarah Hashmi, Yan-Tsung Peng, Yu Tsao, Hsin-Min Wang

Viaarxiv icon

AVTENet: Audio-Visual Transformer-based Ensemble Network Exploiting Multiple Experts for Video Deepfake Detection

Oct 19, 2023
Ammarah Hashmi, Sahibzada Adil Shahzad, Chia-Wen Lin, Yu Tsao, Hsin-Min Wang

Viaarxiv icon

The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains

Oct 07, 2023
Erica Cooper, Wen-Chin Huang, Yu Tsao, Hsin-Min Wang, Tomoki Toda, Junichi Yamagishi

Figure 1 for The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains
Figure 2 for The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains
Figure 3 for The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains
Figure 4 for The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains
Viaarxiv icon

A Study on Incorporating Whisper for Robust Speech Assessment

Sep 22, 2023
Ryandhimas E. Zezario, Yu-Wen Chen, Yu Tsao, Szu-Wei Fu, Hsin-Min Wang, Chiou-Shann Fuh

Figure 1 for A Study on Incorporating Whisper for Robust Speech Assessment
Figure 2 for A Study on Incorporating Whisper for Robust Speech Assessment
Figure 3 for A Study on Incorporating Whisper for Robust Speech Assessment
Figure 4 for A Study on Incorporating Whisper for Robust Speech Assessment
Viaarxiv icon