Picture for Hsin-Min Wang

Hsin-Min Wang

Towards Robust Assessment of Pathological Voices via Combined Low-Level Descriptors and Foundation Model Representations

Add code
May 30, 2025
Viaarxiv icon

Towards Robust Automated Perceptual Voice Quality Assessment with Speech Foundation Models

Add code
May 28, 2025
Viaarxiv icon

MSECG: Incorporating Mamba for Robust and Efficient ECG Super-Resolution

Add code
Dec 06, 2024
Viaarxiv icon

How Good is ChatGPT at Audiovisual Deepfake Detection: A Comparative Study of ChatGPT, AI Models and Human Perception

Add code
Nov 14, 2024
Figure 1 for How Good is ChatGPT at Audiovisual Deepfake Detection: A Comparative Study of ChatGPT, AI Models and Human Perception
Figure 2 for How Good is ChatGPT at Audiovisual Deepfake Detection: A Comparative Study of ChatGPT, AI Models and Human Perception
Figure 3 for How Good is ChatGPT at Audiovisual Deepfake Detection: A Comparative Study of ChatGPT, AI Models and Human Perception
Figure 4 for How Good is ChatGPT at Audiovisual Deepfake Detection: A Comparative Study of ChatGPT, AI Models and Human Perception
Viaarxiv icon

Understanding Audiovisual Deepfake Detection: Techniques, Challenges, Human Factors and Perceptual Insights

Add code
Nov 12, 2024
Figure 1 for Understanding Audiovisual Deepfake Detection: Techniques, Challenges, Human Factors and Perceptual Insights
Figure 2 for Understanding Audiovisual Deepfake Detection: Techniques, Challenges, Human Factors and Perceptual Insights
Figure 3 for Understanding Audiovisual Deepfake Detection: Techniques, Challenges, Human Factors and Perceptual Insights
Figure 4 for Understanding Audiovisual Deepfake Detection: Techniques, Challenges, Human Factors and Perceptual Insights
Viaarxiv icon

Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing

Add code
Sep 22, 2024
Figure 1 for Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
Figure 2 for Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
Figure 3 for Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
Figure 4 for Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing
Viaarxiv icon

A Study on Zero-shot Non-intrusive Speech Assessment using Large Language Models

Add code
Sep 16, 2024
Viaarxiv icon

Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement

Add code
Sep 16, 2024
Figure 1 for Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Figure 2 for Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Figure 3 for Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Figure 4 for Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
Viaarxiv icon

Exploring the Impact of Data Quantity on ASR in Extremely Low-resource Languages

Add code
Sep 13, 2024
Viaarxiv icon

The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction

Add code
Sep 11, 2024
Figure 1 for The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Figure 2 for The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Figure 3 for The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Figure 4 for The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Viaarxiv icon