Picture for Wei-Ping Huang

Wei-Ping Huang

Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation

Add code
Jul 13, 2024
Viaarxiv icon

Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech

Add code
Jun 16, 2024
Viaarxiv icon

Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models

Add code
Jun 12, 2024
Viaarxiv icon

Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization

Add code
Jan 23, 2024
Viaarxiv icon

Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond

Add code
Oct 09, 2023
Figure 1 for Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Figure 2 for Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Figure 3 for Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Figure 4 for Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Viaarxiv icon

Why We Should Report the Details in Subjective Evaluation of TTS More Rigorously

Add code
Jun 03, 2023
Figure 1 for Why We Should Report the Details in Subjective Evaluation of TTS More Rigorously
Figure 2 for Why We Should Report the Details in Subjective Evaluation of TTS More Rigorously
Figure 3 for Why We Should Report the Details in Subjective Evaluation of TTS More Rigorously
Figure 4 for Why We Should Report the Details in Subjective Evaluation of TTS More Rigorously
Viaarxiv icon

On the Utility of Self-supervised Models for Prosody-related Tasks

Add code
Oct 13, 2022
Figure 1 for On the Utility of Self-supervised Models for Prosody-related Tasks
Figure 2 for On the Utility of Self-supervised Models for Prosody-related Tasks
Figure 3 for On the Utility of Self-supervised Models for Prosody-related Tasks
Figure 4 for On the Utility of Self-supervised Models for Prosody-related Tasks
Viaarxiv icon

Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding

Add code
Jun 27, 2022
Figure 1 for Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding
Figure 2 for Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding
Figure 3 for Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding
Figure 4 for Few-Shot Cross-Lingual TTS Using Transferable Phoneme Embedding
Viaarxiv icon