Picture for Wei-Ping Huang

Wei-Ping Huang

How Contrastive Decoding Enhances Large Audio Language Models?

Add code
Mar 10, 2026
Viaarxiv icon

DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment

Add code
Jul 03, 2025
Figure 1 for DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment
Figure 2 for DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment
Figure 3 for DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment
Figure 4 for DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment
Viaarxiv icon

Speech-FT: A Fine-tuning Strategy for Enhancing Speech Representation Models Without Compromising Generalization Ability

Add code
Feb 18, 2025
Figure 1 for Speech-FT: A Fine-tuning Strategy for Enhancing Speech Representation Models Without Compromising Generalization Ability
Figure 2 for Speech-FT: A Fine-tuning Strategy for Enhancing Speech Representation Models Without Compromising Generalization Ability
Figure 3 for Speech-FT: A Fine-tuning Strategy for Enhancing Speech Representation Models Without Compromising Generalization Ability
Figure 4 for Speech-FT: A Fine-tuning Strategy for Enhancing Speech Representation Models Without Compromising Generalization Ability
Viaarxiv icon

Building a Taiwanese Mandarin Spoken Language Model: A First Attempt

Add code
Nov 11, 2024
Figure 1 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Figure 2 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Figure 3 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Figure 4 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Viaarxiv icon

Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation

Add code
Jul 13, 2024
Figure 1 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Figure 2 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Figure 3 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Figure 4 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Viaarxiv icon

Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech

Add code
Jun 16, 2024
Figure 1 for Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
Figure 2 for Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
Figure 3 for Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
Figure 4 for Continual Test-time Adaptation for End-to-end Speech Recognition on Noisy Speech
Viaarxiv icon

Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models

Add code
Jun 12, 2024
Figure 1 for Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models
Figure 2 for Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models
Figure 3 for Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models
Figure 4 for Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models
Viaarxiv icon

Maximizing Data Efficiency for Cross-Lingual TTS Adaptation by Self-Supervised Representation Mixing and Embedding Initialization

Add code
Jan 23, 2024
Viaarxiv icon

Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond

Add code
Oct 09, 2023
Figure 1 for Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Figure 2 for Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Figure 3 for Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Figure 4 for Findings of the 2023 ML-SUPERB Challenge: Pre-Training and Evaluation over More Languages and Beyond
Viaarxiv icon

Why We Should Report the Details in Subjective Evaluation of TTS More Rigorously

Add code
Jun 03, 2023
Viaarxiv icon