Picture for Chao-Han Huck Yang

Chao-Han Huck Yang

From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment

Add code
Jun 20, 2024
Viaarxiv icon

Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

Add code
May 23, 2024
Figure 1 for Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
Figure 2 for Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
Figure 3 for Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
Figure 4 for Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models
Viaarxiv icon

An Investigation of Incorporating Mamba for Speech Enhancement

Add code
May 10, 2024
Viaarxiv icon

Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities

Add code
Apr 23, 2024
Figure 1 for Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities
Figure 2 for Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities
Figure 3 for Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities
Figure 4 for Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities
Viaarxiv icon

GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators

Add code
Feb 10, 2024
Figure 1 for GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Figure 2 for GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Figure 3 for GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Figure 4 for GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
Viaarxiv icon

It's Never Too Late: Fusing Acoustic Information into Large Language Models for Automatic Speech Recognition

Add code
Feb 08, 2024
Viaarxiv icon

Investigating Training Strategies and Model Robustness of Low-Rank Adaptation for Language Modeling in Speech Recognition

Add code
Jan 19, 2024
Viaarxiv icon

Large Language Models are Efficient Learners of Noise-Robust Speech Recognition

Add code
Jan 19, 2024
Figure 1 for Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
Figure 2 for Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
Figure 3 for Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
Figure 4 for Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
Viaarxiv icon

Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue

Add code
Jan 17, 2024
Viaarxiv icon

Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification

Add code
Dec 22, 2023
Figure 1 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Figure 2 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Figure 3 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Figure 4 for Multimodal Attention Merging for Improved Speech Recognition and Audio Event Classification
Viaarxiv icon