Picture for Chao-Han Huck Yang

Chao-Han Huck Yang

Random-Matrix-Induced Simplicity Bias in Over-parameterized Variational Quantum Circuits

Add code
Jan 05, 2026
Viaarxiv icon

Long Grounded Thoughts: Distilling Compositional Visual Reasoning Chains at Scale

Add code
Nov 07, 2025
Viaarxiv icon

Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation

Add code
Sep 09, 2025
Figure 1 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Figure 2 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Figure 3 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Figure 4 for Bias in Gender Bias Benchmarks: How Spurious Features Distort Evaluation
Viaarxiv icon

WoW-Bench: Evaluating Fine-Grained Acoustic Perception in Audio-Language Models via Marine Mammal Vocalizations

Add code
Aug 28, 2025
Figure 1 for WoW-Bench: Evaluating Fine-Grained Acoustic Perception in Audio-Language Models via Marine Mammal Vocalizations
Figure 2 for WoW-Bench: Evaluating Fine-Grained Acoustic Perception in Audio-Language Models via Marine Mammal Vocalizations
Figure 3 for WoW-Bench: Evaluating Fine-Grained Acoustic Perception in Audio-Language Models via Marine Mammal Vocalizations
Figure 4 for WoW-Bench: Evaluating Fine-Grained Acoustic Perception in Audio-Language Models via Marine Mammal Vocalizations
Viaarxiv icon

Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations

Add code
Aug 25, 2025
Figure 1 for Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations
Figure 2 for Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations
Figure 3 for Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations
Figure 4 for Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations
Viaarxiv icon

DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment

Add code
Jul 03, 2025
Viaarxiv icon

Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge

Add code
May 12, 2025
Viaarxiv icon

Plan2Align: Predictive Planning Based Test-Time Preference Alignment in Paragraph-Level Machine Translation

Add code
Feb 28, 2025
Viaarxiv icon

ESPnet-SpeechLM: An Open Speech Language Model Toolkit

Add code
Feb 21, 2025
Figure 1 for ESPnet-SpeechLM: An Open Speech Language Model Toolkit
Figure 2 for ESPnet-SpeechLM: An Open Speech Language Model Toolkit
Figure 3 for ESPnet-SpeechLM: An Open Speech Language Model Toolkit
Figure 4 for ESPnet-SpeechLM: An Open Speech Language Model Toolkit
Viaarxiv icon

Audio Large Language Models Can Be Descriptive Speech Quality Evaluators

Add code
Jan 27, 2025
Viaarxiv icon