Picture for Chih-Kai Yang

Chih-Kai Yang

How Auditory Knowledge in LLM Backbones Shapes Audio Language Models: A Holistic Evaluation

Add code
Mar 19, 2026
Viaarxiv icon

Nudging Hidden States: Training-Free Model Steering for Chain-of-Thought Reasoning in Large Audio-Language Models

Add code
Mar 15, 2026
Viaarxiv icon

MUGEN: Evaluating and Improving Multi-audio Understanding of Large Audio-Language Models

Add code
Mar 10, 2026
Viaarxiv icon

DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment

Add code
Jul 03, 2025
Figure 1 for DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment
Figure 2 for DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment
Figure 3 for DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment
Figure 4 for DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment
Viaarxiv icon

AudioLens: A Closer Look at Auditory Attribute Perception of Large Audio-Language Models

Add code
Jun 05, 2025
Viaarxiv icon

Analyzing Mitigation Strategies for Catastrophic Forgetting in End-to-End Training of Spoken Language Models

Add code
May 23, 2025
Viaarxiv icon

Towards Holistic Evaluation of Large Audio-Language Models: A Comprehensive Survey

Add code
May 21, 2025
Viaarxiv icon

SAKURA: On the Multi-hop Reasoning of Large Audio-Language Models Based on Speech and Audio Information

Add code
May 19, 2025
Viaarxiv icon

Building a Taiwanese Mandarin Spoken Language Model: A First Attempt

Add code
Nov 11, 2024
Figure 1 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Figure 2 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Figure 3 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Figure 4 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Viaarxiv icon

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Figure 1 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 2 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 3 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 4 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Viaarxiv icon