Picture for Hung-yi Lee

Hung-yi Lee

TiCo: Time-Controllable Training for Spoken Dialogue Models

Add code
Mar 23, 2026
Viaarxiv icon

TaigiSpeech: A Low-Resource Real-World Speech Intent Dataset and Preliminary Results with Scalable Data Mining In-the-Wild

Add code
Mar 23, 2026
Viaarxiv icon

The Binding Effect: Analyzing How Multi-Dimensional Cues Form Gender Bias in Instruction TTS

Add code
Mar 21, 2026
Viaarxiv icon

How Auditory Knowledge in LLM Backbones Shapes Audio Language Models: A Holistic Evaluation

Add code
Mar 19, 2026
Viaarxiv icon

Nudging Hidden States: Training-Free Model Steering for Chain-of-Thought Reasoning in Large Audio-Language Models

Add code
Mar 15, 2026
Viaarxiv icon

Causal Tracing of Audio-Text Fusion in Large Audio Language Models

Add code
Mar 14, 2026
Viaarxiv icon

TASTE-Streaming: Towards Streamable Text-Aligned Speech Tokenization and Embedding for Spoken Language Modeling

Add code
Mar 12, 2026
Viaarxiv icon

MOS-Bias: From Hidden Gender Bias to Gender-Aware Speech Quality Assessment

Add code
Mar 11, 2026
Viaarxiv icon

MUGEN: Evaluating and Improving Multi-audio Understanding of Large Audio-Language Models

Add code
Mar 10, 2026
Viaarxiv icon

How Contrastive Decoding Enhances Large Audio Language Models?

Add code
Mar 10, 2026
Viaarxiv icon