Picture for Ke-Han Lu

Ke-Han Lu

Reducing Object Hallucination in Large Audio-Language Models via Audio-Aware Decoding

Add code
Jun 08, 2025
Viaarxiv icon

Speech-IFEval: Evaluating Instruction-Following and Quantifying Catastrophic Forgetting in Speech-Aware Language Models

Add code
May 25, 2025
Viaarxiv icon

Analyzing Mitigation Strategies for Catastrophic Forgetting in End-to-End Training of Spoken Language Models

Add code
May 23, 2025
Viaarxiv icon

Building a Taiwanese Mandarin Spoken Language Model: A First Attempt

Add code
Nov 11, 2024
Figure 1 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Figure 2 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Figure 3 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Figure 4 for Building a Taiwanese Mandarin Spoken Language Model: A First Attempt
Viaarxiv icon

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Figure 1 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 2 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 3 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 4 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Viaarxiv icon

Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data

Add code
Sep 30, 2024
Figure 1 for Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Figure 2 for Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Figure 3 for Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Figure 4 for Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Viaarxiv icon

Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models

Add code
Sep 21, 2024
Figure 1 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 2 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 3 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Figure 4 for Codec-SUPERB @ SLT 2024: A lightweight benchmark for neural audio codec models
Viaarxiv icon

SpeechCaps: Advancing Instruction-Based Universal Speech Models with Multi-Talker Speaking Style Captioning

Add code
Aug 25, 2024
Figure 1 for SpeechCaps: Advancing Instruction-Based Universal Speech Models with Multi-Talker Speaking Style Captioning
Figure 2 for SpeechCaps: Advancing Instruction-Based Universal Speech Models with Multi-Talker Speaking Style Captioning
Figure 3 for SpeechCaps: Advancing Instruction-Based Universal Speech Models with Multi-Talker Speaking Style Captioning
Figure 4 for SpeechCaps: Advancing Instruction-Based Universal Speech Models with Multi-Talker Speaking Style Captioning
Viaarxiv icon

Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation

Add code
Jul 13, 2024
Figure 1 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Figure 2 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Figure 3 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Figure 4 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Viaarxiv icon

Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models

Add code
Jul 09, 2024
Viaarxiv icon