Picture for Ke-Han Lu

Ke-Han Lu

SpeechCaps: Advancing Instruction-Based Universal Speech Models with Multi-Talker Speaking Style Captioning

Add code
Aug 25, 2024
Viaarxiv icon

Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation

Add code
Jul 13, 2024
Figure 1 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Figure 2 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Figure 3 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Figure 4 for Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Viaarxiv icon

Listen and Speak Fairly: A Study on Semantic Gender Bias in Speech Integrated Large Language Models

Add code
Jul 09, 2024
Viaarxiv icon

DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment

Add code
Jun 27, 2024
Figure 1 for DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment
Figure 2 for DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment
Figure 3 for DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment
Figure 4 for DeSTA: Enhancing Speech Language Models through Descriptive Speech-Text Alignment
Viaarxiv icon

Investigating Zero-Shot Generalizability on Mandarin-English Code-Switched ASR and Speech-to-text Translation of Recent Foundation Models with Self-Supervision and Weak Supervision

Add code
Dec 30, 2023
Viaarxiv icon

HypR: A comprehensive study for ASR hypothesis revising with a reference corpus

Add code
Sep 19, 2023
Figure 1 for HypR: A comprehensive study for ASR hypothesis revising with a reference corpus
Figure 2 for HypR: A comprehensive study for ASR hypothesis revising with a reference corpus
Viaarxiv icon

Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech

Add code
Sep 18, 2023
Figure 1 for Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
Figure 2 for Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
Figure 3 for Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
Figure 4 for Dynamic-SUPERB: Towards A Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark for Speech
Viaarxiv icon

A context-aware knowledge transferring strategy for CTC-based ASR

Add code
Oct 12, 2022
Figure 1 for A context-aware knowledge transferring strategy for CTC-based ASR
Figure 2 for A context-aware knowledge transferring strategy for CTC-based ASR
Figure 3 for A context-aware knowledge transferring strategy for CTC-based ASR
Figure 4 for A context-aware knowledge transferring strategy for CTC-based ASR
Viaarxiv icon

A Transformer-based Cross-modal Fusion Model with Adversarial Training for VQA Challenge 2021

Add code
Jun 24, 2021
Figure 1 for A Transformer-based Cross-modal Fusion Model with Adversarial Training for VQA Challenge 2021
Viaarxiv icon