Picture for Shih-Lun Wu

Shih-Lun Wu

Stemphonic: All-at-once Flexible Multi-stem Music Generation

Add code
Feb 10, 2026
Viaarxiv icon

MIDI-LLM: Adapting Large Language Models for Text-to-MIDI Music Generation

Add code
Nov 06, 2025
Viaarxiv icon

Hookpad Aria: A Copilot for Songwriters

Add code
Feb 12, 2025
Figure 1 for Hookpad Aria: A Copilot for Songwriters
Figure 2 for Hookpad Aria: A Copilot for Songwriters
Viaarxiv icon

Foundation Models for Music: A Survey

Add code
Aug 27, 2024
Figure 1 for Foundation Models for Music: A Survey
Figure 2 for Foundation Models for Music: A Survey
Figure 3 for Foundation Models for Music: A Survey
Figure 4 for Foundation Models for Music: A Survey
Viaarxiv icon

Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning

Add code
Jul 24, 2024
Figure 1 for Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning
Figure 2 for Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning
Figure 3 for Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning
Figure 4 for Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music with Lightweight Finetuning
Viaarxiv icon

Music ControlNet: Multiple Time-varying Controls for Music Generation

Add code
Nov 13, 2023
Figure 1 for Music ControlNet: Multiple Time-varying Controls for Music Generation
Figure 2 for Music ControlNet: Multiple Time-varying Controls for Music Generation
Figure 3 for Music ControlNet: Multiple Time-varying Controls for Music Generation
Figure 4 for Music ControlNet: Multiple Time-varying Controls for Music Generation
Viaarxiv icon

Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation

Add code
Sep 29, 2023
Figure 1 for Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Figure 2 for Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Figure 3 for Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Figure 4 for Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Viaarxiv icon

Listener Model for the PhotoBook Referential Game with CLIPScores as Implicit Reference Chain

Add code
Jun 16, 2023
Figure 1 for Listener Model for the PhotoBook Referential Game with CLIPScores as Implicit Reference Chain
Figure 2 for Listener Model for the PhotoBook Referential Game with CLIPScores as Implicit Reference Chain
Figure 3 for Listener Model for the PhotoBook Referential Game with CLIPScores as Implicit Reference Chain
Figure 4 for Listener Model for the PhotoBook Referential Game with CLIPScores as Implicit Reference Chain
Viaarxiv icon

Tensor decomposition for minimization of E2E SLU model toward on-device processing

Add code
Jun 02, 2023
Figure 1 for Tensor decomposition for minimization of E2E SLU model toward on-device processing
Figure 2 for Tensor decomposition for minimization of E2E SLU model toward on-device processing
Figure 3 for Tensor decomposition for minimization of E2E SLU model toward on-device processing
Figure 4 for Tensor decomposition for minimization of E2E SLU model toward on-device processing
Viaarxiv icon

The Pipeline System of ASR and NLU with MLM-based Data Augmentation toward STOP Low-resource Challenge

Add code
May 11, 2023
Viaarxiv icon