Picture for Dorien Herremans

Dorien Herremans

Singapore University of Technology and Design

MIRFLEX: Music Information Retrieval Feature Library for Extraction

Add code
Nov 01, 2024
Viaarxiv icon

DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech

Add code
Oct 17, 2024
Figure 1 for DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech
Figure 2 for DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech
Figure 3 for DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech
Figure 4 for DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech
Viaarxiv icon

Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction

Add code
Oct 15, 2024
Figure 1 for Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction
Figure 2 for Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction
Figure 3 for Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction
Figure 4 for Leveraging LLM Embeddings for Cross Dataset Label Alignment and Zero Shot Music Emotion Prediction
Viaarxiv icon

Prevailing Research Areas for Music AI in the Era of Foundation Models

Add code
Sep 14, 2024
Viaarxiv icon

PRESENT: Zero-Shot Text-to-Prosody Control

Add code
Aug 13, 2024
Viaarxiv icon

BandControlNet: Parallel Transformers-based Steerable Popular Music Generation with Fine-Grained Spatiotemporal Features

Add code
Jul 15, 2024
Viaarxiv icon

DisfluencySpeech -- Single-Speaker Conversational Speech Dataset with Paralanguage

Add code
Jun 13, 2024
Figure 1 for DisfluencySpeech -- Single-Speaker Conversational Speech Dataset with Paralanguage
Figure 2 for DisfluencySpeech -- Single-Speaker Conversational Speech Dataset with Paralanguage
Figure 3 for DisfluencySpeech -- Single-Speaker Conversational Speech Dataset with Paralanguage
Viaarxiv icon

Are we there yet? A brief survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges

Add code
Jun 13, 2024
Viaarxiv icon

MidiCaps -- A large-scale MIDI dataset with text captions

Add code
Jun 04, 2024
Viaarxiv icon

Accent Conversion in Text-To-Speech Using Multi-Level VAE and Adversarial Training

Add code
Jun 03, 2024
Viaarxiv icon