Picture for Berrak Sisman

Berrak Sisman

We Need Variations in Speech Synthesis: Sub-center Modelling for Speaker Embeddings

Add code
Jul 05, 2024
Viaarxiv icon

Towards Naturalistic Voice Conversion: NaturalVoices Dataset with an Automatic Processing Pipeline

Add code
Jun 06, 2024
Viaarxiv icon

Style Mixture of Experts for Expressive Text-To-Speech Synthesis

Add code
Jun 05, 2024
Viaarxiv icon

Accent Conversion in Text-To-Speech Using Multi-Level VAE and Adversarial Training

Add code
Jun 03, 2024
Viaarxiv icon

Exploring speech style spaces with language models: Emotional TTS without emotion labels

Add code
May 18, 2024
Viaarxiv icon

Converting Anyone's Voice: End-to-End Expressive Voice Conversion with a Conditional Diffusion Model

Add code
May 02, 2024
Viaarxiv icon

emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition

Add code
Mar 21, 2024
Figure 1 for emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition
Figure 2 for emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition
Figure 3 for emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition
Figure 4 for emoDARTS: Joint Optimisation of CNN & Sequential Neural Network Architectures for Superior Speech Emotion Recognition
Viaarxiv icon

Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition

Add code
Jan 19, 2024
Figure 1 for Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition
Figure 2 for Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition
Figure 3 for Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition
Figure 4 for Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition
Viaarxiv icon

High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units

Add code
Jun 29, 2023
Figure 1 for High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units
Figure 2 for High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units
Figure 3 for High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units
Figure 4 for High-Quality Automatic Voice Over with Accurate Alignment: Supervision through Self-Supervised Discrete Speech Units
Viaarxiv icon

Improving Speech Emotion Recognition Performance using Differentiable Architecture Search

Add code
May 23, 2023
Figure 1 for Improving Speech Emotion Recognition Performance using Differentiable Architecture Search
Figure 2 for Improving Speech Emotion Recognition Performance using Differentiable Architecture Search
Figure 3 for Improving Speech Emotion Recognition Performance using Differentiable Architecture Search
Figure 4 for Improving Speech Emotion Recognition Performance using Differentiable Architecture Search
Viaarxiv icon