Picture for Berrak Sisman

Berrak Sisman

Can Emotion Fool Anti-spoofing?

Add code
May 29, 2025
Viaarxiv icon

EmotionRankCLAP: Bridging Natural Language Speaking Styles and Ordinal Speech Emotion via Rank-N-Contrast

Add code
May 29, 2025
Viaarxiv icon

Towards Emotionally Consistent Text-Based Speech Editing: Introducing EmoCorrector and The ECD-TSE Dataset

Add code
May 24, 2025
Viaarxiv icon

DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech

Add code
Oct 17, 2024
Figure 1 for DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech
Figure 2 for DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech
Figure 3 for DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech
Figure 4 for DART: Disentanglement of Accent and Speaker Representation in Multispeaker Text-to-Speech
Viaarxiv icon

Discrete Unit based Masking for Improving Disentanglement in Voice Conversion

Add code
Sep 17, 2024
Figure 1 for Discrete Unit based Masking for Improving Disentanglement in Voice Conversion
Figure 2 for Discrete Unit based Masking for Improving Disentanglement in Voice Conversion
Figure 3 for Discrete Unit based Masking for Improving Disentanglement in Voice Conversion
Figure 4 for Discrete Unit based Masking for Improving Disentanglement in Voice Conversion
Viaarxiv icon

SelectTTS: Synthesizing Anyone's Voice via Discrete Unit-Based Frame Selection

Add code
Aug 30, 2024
Figure 1 for SelectTTS: Synthesizing Anyone's Voice via Discrete Unit-Based Frame Selection
Figure 2 for SelectTTS: Synthesizing Anyone's Voice via Discrete Unit-Based Frame Selection
Figure 3 for SelectTTS: Synthesizing Anyone's Voice via Discrete Unit-Based Frame Selection
Figure 4 for SelectTTS: Synthesizing Anyone's Voice via Discrete Unit-Based Frame Selection
Viaarxiv icon

PRESENT: Zero-Shot Text-to-Prosody Control

Add code
Aug 13, 2024
Viaarxiv icon

We Need Variations in Speech Synthesis: Sub-center Modelling for Speaker Embeddings

Add code
Jul 05, 2024
Figure 1 for We Need Variations in Speech Synthesis: Sub-center Modelling for Speaker Embeddings
Figure 2 for We Need Variations in Speech Synthesis: Sub-center Modelling for Speaker Embeddings
Figure 3 for We Need Variations in Speech Synthesis: Sub-center Modelling for Speaker Embeddings
Viaarxiv icon

Towards Naturalistic Voice Conversion: NaturalVoices Dataset with an Automatic Processing Pipeline

Add code
Jun 06, 2024
Viaarxiv icon

Style Mixture of Experts for Expressive Text-To-Speech Synthesis

Add code
Jun 05, 2024
Viaarxiv icon