Picture for Korin Richmond

Korin Richmond

CSTR

Segmentation-Variant Codebooks for Preservation of Paralinguistic and Prosodic Information

Add code
May 21, 2025
Viaarxiv icon

Pairwise Evaluation of Accent Similarity in Speech Synthesis

Add code
May 20, 2025
Viaarxiv icon

Revisiting Acoustic Similarity in Emotional Speech and Music via Self-Supervised Representations

Add code
Sep 26, 2024
Figure 1 for Revisiting Acoustic Similarity in Emotional Speech and Music via Self-Supervised Representations
Figure 2 for Revisiting Acoustic Similarity in Emotional Speech and Music via Self-Supervised Representations
Figure 3 for Revisiting Acoustic Similarity in Emotional Speech and Music via Self-Supervised Representations
Figure 4 for Revisiting Acoustic Similarity in Emotional Speech and Music via Self-Supervised Representations
Viaarxiv icon

Cross-lingual Speech Emotion Recognition: Humans vs. Self-Supervised Models

Add code
Sep 25, 2024
Figure 1 for Cross-lingual Speech Emotion Recognition: Humans vs. Self-Supervised Models
Figure 2 for Cross-lingual Speech Emotion Recognition: Humans vs. Self-Supervised Models
Figure 3 for Cross-lingual Speech Emotion Recognition: Humans vs. Self-Supervised Models
Figure 4 for Cross-lingual Speech Emotion Recognition: Humans vs. Self-Supervised Models
Viaarxiv icon

Acquiring Pronunciation Knowledge from Transcribed Speech Audio via Multi-task Learning

Add code
Sep 15, 2024
Viaarxiv icon

AccentBox: Towards High-Fidelity Zero-Shot Accent Generation

Add code
Sep 13, 2024
Viaarxiv icon

An Initial Investigation of Language Adaptation for TTS Systems under Low-resource Scenarios

Add code
Jun 13, 2024
Viaarxiv icon

ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations

Add code
Dec 22, 2023
Figure 1 for ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
Figure 2 for ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
Figure 3 for ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
Figure 4 for ZMM-TTS: Zero-shot Multilingual and Multispeaker Speech Synthesis Conditioned on Self-supervised Discrete Speech Representations
Viaarxiv icon

Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks

Add code
Sep 22, 2022
Figure 1 for Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Figure 2 for Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Figure 3 for Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Figure 4 for Predicting pairwise preferences between TTS audio stimuli using parallel ratings data and anti-symmetric twin neural networks
Viaarxiv icon

Automatic audiovisual synchronisation for ultrasound tongue imaging

Add code
May 31, 2021
Figure 1 for Automatic audiovisual synchronisation for ultrasound tongue imaging
Figure 2 for Automatic audiovisual synchronisation for ultrasound tongue imaging
Figure 3 for Automatic audiovisual synchronisation for ultrasound tongue imaging
Figure 4 for Automatic audiovisual synchronisation for ultrasound tongue imaging
Viaarxiv icon