Picture for Ilia Kulikov

Ilia Kulikov

Textless Acoustic Model with Self-Supervised Distillation for Noise-Robust Expressive Speech-to-Speech Translation

Add code
Jun 04, 2024
Viaarxiv icon

An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis

Mar 19, 2024
Figure 1 for An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis
Figure 2 for An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis
Figure 3 for An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis
Figure 4 for An Empirical Study of Speech Language Models for Prompt-Conditioned Speech Synthesis
Viaarxiv icon

MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation

Mar 19, 2024
Figure 1 for MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation
Figure 2 for MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation
Figure 3 for MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation
Figure 4 for MSLM-S2ST: A Multitask Speech Language Model for Textless Speech-to-Speech Translation with Speaker Style Preservation
Viaarxiv icon

Seamless: Multilingual Expressive and Streaming Speech Translation

Add code
Dec 08, 2023
Figure 1 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 2 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 3 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 4 for Seamless: Multilingual Expressive and Streaming Speech Translation
Viaarxiv icon

Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction

Add code
Oct 04, 2023
Viaarxiv icon

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

Add code
Aug 23, 2023
Figure 1 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 2 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 3 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 4 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Viaarxiv icon

UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units

Add code
Dec 15, 2022
Figure 1 for UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Figure 2 for UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Figure 3 for UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Figure 4 for UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Viaarxiv icon

Improving Speech-to-Speech Translation Through Unlabeled Text

Add code
Oct 26, 2022
Figure 1 for Improving Speech-to-Speech Translation Through Unlabeled Text
Figure 2 for Improving Speech-to-Speech Translation Through Unlabeled Text
Figure 3 for Improving Speech-to-Speech Translation Through Unlabeled Text
Figure 4 for Improving Speech-to-Speech Translation Through Unlabeled Text
Viaarxiv icon

Simple and Effective Unsupervised Speech Translation

Add code
Oct 18, 2022
Figure 1 for Simple and Effective Unsupervised Speech Translation
Figure 2 for Simple and Effective Unsupervised Speech Translation
Figure 3 for Simple and Effective Unsupervised Speech Translation
Figure 4 for Simple and Effective Unsupervised Speech Translation
Viaarxiv icon

Uncertainty Determines the Adequacy of the Mode and the Tractability of Decoding in Sequence-to-Sequence Models

Apr 01, 2022
Figure 1 for Uncertainty Determines the Adequacy of the Mode and the Tractability of Decoding in Sequence-to-Sequence Models
Figure 2 for Uncertainty Determines the Adequacy of the Mode and the Tractability of Decoding in Sequence-to-Sequence Models
Figure 3 for Uncertainty Determines the Adequacy of the Mode and the Tractability of Decoding in Sequence-to-Sequence Models
Figure 4 for Uncertainty Determines the Adequacy of the Mode and the Tractability of Decoding in Sequence-to-Sequence Models
Viaarxiv icon