Picture for Xutai Ma

Xutai Ma

MMM: Multi-Layer Multi-Residual Multi-Stream Discrete Speech Representation from Self-supervised Learning Model

Add code
Jun 14, 2024
Viaarxiv icon

Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation

Add code
Jun 06, 2024
Figure 1 for Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation
Figure 2 for Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation
Figure 3 for Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation
Figure 4 for Evaluating the IWSLT2023 Speech Translation Tasks: Human Annotations, Automatic Metrics, and Segmentation
Viaarxiv icon

Seamless: Multilingual Expressive and Streaming Speech Translation

Add code
Dec 08, 2023
Figure 1 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 2 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 3 for Seamless: Multilingual Expressive and Streaming Speech Translation
Figure 4 for Seamless: Multilingual Expressive and Streaming Speech Translation
Viaarxiv icon

Efficient Monotonic Multihead Attention

Add code
Dec 07, 2023
Figure 1 for Efficient Monotonic Multihead Attention
Figure 2 for Efficient Monotonic Multihead Attention
Figure 3 for Efficient Monotonic Multihead Attention
Figure 4 for Efficient Monotonic Multihead Attention
Viaarxiv icon

Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction

Add code
Oct 04, 2023
Viaarxiv icon

SeamlessM4T-Massively Multilingual & Multimodal Machine Translation

Add code
Aug 23, 2023
Figure 1 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 2 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 3 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Figure 4 for SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
Viaarxiv icon

Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks

Add code
May 04, 2023
Figure 1 for Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Figure 2 for Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Figure 3 for Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Figure 4 for Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks
Viaarxiv icon

Direct simultaneous speech to speech translation

Add code
Oct 15, 2021
Figure 1 for Direct simultaneous speech to speech translation
Figure 2 for Direct simultaneous speech to speech translation
Viaarxiv icon

Incremental Speech Synthesis For Speech-To-Speech Translation

Add code
Oct 15, 2021
Figure 1 for Incremental Speech Synthesis For Speech-To-Speech Translation
Figure 2 for Incremental Speech Synthesis For Speech-To-Speech Translation
Figure 3 for Incremental Speech Synthesis For Speech-To-Speech Translation
Figure 4 for Incremental Speech Synthesis For Speech-To-Speech Translation
Viaarxiv icon

Direct speech-to-speech translation with discrete units

Add code
Jul 12, 2021
Figure 1 for Direct speech-to-speech translation with discrete units
Figure 2 for Direct speech-to-speech translation with discrete units
Figure 3 for Direct speech-to-speech translation with discrete units
Figure 4 for Direct speech-to-speech translation with discrete units
Viaarxiv icon