Picture for Pavel Denisov

Pavel Denisov

Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training

Add code
Apr 16, 2024
Figure 1 for Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training
Figure 2 for Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training
Figure 3 for Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training
Figure 4 for Teaching a Multilingual Large Language Model to Understand Multilingual Speech via Multi-Instructional Training
Viaarxiv icon

The IMS Toucan System for the Blizzard Challenge 2023

Add code
Oct 26, 2023
Viaarxiv icon

Leveraging Multilingual Self-Supervised Pretrained Models for Sequence-to-Sequence End-to-End Spoken Language Understanding

Add code
Oct 09, 2023
Viaarxiv icon

Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study

Add code
Sep 27, 2023
Figure 1 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Figure 2 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Figure 3 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Figure 4 for Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study
Viaarxiv icon

Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy

Add code
Oct 20, 2022
Figure 1 for Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
Figure 2 for Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
Figure 3 for Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
Figure 4 for Anonymizing Speech with Generative Adversarial Networks to Preserve Speaker Privacy
Viaarxiv icon

Speaker Anonymization with Phonetic Intermediate Representations

Add code
Jul 11, 2022
Figure 1 for Speaker Anonymization with Phonetic Intermediate Representations
Figure 2 for Speaker Anonymization with Phonetic Intermediate Representations
Figure 3 for Speaker Anonymization with Phonetic Intermediate Representations
Figure 4 for Speaker Anonymization with Phonetic Intermediate Representations
Viaarxiv icon

ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet

Add code
Nov 29, 2021
Figure 1 for ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Figure 2 for ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Figure 3 for ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Figure 4 for ESPnet-SLU: Advancing Spoken Language Understanding through ESPnet
Viaarxiv icon

Investigations on Speech Recognition Systems for Low-Resource Dialectal Arabic-English Code-Switching Speech

Add code
Aug 29, 2021
Figure 1 for Investigations on Speech Recognition Systems for Low-Resource Dialectal Arabic-English Code-Switching Speech
Figure 2 for Investigations on Speech Recognition Systems for Low-Resource Dialectal Arabic-English Code-Switching Speech
Figure 3 for Investigations on Speech Recognition Systems for Low-Resource Dialectal Arabic-English Code-Switching Speech
Figure 4 for Investigations on Speech Recognition Systems for Low-Resource Dialectal Arabic-English Code-Switching Speech
Viaarxiv icon

IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task

Add code
Jun 30, 2021
Figure 1 for IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
Figure 2 for IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
Figure 3 for IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
Figure 4 for IMS' Systems for the IWSLT 2021 Low-Resource Speech Translation Task
Viaarxiv icon

Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning

Add code
Jul 03, 2020
Figure 1 for Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning
Figure 2 for Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning
Figure 3 for Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning
Figure 4 for Pretrained Semantic Speech Embeddings for End-to-End Spoken Language Understanding via Cross-Modal Teacher-Student Learning
Viaarxiv icon