Alert button

"speech": models, code, and papers
Alert button

Predictive Neural Speech Coding

Add code
Bookmark button
Alert button
Jul 18, 2022
Xue Jiang, Xiulian Peng, Huaying Xue, Yuan Zhang, Yan Lu

Figure 1 for Predictive Neural Speech Coding
Figure 2 for Predictive Neural Speech Coding
Figure 3 for Predictive Neural Speech Coding
Figure 4 for Predictive Neural Speech Coding
Viaarxiv icon

A Planning-Based Explainable Collaborative Dialogue System

Mar 02, 2023
Philip R. Cohen, Lucian Galescu

Figure 1 for A Planning-Based Explainable Collaborative Dialogue System
Figure 2 for A Planning-Based Explainable Collaborative Dialogue System
Figure 3 for A Planning-Based Explainable Collaborative Dialogue System
Figure 4 for A Planning-Based Explainable Collaborative Dialogue System
Viaarxiv icon

Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features

Add code
Bookmark button
Alert button
Nov 01, 2022
Alexandra Vioni, Georgia Maniati, Nikolaos Ellinas, June Sig Sung, Inchul Hwang, Aimilios Chalamandaris, Pirros Tsiakoulis

Figure 1 for Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features
Figure 2 for Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features
Figure 3 for Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features
Figure 4 for Investigating Content-Aware Neural Text-To-Speech MOS Prediction Using Prosodic and Linguistic Features
Viaarxiv icon

Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis

Add code
Bookmark button
Alert button
Nov 01, 2022
Karolos Nikitaras, Konstantinos Klapsas, Nikolaos Ellinas, Georgia Maniati, June Sig Sung, Inchul Hwang, Spyros Raptis, Aimilios Chalamandaris, Pirros Tsiakoulis

Figure 1 for Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis
Figure 2 for Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis
Figure 3 for Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis
Figure 4 for Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis
Viaarxiv icon

Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset

Add code
Bookmark button
Alert button
Sep 11, 2022
H. A. Z. Sameen Shahgir, Khondker Salman Sayeed, Tanjeem Azwad Zaman

Figure 1 for Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Figure 2 for Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Figure 3 for Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Figure 4 for Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Viaarxiv icon

Automated detection of pronunciation errors in non-native English speech employing deep learning

Sep 13, 2022
Daniel Korzekwa

Viaarxiv icon

OLISIA: a Cascade System for Spoken Dialogue State Tracking

Add code
Bookmark button
Alert button
Apr 20, 2023
Léo Jacqmin, Lucas Druart, Valentin Vielzeuf, Lina Maria Rojas-Barahona, Yannick Estève, Benoît Favre

Figure 1 for OLISIA: a Cascade System for Spoken Dialogue State Tracking
Figure 2 for OLISIA: a Cascade System for Spoken Dialogue State Tracking
Figure 3 for OLISIA: a Cascade System for Spoken Dialogue State Tracking
Figure 4 for OLISIA: a Cascade System for Spoken Dialogue State Tracking
Viaarxiv icon

Application of Knowledge Distillation to Multi-task Speech Representation Learning

Oct 29, 2022
Mine Kerpicci, Van Nguyen, Shuhua Zhang, Erik Visser

Figure 1 for Application of Knowledge Distillation to Multi-task Speech Representation Learning
Figure 2 for Application of Knowledge Distillation to Multi-task Speech Representation Learning
Figure 3 for Application of Knowledge Distillation to Multi-task Speech Representation Learning
Figure 4 for Application of Knowledge Distillation to Multi-task Speech Representation Learning
Viaarxiv icon

Multi-View Attention Transfer for Efficient Speech Enhancement

Aug 22, 2022
Wooseok Shin, Hyun Joon Park, Jin Sob Kim, Byung Hoon Lee, Sung Won Han

Figure 1 for Multi-View Attention Transfer for Efficient Speech Enhancement
Figure 2 for Multi-View Attention Transfer for Efficient Speech Enhancement
Figure 3 for Multi-View Attention Transfer for Efficient Speech Enhancement
Figure 4 for Multi-View Attention Transfer for Efficient Speech Enhancement
Viaarxiv icon

Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation

Add code
Bookmark button
Alert button
May 18, 2022
Qianqian Dong, Fengpeng Yue, Tom Ko, Mingxuan Wang, Qibing Bai, Yu Zhang

Figure 1 for Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
Figure 2 for Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
Figure 3 for Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
Figure 4 for Leveraging Pseudo-labeled Data to Improve Direct Speech-to-Speech Translation
Viaarxiv icon