Alert button

"speech": models, code, and papers
Alert button

Learning Speech-driven 3D Conversational Gestures from Video

Feb 13, 2021
Ikhsanul Habibie, Weipeng Xu, Dushyant Mehta, Lingjie Liu, Hans-Peter Seidel, Gerard Pons-Moll, Mohamed Elgharib, Christian Theobalt

Figure 1 for Learning Speech-driven 3D Conversational Gestures from Video
Figure 2 for Learning Speech-driven 3D Conversational Gestures from Video
Figure 3 for Learning Speech-driven 3D Conversational Gestures from Video
Figure 4 for Learning Speech-driven 3D Conversational Gestures from Video
Viaarxiv icon

AVQVC: One-shot Voice Conversion by Vector Quantization with applying contrastive learning

Add code
Bookmark button
Alert button
Feb 21, 2022
Huaizhen Tang, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao

Figure 1 for AVQVC: One-shot Voice Conversion by Vector Quantization with applying contrastive learning
Figure 2 for AVQVC: One-shot Voice Conversion by Vector Quantization with applying contrastive learning
Figure 3 for AVQVC: One-shot Voice Conversion by Vector Quantization with applying contrastive learning
Figure 4 for AVQVC: One-shot Voice Conversion by Vector Quantization with applying contrastive learning
Viaarxiv icon

Improved Processing of Ultrasound Tongue Videos by Combining ConvLSTM and 3D Convolutional Networks

Add code
Bookmark button
Alert button
Jun 26, 2022
Amin Honarmandi Shandiz, Laszlo Toth

Figure 1 for Improved Processing of Ultrasound Tongue Videos by Combining ConvLSTM and 3D Convolutional Networks
Figure 2 for Improved Processing of Ultrasound Tongue Videos by Combining ConvLSTM and 3D Convolutional Networks
Figure 3 for Improved Processing of Ultrasound Tongue Videos by Combining ConvLSTM and 3D Convolutional Networks
Figure 4 for Improved Processing of Ultrasound Tongue Videos by Combining ConvLSTM and 3D Convolutional Networks
Viaarxiv icon

Incremental Text-to-Speech Synthesis Using Pseudo Lookahead with Large Pretrained Language Model

Add code
Bookmark button
Alert button
Dec 23, 2020
Takaaki Saeki, Shinnosuke Takamichi, Hiroshi Saruwatari

Figure 1 for Incremental Text-to-Speech Synthesis Using Pseudo Lookahead with Large Pretrained Language Model
Figure 2 for Incremental Text-to-Speech Synthesis Using Pseudo Lookahead with Large Pretrained Language Model
Figure 3 for Incremental Text-to-Speech Synthesis Using Pseudo Lookahead with Large Pretrained Language Model
Viaarxiv icon

Training end-to-end speech-to-text models on mobile phones

Dec 07, 2021
Zitha S, Raghavendra Rao Suresh, Pooja Rao, T. V. Prabhakar

Figure 1 for Training end-to-end speech-to-text models on mobile phones
Figure 2 for Training end-to-end speech-to-text models on mobile phones
Figure 3 for Training end-to-end speech-to-text models on mobile phones
Figure 4 for Training end-to-end speech-to-text models on mobile phones
Viaarxiv icon

Two Streams and Two Resolution Spectrograms Model for End-to-end Automatic Speech Recognition

Aug 18, 2021
Jin Li, Xurong Xie, Nan Yan, Lan Wang

Figure 1 for Two Streams and Two Resolution Spectrograms Model for End-to-end Automatic Speech Recognition
Figure 2 for Two Streams and Two Resolution Spectrograms Model for End-to-end Automatic Speech Recognition
Figure 3 for Two Streams and Two Resolution Spectrograms Model for End-to-end Automatic Speech Recognition
Figure 4 for Two Streams and Two Resolution Spectrograms Model for End-to-end Automatic Speech Recognition
Viaarxiv icon

Public Wisdom Matters! Discourse-Aware Hyperbolic Fourier Co-Attention for Social-Text Classification

Add code
Bookmark button
Alert button
Sep 15, 2022
Karish Grover, S. M. Phaneendra Angara, Md. Shad Akhtar, Tanmoy Chakraborty

Figure 1 for Public Wisdom Matters! Discourse-Aware Hyperbolic Fourier Co-Attention for Social-Text Classification
Figure 2 for Public Wisdom Matters! Discourse-Aware Hyperbolic Fourier Co-Attention for Social-Text Classification
Figure 3 for Public Wisdom Matters! Discourse-Aware Hyperbolic Fourier Co-Attention for Social-Text Classification
Figure 4 for Public Wisdom Matters! Discourse-Aware Hyperbolic Fourier Co-Attention for Social-Text Classification
Viaarxiv icon

Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding

Oct 08, 2021
Saurabhchand Bhati, Jesús Villalba, Piotr Żelasko, Laureano Moro-Velazquez, Najim Dehak

Figure 1 for Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding
Figure 2 for Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding
Figure 3 for Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding
Figure 4 for Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding
Viaarxiv icon

KOLD: Korean Offensive Language Dataset

Add code
Bookmark button
Alert button
May 23, 2022
Younghoon Jeong, Juhyun Oh, Jaimeen Ahn, Jongwon Lee, Jihyung Mon, Sungjoon Park, Alice Oh

Figure 1 for KOLD: Korean Offensive Language Dataset
Figure 2 for KOLD: Korean Offensive Language Dataset
Figure 3 for KOLD: Korean Offensive Language Dataset
Figure 4 for KOLD: Korean Offensive Language Dataset
Viaarxiv icon

Consonant-Vowel Transition Models Based on Deep Learning for Objective Evaluation of Articulation

Add code
Bookmark button
Alert button
Mar 18, 2022
Vikram C. Mathad, Julie M. Liss, Kathy Chapman, Nancy Scherer, Visar Berisha

Figure 1 for Consonant-Vowel Transition Models Based on Deep Learning for Objective Evaluation of Articulation
Figure 2 for Consonant-Vowel Transition Models Based on Deep Learning for Objective Evaluation of Articulation
Figure 3 for Consonant-Vowel Transition Models Based on Deep Learning for Objective Evaluation of Articulation
Figure 4 for Consonant-Vowel Transition Models Based on Deep Learning for Objective Evaluation of Articulation
Viaarxiv icon