Alert button

"speech": models, code, and papers
Alert button

Sparsely Shared LoRA on Whisper for Child Speech Recognition

Sep 21, 2023
Wei Liu, Ying Qin, Zhiyuan Peng, Tan Lee

Viaarxiv icon

Speech-Gesture GAN: Gesture Generation for Robots and Embodied Agents

Sep 17, 2023
Carson Yu Liu, Gelareh Mohammadi, Yang Song, Wafa Johal

Figure 1 for Speech-Gesture GAN: Gesture Generation for Robots and Embodied Agents
Figure 2 for Speech-Gesture GAN: Gesture Generation for Robots and Embodied Agents
Figure 3 for Speech-Gesture GAN: Gesture Generation for Robots and Embodied Agents
Figure 4 for Speech-Gesture GAN: Gesture Generation for Robots and Embodied Agents
Viaarxiv icon

Fast Word Error Rate Estimation Using Self-Supervised Representations For Speech And Text

Oct 12, 2023
Chanho Park, Chengsong Lu, Mingjie Chen, Thomas Hain

Viaarxiv icon

Influence Scores at Scale for Efficient Language Data Sampling

Nov 27, 2023
Nikhil Anand, Joshua Tan, Maria Minakova

Viaarxiv icon

GRASS: Unified Generation Model for Speech-to-Semantic Tasks

Sep 11, 2023
Aobo Xia, Shuyu Lei, Yushu Yang, Xiang Guo, Hua Chai

Figure 1 for GRASS: Unified Generation Model for Speech-to-Semantic Tasks
Figure 2 for GRASS: Unified Generation Model for Speech-to-Semantic Tasks
Figure 3 for GRASS: Unified Generation Model for Speech-to-Semantic Tasks
Viaarxiv icon

Enhancing Code-switching Speech Recognition with Interactive Language Biases

Sep 29, 2023
Hexin Liu, Leibny Paola Garcia, Xiangyu Zhang, Andy W. H. Khong, Sanjeev Khudanpur

Figure 1 for Enhancing Code-switching Speech Recognition with Interactive Language Biases
Figure 2 for Enhancing Code-switching Speech Recognition with Interactive Language Biases
Figure 3 for Enhancing Code-switching Speech Recognition with Interactive Language Biases
Figure 4 for Enhancing Code-switching Speech Recognition with Interactive Language Biases
Viaarxiv icon

FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion

Sep 20, 2023
Stefan Stan, Kazi Injamamul Haque, Zerrin Yumak

Figure 1 for FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion
Figure 2 for FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion
Figure 3 for FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion
Figure 4 for FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion
Viaarxiv icon

DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models

Sep 30, 2023
Zhiyao Sun, Tian Lv, Sheng Ye, Matthieu Gaetan Lin, Jenny Sheng, Yu-Hui Wen, Minjing Yu, Yong-jin Liu

Figure 1 for DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
Figure 2 for DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
Figure 3 for DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
Figure 4 for DiffPoseTalk: Speech-Driven Stylistic 3D Facial Animation and Head Pose Generation via Diffusion Models
Viaarxiv icon

Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units

Sep 25, 2023
Jakob Poncelet, Hugo Van hamme

Figure 1 for Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units
Figure 2 for Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units
Figure 3 for Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units
Figure 4 for Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units
Viaarxiv icon

Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder

Nov 25, 2023
Yicheng Gu, Xueyao Zhang, Liumeng Xue, Zhizheng Wu

Viaarxiv icon