Alert button

"speech": models, code, and papers
Alert button

Adaptive Knowledge Distillation between Text and Speech Pre-trained Models

Add code
Bookmark button
Alert button
Mar 07, 2023
Jinjie Ni, Yukun Ma, Wen Wang, Qian Chen, Dianwen Ng, Han Lei, Trung Hieu Nguyen, Chong Zhang, Bin Ma, Erik Cambria

Figure 1 for Adaptive Knowledge Distillation between Text and Speech Pre-trained Models
Figure 2 for Adaptive Knowledge Distillation between Text and Speech Pre-trained Models
Figure 3 for Adaptive Knowledge Distillation between Text and Speech Pre-trained Models
Figure 4 for Adaptive Knowledge Distillation between Text and Speech Pre-trained Models
Viaarxiv icon

Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding

Aug 12, 2023
Kumari Nishu, Minsik Cho, Paul Dixon, Devang Naik

Figure 1 for Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Figure 2 for Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Figure 3 for Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Figure 4 for Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Viaarxiv icon

Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation

Add code
Bookmark button
Alert button
Aug 12, 2023
Zhichao Wang, Mengyu Dai, Keld Lundgaard

Figure 1 for Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation
Figure 2 for Text-to-Video: a Two-stage Framework for Zero-shot Identity-agnostic Talking-head Generation
Viaarxiv icon

On Data Sampling Strategies for Training Neural Network Speech Separation Models

Apr 14, 2023
William Ravenscroft, Stefan Goetze, Thomas Hain

Figure 1 for On Data Sampling Strategies for Training Neural Network Speech Separation Models
Figure 2 for On Data Sampling Strategies for Training Neural Network Speech Separation Models
Figure 3 for On Data Sampling Strategies for Training Neural Network Speech Separation Models
Figure 4 for On Data Sampling Strategies for Training Neural Network Speech Separation Models
Viaarxiv icon

Multi-task learning of speech and speaker recognition

Add code
Bookmark button
Alert button
Feb 24, 2023
Nik Vaessen, David A. van Leeuwen

Figure 1 for Multi-task learning of speech and speaker recognition
Figure 2 for Multi-task learning of speech and speaker recognition
Figure 3 for Multi-task learning of speech and speaker recognition
Figure 4 for Multi-task learning of speech and speaker recognition
Viaarxiv icon

Dialogue Systems Can Generate Appropriate Responses without the Use of Question Marks? -- Investigation of the Effects of Question Marks on Dialogue Systems

Aug 07, 2023
Tomoya Mizumoto, Takato Yamazaki, Katsumasa Yoshikawa, Masaya Ohagi, Toshiki Kawamoto, Toshinori Sato

Figure 1 for Dialogue Systems Can Generate Appropriate Responses without the Use of Question Marks? -- Investigation of the Effects of Question Marks on Dialogue Systems
Figure 2 for Dialogue Systems Can Generate Appropriate Responses without the Use of Question Marks? -- Investigation of the Effects of Question Marks on Dialogue Systems
Figure 3 for Dialogue Systems Can Generate Appropriate Responses without the Use of Question Marks? -- Investigation of the Effects of Question Marks on Dialogue Systems
Figure 4 for Dialogue Systems Can Generate Appropriate Responses without the Use of Question Marks? -- Investigation of the Effects of Question Marks on Dialogue Systems
Viaarxiv icon

Towards Real-Time Single-Channel Speech Separation in Noisy and Reverberant Environments

Mar 14, 2023
Julian Neri, Sebastian Braun

Figure 1 for Towards Real-Time Single-Channel Speech Separation in Noisy and Reverberant Environments
Figure 2 for Towards Real-Time Single-Channel Speech Separation in Noisy and Reverberant Environments
Figure 3 for Towards Real-Time Single-Channel Speech Separation in Noisy and Reverberant Environments
Figure 4 for Towards Real-Time Single-Channel Speech Separation in Noisy and Reverberant Environments
Viaarxiv icon

SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing

Add code
Bookmark button
Alert button
Feb 27, 2023
Weidong Chen, Xiaofen Xing, Xiangmin Xu, Jianxin Pang, Lan Du

Figure 1 for SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Figure 2 for SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Figure 3 for SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Figure 4 for SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Viaarxiv icon

Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis

Add code
Bookmark button
Alert button
Apr 26, 2023
Ye-Xin Lu, Yang Ai, Zhen-Hua Ling

Figure 1 for Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Figure 2 for Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Figure 3 for Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Figure 4 for Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis
Viaarxiv icon

Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation

Add code
Bookmark button
Alert button
Jun 02, 2023
Federico Nocentini, Claudio Ferrari, Stefano Berretti

Figure 1 for Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation
Figure 2 for Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation
Figure 3 for Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation
Figure 4 for Learning Landmarks Motion from Speech for Speaker-Agnostic 3D Talking Heads Generation
Viaarxiv icon