Alert button

"speech": models, code, and papers
Alert button

MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation

Add code
Bookmark button
Alert button
Dec 07, 2022
Yanjie Fu, Haoran Yin, Meng Ge, Longbiao Wang, Gaoyan Zhang, Jianwu Dang, Chengyun Deng, Fei Wang

Figure 1 for MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation
Figure 2 for MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation
Figure 3 for MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation
Figure 4 for MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation
Viaarxiv icon

THLNet: two-stage heterogeneous lightweight network for monaural speech enhancement

Add code
Bookmark button
Alert button
Jan 19, 2023
Feng Dang, Qi Hu, Pengyuan Zhang

Figure 1 for THLNet: two-stage heterogeneous lightweight network for monaural speech enhancement
Figure 2 for THLNet: two-stage heterogeneous lightweight network for monaural speech enhancement
Figure 3 for THLNet: two-stage heterogeneous lightweight network for monaural speech enhancement
Figure 4 for THLNet: two-stage heterogeneous lightweight network for monaural speech enhancement
Viaarxiv icon

Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis

Add code
Bookmark button
Alert button
May 26, 2023
Seongyeon Park, Bohyung Kim, Tae-hyun Oh

Figure 1 for Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis
Figure 2 for Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis
Figure 3 for Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis
Figure 4 for Automatic Tuning of Loss Trade-offs without Hyper-parameter Search in End-to-End Zero-Shot Speech Synthesis
Viaarxiv icon

SPACEx: Speech-driven Portrait Animation with Controllable Expression

Add code
Bookmark button
Alert button
Nov 17, 2022
Siddharth Gururani, Arun Mallya, Ting-Chun Wang, Rafael Valle, Ming-Yu Liu

Figure 1 for SPACEx: Speech-driven Portrait Animation with Controllable Expression
Figure 2 for SPACEx: Speech-driven Portrait Animation with Controllable Expression
Figure 3 for SPACEx: Speech-driven Portrait Animation with Controllable Expression
Figure 4 for SPACEx: Speech-driven Portrait Animation with Controllable Expression
Viaarxiv icon

A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers

Add code
Bookmark button
Alert button
Apr 16, 2023
Juan Zuluaga-Gomez, Amrutha Prasad, Iuliia Nigmatulina, Petr Motlicek, Matthias Kleinert

Figure 1 for A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers
Figure 2 for A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers
Figure 3 for A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers
Figure 4 for A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers
Viaarxiv icon

How "open" are the conversations with open-domain chatbots? A proposal for Speech Event based evaluation

Nov 24, 2022
A. Seza Doğruöz, Gabriel Skantze

Figure 1 for How "open" are the conversations with open-domain chatbots? A proposal for Speech Event based evaluation
Viaarxiv icon

Cross-Modal Mutual Learning for Cued Speech Recognition

Dec 02, 2022
Lei Liu, Li Liu

Figure 1 for Cross-Modal Mutual Learning for Cued Speech Recognition
Figure 2 for Cross-Modal Mutual Learning for Cued Speech Recognition
Figure 3 for Cross-Modal Mutual Learning for Cued Speech Recognition
Figure 4 for Cross-Modal Mutual Learning for Cued Speech Recognition
Viaarxiv icon

Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses

Add code
Bookmark button
Alert button
Nov 29, 2022
Yang Ai, Zhen-Hua Ling

Figure 1 for Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses
Figure 2 for Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses
Figure 3 for Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses
Figure 4 for Neural Speech Phase Prediction based on Parallel Estimation Architecture and Anti-Wrapping Losses
Viaarxiv icon

A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge

May 06, 2023
Siddhant Arora, Hayato Futami, Shih-Lun Wu, Jessica Huynh, Yifan Peng, Yosuke Kashiwagi, Emiru Tsunoo, Brian Yan, Shinji Watanabe

Figure 1 for A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge
Figure 2 for A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge
Figure 3 for A Study on the Integration of Pipeline and E2E SLU systems for Spoken Semantic Parsing toward STOP Quality Challenge
Viaarxiv icon

SHINE: Syntax-augmented Hierarchical Interactive Encoder for Zero-shot Cross-lingual Information Extraction

Add code
Bookmark button
Alert button
May 21, 2023
Jun-Yu Ma, Jia-Chen Gu, Zhen-Hua Ling, Quan Liu, Cong Liu, Guoping Hu

Figure 1 for SHINE: Syntax-augmented Hierarchical Interactive Encoder for Zero-shot Cross-lingual Information Extraction
Figure 2 for SHINE: Syntax-augmented Hierarchical Interactive Encoder for Zero-shot Cross-lingual Information Extraction
Figure 3 for SHINE: Syntax-augmented Hierarchical Interactive Encoder for Zero-shot Cross-lingual Information Extraction
Figure 4 for SHINE: Syntax-augmented Hierarchical Interactive Encoder for Zero-shot Cross-lingual Information Extraction
Viaarxiv icon