Alert button

"speech": models, code, and papers
Alert button

Direction of Arrival Estimation of Noisy Speech Using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals

Mar 01, 2021
Nils Poschadel, Robert Hupke, Stephan Preihs, Jürgen Peissig

Figure 1 for Direction of Arrival Estimation of Noisy Speech Using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals
Figure 2 for Direction of Arrival Estimation of Noisy Speech Using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals
Figure 3 for Direction of Arrival Estimation of Noisy Speech Using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals
Figure 4 for Direction of Arrival Estimation of Noisy Speech Using Convolutional Recurrent Neural Networks with Higher-Order Ambisonics Signals
Viaarxiv icon

Enrollment-less training for personalized voice activity detection

Jun 23, 2021
Naoki Makishima, Mana Ihori, Tomohiro Tanaka, Akihiko Takashima, Shota Orihashi, Ryo Masumura

Figure 1 for Enrollment-less training for personalized voice activity detection
Figure 2 for Enrollment-less training for personalized voice activity detection
Figure 3 for Enrollment-less training for personalized voice activity detection
Viaarxiv icon

Deep Factorization for Speech Signal

Jun 25, 2017
Dong Wang, Lantian Li, Ying Shi, Yixiang Chen, Zhiyuan Tang

Figure 1 for Deep Factorization for Speech Signal
Figure 2 for Deep Factorization for Speech Signal
Figure 3 for Deep Factorization for Speech Signal
Figure 4 for Deep Factorization for Speech Signal
Viaarxiv icon

Mixture factorized auto-encoder for unsupervised hierarchical deep factorization of speech signal

Oct 30, 2019
Zhiyuan Peng, Siyuan Feng, Tan Lee

Figure 1 for Mixture factorized auto-encoder for unsupervised hierarchical deep factorization of speech signal
Figure 2 for Mixture factorized auto-encoder for unsupervised hierarchical deep factorization of speech signal
Figure 3 for Mixture factorized auto-encoder for unsupervised hierarchical deep factorization of speech signal
Figure 4 for Mixture factorized auto-encoder for unsupervised hierarchical deep factorization of speech signal
Viaarxiv icon

WaveNODE: A Continuous Normalizing Flow for Speech Synthesis

Add code
Bookmark button
Alert button
Jul 02, 2020
Hyeongju Kim, Hyeonseung Lee, Woo Hyun Kang, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim

Figure 1 for WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Figure 2 for WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Figure 3 for WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Figure 4 for WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
Viaarxiv icon

GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models

Add code
Bookmark button
Alert button
Mar 25, 2022
Changye Li, David Knopman, Weizhe Xu, Trevor Cohen, Serguei Pakhomov

Figure 1 for GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models
Figure 2 for GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models
Figure 3 for GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models
Figure 4 for GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models
Viaarxiv icon

High Performance Sequence-to-Sequence Model for Streaming Speech Recognition

Mar 22, 2020
Thai-Son Nguyen, Ngoc-Quan Pham, Sebastian Stueker, Alex Waibel

Figure 1 for High Performance Sequence-to-Sequence Model for Streaming Speech Recognition
Figure 2 for High Performance Sequence-to-Sequence Model for Streaming Speech Recognition
Figure 3 for High Performance Sequence-to-Sequence Model for Streaming Speech Recognition
Figure 4 for High Performance Sequence-to-Sequence Model for Streaming Speech Recognition
Viaarxiv icon

Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments

Jun 13, 2019
Guan-Lin Chao, William Chan, Ian Lane

Figure 1 for Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Figure 2 for Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Figure 3 for Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Figure 4 for Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments
Viaarxiv icon

A wearable sensor vest for social humanoid robots with GPGPU, IoT, and modular software architecture

Jan 06, 2022
Mohsen Jafarzadeh, Stephen Brooks, Shimeng Yu, Balakrishnan Prabhakaran, Yonas Tadesse

Figure 1 for A wearable sensor vest for social humanoid robots with GPGPU, IoT, and modular software architecture
Figure 2 for A wearable sensor vest for social humanoid robots with GPGPU, IoT, and modular software architecture
Figure 3 for A wearable sensor vest for social humanoid robots with GPGPU, IoT, and modular software architecture
Figure 4 for A wearable sensor vest for social humanoid robots with GPGPU, IoT, and modular software architecture
Viaarxiv icon

Monaural Multi-Talker Speech Recognition using Factorial Speech Processing Models

Oct 05, 2016
Mahdi Khademian, Mohammad Mehdi Homayounpour

Figure 1 for Monaural Multi-Talker Speech Recognition using Factorial Speech Processing Models
Figure 2 for Monaural Multi-Talker Speech Recognition using Factorial Speech Processing Models
Figure 3 for Monaural Multi-Talker Speech Recognition using Factorial Speech Processing Models
Figure 4 for Monaural Multi-Talker Speech Recognition using Factorial Speech Processing Models
Viaarxiv icon