Alert button

"speech": models, code, and papers
Alert button

Improving And Analyzing Neural Speaker Embeddings for ASR

Jan 11, 2023
Christoph Lüscher, Jingjing Xu, Mohammad Zeineldeen, Ralf Schlüter, Hermann Ney

Figure 1 for Improving And Analyzing Neural Speaker Embeddings for ASR
Figure 2 for Improving And Analyzing Neural Speaker Embeddings for ASR
Figure 3 for Improving And Analyzing Neural Speaker Embeddings for ASR
Figure 4 for Improving And Analyzing Neural Speaker Embeddings for ASR
Viaarxiv icon

Arabic Text-To-Speech (TTS) Data Preparation

Apr 07, 2022
Hala Al Masri, Muhy Eddin Za'ter

Figure 1 for Arabic Text-To-Speech (TTS) Data Preparation
Viaarxiv icon

A Comparison of Temporal Encoders for Neuromorphic Keyword Spotting with Few Neurons

Jan 24, 2023
Mattias Nilsson, Ton Juny Pina, Lyes Khacef, Foteini Liwicki, Elisabetta Chicca, Fredrik Sandin

Figure 1 for A Comparison of Temporal Encoders for Neuromorphic Keyword Spotting with Few Neurons
Figure 2 for A Comparison of Temporal Encoders for Neuromorphic Keyword Spotting with Few Neurons
Figure 3 for A Comparison of Temporal Encoders for Neuromorphic Keyword Spotting with Few Neurons
Figure 4 for A Comparison of Temporal Encoders for Neuromorphic Keyword Spotting with Few Neurons
Viaarxiv icon

On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors

Oct 27, 2022
Zaharah Bukhsh, Aaqib Saeed

Figure 1 for On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors
Figure 2 for On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors
Figure 3 for On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors
Figure 4 for On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors
Viaarxiv icon

A review of discourse and conversation impairments in patients with dementia

Nov 15, 2022
Charalambos Themistocleous

Viaarxiv icon

Pronunciation Modeling of Foreign Words for Mandarin ASR by Considering the Effect of Language Transfer

Oct 07, 2022
Lei Wang, Rong Tong

Figure 1 for Pronunciation Modeling of Foreign Words for Mandarin ASR by Considering the Effect of Language Transfer
Figure 2 for Pronunciation Modeling of Foreign Words for Mandarin ASR by Considering the Effect of Language Transfer
Figure 3 for Pronunciation Modeling of Foreign Words for Mandarin ASR by Considering the Effect of Language Transfer
Figure 4 for Pronunciation Modeling of Foreign Words for Mandarin ASR by Considering the Effect of Language Transfer
Viaarxiv icon

Speech intelligibility of simulated hearing loss sounds and its prediction using the Gammachirp Envelope Similarity Index (GESI)

Jun 14, 2022
Toshio Irino, Honoka Tamaru, Ayako Yamamoto

Figure 1 for Speech intelligibility of simulated hearing loss sounds and its prediction using the Gammachirp Envelope Similarity Index (GESI)
Figure 2 for Speech intelligibility of simulated hearing loss sounds and its prediction using the Gammachirp Envelope Similarity Index (GESI)
Figure 3 for Speech intelligibility of simulated hearing loss sounds and its prediction using the Gammachirp Envelope Similarity Index (GESI)
Figure 4 for Speech intelligibility of simulated hearing loss sounds and its prediction using the Gammachirp Envelope Similarity Index (GESI)
Viaarxiv icon

Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition

Add code
Bookmark button
Alert button
Mar 28, 2022
Yuchen Hu, Nana Hou, Chen Chen, Eng Siong Chng

Figure 1 for Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Figure 2 for Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Figure 3 for Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Figure 4 for Dual-Path Style Learning for End-to-End Noise-Robust Speech Recognition
Viaarxiv icon

Beyond Statistical Similarity: Rethinking Metrics for Deep Generative Models in Engineering Design

Feb 06, 2023
Lyle Regenwetter, Akash Srivastava, Dan Gutfreund, Faez Ahmed

Figure 1 for Beyond Statistical Similarity: Rethinking Metrics for Deep Generative Models in Engineering Design
Figure 2 for Beyond Statistical Similarity: Rethinking Metrics for Deep Generative Models in Engineering Design
Figure 3 for Beyond Statistical Similarity: Rethinking Metrics for Deep Generative Models in Engineering Design
Figure 4 for Beyond Statistical Similarity: Rethinking Metrics for Deep Generative Models in Engineering Design
Viaarxiv icon

VCSE: Time-Domain Visual-Contextual Speaker Extraction Network

Add code
Bookmark button
Alert button
Oct 09, 2022
Junjie Li, Meng Ge, Zexu Pan, Longbiao Wang, Jianwu Dang

Figure 1 for VCSE: Time-Domain Visual-Contextual Speaker Extraction Network
Figure 2 for VCSE: Time-Domain Visual-Contextual Speaker Extraction Network
Figure 3 for VCSE: Time-Domain Visual-Contextual Speaker Extraction Network
Figure 4 for VCSE: Time-Domain Visual-Contextual Speaker Extraction Network
Viaarxiv icon