Alert button

"speech recognition": models, code, and papers
Alert button

A Decidability-Based Loss Function

Sep 12, 2021
Pedro Silva, Gladston Moreira, Vander Freitas, Rodrigo Silva, David Menotti, Eduardo Luz

Figure 1 for A Decidability-Based Loss Function
Figure 2 for A Decidability-Based Loss Function
Figure 3 for A Decidability-Based Loss Function
Figure 4 for A Decidability-Based Loss Function
Viaarxiv icon

Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent

Add code
Bookmark button
Alert button
Dec 02, 2021
Wei Zhang, Mingrui Liu, Yu Feng, Xiaodong Cui, Brian Kingsbury, Yuhai Tu

Figure 1 for Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Figure 2 for Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Figure 3 for Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Figure 4 for Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent
Viaarxiv icon

What does a network layer hear? Analyzing hidden representations of end-to-end ASR through speech synthesis

Add code
Bookmark button
Alert button
Nov 04, 2019
Chung-Yi Li, Pei-Chieh Yuan, Hung-Yi Lee

Figure 1 for What does a network layer hear? Analyzing hidden representations of end-to-end ASR through speech synthesis
Figure 2 for What does a network layer hear? Analyzing hidden representations of end-to-end ASR through speech synthesis
Figure 3 for What does a network layer hear? Analyzing hidden representations of end-to-end ASR through speech synthesis
Figure 4 for What does a network layer hear? Analyzing hidden representations of end-to-end ASR through speech synthesis
Viaarxiv icon

Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR

Jun 04, 2020
Thilo von Neumann, Christoph Boeddeker, Lukas Drude, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach

Figure 1 for Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR
Figure 2 for Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR
Figure 3 for Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR
Figure 4 for Multi-talker ASR for an unknown number of sources: Joint training of source counting, separation and ASR
Viaarxiv icon

Binary classification of spoken words with passive elastic metastructures

Nov 14, 2021
Tena Dubček, Daniel Moreno-Garcia, Thomas Haag, Henrik R. Thomsen, Theodor S. Becker, Christoph Bärlocher, Fredrik Andersson, Sebastian D. Huber, Dirk-Jan van Manen, Luis Guillermo Villanueva, Johan O. A. Robertsson, Marc Serra-Garcia

Figure 1 for Binary classification of spoken words with passive elastic metastructures
Figure 2 for Binary classification of spoken words with passive elastic metastructures
Figure 3 for Binary classification of spoken words with passive elastic metastructures
Figure 4 for Binary classification of spoken words with passive elastic metastructures
Viaarxiv icon

Warped Language Models for Noise Robust Language Understanding

Add code
Bookmark button
Alert button
Nov 03, 2020
Mahdi Namazifar, Gokhan Tur, Dilek Hakkani Tür

Figure 1 for Warped Language Models for Noise Robust Language Understanding
Figure 2 for Warped Language Models for Noise Robust Language Understanding
Figure 3 for Warped Language Models for Noise Robust Language Understanding
Figure 4 for Warped Language Models for Noise Robust Language Understanding
Viaarxiv icon

When Can Self-Attention Be Replaced by Feed Forward Layers?

May 28, 2020
Shucong Zhang, Erfan Loweimi, Peter Bell, Steve Renals

Figure 1 for When Can Self-Attention Be Replaced by Feed Forward Layers?
Figure 2 for When Can Self-Attention Be Replaced by Feed Forward Layers?
Figure 3 for When Can Self-Attention Be Replaced by Feed Forward Layers?
Figure 4 for When Can Self-Attention Be Replaced by Feed Forward Layers?
Viaarxiv icon

Cross-Attention End-to-End ASR for Two-Party Conversations

Jul 24, 2019
Suyoun Kim, Siddharth Dalmia, Florian Metze

Figure 1 for Cross-Attention End-to-End ASR for Two-Party Conversations
Figure 2 for Cross-Attention End-to-End ASR for Two-Party Conversations
Figure 3 for Cross-Attention End-to-End ASR for Two-Party Conversations
Viaarxiv icon

BSTC: A Large-Scale Chinese-English Speech Translation Dataset

Add code
Bookmark button
Alert button
Apr 27, 2021
Ruiqing Zhang, Xiyang Wang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Zhi Li, Haifeng Wang, Ying Chen, Qinfei Li

Figure 1 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 2 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 3 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 4 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Viaarxiv icon

speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

Add code
Bookmark button
Alert button
Apr 03, 2021
Junbo Zhang, Zhiwen Zhang, Yongqing Wang, Zhiyong Yan, Qiong Song, Yukai Huang, Ke Li, Daniel Povey, Yujun Wang

Figure 1 for speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment
Figure 2 for speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment
Figure 3 for speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment
Figure 4 for speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment
Viaarxiv icon