Alert button

"speech": models, code, and papers
Alert button

The Double Helix inside the NLP Transformer

Jun 23, 2023
Jason H. J. Lu, Qingzhen Guo

Figure 1 for The Double Helix inside the NLP Transformer
Figure 2 for The Double Helix inside the NLP Transformer
Figure 3 for The Double Helix inside the NLP Transformer
Figure 4 for The Double Helix inside the NLP Transformer
Viaarxiv icon

Implementing contextual biasing in GPU decoder for online ASR

Jun 23, 2023
Iuliia Nigmatulina, Srikanth Madikeri, Esaú Villatoro-Tello, Petr Motliček, Juan Zuluaga-Gomez, Karthik Pandia, Aravind Ganapathiraju

Figure 1 for Implementing contextual biasing in GPU decoder for online ASR
Figure 2 for Implementing contextual biasing in GPU decoder for online ASR
Figure 3 for Implementing contextual biasing in GPU decoder for online ASR
Viaarxiv icon

VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining

Feb 24, 2023
Ramon Ruiz-Dolz, Javier Iranzo-Sánchez

Figure 1 for VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining
Figure 2 for VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining
Figure 3 for VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining
Figure 4 for VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining
Viaarxiv icon

SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain

Jan 08, 2023
Heli Qi, Sashi Novitasari, Andros Tjandra, Sakriani Sakti, Satoshi Nakamura

Figure 1 for SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Figure 2 for SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Figure 3 for SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Figure 4 for SpeeChain: A Speech Toolkit for Large-Scale Machine Speech Chain
Viaarxiv icon

OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset

Jan 16, 2023
Jeongkyun Park, Jung-Wook Hwang, Kwanghee Choi, Seung-Hyun Lee, Jun Hwan Ahn, Rae-Hong Park, Hyung-Min Park

Figure 1 for OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset
Figure 2 for OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset
Figure 3 for OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset
Figure 4 for OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset
Viaarxiv icon

Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation

May 29, 2023
Yui Sudo, Kazuya Hata, Kazuhiro Nakadai

Viaarxiv icon

Eeg2vec: Self-Supervised Electroencephalographic Representation Learning

May 23, 2023
Qiushi Zhu, Xiaoying Zhao, Jie Zhang, Yu Gu, Chao Weng, Yuchen Hu

Figure 1 for Eeg2vec: Self-Supervised Electroencephalographic Representation Learning
Figure 2 for Eeg2vec: Self-Supervised Electroencephalographic Representation Learning
Figure 3 for Eeg2vec: Self-Supervised Electroencephalographic Representation Learning
Figure 4 for Eeg2vec: Self-Supervised Electroencephalographic Representation Learning
Viaarxiv icon

Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation

Feb 11, 2023
Cong Han, Vishal Choudhari, Yinghao Aaron Li, Nima Mesgarani

Figure 1 for Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation
Figure 2 for Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation
Figure 3 for Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation
Figure 4 for Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation
Viaarxiv icon

RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness

Feb 18, 2023
Heitor R. Guimarães, Arthur Pimentel, Anderson R. Avila, Mehdi Rezagholizadeh, Boxing Chen, Tiago H. Falk

Figure 1 for RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness
Figure 2 for RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness
Figure 3 for RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness
Figure 4 for RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness
Viaarxiv icon

An ASR-Based Tutor for Learning to Read: How to Optimize Feedback to First Graders

Jun 07, 2023
Yu Bai, Cristian Tejedor-Garcia, Ferdy Hubers, Catia Cucchiarini, Helmer Strik

Viaarxiv icon