Alert button

"speech": models, code, and papers
Alert button

TRESTLE: Toolkit for Reproducible Execution of Speech, Text and Language Experiments

Add code
Bookmark button
Alert button
Feb 14, 2023
Changye Li, Trevor Cohen, Martin Michalowski, Serguei Pakhomov

Figure 1 for TRESTLE: Toolkit for Reproducible Execution of Speech, Text and Language Experiments
Figure 2 for TRESTLE: Toolkit for Reproducible Execution of Speech, Text and Language Experiments
Figure 3 for TRESTLE: Toolkit for Reproducible Execution of Speech, Text and Language Experiments
Figure 4 for TRESTLE: Toolkit for Reproducible Execution of Speech, Text and Language Experiments
Viaarxiv icon

Dual-path Attention is All You Need for Audio-Visual Speech Extraction

Jul 09, 2022
Zhongweiyang Xu, Xulin Fan, Mark Hasegawa-Johnson

Figure 1 for Dual-path Attention is All You Need for Audio-Visual Speech Extraction
Figure 2 for Dual-path Attention is All You Need for Audio-Visual Speech Extraction
Figure 3 for Dual-path Attention is All You Need for Audio-Visual Speech Extraction
Viaarxiv icon

Exploration of A Self-Supervised Speech Model: A Study on Emotional Corpora

Oct 05, 2022
Yuanchao Li, Yumnah Mohamied, Peter Bell, Catherine Lai

Figure 1 for Exploration of A Self-Supervised Speech Model: A Study on Emotional Corpora
Figure 2 for Exploration of A Self-Supervised Speech Model: A Study on Emotional Corpora
Figure 3 for Exploration of A Self-Supervised Speech Model: A Study on Emotional Corpora
Figure 4 for Exploration of A Self-Supervised Speech Model: A Study on Emotional Corpora
Viaarxiv icon

Leveraging Speech Separation for Conversational Telephone Speaker Diarization

Add code
Bookmark button
Alert button
Apr 05, 2022
Giovanni Morrone, Samuele Cornell, Desh Raj, Enrico Zovato, Alessio Brutti, Stefano Squartini

Figure 1 for Leveraging Speech Separation for Conversational Telephone Speaker Diarization
Figure 2 for Leveraging Speech Separation for Conversational Telephone Speaker Diarization
Figure 3 for Leveraging Speech Separation for Conversational Telephone Speaker Diarization
Figure 4 for Leveraging Speech Separation for Conversational Telephone Speaker Diarization
Viaarxiv icon

Efficiency 360: Efficient Vision Transformers

Add code
Bookmark button
Alert button
Feb 23, 2023
Badri N. Patro, Vijay Srinivas Agneeswaran

Figure 1 for Efficiency 360: Efficient Vision Transformers
Figure 2 for Efficiency 360: Efficient Vision Transformers
Figure 3 for Efficiency 360: Efficient Vision Transformers
Figure 4 for Efficiency 360: Efficient Vision Transformers
Viaarxiv icon

Knowledge-Based Counterfactual Queries for Visual Question Answering

Mar 05, 2023
Theodoti Stoikou, Maria Lymperaiou, Giorgos Stamou

Figure 1 for Knowledge-Based Counterfactual Queries for Visual Question Answering
Figure 2 for Knowledge-Based Counterfactual Queries for Visual Question Answering
Figure 3 for Knowledge-Based Counterfactual Queries for Visual Question Answering
Figure 4 for Knowledge-Based Counterfactual Queries for Visual Question Answering
Viaarxiv icon

Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognition

Nov 17, 2022
Xurong Xie, Xunying Liu, Hui Chen, Hongan Wang

Figure 1 for Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognition
Figure 2 for Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognition
Figure 3 for Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognition
Figure 4 for Unsupervised Model-based speaker adaptation of end-to-end lattice-free MMI model for speech recognition
Viaarxiv icon

A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition

Add code
Bookmark button
Alert button
Apr 05, 2022
Ye-Qian Du, Jie Zhang, Qiu-Shi Zhu, Li-Rong Dai, Ming-Hui Wu, Xin Fang, Zhou-Wang Yang

Figure 1 for A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Figure 2 for A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Figure 3 for A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Figure 4 for A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition
Viaarxiv icon

Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection

Add code
Bookmark button
Alert button
Dec 01, 2022
Rahul Sharma, Shrikanth Narayanan

Figure 1 for Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection
Figure 2 for Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection
Figure 3 for Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection
Figure 4 for Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection
Viaarxiv icon

Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription

Jul 08, 2022
Xianrui Zheng, Chao Zhang, Philip C. Woodland

Figure 1 for Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription
Figure 2 for Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription
Figure 3 for Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription
Figure 4 for Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription
Viaarxiv icon