Alert button

"speech": models, code, and papers
Alert button

TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency

Add code
Bookmark button
Alert button
Aug 14, 2022
Medhini Narasimhan, Arsha Nagrani, Chen Sun, Michael Rubinstein, Trevor Darrell, Anna Rohrbach, Cordelia Schmid

Figure 1 for TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Figure 2 for TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Figure 3 for TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Figure 4 for TL;DW? Summarizing Instructional Videos with Task Relevance & Cross-Modal Saliency
Viaarxiv icon

AsNER -- Annotated Dataset and Baseline for Assamese Named Entity recognition

Add code
Bookmark button
Alert button
Jul 07, 2022
Dhrubajyoti Pathak, Sukumar Nandi, Priyankoo Sarmah

Figure 1 for AsNER -- Annotated Dataset and Baseline for Assamese Named Entity recognition
Figure 2 for AsNER -- Annotated Dataset and Baseline for Assamese Named Entity recognition
Figure 3 for AsNER -- Annotated Dataset and Baseline for Assamese Named Entity recognition
Figure 4 for AsNER -- Annotated Dataset and Baseline for Assamese Named Entity recognition
Viaarxiv icon

Multi-Modal Detection of Alzheimer's Disease from Speech and Text

Nov 30, 2020
Amish Mittal, Sourav Sahoo, Arnhav Datar, Juned Kadiwala, Hrithwik Shalu, Jimson Mathew

Figure 1 for Multi-Modal Detection of Alzheimer's Disease from Speech and Text
Figure 2 for Multi-Modal Detection of Alzheimer's Disease from Speech and Text
Figure 3 for Multi-Modal Detection of Alzheimer's Disease from Speech and Text
Figure 4 for Multi-Modal Detection of Alzheimer's Disease from Speech and Text
Viaarxiv icon

Training Neural Speech Recognition Systems with Synthetic Speech Augmentation

Add code
Bookmark button
Alert button
Nov 02, 2018
Jason Li, Ravi Gadde, Boris Ginsburg, Vitaly Lavrukhin

Figure 1 for Training Neural Speech Recognition Systems with Synthetic Speech Augmentation
Figure 2 for Training Neural Speech Recognition Systems with Synthetic Speech Augmentation
Figure 3 for Training Neural Speech Recognition Systems with Synthetic Speech Augmentation
Figure 4 for Training Neural Speech Recognition Systems with Synthetic Speech Augmentation
Viaarxiv icon

TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation

Add code
Bookmark button
Alert button
Mar 31, 2021
Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu

Figure 1 for TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation
Figure 2 for TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation
Figure 3 for TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation
Figure 4 for TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation
Viaarxiv icon

Multi-Task Learning with Sentiment, Emotion, and Target Detection to Recognize Hate Speech and Offensive Language

Add code
Bookmark button
Alert button
Sep 21, 2021
Flor Miriam Plaza-del-Arco, Sercan Halat, Sebastian Padó, Roman Klinger

Figure 1 for Multi-Task Learning with Sentiment, Emotion, and Target Detection to Recognize Hate Speech and Offensive Language
Figure 2 for Multi-Task Learning with Sentiment, Emotion, and Target Detection to Recognize Hate Speech and Offensive Language
Figure 3 for Multi-Task Learning with Sentiment, Emotion, and Target Detection to Recognize Hate Speech and Offensive Language
Figure 4 for Multi-Task Learning with Sentiment, Emotion, and Target Detection to Recognize Hate Speech and Offensive Language
Viaarxiv icon

Improving speech recognition models with small samples for air traffic control systems

Feb 16, 2021
Yi Lin, Qin Li, Bo Yang, Zhen Yan, Huachun Tan, Zhengmao Chen

Figure 1 for Improving speech recognition models with small samples for air traffic control systems
Figure 2 for Improving speech recognition models with small samples for air traffic control systems
Figure 3 for Improving speech recognition models with small samples for air traffic control systems
Figure 4 for Improving speech recognition models with small samples for air traffic control systems
Viaarxiv icon

Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models

Add code
Bookmark button
Alert button
Jul 26, 2022
Robin Rombach, Andreas Blattmann, Björn Ommer

Figure 1 for Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
Figure 2 for Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
Figure 3 for Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
Figure 4 for Text-Guided Synthesis of Artistic Images with Retrieval-Augmented Diffusion Models
Viaarxiv icon

Endpoint Detection for Streaming End-to-End Multi-talker ASR

Add code
Bookmark button
Alert button
Jan 24, 2022
Liang Lu, Jinyu Li, Yifan Gong

Figure 1 for Endpoint Detection for Streaming End-to-End Multi-talker ASR
Figure 2 for Endpoint Detection for Streaming End-to-End Multi-talker ASR
Figure 3 for Endpoint Detection for Streaming End-to-End Multi-talker ASR
Figure 4 for Endpoint Detection for Streaming End-to-End Multi-talker ASR
Viaarxiv icon

Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition

Add code
Bookmark button
Alert button
Jan 17, 2021
Cheng Yi, Shiyu Zhou, Bo Xu

Figure 1 for Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition
Figure 2 for Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition
Figure 3 for Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition
Figure 4 for Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition
Viaarxiv icon