Alert button

"speech": models, code, and papers
Alert button

BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions

May 17, 2023
Jie Zhang, Qing-Tian Xu, Qiu-Shi Zhu, Zhen-Hua Ling

Figure 1 for BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions
Figure 2 for BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions
Figure 3 for BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions
Viaarxiv icon

ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging

Aug 05, 2023
Fangyuan Wang, Ming Hao, Yuhai Shi, Bo Xu

Figure 1 for ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
Figure 2 for ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
Figure 3 for ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
Figure 4 for ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging
Viaarxiv icon

Some voices are too common: Building fair speech recognition systems using the Common Voice dataset

Jun 01, 2023
Lucas Maison, Yannick Estève

Figure 1 for Some voices are too common: Building fair speech recognition systems using the Common Voice dataset
Figure 2 for Some voices are too common: Building fair speech recognition systems using the Common Voice dataset
Figure 3 for Some voices are too common: Building fair speech recognition systems using the Common Voice dataset
Figure 4 for Some voices are too common: Building fair speech recognition systems using the Common Voice dataset
Viaarxiv icon

LAHM : Large Annotated Dataset for Multi-Domain and Multilingual Hate Speech Identification

Apr 03, 2023
Ankit Yadav, Shubham Chandel, Sushant Chatufale, Anil Bandhakavi

Figure 1 for LAHM : Large Annotated Dataset for Multi-Domain and Multilingual Hate Speech Identification
Figure 2 for LAHM : Large Annotated Dataset for Multi-Domain and Multilingual Hate Speech Identification
Figure 3 for LAHM : Large Annotated Dataset for Multi-Domain and Multilingual Hate Speech Identification
Figure 4 for LAHM : Large Annotated Dataset for Multi-Domain and Multilingual Hate Speech Identification
Viaarxiv icon

Identity Construction in a Misogynist Incels Forum

Jul 09, 2023
Michael Miller Yoder, Chloe Perry, David West Brown, Kathleen M. Carley, Meredith L. Pruden

Figure 1 for Identity Construction in a Misogynist Incels Forum
Figure 2 for Identity Construction in a Misogynist Incels Forum
Figure 3 for Identity Construction in a Misogynist Incels Forum
Figure 4 for Identity Construction in a Misogynist Incels Forum
Viaarxiv icon

Speaker Diarization of Scripted Audiovisual Content

Aug 04, 2023
Yogesh Virkar, Brian Thompson, Rohit Paturi, Sundararajan Srinivasan, Marcello Federico

Viaarxiv icon

A Critical Review of Physics-Informed Machine Learning Applications in Subsurface Energy Systems

Aug 06, 2023
Abdeldjalil Latrach, Mohamed Lamine Malki, Misael Morales, Mohamed Mehana, Minou Rabiei

Figure 1 for A Critical Review of Physics-Informed Machine Learning Applications in Subsurface Energy Systems
Figure 2 for A Critical Review of Physics-Informed Machine Learning Applications in Subsurface Energy Systems
Figure 3 for A Critical Review of Physics-Informed Machine Learning Applications in Subsurface Energy Systems
Figure 4 for A Critical Review of Physics-Informed Machine Learning Applications in Subsurface Energy Systems
Viaarxiv icon

Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data

Jul 04, 2023
Guangzhi Sun, Chao Zhang, Ivan Vulić, Paweł Budzianowski, Philip C. Woodland

Figure 1 for Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data
Figure 2 for Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data
Figure 3 for Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data
Figure 4 for Knowledge-Aware Audio-Grounded Generative Slot Filling for Limited Annotated Data
Viaarxiv icon

FPGA Resource-aware Structured Pruning for Real-Time Neural Networks

Aug 09, 2023
Benjamin Ramhorst, George A. Constantinides, Vladimir Loncar

Figure 1 for FPGA Resource-aware Structured Pruning for Real-Time Neural Networks
Figure 2 for FPGA Resource-aware Structured Pruning for Real-Time Neural Networks
Figure 3 for FPGA Resource-aware Structured Pruning for Real-Time Neural Networks
Figure 4 for FPGA Resource-aware Structured Pruning for Real-Time Neural Networks
Viaarxiv icon

InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition

May 24, 2023
Zhi-Hao, Lai

Figure 1 for InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition
Figure 2 for InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition
Figure 3 for InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition
Figure 4 for InterFormer: Interactive Local and Global Features Fusion for Automatic Speech Recognition
Viaarxiv icon