Alert button

"speech": models, code, and papers
Alert button

Leveraging Label Information for Multimodal Emotion Recognition

Sep 05, 2023
Peiying Wang, Sunlu Zeng, Junqing Chen, Lu Fan, Meng Chen, Youzheng Wu, Xiaodong He

Figure 1 for Leveraging Label Information for Multimodal Emotion Recognition
Figure 2 for Leveraging Label Information for Multimodal Emotion Recognition
Figure 3 for Leveraging Label Information for Multimodal Emotion Recognition
Figure 4 for Leveraging Label Information for Multimodal Emotion Recognition
Viaarxiv icon

Cross-modal Alignment with Optimal Transport for CTC-based ASR

Sep 24, 2023
Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai

Viaarxiv icon

Translatotron 3: Speech to Speech Translation with Monolingual Data

Jun 01, 2023
Eliya Nachmani, Alon Levkovitch, Yifan Ding, Chulayuth Asawaroengchai, Heiga Zen, Michelle Tadmor Ramanovich

Figure 1 for Translatotron 3: Speech to Speech Translation with Monolingual Data
Figure 2 for Translatotron 3: Speech to Speech Translation with Monolingual Data
Figure 3 for Translatotron 3: Speech to Speech Translation with Monolingual Data
Figure 4 for Translatotron 3: Speech to Speech Translation with Monolingual Data
Viaarxiv icon

Knowledge Distilled Ensemble Model for sEMG-based Silent Speech Interface

Aug 07, 2023
Wenqiang Lai, Qihan Yang, Ye Mao, Endong Sun, Jiangnan Ye

Viaarxiv icon

Soft Convex Quantization: Revisiting Vector Quantization with Convex Optimization

Oct 04, 2023
Tanmay Gautam, Reid Pryzant, Ziyi Yang, Chenguang Zhu, Somayeh Sojoudi

Figure 1 for Soft Convex Quantization: Revisiting Vector Quantization with Convex Optimization
Figure 2 for Soft Convex Quantization: Revisiting Vector Quantization with Convex Optimization
Figure 3 for Soft Convex Quantization: Revisiting Vector Quantization with Convex Optimization
Figure 4 for Soft Convex Quantization: Revisiting Vector Quantization with Convex Optimization
Viaarxiv icon

Mitigating Bias in Conversations: A Hate Speech Classifier and Debiaser with Prompts

Jul 14, 2023
Shaina Raza, Chen Ding, Deval Pandya

Viaarxiv icon

Multi-Channel MOSRA: Mean Opinion Score and Room Acoustics Estimation Using Simulated Data and a Teacher Model

Sep 21, 2023
Jozef Coldenhoff, Andrew Harper, Paul Kendrick, Tijana Stojkovic, Milos Cernak

Viaarxiv icon

PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement

Jul 28, 2023
Xinmeng Xu, Weiping Tu, Yuhong Yang

Figure 1 for PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement
Figure 2 for PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement
Figure 3 for PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement
Figure 4 for PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement
Viaarxiv icon

Learning Multilingual Expressive Speech Representation for Prosody Prediction without Parallel Data

Jun 29, 2023
Jarod Duret, Titouan Parcollet, Yannick Estève

Figure 1 for Learning Multilingual Expressive Speech Representation for Prosody Prediction without Parallel Data
Figure 2 for Learning Multilingual Expressive Speech Representation for Prosody Prediction without Parallel Data
Figure 3 for Learning Multilingual Expressive Speech Representation for Prosody Prediction without Parallel Data
Figure 4 for Learning Multilingual Expressive Speech Representation for Prosody Prediction without Parallel Data
Viaarxiv icon

Segment-Level Vectorized Beam Search Based on Partially Autoregressive Inference

Oct 01, 2023
Masao Someki, Nicholas Eng, Yosuke Higuchi, Shinji Watanabe

Viaarxiv icon