Alert button

"speech": models, code, and papers
Alert button

GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition

Jun 13, 2023
Yu Pan, Yanni Hu, Yuguang Yang, Jixun Yao, Wen Fei, Lei Ma, Heng Lu

Figure 1 for GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition
Figure 2 for GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition
Figure 3 for GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Speech Emotion Recognition
Viaarxiv icon

Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias

Add code
Bookmark button
Alert button
Jun 06, 2023
Ziyue Jiang, Yi Ren, Zhenhui Ye, Jinglin Liu, Chen Zhang, Qian Yang, Shengpeng Ji, Rongjie Huang, Chunfeng Wang, Xiang Yin, Zejun Ma, Zhou Zhao

Figure 1 for Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Figure 2 for Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Figure 3 for Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Figure 4 for Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
Viaarxiv icon

Diffusion-based Signal Refiner for Speech Separation

Add code
Bookmark button
Alert button
May 12, 2023
Masato Hirano, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji

Figure 1 for Diffusion-based Signal Refiner for Speech Separation
Figure 2 for Diffusion-based Signal Refiner for Speech Separation
Figure 3 for Diffusion-based Signal Refiner for Speech Separation
Figure 4 for Diffusion-based Signal Refiner for Speech Separation
Viaarxiv icon

Inter-connection: Effective Connection between Pre-trained Encoder and Decoder for Speech Translation

May 26, 2023
Yuta Nishikawa, Satoshi Nakamura

Figure 1 for Inter-connection: Effective Connection between Pre-trained Encoder and Decoder for Speech Translation
Figure 2 for Inter-connection: Effective Connection between Pre-trained Encoder and Decoder for Speech Translation
Figure 3 for Inter-connection: Effective Connection between Pre-trained Encoder and Decoder for Speech Translation
Figure 4 for Inter-connection: Effective Connection between Pre-trained Encoder and Decoder for Speech Translation
Viaarxiv icon

What do self-supervised speech models know about words?

Add code
Bookmark button
Alert button
Jun 30, 2023
Ankita Pasad, Chung-Ming Chien, Shane Settle, Karen Livescu

Figure 1 for What do self-supervised speech models know about words?
Figure 2 for What do self-supervised speech models know about words?
Figure 3 for What do self-supervised speech models know about words?
Figure 4 for What do self-supervised speech models know about words?
Viaarxiv icon

Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model

Add code
Bookmark button
Alert button
May 26, 2023
Xiang Li, Songxiang Liu, Max W. Y. Lam, Zhiyong Wu, Chao Weng, Helen Meng

Figure 1 for Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model
Figure 2 for Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model
Figure 3 for Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model
Figure 4 for Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model
Viaarxiv icon

SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres

Add code
Bookmark button
Alert button
May 23, 2023
Shumin Deng, Shengyu Mao, Ningyu Zhang, Bryan Hooi

Figure 1 for SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres
Figure 2 for SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres
Figure 3 for SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres
Figure 4 for SPEECH: Structured Prediction with Energy-Based Event-Centric Hyperspheres
Viaarxiv icon

DEPAC: a Corpus for Depression and Anxiety Detection from Speech

Jun 20, 2023
Mashrura Tasnim, Malikeh Ehghaghi, Brian Diep, Jekaterina Novikova

Figure 1 for DEPAC: a Corpus for Depression and Anxiety Detection from Speech
Figure 2 for DEPAC: a Corpus for Depression and Anxiety Detection from Speech
Figure 3 for DEPAC: a Corpus for Depression and Anxiety Detection from Speech
Figure 4 for DEPAC: a Corpus for Depression and Anxiety Detection from Speech
Viaarxiv icon

Rule By Example: Harnessing Logical Rules for Explainable Hate Speech Detection

Add code
Bookmark button
Alert button
Jul 24, 2023
Christopher Clarke, Matthew Hall, Gaurav Mittal, Ye Yu, Sandra Sajeev, Jason Mars, Mei Chen

Viaarxiv icon

Improving Speech Translation Accuracy and Time Efficiency with Fine-tuned wav2vec 2.0-based Speech Segmentation

Add code
Bookmark button
Alert button
Apr 25, 2023
Ryo Fukuda, Katsuhito Sudoh, Satoshi Nakamura

Figure 1 for Improving Speech Translation Accuracy and Time Efficiency with Fine-tuned wav2vec 2.0-based Speech Segmentation
Figure 2 for Improving Speech Translation Accuracy and Time Efficiency with Fine-tuned wav2vec 2.0-based Speech Segmentation
Figure 3 for Improving Speech Translation Accuracy and Time Efficiency with Fine-tuned wav2vec 2.0-based Speech Segmentation
Figure 4 for Improving Speech Translation Accuracy and Time Efficiency with Fine-tuned wav2vec 2.0-based Speech Segmentation
Viaarxiv icon