Alert button

"speech": models, code, and papers
Alert button

A Small and Fast BERT for Chinese Medical Punctuation Restoration

Add code
Bookmark button
Alert button
Aug 24, 2023
Tongtao Ling, Chen Liao, Zhipeng Yu, Lei Chen, Shilei Huang, Yi Liu

Figure 1 for A Small and Fast BERT for Chinese Medical Punctuation Restoration
Figure 2 for A Small and Fast BERT for Chinese Medical Punctuation Restoration
Figure 3 for A Small and Fast BERT for Chinese Medical Punctuation Restoration
Figure 4 for A Small and Fast BERT for Chinese Medical Punctuation Restoration
Viaarxiv icon

DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial Attention Detection

Sep 07, 2023
Cunhang Fan, Hongyu Zhang, Wei Huang, Jun Xue, Jianhua Tao, Jiangyan Yi, Zhao Lv, Xiaopei Wu

Figure 1 for DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial Attention Detection
Figure 2 for DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial Attention Detection
Figure 3 for DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial Attention Detection
Figure 4 for DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial Attention Detection
Viaarxiv icon

Transforming the Embeddings: A Lightweight Technique for Speech Emotion Recognition Tasks

May 29, 2023
Orchid Chetia Phukan, Arun Balaji Buduru, Rajesh Sharma

Figure 1 for Transforming the Embeddings: A Lightweight Technique for Speech Emotion Recognition Tasks
Figure 2 for Transforming the Embeddings: A Lightweight Technique for Speech Emotion Recognition Tasks
Figure 3 for Transforming the Embeddings: A Lightweight Technique for Speech Emotion Recognition Tasks
Figure 4 for Transforming the Embeddings: A Lightweight Technique for Speech Emotion Recognition Tasks
Viaarxiv icon

PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions

Add code
Bookmark button
Alert button
Jun 01, 2023
Guanghou Liu, Yongmao Zhang, Yi Lei, Yunlin Chen, Rui Wang, Zhifei Li, Lei Xie

Figure 1 for PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions
Figure 2 for PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions
Figure 3 for PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions
Figure 4 for PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions
Viaarxiv icon

Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech

Add code
Bookmark button
Alert button
Jun 01, 2023
Shashi Kant Gupta, Sushant Hiray, Prashant Kukde

Figure 1 for Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech
Figure 2 for Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech
Figure 3 for Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech
Figure 4 for Spoken Language Identification System for English-Mandarin Code-Switching Child-Directed Speech
Viaarxiv icon

Enhancing Gappy Speech Audio Signals with Generative Adversarial Networks

Add code
Bookmark button
Alert button
May 09, 2023
Deniss Strods, Alan F. Smeaton

Figure 1 for Enhancing Gappy Speech Audio Signals with Generative Adversarial Networks
Figure 2 for Enhancing Gappy Speech Audio Signals with Generative Adversarial Networks
Figure 3 for Enhancing Gappy Speech Audio Signals with Generative Adversarial Networks
Figure 4 for Enhancing Gappy Speech Audio Signals with Generative Adversarial Networks
Viaarxiv icon

Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation

Add code
Bookmark button
Alert button
May 19, 2023
Kangwook Jang, Sungnyun Kim, Se-Young Yun, Hoirin Kim

Figure 1 for Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Figure 2 for Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Figure 3 for Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Figure 4 for Recycle-and-Distill: Universal Compression Strategy for Transformer-based Speech SSL Models with Attention Map Reusing and Masking Distillation
Viaarxiv icon

ReZero: Region-customizable Sound Extraction

Add code
Bookmark button
Alert button
Aug 31, 2023
Rongzhi Gu, Yi Luo

Figure 1 for ReZero: Region-customizable Sound Extraction
Figure 2 for ReZero: Region-customizable Sound Extraction
Figure 3 for ReZero: Region-customizable Sound Extraction
Figure 4 for ReZero: Region-customizable Sound Extraction
Viaarxiv icon

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models

Sep 05, 2023
Yuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra

Figure 1 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 2 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 3 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 4 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Viaarxiv icon

Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation

Add code
Bookmark button
Alert button
May 19, 2023
Martijn Bartelds, Nay San, Bradley McDonnell, Dan Jurafsky, Martijn Wieling

Figure 1 for Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation
Figure 2 for Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation
Figure 3 for Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation
Figure 4 for Making More of Little Data: Improving Low-Resource Automatic Speech Recognition Using Data Augmentation
Viaarxiv icon