Alert button

"speech": models, code, and papers
Alert button

GestSync: Determining who is speaking without a talking head

Oct 08, 2023
Sindhu B Hegde, Andrew Zisserman

Viaarxiv icon

Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification

Sep 25, 2023
Yuke Lin, Xiaoyi Qin, Ning Jiang, Guoqing Zhao, Ming Li

Figure 1 for Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification
Figure 2 for Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification
Figure 3 for Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification
Figure 4 for Haha-Pod: An Attempt for Laughter-based Non-Verbal Speaker Verification
Viaarxiv icon

VoiceLens: Controllable Speaker Generation and Editing with Flow

Add code
Bookmark button
Alert button
Sep 25, 2023
Yao Shi, Ming Li

Viaarxiv icon

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Aug 14, 2023
Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran

Viaarxiv icon

Text Injection for Capitalization and Turn-Taking Prediction in Speech Models

Aug 14, 2023
Shaan Bijwadia, Shuo-yiin Chang, Weiran Wang, Zhong Meng, Hao Zhang, Tara N. Sainath

Figure 1 for Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Figure 2 for Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Figure 3 for Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Figure 4 for Text Injection for Capitalization and Turn-Taking Prediction in Speech Models
Viaarxiv icon

Leveraging Large Language Models for Exploiting ASR Uncertainty

Sep 12, 2023
Pranay Dighe, Yi Su, Shangshang Zheng, Yunshu Liu, Vineet Garg, Xiaochuan Niu, Ahmed Tewfik

Figure 1 for Leveraging Large Language Models for Exploiting ASR Uncertainty
Figure 2 for Leveraging Large Language Models for Exploiting ASR Uncertainty
Figure 3 for Leveraging Large Language Models for Exploiting ASR Uncertainty
Figure 4 for Leveraging Large Language Models for Exploiting ASR Uncertainty
Viaarxiv icon

PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement

Jul 28, 2023
Xinmeng Xu, Weiping Tu, Yuhong Yang

Figure 1 for PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement
Figure 2 for PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement
Figure 3 for PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement
Figure 4 for PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement
Viaarxiv icon

Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation

Jul 04, 2023
Dimosthenis Antypas, Jose Camacho-Collados

Figure 1 for Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation
Figure 2 for Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation
Figure 3 for Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation
Figure 4 for Robust Hate Speech Detection in Social Media: A Cross-Dataset Empirical Evaluation
Viaarxiv icon

Massive End-to-end Models for Short Search Queries

Sep 22, 2023
Weiran Wang, Rohit Prabhavalkar, Dongseong Hwang, Qiujia Li, Khe Chai Sim, Bo Li, James Qin, Xingyu Cai, Adam Stooke, Zhong Meng, CJ Zheng, Yanzhang He, Tara Sainath, Pedro Moreno Mengibar

Figure 1 for Massive End-to-end Models for Short Search Queries
Figure 2 for Massive End-to-end Models for Short Search Queries
Figure 3 for Massive End-to-end Models for Short Search Queries
Figure 4 for Massive End-to-end Models for Short Search Queries
Viaarxiv icon

Attentive Multi-Layer Perceptron for Non-autoregressive Generation

Add code
Bookmark button
Alert button
Oct 14, 2023
Shuyang Jiang, Jun Zhang, Jiangtao Feng, Lin Zheng, Lingpeng Kong

Viaarxiv icon