Alert button

"speech": models, code, and papers
Alert button

Collaborative Watermarking for Adversarial Speech Synthesis

Add code
Bookmark button
Alert button
Sep 26, 2023
Lauri Juvela, Xin Wang

Figure 1 for Collaborative Watermarking for Adversarial Speech Synthesis
Figure 2 for Collaborative Watermarking for Adversarial Speech Synthesis
Figure 3 for Collaborative Watermarking for Adversarial Speech Synthesis
Viaarxiv icon

Instruction-Following Speech Recognition

Sep 18, 2023
Cheng-I Jeff Lai, Zhiyun Lu, Liangliang Cao, Ruoming Pang

Figure 1 for Instruction-Following Speech Recognition
Figure 2 for Instruction-Following Speech Recognition
Figure 3 for Instruction-Following Speech Recognition
Figure 4 for Instruction-Following Speech Recognition
Viaarxiv icon

Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function

Oct 22, 2023
Hsinyu Chang, Yicheng Hsu, Mingsian R. Bai

Figure 1 for Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function
Figure 2 for Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function
Figure 3 for Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function
Figure 4 for Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function
Viaarxiv icon

Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers

Add code
Bookmark button
Alert button
Oct 15, 2023
Hosein Mohebbi, Grzegorz Chrupała, Willem Zuidema, Afra Alishahi

Viaarxiv icon

LSTM-CNN Network for Audio Signature Analysis in Noisy Environments

Dec 12, 2023
Praveen Damacharla, Hamid Rajabalipanah, Mohammad Hosein Fakheri

Viaarxiv icon

APNet2: High-quality and High-efficiency Neural Vocoder with Direct Prediction of Amplitude and Phase Spectra

Add code
Bookmark button
Alert button
Nov 20, 2023
Hui-Peng Du, Ye-Xin Lu, Yang Ai, Zhen-Hua Ling

Viaarxiv icon

Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks

Add code
Bookmark button
Alert button
Sep 18, 2023
Soumi Maiti, Yifan Peng, Shukjae Choi, Jee-weon Jung, Xuankai Chang, Shinji Watanabe

Figure 1 for Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
Figure 2 for Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
Figure 3 for Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
Figure 4 for Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
Viaarxiv icon

Detecting value-expressive text posts in Russian social media

Dec 14, 2023
Maria Milkova, Maksim Rudnev, Lidia Okolskaya

Viaarxiv icon

Large Language Models for Autonomous Driving: Real-World Experiments

Dec 14, 2023
Can Cui, Zichong Yang, Yupeng Zhou, Yunsheng Ma, Juanwu Lu, Ziran Wang

Viaarxiv icon

Do self-supervised speech and language models extract similar representations as human brain?

Oct 07, 2023
Peili Chen, Linyang He, Li Fu, Lu Fan, Edward F. Chang, Yuanning Li

Viaarxiv icon