Alert button

"speech": models, code, and papers
Alert button

Exploiting Symmetric Temporally Sparse BPTT for Efficient RNN Training

Dec 14, 2023
Xi Chen, Chang Gao, Zuowen Wang, Longbiao Cheng, Sheng Zhou, Shih-Chii Liu, Tobi Delbruck

Viaarxiv icon

Analysis of Speech Separation Performance Degradation on Emotional Speech Mixtures

Add code
Bookmark button
Alert button
Sep 14, 2023
Jia Qi Yip, Dianwen Ng, Bin Ma, Chng Eng Siong

Figure 1 for Analysis of Speech Separation Performance Degradation on Emotional Speech Mixtures
Figure 2 for Analysis of Speech Separation Performance Degradation on Emotional Speech Mixtures
Figure 3 for Analysis of Speech Separation Performance Degradation on Emotional Speech Mixtures
Figure 4 for Analysis of Speech Separation Performance Degradation on Emotional Speech Mixtures
Viaarxiv icon

Thech. Report: Genuinization of Speech waveform PMF for speaker detection spoofing and countermeasures

Oct 09, 2023
Itshak Lapidot, Jean-Francois Bonastre

Viaarxiv icon

A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction

Add code
Bookmark button
Alert button
Oct 12, 2023
Kohei Saijo, Wangyou Zhang, Zhong-Qiu Wang, Shinji Watanabe, Tetsunori Kobayashi, Tetsuji Ogawa

Figure 1 for A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
Figure 2 for A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
Figure 3 for A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
Figure 4 for A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, and Extraction
Viaarxiv icon

Batched Low-Rank Adaptation of Foundation Models

Dec 09, 2023
Yeming Wen, Swarat Chaudhuri

Figure 1 for Batched Low-Rank Adaptation of Foundation Models
Figure 2 for Batched Low-Rank Adaptation of Foundation Models
Figure 3 for Batched Low-Rank Adaptation of Foundation Models
Figure 4 for Batched Low-Rank Adaptation of Foundation Models
Viaarxiv icon

Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function

Oct 19, 2023
Hsinyu Chang, Yicheng Hsu, Mingsian R. Bai

Figure 1 for Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function
Figure 2 for Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function
Figure 3 for Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function
Figure 4 for Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function
Viaarxiv icon

Spatial HuBERT: Self-supervised Spatial Speech Representation Learning for a Single Talker from Multi-channel Audio

Oct 17, 2023
Antoni Dimitriadis, Siqi Pan, Vidhyasaharan Sethu, Beena Ahmed

Viaarxiv icon

MUST&P-SRL: Multi-lingual and Unified Syllabification in Text and Phonetic Domains for Speech Representation Learning

Add code
Bookmark button
Alert button
Oct 17, 2023
Noé Tits

Viaarxiv icon

RepCodec: A Speech Representation Codec for Speech Tokenization

Add code
Bookmark button
Alert button
Aug 31, 2023
Zhichao Huang, Chutong Meng, Tom Ko

Figure 1 for RepCodec: A Speech Representation Codec for Speech Tokenization
Figure 2 for RepCodec: A Speech Representation Codec for Speech Tokenization
Figure 3 for RepCodec: A Speech Representation Codec for Speech Tokenization
Figure 4 for RepCodec: A Speech Representation Codec for Speech Tokenization
Viaarxiv icon

Efficient Representation of the Activation Space in Deep Neural Networks

Dec 13, 2023
Tanya Akumu, Celia Cintas, Girmaw Abebe Tadesse, Adebayo Oshingbesan, Skyler Speakman, Edward McFowland III

Viaarxiv icon