Alert button

"speech": models, code, and papers
Alert button

Modelling prospective memory and resilient situated communications via Wizard of Oz

Nov 09, 2023
Yanzhe Li, Frank Broz, Mark Neerincx

Viaarxiv icon

Investigating Weight-Perturbed Deep Neural Networks With Application in Iris Presentation Attack Detection

Add code
Bookmark button
Alert button
Nov 21, 2023
Renu Sharma, Redwan Sony, Arun Ross

Viaarxiv icon

Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition

Sep 19, 2023
Krishna C. Puvvada, Nithin Rao Koluguri, Kunal Dhawan, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

CleanUNet 2: A Hybrid Speech Denoising Model on Waveform and Spectrogram

Add code
Bookmark button
Alert button
Sep 12, 2023
Zhifeng Kong, Wei Ping, Ambrish Dantrey, Bryan Catanzaro

Figure 1 for CleanUNet 2: A Hybrid Speech Denoising Model on Waveform and Spectrogram
Figure 2 for CleanUNet 2: A Hybrid Speech Denoising Model on Waveform and Spectrogram
Figure 3 for CleanUNet 2: A Hybrid Speech Denoising Model on Waveform and Spectrogram
Figure 4 for CleanUNet 2: A Hybrid Speech Denoising Model on Waveform and Spectrogram
Viaarxiv icon

Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition

Sep 14, 2023
Yang Li, Liangzhen Lai, Yuan Shangguan, Forrest N. Iandola, Ernie Chang, Yangyang Shi, Vikas Chandra

Figure 1 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Figure 2 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Figure 3 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Figure 4 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Viaarxiv icon

Brain-Driven Representation Learning Based on Diffusion Model

Nov 14, 2023
Soowon Kim, Seo-Hyun Lee, Young-Eun Lee, Ji-Won Lee, Ji-Ha Park, Seong-Whan Lee

Figure 1 for Brain-Driven Representation Learning Based on Diffusion Model
Figure 2 for Brain-Driven Representation Learning Based on Diffusion Model
Figure 3 for Brain-Driven Representation Learning Based on Diffusion Model
Viaarxiv icon

SALMONN: Towards Generic Hearing Abilities for Large Language Models

Add code
Bookmark button
Alert button
Oct 20, 2023
Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang

Viaarxiv icon

Prompting and Adapter Tuning for Self-supervised Encoder-Decoder Speech Model

Oct 04, 2023
Kai-Wei Chang, Ming-Hsin Chen, Yun-Ping Lin, Jing Neng Hsu, Paul Kuo-Ming Huang, Chien-yu Huang, Shang-Wen Li, Hung-yi Lee

Viaarxiv icon

MSAC: Multiple Speech Attribute Control Method for Speech Emotion Recognition

Aug 08, 2023
Yu Pan

Figure 1 for MSAC: Multiple Speech Attribute Control Method for Speech Emotion Recognition
Figure 2 for MSAC: Multiple Speech Attribute Control Method for Speech Emotion Recognition
Figure 3 for MSAC: Multiple Speech Attribute Control Method for Speech Emotion Recognition
Figure 4 for MSAC: Multiple Speech Attribute Control Method for Speech Emotion Recognition
Viaarxiv icon

IruMozhi: Automatically classifying diglossia in Tamil

Nov 13, 2023
Kabilan Prasanna, Aryaman Arora

Viaarxiv icon