Alert button

"speech": models, code, and papers
Alert button

Leveraging Redundancy in Multiple Audio Signals for Far-Field Speech Recognition

Mar 01, 2023
Feng-Ju Chang, Anastasios Alexandridis, Rupak Vignesh Swaminathan, Martin Radfar, Harish Mallidi, Maurizio Omologo, Athanasios Mouchtaris, Brian King, Roland Maas

Figure 1 for Leveraging Redundancy in Multiple Audio Signals for Far-Field Speech Recognition
Figure 2 for Leveraging Redundancy in Multiple Audio Signals for Far-Field Speech Recognition
Figure 3 for Leveraging Redundancy in Multiple Audio Signals for Far-Field Speech Recognition
Figure 4 for Leveraging Redundancy in Multiple Audio Signals for Far-Field Speech Recognition
Viaarxiv icon

Improved DeepFake Detection Using Whisper Features

Add code
Bookmark button
Alert button
Jun 02, 2023
Piotr Kawa, Marcin Plata, Michał Czuba, Piotr Szymański, Piotr Syga

Figure 1 for Improved DeepFake Detection Using Whisper Features
Figure 2 for Improved DeepFake Detection Using Whisper Features
Figure 3 for Improved DeepFake Detection Using Whisper Features
Figure 4 for Improved DeepFake Detection Using Whisper Features
Viaarxiv icon

Efficient Spoken Language Recognition via Multilabel Classification

Add code
Bookmark button
Alert button
Jun 02, 2023
Oriol Nieto, Zeyu Jin, Franck Dernoncourt, Justin Salamon

Figure 1 for Efficient Spoken Language Recognition via Multilabel Classification
Figure 2 for Efficient Spoken Language Recognition via Multilabel Classification
Figure 3 for Efficient Spoken Language Recognition via Multilabel Classification
Figure 4 for Efficient Spoken Language Recognition via Multilabel Classification
Viaarxiv icon

Can Contextual Biasing Remain Effective with Whisper and GPT-2?

Add code
Bookmark button
Alert button
Jun 02, 2023
Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C. Woodland

Figure 1 for Can Contextual Biasing Remain Effective with Whisper and GPT-2?
Figure 2 for Can Contextual Biasing Remain Effective with Whisper and GPT-2?
Figure 3 for Can Contextual Biasing Remain Effective with Whisper and GPT-2?
Figure 4 for Can Contextual Biasing Remain Effective with Whisper and GPT-2?
Viaarxiv icon

Context-aware Fine-tuning of Self-supervised Speech Models

Add code
Bookmark button
Alert button
Dec 16, 2022
Suwon Shon, Felix Wu, Kwangyoun Kim, Prashant Sridhar, Karen Livescu, Shinji Watanabe

Figure 1 for Context-aware Fine-tuning of Self-supervised Speech Models
Figure 2 for Context-aware Fine-tuning of Self-supervised Speech Models
Figure 3 for Context-aware Fine-tuning of Self-supervised Speech Models
Figure 4 for Context-aware Fine-tuning of Self-supervised Speech Models
Viaarxiv icon

Stabilising and accelerating light gated recurrent units for automatic speech recognition

Add code
Bookmark button
Alert button
Feb 16, 2023
Adel Moumen, Titouan Parcollet

Figure 1 for Stabilising and accelerating light gated recurrent units for automatic speech recognition
Figure 2 for Stabilising and accelerating light gated recurrent units for automatic speech recognition
Viaarxiv icon

Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning

Apr 12, 2023
Nikhil Singh, Chih-Wei Wu, Iroro Orife, Mahdi Kalayeh

Figure 1 for Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Figure 2 for Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Figure 3 for Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Figure 4 for Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning
Viaarxiv icon

Human-in-the-Loop Hate Speech Classification in a Multilingual Context

Add code
Bookmark button
Alert button
Dec 05, 2022
Ana Kotarcic, Dominik Hangartner, Fabrizio Gilardi, Selina Kurer, Karsten Donnay

Figure 1 for Human-in-the-Loop Hate Speech Classification in a Multilingual Context
Figure 2 for Human-in-the-Loop Hate Speech Classification in a Multilingual Context
Figure 3 for Human-in-the-Loop Hate Speech Classification in a Multilingual Context
Figure 4 for Human-in-the-Loop Hate Speech Classification in a Multilingual Context
Viaarxiv icon

EarSpy: Spying Caller Speech and Identity through Tiny Vibrations of Smartphone Ear Speakers

Dec 23, 2022
Ahmed Tanvir Mahdad, Cong Shi, Zhengkun Ye, Tianming Zhao, Yan Wang, Yingying Chen, Nitesh Saxena

Figure 1 for EarSpy: Spying Caller Speech and Identity through Tiny Vibrations of Smartphone Ear Speakers
Figure 2 for EarSpy: Spying Caller Speech and Identity through Tiny Vibrations of Smartphone Ear Speakers
Figure 3 for EarSpy: Spying Caller Speech and Identity through Tiny Vibrations of Smartphone Ear Speakers
Figure 4 for EarSpy: Spying Caller Speech and Identity through Tiny Vibrations of Smartphone Ear Speakers
Viaarxiv icon

High Fidelity Speech Enhancement with Band-split RNN

Add code
Bookmark button
Alert button
Dec 01, 2022
Jianwei Yu, Yi Luo, Hangting Chen, Rongzhi Gu, Chao Weng

Figure 1 for High Fidelity Speech Enhancement with Band-split RNN
Figure 2 for High Fidelity Speech Enhancement with Band-split RNN
Viaarxiv icon