Alert button

"speech recognition": models, code, and papers
Alert button

Differentially Private Speaker Anonymization

Feb 23, 2022
Ali Shahin Shamsabadi, Brij Mohan Lal Srivastava, Aurélien Bellet, Nathalie Vauquier, Emmanuel Vincent, Mohamed Maouche, Marc Tommasi, Nicolas Papernot

Figure 1 for Differentially Private Speaker Anonymization
Figure 2 for Differentially Private Speaker Anonymization
Figure 3 for Differentially Private Speaker Anonymization
Figure 4 for Differentially Private Speaker Anonymization
Viaarxiv icon

Effect of noise suppression losses on speech distortion and ASR performance

Nov 23, 2021
Sebastian Braun, Hannes Gamper

Figure 1 for Effect of noise suppression losses on speech distortion and ASR performance
Figure 2 for Effect of noise suppression losses on speech distortion and ASR performance
Figure 3 for Effect of noise suppression losses on speech distortion and ASR performance
Figure 4 for Effect of noise suppression losses on speech distortion and ASR performance
Viaarxiv icon

UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data

Add code
Bookmark button
Alert button
Jan 19, 2021
Chengyi Wang, Yu Wu, Yao Qian, Kenichi Kumatani, Shujie Liu, Furu Wei, Michael Zeng, Xuedong Huang

Figure 1 for UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Figure 2 for UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Figure 3 for UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Figure 4 for UniSpeech: Unified Speech Representation Learning with Labeled and Unlabeled Data
Viaarxiv icon

Hear "No Evil", See "Kenansville": Efficient and Transferable Black-Box Attacks on Speech Recognition and Voice Identification Systems

Oct 11, 2019
Hadi Abdullah, Muhammad Sajidur Rahman, Washington Garcia, Logan Blue, Kevin Warren, Anurag Swarnim Yadav, Tom Shrimpton, Patrick Traynor

Figure 1 for Hear "No Evil", See "Kenansville": Efficient and Transferable Black-Box Attacks on Speech Recognition and Voice Identification Systems
Figure 2 for Hear "No Evil", See "Kenansville": Efficient and Transferable Black-Box Attacks on Speech Recognition and Voice Identification Systems
Figure 3 for Hear "No Evil", See "Kenansville": Efficient and Transferable Black-Box Attacks on Speech Recognition and Voice Identification Systems
Figure 4 for Hear "No Evil", See "Kenansville": Efficient and Transferable Black-Box Attacks on Speech Recognition and Voice Identification Systems
Viaarxiv icon

VoiceMoji: A Novel On-Device Pipeline for Seamless Emoji Insertion in Dictation

Dec 22, 2021
Sumit Kumar, Harichandana B S S, Himanshu Arora

Figure 1 for VoiceMoji: A Novel On-Device Pipeline for Seamless Emoji Insertion in Dictation
Figure 2 for VoiceMoji: A Novel On-Device Pipeline for Seamless Emoji Insertion in Dictation
Figure 3 for VoiceMoji: A Novel On-Device Pipeline for Seamless Emoji Insertion in Dictation
Figure 4 for VoiceMoji: A Novel On-Device Pipeline for Seamless Emoji Insertion in Dictation
Viaarxiv icon

FAST-RIR: Fast neural diffuse room impulse response generator

Add code
Bookmark button
Alert button
Oct 07, 2021
Anton Ratnarajah, Shi-Xiong Zhang, Meng Yu, Zhenyu Tang, Dinesh Manocha, Dong Yu

Figure 1 for FAST-RIR: Fast neural diffuse room impulse response generator
Figure 2 for FAST-RIR: Fast neural diffuse room impulse response generator
Figure 3 for FAST-RIR: Fast neural diffuse room impulse response generator
Figure 4 for FAST-RIR: Fast neural diffuse room impulse response generator
Viaarxiv icon

SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing

Add code
Bookmark button
Alert button
Oct 14, 2021
Junyi Ao, Rui Wang, Long Zhou, Shujie Liu, Shuo Ren, Yu Wu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei

Figure 1 for SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing
Figure 2 for SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing
Figure 3 for SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing
Figure 4 for SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing
Viaarxiv icon

End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results

Add code
Bookmark button
Alert button
Dec 04, 2014
Jan Chorowski, Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio

Figure 1 for End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results
Figure 2 for End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results
Figure 3 for End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results
Figure 4 for End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results
Viaarxiv icon

Learning of Frequency-Time Attention Mechanism for Automatic Modulation Recognition

Nov 05, 2021
Shangao Lin, Yuan Zeng, Yi Gong

Figure 1 for Learning of Frequency-Time Attention Mechanism for Automatic Modulation Recognition
Figure 2 for Learning of Frequency-Time Attention Mechanism for Automatic Modulation Recognition
Figure 3 for Learning of Frequency-Time Attention Mechanism for Automatic Modulation Recognition
Figure 4 for Learning of Frequency-Time Attention Mechanism for Automatic Modulation Recognition
Viaarxiv icon

Learning a Neural Diff for Speech Models

Aug 17, 2021
Jonathan Macoskey, Grant P. Strimel, Ariya Rastrow

Figure 1 for Learning a Neural Diff for Speech Models
Figure 2 for Learning a Neural Diff for Speech Models
Viaarxiv icon