Alert button

"speech recognition": models, code, and papers
Alert button

Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR

May 18, 2023
Hang Shao, Wei Wang, Bei Liu, Xun Gong, Haoyu Wang, Yanmin Qian

Figure 1 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Figure 2 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Figure 3 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Viaarxiv icon

Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure

Add code
Bookmark button
Alert button
Jul 04, 2023
Yikang Wang, Hiromitsu Nishizaki, Ming Li

Figure 1 for Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure
Figure 2 for Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure
Figure 3 for Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure
Figure 4 for Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure
Viaarxiv icon

Modeling Spoken Information Queries for Virtual Assistants: Open Problems, Challenges and Opportunities

Apr 25, 2023
Christophe Van Gysel

Viaarxiv icon

An Empirical Study and Improvement for Speech Emotion Recognition

Add code
Bookmark button
Alert button
Apr 08, 2023
Zhen Wu, Yizhe Lu, Xinyu Dai

Figure 1 for An Empirical Study and Improvement for Speech Emotion Recognition
Figure 2 for An Empirical Study and Improvement for Speech Emotion Recognition
Figure 3 for An Empirical Study and Improvement for Speech Emotion Recognition
Figure 4 for An Empirical Study and Improvement for Speech Emotion Recognition
Viaarxiv icon

Accelerating Transducers through Adjacent Token Merging

Jun 28, 2023
Yuang Li, Yu Wu, Jinyu Li, Shujie Liu

Figure 1 for Accelerating Transducers through Adjacent Token Merging
Figure 2 for Accelerating Transducers through Adjacent Token Merging
Figure 3 for Accelerating Transducers through Adjacent Token Merging
Figure 4 for Accelerating Transducers through Adjacent Token Merging
Viaarxiv icon

Towards Improved Room Impulse Response Estimation for Speech Recognition

Add code
Bookmark button
Alert button
Nov 08, 2022
Anton Ratnarajah, Ishwarya Ananthabhotla, Vamsi Krishna Ithapu, Pablo Hoffmann, Dinesh Manocha, Paul Calamia

Figure 1 for Towards Improved Room Impulse Response Estimation for Speech Recognition
Figure 2 for Towards Improved Room Impulse Response Estimation for Speech Recognition
Figure 3 for Towards Improved Room Impulse Response Estimation for Speech Recognition
Figure 4 for Towards Improved Room Impulse Response Estimation for Speech Recognition
Viaarxiv icon

Speech-dependent Modeling of Own Voice Transfer Characteristics for In-ear Microphones in Hearables

Sep 15, 2023
Mattes Ohlenbusch, Christian Rollwage, Simon Doclo

Viaarxiv icon

TEVR: Improving Speech Recognition by Token Entropy Variance Reduction

Jun 25, 2022
Hajo Nils Krabbenhöft, Erhardt Barth

Figure 1 for TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
Figure 2 for TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
Figure 3 for TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
Figure 4 for TEVR: Improving Speech Recognition by Token Entropy Variance Reduction
Viaarxiv icon

A vector quantized masked autoencoder for speech emotion recognition

Add code
Bookmark button
Alert button
Apr 21, 2023
Samir Sadok, Simon Leglaive, Renaud Séguier

Figure 1 for A vector quantized masked autoencoder for speech emotion recognition
Figure 2 for A vector quantized masked autoencoder for speech emotion recognition
Figure 3 for A vector quantized masked autoencoder for speech emotion recognition
Figure 4 for A vector quantized masked autoencoder for speech emotion recognition
Viaarxiv icon

Generating gender-ambiguous voices for privacy-preserving speech recognition

Jul 03, 2022
Dimitrios Stoidis, Andrea Cavallaro

Figure 1 for Generating gender-ambiguous voices for privacy-preserving speech recognition
Figure 2 for Generating gender-ambiguous voices for privacy-preserving speech recognition
Figure 3 for Generating gender-ambiguous voices for privacy-preserving speech recognition
Figure 4 for Generating gender-ambiguous voices for privacy-preserving speech recognition
Viaarxiv icon