Alert button

"speech recognition": models, code, and papers
Alert button

Speaker-Adapted End-to-End Visual Speech Recognition for Continuous Spanish

Add code
Bookmark button
Alert button
Nov 21, 2023
David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos

Viaarxiv icon

Phoneme-Based Proactive Anti-Eavesdropping with Controlled Recording Privilege

Jan 28, 2024
Peng Huang, Yao Wei, Peng Cheng, Zhongjie Ba, Li Lu, Feng Lin, Yang Wang, Kui Ren

Viaarxiv icon

On Speaker Attribution with SURT

Jan 28, 2024
Desh Raj, Matthew Wiesner, Matthew Maciejewski, Leibny Paola Garcia-Perera, Daniel Povey, Sanjeev Khudanpur

Viaarxiv icon

Soft-Weighted CrossEntropy Loss for Continous Alzheimer's Disease Detection

Feb 19, 2024
Xiaohui Zhang, Wenjie Fu, Mangui Liang

Viaarxiv icon

USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models

Dec 13, 2023
Shaojin Ding, Qiu David, David Rim, Yanzhang He, Oleg Rybakov, Bo Li, Rohit Prabhavalkar, Weiran Wang, Tara N. Sainath, Shivani Agrawal, Zhonglin Han, Jian Li, Amir Yazdanbakhsh

Figure 1 for USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Figure 2 for USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Figure 3 for USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Figure 4 for USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models
Viaarxiv icon

Testing Speech Emotion Recognition Machine Learning Models

Dec 11, 2023
Anna Derington, Hagen Wierstorf, Ali Özkil, Florian Eyben, Felix Burkhardt, Björn W. Schuller

Viaarxiv icon

Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases

Feb 01, 2024
Giulio Zhou, Tsz Kin Lam, Alexandra Birch, Barry Haddow

Viaarxiv icon

CNN architecture extraction on edge GPU

Jan 24, 2024
Peter Horvath, Lukasz Chmielewski, Leo Weissbart, Lejla Batina, Yuval Yarom

Viaarxiv icon

Locality enhanced dynamic biasing and sampling strategies for contextual ASR

Jan 23, 2024
Md Asif Jalal, Pablo Peso Parada, George Pavlidis, Vasileios Moschopoulos, Karthikeyan Saravanan, Chrysovalantis-Giorgos Kontoulis, Jisi Zhang, Anastasios Drosou, Gil Ho Lee, Jungin Lee, Seokyeong Jung

Viaarxiv icon

GhostVec: A New Threat to Speaker Privacy of End-to-End Speech Recognition System

Nov 17, 2023
Xiaojiao Chen, Sheng Li, Jiyi Li, Hao Huang, Yang Cao, Liang He

Viaarxiv icon