speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Scaling On-Device GPU Inference for Large Generative Models

Add code
May 01, 2025
Viaarxiv icon

PLAICraft: Large-Scale Time-Aligned Vision-Speech-Action Dataset for Embodied AI

Add code
May 19, 2025
Viaarxiv icon

Retrieval-Enhanced Few-Shot Prompting for Speech Event Extraction

Add code
Apr 30, 2025
Viaarxiv icon

Improving Pretrained YAMNet for Enhanced Speech Command Detection via Transfer Learning

Add code
Apr 26, 2025
Viaarxiv icon

TinyML for Speech Recognition

Add code
Apr 22, 2025
Viaarxiv icon

Chinese-LiPS: A Chinese audio-visual speech recognition dataset with Lip-reading and Presentation Slides

Add code
Apr 21, 2025
Viaarxiv icon

A Comprehensive Part-of-Speech Tagging to Standardize Central-Kurdish Language: A Research Guide for Kurdish Natural Language Processing Tasks

Add code
Apr 28, 2025
Viaarxiv icon

Kimi-Audio Technical Report

Add code
Apr 25, 2025
Viaarxiv icon

Advancing Arabic Speech Recognition Through Large-Scale Weakly Supervised Learning

Add code
Apr 16, 2025
Viaarxiv icon

StableQuant: Layer Adaptive Post-Training Quantization for Speech Foundation Models

Add code
Apr 21, 2025
Viaarxiv icon