speech recognition


Speech recognition is the task of identifying words spoken aloud, analyzing the voice and language, and accurately transcribing the words.

Audio-to-Audio Emotion Conversion With Pitch And Duration Style Transfer

Add code
May 23, 2025
Viaarxiv icon

SepALM: Audio Language Models Are Error Correctors for Robust Speech Separation

Add code
May 06, 2025
Viaarxiv icon

VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model

Add code
May 06, 2025
Viaarxiv icon

Conversational Recommendation System using NLP and Sentiment Analysis

Add code
May 17, 2025
Viaarxiv icon

Analysis of ABC Frontend Audio Systems for the NIST-SRE24

Add code
May 21, 2025
Viaarxiv icon

Voila: Voice-Language Foundation Models for Real-Time Autonomous Interaction and Voice Role-Play

Add code
May 05, 2025
Viaarxiv icon

Transforming faces into video stories -- VideoFace2.0

Add code
May 04, 2025
Viaarxiv icon

A Comparative Analysis of Static Word Embeddings for Hungarian

Add code
May 12, 2025
Viaarxiv icon

Scaling On-Device GPU Inference for Large Generative Models

Add code
May 01, 2025
Viaarxiv icon

A Synergistic Framework of Nonlinear Acoustic Computing and Reinforcement Learning for Real-World Human-Robot Interaction

Add code
May 04, 2025
Viaarxiv icon