speech


Token-based Attractors and Cross-attention in Spoof Diarization

Add code
Sep 16, 2025
Viaarxiv icon

MSR-Codec: A Low-Bitrate Multi-Stream Residual Codec for High-Fidelity Speech Generation with Information Disentanglement

Add code
Sep 16, 2025
Viaarxiv icon

Traces Propagation: Memory-Efficient and Scalable Forward-Only Learning in Spiking Neural Networks

Add code
Sep 16, 2025
Viaarxiv icon

The CCF AATC 2025: Speech Restoration Challenge

Add code
Sep 16, 2025
Viaarxiv icon

A Lightweight Pipeline for Noisy Speech Voice Cloning and Accurate Lip Sync Synthesis

Add code
Sep 16, 2025
Viaarxiv icon

PAC: Pronunciation-Aware Contextualized Large Language Model-based Automatic Speech Recognition

Add code
Sep 16, 2025
Viaarxiv icon

High-Energy Concentration for Federated Learning in Frequency Domain

Add code
Sep 16, 2025
Viaarxiv icon

Multi-Modal Embedding-based Target Speaker Enhancement

Add code
Sep 16, 2025
Viaarxiv icon

FunAudio-ASR Technical Report

Add code
Sep 15, 2025
Viaarxiv icon

Learning to Generate Pointing Gestures in Situated Embodied Conversational Agents

Add code
Sep 15, 2025
Viaarxiv icon