speech


RealClass: A Framework for Classroom Speech Simulation with Public Datasets and Game Engines

Add code
Oct 01, 2025
Viaarxiv icon

UniverSR: Unified and Versatile Audio Super-Resolution via Vocoder-Free Flow Matching

Add code
Oct 01, 2025
Viaarxiv icon

From Scores to Preferences: Redefining MOS Benchmarking for Speech Quality Reward Modeling

Add code
Oct 01, 2025
Viaarxiv icon

When Silence Matters: The Impact of Irrelevant Audio on Text Reasoning in Large Audio-Language Models

Add code
Oct 01, 2025
Viaarxiv icon

PodEval: A Multimodal Evaluation Framework for Podcast Audio Generation

Add code
Oct 01, 2025
Viaarxiv icon

BatonVoice: An Operationalist Framework for Enhancing Controllable Speech Synthesis with Linguistic Intelligence from LLMs

Add code
Sep 30, 2025
Viaarxiv icon

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Add code
Sep 30, 2025
Viaarxiv icon

Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap

Add code
Sep 30, 2025
Viaarxiv icon

Convergence and Divergence of Language Models under Different Random Seeds

Add code
Sep 30, 2025
Figure 1 for Convergence and Divergence of Language Models under Different Random Seeds
Figure 2 for Convergence and Divergence of Language Models under Different Random Seeds
Figure 3 for Convergence and Divergence of Language Models under Different Random Seeds
Figure 4 for Convergence and Divergence of Language Models under Different Random Seeds
Viaarxiv icon

Scaling Spoken Language Models with Syllabic Speech Tokenization

Add code
Sep 30, 2025
Viaarxiv icon