speech


PodEval: A Multimodal Evaluation Framework for Podcast Audio Generation

Add code
Oct 01, 2025
Viaarxiv icon

BatonVoice: An Operationalist Framework for Enhancing Controllable Speech Synthesis with Linguistic Intelligence from LLMs

Add code
Sep 30, 2025
Viaarxiv icon

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Add code
Sep 30, 2025
Viaarxiv icon

Voice Evaluation of Reasoning Ability: Diagnosing the Modality-Induced Performance Gap

Add code
Sep 30, 2025
Viaarxiv icon

Convergence and Divergence of Language Models under Different Random Seeds

Add code
Sep 30, 2025
Figure 1 for Convergence and Divergence of Language Models under Different Random Seeds
Figure 2 for Convergence and Divergence of Language Models under Different Random Seeds
Figure 3 for Convergence and Divergence of Language Models under Different Random Seeds
Figure 4 for Convergence and Divergence of Language Models under Different Random Seeds
Viaarxiv icon

Scaling Spoken Language Models with Syllabic Speech Tokenization

Add code
Sep 30, 2025
Viaarxiv icon

The Unheard Alternative: Contrastive Explanations for Speech-to-Text Models

Add code
Sep 30, 2025
Figure 1 for The Unheard Alternative: Contrastive Explanations for Speech-to-Text Models
Figure 2 for The Unheard Alternative: Contrastive Explanations for Speech-to-Text Models
Figure 3 for The Unheard Alternative: Contrastive Explanations for Speech-to-Text Models
Figure 4 for The Unheard Alternative: Contrastive Explanations for Speech-to-Text Models
Viaarxiv icon

Conversational Implicatures: Modelling Relevance Theory Probabilistically

Add code
Sep 26, 2025
Viaarxiv icon

Bridging Fairness and Explainability: Can Input-Based Explanations Promote Fairness in Hate Speech Detection?

Add code
Sep 26, 2025
Viaarxiv icon

FLEXI: Benchmarking Full-duplex Human-LLM Speech Interaction

Add code
Sep 26, 2025
Viaarxiv icon