Audio Synthesis


Language Model Based Text-to-Audio Generation: Anti-Causally Aligned Collaborative Residual Transformers

Add code
Oct 06, 2025
Viaarxiv icon

TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation

Add code
Oct 08, 2025
Viaarxiv icon

Speak, Edit, Repeat: High-Fidelity Voice Editing and Zero-Shot TTS with Cross-Attentive Mamba

Add code
Oct 06, 2025
Figure 1 for Speak, Edit, Repeat: High-Fidelity Voice Editing and Zero-Shot TTS with Cross-Attentive Mamba
Figure 2 for Speak, Edit, Repeat: High-Fidelity Voice Editing and Zero-Shot TTS with Cross-Attentive Mamba
Figure 3 for Speak, Edit, Repeat: High-Fidelity Voice Editing and Zero-Shot TTS with Cross-Attentive Mamba
Figure 4 for Speak, Edit, Repeat: High-Fidelity Voice Editing and Zero-Shot TTS with Cross-Attentive Mamba
Viaarxiv icon

Pitch-Conditioned Instrument Sound Synthesis From an Interactive Timbre Latent Space

Add code
Oct 05, 2025
Viaarxiv icon

Video Object Segmentation-Aware Audio Generation

Add code
Sep 30, 2025
Figure 1 for Video Object Segmentation-Aware Audio Generation
Figure 2 for Video Object Segmentation-Aware Audio Generation
Figure 3 for Video Object Segmentation-Aware Audio Generation
Figure 4 for Video Object Segmentation-Aware Audio Generation
Viaarxiv icon

Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech

Add code
Sep 19, 2025
Figure 1 for Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
Figure 2 for Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
Figure 3 for Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
Figure 4 for Beyond Video-to-SFX: Video to Audio Synthesis with Environmentally Aware Speech
Viaarxiv icon

Comprehend and Talk: Text to Speech Synthesis via Dual Language Modeling

Add code
Sep 26, 2025
Viaarxiv icon

Lightweight Implicit Neural Network for Binaural Audio Synthesis

Add code
Sep 17, 2025
Viaarxiv icon

DAIEN-TTS: Disentangled Audio Infilling for Environment-Aware Text-to-Speech Synthesis

Add code
Sep 18, 2025
Viaarxiv icon

SynParaSpeech: Automated Synthesis of Paralinguistic Datasets for Speech Generation and Understanding

Add code
Sep 18, 2025
Viaarxiv icon