Picture for Ruiqi Li

Ruiqi Li

Answer Presence Drives RAG Rewriting Gains

Add code
Jun 04, 2026
Viaarxiv icon

Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer

Add code
May 29, 2026
Viaarxiv icon

SwanVoice: Expressive Long-Form Zero-Shot Speech Synthesis for Both Monologue and Dialogue

Add code
May 29, 2026
Viaarxiv icon

Comprehensive Benchmarking of Long-Form Speech Generation in Diverse Scenarios

Add code
May 27, 2026
Viaarxiv icon

ComPrivDet: Efficient Privacy Object Detection in Compressed Domains Through Inference Reuse

Add code
Apr 04, 2026
Viaarxiv icon

ALIVE: Animate Your World with Lifelike Audio-Video Generation

Add code
Feb 09, 2026
Viaarxiv icon

MCP-ITP: An Automated Framework for Implicit Tool Poisoning in MCP

Add code
Jan 12, 2026
Viaarxiv icon

STARS: A Unified Framework for Singing Transcription, Alignment, and Refined Style Annotation

Add code
Jul 09, 2025
Viaarxiv icon

Versatile Framework for Song Generation with Prompt-based Control

Add code
Apr 29, 2025
Viaarxiv icon

Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis

Add code
Feb 26, 2025
Figure 1 for Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis
Figure 2 for Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis
Figure 3 for Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis
Figure 4 for Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis
Viaarxiv icon