Picture for Tianrui Wang

Tianrui Wang

Speech Meets ELF: Audio Conditional Continuous-Target Diffusion for Speech Recognition and Translation

Add code
Jun 09, 2026
Viaarxiv icon

Audio Imitator: Controlling Timbre and Tempo in Video2Audio Synthesis with Audio Reference

Add code
Jun 05, 2026
Viaarxiv icon

MMAE: A Massive Multitask Audio Editing Benchmark

Add code
Jun 05, 2026
Viaarxiv icon

Separate First, Fuse Later: Mitigating Cross-Modal Interference in Audio-Visual LLMs Reasoning with Modality-Specific Chain-of-Thought

Add code
May 11, 2026
Viaarxiv icon

Evaluating the Expressive Appropriateness of Speech in Rich Contexts

Add code
May 10, 2026
Viaarxiv icon

WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling

Add code
May 07, 2026
Viaarxiv icon

VocalParse: Towards Unified and Scalable Singing Voice Transcription with Large Audio Language Models

Add code
May 06, 2026
Viaarxiv icon

UniSonate: A Unified Model for Speech, Music, and Sound Effect Generation with Text Instructions

Add code
Apr 24, 2026
Viaarxiv icon

X-VC: Zero-shot Streaming Voice Conversion in Codec Space

Add code
Apr 14, 2026
Viaarxiv icon

MSR-HuBERT: Self-supervised Pre-training for Adaptation to Multiple Sampling Rates

Add code
Mar 24, 2026
Viaarxiv icon