Picture for Tao Jin

Tao Jin

University of Science and Technology of China

Speech Token Prediction via Compressed-to-fine Language Modeling for Speech Generation

Add code
May 30, 2025
Viaarxiv icon

IRBridge: Solving Image Restoration Bridge with Pre-trained Generative Diffusion Models

Add code
May 30, 2025
Viaarxiv icon

TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis

Add code
May 20, 2025
Viaarxiv icon

Observe-R1: Unlocking Reasoning Abilities of MLLMs with Dynamic Progressive Reinforcement Learning

Add code
May 18, 2025
Viaarxiv icon

T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback

Add code
May 15, 2025
Viaarxiv icon

Diff-Prompt: Diffusion-Driven Prompt Generator with Mask Supervision

Add code
Apr 30, 2025
Viaarxiv icon

ISDrama: Immersive Spatial Drama Generation through Multimodal Prompting

Add code
Apr 29, 2025
Viaarxiv icon

Unleashing the Power of Natural Audio Featuring Multiple Sound Sources

Add code
Apr 24, 2025
Viaarxiv icon

ConceptGuard: Continual Personalized Text-to-Image Generation with Forgetting and Confusion Mitigation

Add code
Mar 13, 2025
Viaarxiv icon

Smoothing the Shift: Towards Stable Test-Time Adaptation under Complex Multimodal Noises

Add code
Mar 04, 2025
Viaarxiv icon