Picture for Shengpeng Ji

Shengpeng Ji

TAP: Parameter-efficient Task-Aware Prompting for Adverse Weather Removal

Add code
Aug 11, 2025
Viaarxiv icon

IRBridge: Solving Image Restoration Bridge with Pre-trained Generative Diffusion Models

Add code
May 30, 2025
Viaarxiv icon

T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback

Add code
May 15, 2025
Viaarxiv icon

WavReward: Spoken Dialogue Models With Generalist Reward Evaluators

Add code
May 14, 2025
Viaarxiv icon

Astrea: A MOE-based Visual Understanding Model with Progressive Alignment

Add code
Mar 12, 2025
Viaarxiv icon

InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training

Add code
Mar 04, 2025
Figure 1 for InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training
Figure 2 for InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training
Figure 3 for InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training
Figure 4 for InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training
Viaarxiv icon

UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook

Add code
Feb 27, 2025
Viaarxiv icon

Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis

Add code
Feb 26, 2025
Viaarxiv icon

WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models

Add code
Feb 20, 2025
Viaarxiv icon

Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model

Add code
Feb 08, 2025
Figure 1 for Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model
Figure 2 for Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model
Figure 3 for Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model
Figure 4 for Enhancing Expressive Voice Conversion with Discrete Pitch-Conditioned Flow Matching Model
Viaarxiv icon