Picture for Jun Du

Jun Du

REST: Diffusion-based Real-time End-to-end Streaming Talking Head Generation via ID-Context Caching and Asynchronous Streaming Distillation

Add code
Dec 12, 2025
Viaarxiv icon

Revolutionizing Mixed Precision Quantization: Towards Training-free Automatic Proxy Discovery via Large Language Models

Add code
Dec 08, 2025
Viaarxiv icon

From Structure to Detail: Hierarchical Distillation for Efficient Diffusion Model

Add code
Nov 12, 2025
Viaarxiv icon

VSE-MOT: Multi-Object Tracking in Low-Quality Video Scenes Guided by Visual Semantic Enhancement

Add code
Sep 17, 2025
Viaarxiv icon

Improving Anomalous Sound Detection with Attribute-aware Representation from Domain-adaptive Pre-training

Add code
Sep 16, 2025
Viaarxiv icon

MEAN-RIR: Multi-Modal Environment-Aware Network for Robust Room Impulse Response Estimation

Add code
Sep 05, 2025
Viaarxiv icon

EGGCodec: A Robust Neural Encodec Framework for EGG Reconstruction and F0 Extraction

Add code
Aug 12, 2025
Viaarxiv icon

M3SD: Multi-modal, Multi-scenario and Multi-language Speaker Diarization Dataset

Add code
Jun 17, 2025
Viaarxiv icon

Exploring Speaker Diarization with Mixture of Experts

Add code
Jun 17, 2025
Viaarxiv icon

Multi-Domain Audio Question Answering Toward Acoustic Content Reasoning in The DCASE 2025 Challenge

Add code
May 12, 2025
Viaarxiv icon