Picture for Wenming Yang

Wenming Yang

Focus When Necessary: Adaptive Routing and Collaborative Grounding for Training-Free Visual Grounding

Add code
Jun 15, 2026
Viaarxiv icon

EgoTactile: Learning Grasp Pressure for Everyday Objects from Egocentric Video

Add code
Jun 08, 2026
Viaarxiv icon

EgoPressDiff: Multimodal Video Diffusion for Egocentric UV-Domain Hand-Pressure Estimation

Add code
Jun 05, 2026
Viaarxiv icon

Beyond Skeletons: Learning Animation Directly from Driving Videos with Same2X Training Strategy

Add code
Jun 05, 2026
Viaarxiv icon

AVBench: Human-Aligned and Automated Evaluation Benchmark for Audio-Video Generative Models

Add code
May 23, 2026
Viaarxiv icon

Divide and Conquer: Decoupled Representation Alignment for Multimodal World Models

Add code
May 03, 2026
Viaarxiv icon

CoRe-ECG: Advancing Self-Supervised Representation Learning for 12-Lead ECG via Contrastive and Reconstructive Synergy

Add code
Apr 13, 2026
Viaarxiv icon

TGM-VLA: Task-Guided Mixup for Sampling-Efficient and Robust Robotic Manipulation

Add code
Feb 28, 2026
Viaarxiv icon

TLDiffGAN: A Latent Diffusion-GAN Framework with Temporal Information Fusion for Anomalous Sound Detection

Add code
Feb 01, 2026
Viaarxiv icon

VividVoice: A Unified Framework for Scene-Aware Visually-Driven Speech Synthesis

Add code
Feb 01, 2026
Viaarxiv icon