Picture for Le Wang

Le Wang

Xi'an Jiaotong University

Plan First, Judge Later, Run Better: A DMAIC-Inspired Agentic System for Industrial Anomaly Detection

Add code
Jun 03, 2026
Viaarxiv icon

Efficient Camera Pose Augmentation for View Generalization in Robotic Policy Learning

Add code
Mar 31, 2026
Viaarxiv icon

Forgetting Similar Samples: Can Machine Unlearning Do it Better?

Add code
Jan 11, 2026
Viaarxiv icon

RoboSafe: Safeguarding Embodied Agents via Executable Safety Logic

Add code
Dec 24, 2025
Viaarxiv icon

UniLayDiff: A Unified Diffusion Transformer for Content-Aware Layout Generation

Add code
Dec 09, 2025
Figure 1 for UniLayDiff: A Unified Diffusion Transformer for Content-Aware Layout Generation
Figure 2 for UniLayDiff: A Unified Diffusion Transformer for Content-Aware Layout Generation
Figure 3 for UniLayDiff: A Unified Diffusion Transformer for Content-Aware Layout Generation
Figure 4 for UniLayDiff: A Unified Diffusion Transformer for Content-Aware Layout Generation
Viaarxiv icon

Probing Latent Knowledge Conflict for Faithful Retrieval-Augmented Generation

Add code
Oct 14, 2025
Viaarxiv icon

SAMPO:Scale-wise Autoregression with Motion PrOmpt for generative world models

Add code
Sep 19, 2025
Viaarxiv icon

AudioGen-Omni: A Unified Multimodal Diffusion Transformer for Video-Synchronized Audio, Speech, and Song Generation

Add code
Aug 01, 2025
Figure 1 for AudioGen-Omni: A Unified Multimodal Diffusion Transformer for Video-Synchronized Audio, Speech, and Song Generation
Figure 2 for AudioGen-Omni: A Unified Multimodal Diffusion Transformer for Video-Synchronized Audio, Speech, and Song Generation
Figure 3 for AudioGen-Omni: A Unified Multimodal Diffusion Transformer for Video-Synchronized Audio, Speech, and Song Generation
Figure 4 for AudioGen-Omni: A Unified Multimodal Diffusion Transformer for Video-Synchronized Audio, Speech, and Song Generation
Viaarxiv icon

Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation

Add code
Jun 24, 2025
Figure 1 for Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation
Figure 2 for Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation
Figure 3 for Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation
Figure 4 for Kling-Foley: Multimodal Diffusion Transformer for High-Quality Video-to-Audio Generation
Viaarxiv icon

AGENTSAFE: Benchmarking the Safety of Embodied Agents on Hazardous Instructions

Add code
Jun 17, 2025
Viaarxiv icon