Picture for Jiwen Lu

Jiwen Lu

More Than Sum of Its Parts: Deciphering Intent Shifts in Multimodal Hate Speech Detection

Add code
Mar 22, 2026
Viaarxiv icon

Measuring 3D Spatial Geometric Consistency in Dynamic Generated Videos

Add code
Mar 19, 2026
Viaarxiv icon

DriveTok: 3D Driving Scene Tokenization for Unified Multi-View Reconstruction and Understanding

Add code
Mar 19, 2026
Viaarxiv icon

AdaZoom-GUI: Adaptive Zoom-based GUI Grounding with Instruction Refinement

Add code
Mar 18, 2026
Viaarxiv icon

WaterVIB: Learning Minimal Sufficient Watermark Representations via Variational Information Bottleneck

Add code
Feb 25, 2026
Viaarxiv icon

Moaw: Unleashing Motion Awareness for Video Diffusion Models

Add code
Jan 19, 2026
Viaarxiv icon

CLAP: Contrastive Latent Action Pretraining for Learning Vision-Language-Action Models from Human Videos

Add code
Jan 07, 2026
Viaarxiv icon

NeXT-IMDL: Build Benchmark for NeXT-Generation Image Manipulation Detection & Localization

Add code
Dec 29, 2025
Viaarxiv icon

SFTok: Bridging the Performance Gap in Discrete Tokenizers

Add code
Dec 18, 2025
Viaarxiv icon

DVGT: Driving Visual Geometry Transformer

Add code
Dec 18, 2025
Figure 1 for DVGT: Driving Visual Geometry Transformer
Figure 2 for DVGT: Driving Visual Geometry Transformer
Figure 3 for DVGT: Driving Visual Geometry Transformer
Figure 4 for DVGT: Driving Visual Geometry Transformer
Viaarxiv icon