Picture for Xinggang Wang

Xinggang Wang

Cross-Layer Attentive Feature Upsampling for Low-latency Semantic Segmentation

Add code
Jan 03, 2026
Viaarxiv icon

DriveLaW:Unifying Planning and Video Generation in a Latent Driving World

Add code
Dec 31, 2025
Viaarxiv icon

DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models

Add code
Dec 24, 2025
Viaarxiv icon

A DeepSeek-Powered AI System for Automated Chest Radiograph Interpretation in Clinical Practice

Add code
Dec 23, 2025
Figure 1 for A DeepSeek-Powered AI System for Automated Chest Radiograph Interpretation in Clinical Practice
Figure 2 for A DeepSeek-Powered AI System for Automated Chest Radiograph Interpretation in Clinical Practice
Figure 3 for A DeepSeek-Powered AI System for Automated Chest Radiograph Interpretation in Clinical Practice
Figure 4 for A DeepSeek-Powered AI System for Automated Chest Radiograph Interpretation in Clinical Practice
Viaarxiv icon

DeltaMIL: Gated Memory Integration for Efficient and Discriminative Whole Slide Image Analysis

Add code
Dec 22, 2025
Figure 1 for DeltaMIL: Gated Memory Integration for Efficient and Discriminative Whole Slide Image Analysis
Figure 2 for DeltaMIL: Gated Memory Integration for Efficient and Discriminative Whole Slide Image Analysis
Figure 3 for DeltaMIL: Gated Memory Integration for Efficient and Discriminative Whole Slide Image Analysis
Figure 4 for DeltaMIL: Gated Memory Integration for Efficient and Discriminative Whole Slide Image Analysis
Viaarxiv icon

SuperCLIP: CLIP with Simple Classification Supervision

Add code
Dec 16, 2025
Figure 1 for SuperCLIP: CLIP with Simple Classification Supervision
Figure 2 for SuperCLIP: CLIP with Simple Classification Supervision
Figure 3 for SuperCLIP: CLIP with Simple Classification Supervision
Figure 4 for SuperCLIP: CLIP with Simple Classification Supervision
Viaarxiv icon

Towards Scalable Pre-training of Visual Tokenizers for Generation

Add code
Dec 15, 2025
Viaarxiv icon

InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models

Add code
Dec 09, 2025
Viaarxiv icon

DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving

Add code
Dec 08, 2025
Figure 1 for DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
Figure 2 for DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
Figure 3 for DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
Figure 4 for DiffusionDriveV2: Reinforcement Learning-Constrained Truncated Diffusion Modeling in End-to-End Autonomous Driving
Viaarxiv icon

Phys-Liquid: A Physics-Informed Dataset for Estimating 3D Geometry and Volume of Transparent Deformable Liquids

Add code
Nov 14, 2025
Figure 1 for Phys-Liquid: A Physics-Informed Dataset for Estimating 3D Geometry and Volume of Transparent Deformable Liquids
Figure 2 for Phys-Liquid: A Physics-Informed Dataset for Estimating 3D Geometry and Volume of Transparent Deformable Liquids
Figure 3 for Phys-Liquid: A Physics-Informed Dataset for Estimating 3D Geometry and Volume of Transparent Deformable Liquids
Figure 4 for Phys-Liquid: A Physics-Informed Dataset for Estimating 3D Geometry and Volume of Transparent Deformable Liquids
Viaarxiv icon