Picture for Jianke Zhu

Jianke Zhu

Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems

Add code
Dec 30, 2025
Viaarxiv icon

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

Add code
Dec 18, 2025
Viaarxiv icon

RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning

Add code
Oct 02, 2025
Figure 1 for RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Figure 2 for RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Figure 3 for RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Figure 4 for RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning
Viaarxiv icon

MambaMap: Online Vectorized HD Map Construction using State Space Model

Add code
Jul 27, 2025
Viaarxiv icon

SAM4D: Segment Anything in Camera and LiDAR Streams

Add code
Jun 26, 2025
Figure 1 for SAM4D: Segment Anything in Camera and LiDAR Streams
Figure 2 for SAM4D: Segment Anything in Camera and LiDAR Streams
Figure 3 for SAM4D: Segment Anything in Camera and LiDAR Streams
Figure 4 for SAM4D: Segment Anything in Camera and LiDAR Streams
Viaarxiv icon

OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation

Add code
Jun 23, 2025
Viaarxiv icon

PixelThink: Towards Efficient Chain-of-Pixel Reasoning

Add code
May 29, 2025
Viaarxiv icon

Can MLLMs Guide Me Home? A Benchmark Study on Fine-Grained Visual Reasoning from Transit Maps

Add code
May 24, 2025
Viaarxiv icon

DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models

Add code
Apr 25, 2025
Figure 1 for DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models
Figure 2 for DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models
Figure 3 for DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models
Figure 4 for DREAM: Disentangling Risks to Enhance Safety Alignment in Multimodal Large Language Models
Viaarxiv icon

PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning

Add code
Apr 22, 2025
Figure 1 for PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning
Figure 2 for PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning
Figure 3 for PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning
Figure 4 for PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning
Viaarxiv icon