Picture for Jianbing Shen

Jianbing Shen

From Human Intention to Action Prediction: A Comprehensive Benchmark for Intention-driven End-to-End Autonomous Driving

Add code
Dec 13, 2025
Viaarxiv icon

TransBridge: Boost 3D Object Detection by Scene-Level Completion with Transformer Decoder

Add code
Dec 12, 2025
Figure 1 for TransBridge: Boost 3D Object Detection by Scene-Level Completion with Transformer Decoder
Figure 2 for TransBridge: Boost 3D Object Detection by Scene-Level Completion with Transformer Decoder
Figure 3 for TransBridge: Boost 3D Object Detection by Scene-Level Completion with Transformer Decoder
Figure 4 for TransBridge: Boost 3D Object Detection by Scene-Level Completion with Transformer Decoder
Viaarxiv icon

Sim4Seg: Boosting Multimodal Multi-disease Medical Diagnosis Segmentation with Region-Aware Vision-Language Similarity Masks

Add code
Nov 10, 2025
Viaarxiv icon

Semantic Causality-Aware Vision-Based 3D Occupancy Prediction

Add code
Sep 10, 2025
Figure 1 for Semantic Causality-Aware Vision-Based 3D Occupancy Prediction
Figure 2 for Semantic Causality-Aware Vision-Based 3D Occupancy Prediction
Figure 3 for Semantic Causality-Aware Vision-Based 3D Occupancy Prediction
Figure 4 for Semantic Causality-Aware Vision-Based 3D Occupancy Prediction
Viaarxiv icon

RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping

Add code
Jul 31, 2025
Figure 1 for RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
Figure 2 for RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
Figure 3 for RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
Figure 4 for RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping
Viaarxiv icon

MAM: Modular Multi-Agent Framework for Multi-Modal Medical Diagnosis via Role-Specialized Collaboration

Add code
Jun 24, 2025
Viaarxiv icon

Self-Rewarding Large Vision-Language Models for Optimizing Prompts in Text-to-Image Generation

Add code
May 22, 2025
Viaarxiv icon

Towards Better Cephalometric Landmark Detection with Diffusion Data Generation

Add code
May 09, 2025
Viaarxiv icon

Geometry-aware Temporal Aggregation Network for Monocular 3D Lane Detection

Add code
Apr 29, 2025
Viaarxiv icon

Rethinking Temporal Fusion with a Unified Gradient Descent View for 3D Semantic Occupancy Prediction

Add code
Apr 18, 2025
Viaarxiv icon