Picture for Shanmin Pang

Shanmin Pang

Knowledge-Refined Dual Context-Aware Network for Partially Relevant Video Retrieval

Add code
Mar 25, 2026
Viaarxiv icon

PersonalQ: Select, Quantize, and Serve Personalized Diffusion Models for Efficient Inference

Add code
Mar 24, 2026
Viaarxiv icon

A Human-Oriented Cooperative Driving Approach: Integrating Driving Intention, State, and Conflict

Add code
Dec 29, 2025
Viaarxiv icon

PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems

Add code
Aug 07, 2025
Figure 1 for PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems
Figure 2 for PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems
Figure 3 for PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems
Figure 4 for PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems
Viaarxiv icon

Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning

Add code
Mar 11, 2025
Figure 1 for Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning
Figure 2 for Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning
Figure 3 for Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning
Figure 4 for Accelerate 3D Object Detection Models via Zero-Shot Attention Key Pruning
Viaarxiv icon

Redundant Queries in DETR-Based 3D Detection Methods: Unnecessary and Prunable

Add code
Dec 03, 2024
Figure 1 for Redundant Queries in DETR-Based 3D Detection Methods: Unnecessary and Prunable
Figure 2 for Redundant Queries in DETR-Based 3D Detection Methods: Unnecessary and Prunable
Figure 3 for Redundant Queries in DETR-Based 3D Detection Methods: Unnecessary and Prunable
Figure 4 for Redundant Queries in DETR-Based 3D Detection Methods: Unnecessary and Prunable
Viaarxiv icon

HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning

Add code
Nov 03, 2024
Figure 1 for HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning
Figure 2 for HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning
Figure 3 for HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning
Figure 4 for HeightMapNet: Explicit Height Modeling for End-to-End HD Map Learning
Viaarxiv icon

EmoAttack: Emotion-to-Image Diffusion Models for Emotional Backdoor Generation

Add code
Jun 22, 2024
Figure 1 for EmoAttack: Emotion-to-Image Diffusion Models for Emotional Backdoor Generation
Figure 2 for EmoAttack: Emotion-to-Image Diffusion Models for Emotional Backdoor Generation
Figure 3 for EmoAttack: Emotion-to-Image Diffusion Models for Emotional Backdoor Generation
Figure 4 for EmoAttack: Emotion-to-Image Diffusion Models for Emotional Backdoor Generation
Viaarxiv icon

Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models

Add code
Apr 18, 2024
Figure 1 for Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models
Figure 2 for Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models
Figure 3 for Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models
Figure 4 for Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models
Viaarxiv icon

Sparse Semantic Map-Based Monocular Localization in Traffic Scenes Using Learned 2D-3D Point-Line Correspondences

Add code
Oct 10, 2022
Figure 1 for Sparse Semantic Map-Based Monocular Localization in Traffic Scenes Using Learned 2D-3D Point-Line Correspondences
Figure 2 for Sparse Semantic Map-Based Monocular Localization in Traffic Scenes Using Learned 2D-3D Point-Line Correspondences
Figure 3 for Sparse Semantic Map-Based Monocular Localization in Traffic Scenes Using Learned 2D-3D Point-Line Correspondences
Figure 4 for Sparse Semantic Map-Based Monocular Localization in Traffic Scenes Using Learned 2D-3D Point-Line Correspondences
Viaarxiv icon