Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yishu Li

UniAda: Universal Adaptive Multi-objective Adversarial Attack for End-to-End Autonomous Driving Systems

Apr 25, 2026

Jingyu Zhang, Jacky Wai Keung, Yan Xiao, Yihan Liao, Yishu Li, Xiaoxue Ma

Abstract:Adversarial attacks play a pivotal role in testing and improving the reliability of deep learning (DL) systems. Existing literature has demonstrated that subtle perturbations to the input can elicit erroneous outcomes, thereby substantially compromising the security of DL systems. This has emerged as a critical concern in the development of DL-based safety-critical systems like Autonomous Driving Systems (ADSs). The focus of existing adversarial attack methods on End-to-End (E2E) ADSs has predominantly centered on misbehaviors of steering angle, which overlooks speed-related controls or imperceptible perturbations. To address these challenges, we introduce UniAda, a multi-objective white-box attack technique with a core function that revolves around crafting an image-agnostic adversarial perturbation capable of simultaneously influencing both steering and speed controls. UniAda capitalizes on an intricately designed multi-objective optimization function with the Adaptive Weighting Scheme (AWS), enabling the concurrent optimization of diverse objectives. Validated with both simulated and real-world driving data, UniAda outperforms five benchmarks across two metrics, inducing steering and speed deviations from 3.54 degrees to 29 degrees and 11 km per hour to 22 km per hour on average. This systematic approach establishes UniAda as a proven technique for adversarial attacks on modern DL-based E2E ADSs.

* Published at IEEE Transactions on Reliability journal (2023)

Via

Access Paper or Ask Questions

MonoPlane: Exploiting Monocular Geometric Cues for Generalizable 3D Plane Reconstruction

Nov 02, 2024

Wang Zhao, Jiachen Liu, Sheng Zhang, Yishu Li, Sili Chen, Sharon X Huang, Yong-Jin Liu, Hengkai Guo

Figure 1 for MonoPlane: Exploiting Monocular Geometric Cues for Generalizable 3D Plane Reconstruction

Figure 2 for MonoPlane: Exploiting Monocular Geometric Cues for Generalizable 3D Plane Reconstruction

Figure 3 for MonoPlane: Exploiting Monocular Geometric Cues for Generalizable 3D Plane Reconstruction

Figure 4 for MonoPlane: Exploiting Monocular Geometric Cues for Generalizable 3D Plane Reconstruction

Abstract:This paper presents a generalizable 3D plane detection and reconstruction framework named MonoPlane. Unlike previous robust estimator-based works (which require multiple images or RGB-D input) and learning-based works (which suffer from domain shift), MonoPlane combines the best of two worlds and establishes a plane reconstruction pipeline based on monocular geometric cues, resulting in accurate, robust and scalable 3D plane detection and reconstruction in the wild. Specifically, we first leverage large-scale pre-trained neural networks to obtain the depth and surface normals from a single image. These monocular geometric cues are then incorporated into a proximity-guided RANSAC framework to sequentially fit each plane instance. We exploit effective 3D point proximity and model such proximity via a graph within RANSAC to guide the plane fitting from noisy monocular depths, followed by image-level multi-plane joint optimization to improve the consistency among all plane instances. We further design a simple but effective pipeline to extend this single-view solution to sparse-view 3D plane reconstruction. Extensive experiments on a list of datasets demonstrate our superior zero-shot generalizability over baselines, achieving state-of-the-art plane reconstruction performance in a transferring setting. Our code is available at https://github.com/thuzhaowang/MonoPlane .

* IROS 2024 (oral)

Via

Access Paper or Ask Questions

FlowBotHD: History-Aware Diffuser Handling Ambiguities in Articulated Objects Manipulation

Oct 09, 2024

Yishu Li, Wen Hui Leng, Yiming Fang, Ben Eisner, David Held

Figure 1 for FlowBotHD: History-Aware Diffuser Handling Ambiguities in Articulated Objects Manipulation

Figure 2 for FlowBotHD: History-Aware Diffuser Handling Ambiguities in Articulated Objects Manipulation

Figure 3 for FlowBotHD: History-Aware Diffuser Handling Ambiguities in Articulated Objects Manipulation

Figure 4 for FlowBotHD: History-Aware Diffuser Handling Ambiguities in Articulated Objects Manipulation

Abstract:We introduce a novel approach to manipulate articulated objects with ambiguities, such as opening a door, in which multi-modality and occlusions create ambiguities about the opening side and direction. Multi-modality occurs when the method to open a fully closed door (push, pull, slide) is uncertain, or the side from which it should be opened is uncertain. Occlusions further obscure the door's shape from certain angles, creating further ambiguities during the occlusion. To tackle these challenges, we propose a history-aware diffusion network that models the multi-modal distribution of the articulated object and uses history to disambiguate actions and make stable predictions under occlusions. Experiments and analysis demonstrate the state-of-art performance of our method and specifically improvements in ambiguity-caused failure modes. Our project website is available at https://flowbothd.github.io/.

* Accepted to CoRL 2024

Via

Access Paper or Ask Questions