Alert button
Picture for Bo Yang

Bo Yang

Alert button

Exploring Adversarial Robustness of LiDAR-Camera Fusion Model in Autonomous Driving

Dec 03, 2023
Bo Yang, Xiaoyu Ji, Xiaoyu Ji, Xiaoyu Ji, Xiaoyu Ji

Our study assesses the adversarial robustness of LiDAR-camera fusion models in 3D object detection. We introduce an attack technique that, by simply adding a limited number of physically constrained adversarial points above a car, can make the car undetectable by the fusion model. Experimental results reveal that even without changes to the image data channel, the fusion model can be deceived solely by manipulating the LiDAR data channel. This finding raises safety concerns in the field of autonomous driving. Further, we explore how the quantity of adversarial points, the distance between the front-near car and the LiDAR-equipped car, and various angular factors affect the attack success rate. We believe our research can contribute to the understanding of multi-sensor robustness, offering insights and guidance to enhance the safety of autonomous driving.

Viaarxiv icon

RayDF: Neural Ray-surface Distance Fields with Multi-view Consistency

Oct 30, 2023
Zhuoman Liu, Bo Yang

In this paper, we study the problem of continuous 3D shape representations. The majority of existing successful methods are coordinate-based implicit neural representations. However, they are inefficient to render novel views or recover explicit surface points. A few works start to formulate 3D shapes as ray-based neural functions, but the learned structures are inferior due to the lack of multi-view geometry consistency. To tackle these challenges, we propose a new framework called RayDF. It consists of three major components: 1) the simple ray-surface distance field, 2) the novel dual-ray visibility classifier, and 3) a multi-view consistency optimization module to drive the learned ray-surface distances to be multi-view geometry consistent. We extensively evaluate our method on three public datasets, demonstrating remarkable performance in 3D surface point reconstruction on both synthetic and challenging real-world 3D scenes, clearly surpassing existing coordinate-based and ray-based baselines. Most notably, our method achieves a 1000x faster speed than coordinate-based methods to render an 800x800 depth image, showing the superiority of our method for 3D shape representation. Our code and data are available at https://github.com/vLAR-group/RayDF

* NeurIPS 2023. Code and data are available at: https://github.com/vLAR-group/RayDF 
Viaarxiv icon

Learning Continuous Network Emerging Dynamics from Scarce Observations via Data-Adaptive Stochastic Processes

Oct 25, 2023
Jiaxu Cui, Bingyi Sun, Jiming Liu, Bo Yang

Learning network dynamics from the empirical structure and spatio-temporal observation data is crucial to revealing the interaction mechanisms of complex networks in a wide range of domains. However, most existing methods only aim at learning network dynamic behaviors generated by a specific ordinary differential equation instance, resulting in ineffectiveness for new ones, and generally require dense observations. The observed data, especially from network emerging dynamics, are usually difficult to obtain, which brings trouble to model learning. Therefore, how to learn accurate network dynamics with sparse, irregularly-sampled, partial, and noisy observations remains a fundamental challenge. We introduce Neural ODE Processes for Network Dynamics (NDP4ND), a new class of stochastic processes governed by stochastic data-adaptive network dynamics, to overcome the challenge and learn continuous network dynamics from scarce observations. Intensive experiments conducted on various network dynamics in ecological population evolution, phototaxis movement, brain activity, epidemic spreading, and real-world empirical systems, demonstrate that the proposed method has excellent data adaptability and computational efficiency, and can adapt to unseen network emerging dynamics, producing accurate interpolation and extrapolation with reducing the ratio of required observation data to only about 6\% and improving the learning speed for new dynamics by three orders of magnitude.

* preprint 
Viaarxiv icon

Learning Generalizable Agents via Saliency-Guided Features Decorrelation

Oct 08, 2023
Sili Huang, Yanchao Sun, Jifeng Hu, Siyuan Guo, Hechang Chen, Yi Chang, Lichao Sun, Bo Yang

Figure 1 for Learning Generalizable Agents via Saliency-Guided Features Decorrelation
Figure 2 for Learning Generalizable Agents via Saliency-Guided Features Decorrelation
Figure 3 for Learning Generalizable Agents via Saliency-Guided Features Decorrelation
Figure 4 for Learning Generalizable Agents via Saliency-Guided Features Decorrelation

In visual-based Reinforcement Learning (RL), agents often struggle to generalize well to environmental variations in the state space that were not observed during training. The variations can arise in both task-irrelevant features, such as background noise, and task-relevant features, such as robot configurations, that are related to the optimal decisions. To achieve generalization in both situations, agents are required to accurately understand the impact of changed features on the decisions, i.e., establishing the true associations between changed features and decisions in the policy model. However, due to the inherent correlations among features in the state space, the associations between features and decisions become entangled, making it difficult for the policy to distinguish them. To this end, we propose Saliency-Guided Features Decorrelation (SGFD) to eliminate these correlations through sample reweighting. Concretely, SGFD consists of two core techniques: Random Fourier Functions (RFF) and the saliency map. RFF is utilized to estimate the complex non-linear correlations in high-dimensional images, while the saliency map is designed to identify the changed features. Under the guidance of the saliency map, SGFD employs sample reweighting to minimize the estimated correlations related to changed features, thereby achieving decorrelation in visual RL tasks. Our experimental results demonstrate that SGFD can generalize well on a wide range of test environments and significantly outperforms state-of-the-art methods in handling both task-irrelevant variations and task-relevant variations.

Viaarxiv icon

Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4

Oct 06, 2023
Jiaxian Guo, Bo Yang, Paul Yoo, Bill Yuchen Lin, Yusuke Iwasawa, Yutaka Matsuo

Figure 1 for Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4
Figure 2 for Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4
Figure 3 for Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4
Figure 4 for Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4

Unlike perfect information games, where all elements are known to every player, imperfect information games emulate the real-world complexities of decision-making under uncertain or incomplete information. GPT-4, the recent breakthrough in large language models (LLMs) trained on massive passive data, is notable for its knowledge retrieval and reasoning abilities. This paper delves into the applicability of GPT-4's learned knowledge for imperfect information games. To achieve this, we introduce \textbf{Suspicion-Agent}, an innovative agent that leverages GPT-4's capabilities for performing in imperfect information games. With proper prompt engineering to achieve different functions, Suspicion-Agent based on GPT-4 demonstrates remarkable adaptability across a range of imperfect information card games. Importantly, GPT-4 displays a strong high-order theory of mind (ToM) capacity, meaning it can understand others and intentionally impact others' behavior. Leveraging this, we design a planning strategy that enables GPT-4 to competently play against different opponents, adapting its gameplay style as needed, while requiring only the game rules and descriptions of observations as input. In the experiments, we qualitatively showcase the capabilities of Suspicion-Agent across three different imperfect information games and then quantitatively evaluate it in Leduc Hold'em. The results show that Suspicion-Agent can potentially outperform traditional algorithms designed for imperfect information games, without any specialized training or examples. In order to encourage and foster deeper insights within the community, we make our game-related data publicly available.

Viaarxiv icon

Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT4

Sep 29, 2023
Jiaxian Guo, Bo Yang, Paul Yoo, Yuchen Lin, Yusuke Iwasawa, Yutaka Matsuo

Figure 1 for Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT4
Figure 2 for Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT4
Figure 3 for Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT4
Figure 4 for Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT4

Unlike perfect information games, where all elements are known to every player, imperfect information games emulate the real-world complexities of decision-making under uncertain or incomplete information. GPT-4, the recent breakthrough in large language models (LLMs) trained on massive passive data, is notable for its knowledge retrieval and reasoning abilities. This paper delves into the applicability of GPT-4's learned knowledge for imperfect information games. To achieve this, we introduce \textbf{Suspicion-Agent}, an innovative agent that leverages GPT-4's capabilities for performing in imperfect information games. With proper prompt engineering to achieve different functions, Suspicion-Agent based on GPT-4 demonstrates remarkable adaptability across a range of imperfect information card games. Importantly, GPT-4 displays a strong high-order theory of mind (ToM) capacity, meaning it can understand others and intentionally impact others' behavior. Leveraging this, we design a planning strategy that enables GPT-4 to competently play against different opponents, adapting its gameplay style as needed, while requiring only the game rules and descriptions of observations as input. In the experiments, we qualitatively showcase the capabilities of Suspicion-Agent across three different imperfect information games and then quantitatively evaluate it in Leduc Hold'em. The results show that Suspicion-Agent can potentially outperform traditional algorithms designed for imperfect information games, without any specialized training or examples. In order to encourage and foster deeper insights within the community, we make our game-related data publicly available.

Viaarxiv icon

A Novel Neural-symbolic System under Statistical Relational Learning

Sep 16, 2023
Dongran Yu, Xueyan Liu, Shirui Pan, Anchen Li, Bo Yang

Figure 1 for A Novel Neural-symbolic System under Statistical Relational Learning
Figure 2 for A Novel Neural-symbolic System under Statistical Relational Learning
Figure 3 for A Novel Neural-symbolic System under Statistical Relational Learning
Figure 4 for A Novel Neural-symbolic System under Statistical Relational Learning

A key objective in field of artificial intelligence is to develop cognitive models that can exhibit human-like intellectual capabilities. One promising approach to achieving this is through neural-symbolic systems, which combine the strengths of deep learning and symbolic reasoning. However, current approaches in this area have been limited in their combining way, generalization and interpretability. To address these limitations, we propose a general bi-level probabilistic graphical reasoning framework called GBPGR. This framework leverages statistical relational learning to effectively integrate deep learning models and symbolic reasoning in a mutually beneficial manner. In GBPGR, the results of symbolic reasoning are utilized to refine and correct the predictions made by the deep learning models. At the same time, the deep learning models assist in enhancing the efficiency of the symbolic reasoning process. Through extensive experiments, we demonstrate that our approach achieves high performance and exhibits effective generalization in both transductive and inductive tasks.

Viaarxiv icon

Massive Access of Static and Mobile Users via Reconfigurable Intelligent Surfaces: Protocol Design and Performance Analysis

Sep 12, 2023
Xuelin Cao, Bo Yang, Chongwen Huang, George C. Alexandropoulos, Chau Yuen, Zhu Han, H. Vincent Poor, Lajos Hanzo

Figure 1 for Massive Access of Static and Mobile Users via Reconfigurable Intelligent Surfaces: Protocol Design and Performance Analysis
Figure 2 for Massive Access of Static and Mobile Users via Reconfigurable Intelligent Surfaces: Protocol Design and Performance Analysis
Figure 3 for Massive Access of Static and Mobile Users via Reconfigurable Intelligent Surfaces: Protocol Design and Performance Analysis
Figure 4 for Massive Access of Static and Mobile Users via Reconfigurable Intelligent Surfaces: Protocol Design and Performance Analysis

The envisioned wireless networks of the future entail the provisioning of massive numbers of connections, heterogeneous data traffic, ultra-high spectral efficiency, and low latency services. This vision is spurring research activities focused on defining a next generation multiple access (NGMA) protocol that can accommodate massive numbers of users in different resource blocks, thereby, achieving higher spectral efficiency and increased connectivity compared to conventional multiple access schemes. In this article, we present a multiple access scheme for NGMA in wireless communication systems assisted by multiple reconfigurable intelligent surfaces (RISs). In this regard, considering the practical scenario of static users operating together with mobile ones, we first study the interplay of the design of NGMA schemes and RIS phase configuration in terms of efficiency and complexity. Based on this, we then propose a multiple access framework for RIS-assisted communication systems, and we also design a medium access control (MAC) protocol incorporating RISs. In addition, we give a detailed performance analysis of the designed RIS-assisted MAC protocol. Our extensive simulation results demonstrate that the proposed MAC design outperforms the benchmarks in terms of system throughput and access fairness, and also reveal a trade-off relationship between the system throughput and fairness.

Viaarxiv icon

A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading

Sep 02, 2023
Ruihuai Liang, Bo Yang, Zhiwen Yu, Xuelin Cao, Derrick Wing Kwan Ng, Chau Yuen

Figure 1 for A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading
Figure 2 for A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading
Figure 3 for A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading
Figure 4 for A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading

Computation offloading has become a popular solution to support computationally intensive and latency-sensitive applications by transferring computing tasks to mobile edge servers (MESs) for execution, which is known as mobile/multi-access edge computing (MEC). To improve the MEC performance, it is required to design an optimal offloading strategy that includes offloading decision (i.e., whether offloading or not) and computational resource allocation of MEC. The design can be formulated as a mixed-integer nonlinear programming (MINLP) problem, which is generally NP-hard and its effective solution can be obtained by performing online inference through a well-trained deep neural network (DNN) model. However, when the system environments change dynamically, the DNN model may lose efficacy due to the drift of input parameters, thereby decreasing the generalization ability of the DNN model. To address this unique challenge, in this paper, we propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs). Specifically, the shared backbone will be invariant during the PHs training and the inferred results will be ensembled, thereby significantly reducing the required training overhead and improving the inference performance. As a result, the joint optimization problem for offloading decision and resource allocation can be efficiently solved even in a time-varying wireless environment. Experimental results show that the proposed MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.

Viaarxiv icon

Condition-Adaptive Graph Convolution Learning for Skeleton-Based Gait Recognition

Aug 13, 2023
Xiaohu Huang, Xinggang Wang, Zhidianqiu Jin, Bo Yang, Botao He, Bin Feng, Wenyu Liu

Figure 1 for Condition-Adaptive Graph Convolution Learning for Skeleton-Based Gait Recognition
Figure 2 for Condition-Adaptive Graph Convolution Learning for Skeleton-Based Gait Recognition
Figure 3 for Condition-Adaptive Graph Convolution Learning for Skeleton-Based Gait Recognition
Figure 4 for Condition-Adaptive Graph Convolution Learning for Skeleton-Based Gait Recognition

Graph convolutional networks have been widely applied in skeleton-based gait recognition. A key challenge in this task is to distinguish the individual walking styles of different subjects across various views. Existing state-of-the-art methods employ uniform convolutions to extract features from diverse sequences and ignore the effects of viewpoint changes. To overcome these limitations, we propose a condition-adaptive graph (CAG) convolution network that can dynamically adapt to the specific attributes of each skeleton sequence and the corresponding view angle. In contrast to using fixed weights for all joints and sequences, we introduce a joint-specific filter learning (JSFL) module in the CAG method, which produces sequence-adaptive filters at the joint level. The adaptive filters capture fine-grained patterns that are unique to each joint, enabling the extraction of diverse spatial-temporal information about body parts. Additionally, we design a view-adaptive topology learning (VATL) module that generates adaptive graph topologies. These graph topologies are used to correlate the joints adaptively according to the specific view conditions. Thus, CAG can simultaneously adjust to various walking styles and viewpoints. Experiments on the two most widely used datasets (i.e., CASIA-B and OU-MVLP) show that CAG surpasses all previous skeleton-based methods. Moreover, the recognition performance can be enhanced by simply combining CAG with appearance-based methods, demonstrating the ability of CAG to provide useful complementary information.The source code will be available at https://github.com/OliverHxh/CAG.

* Accepted by TIP journal 
Viaarxiv icon