Autonomous driving technologies are expected to not only improve mobility and road safety but also bring energy efficiency benefits. In the foreseeable future, autonomous vehicles (AVs) will operate on roads shared with human-driven vehicles. To maintain safety and liveness while simultaneously minimizing energy consumption, the AV planning and decision-making process should account for interactions between the autonomous ego vehicle and surrounding human-driven vehicles. In this chapter, we describe a framework for developing energy-efficient autonomous driving policies on shared roads by exploiting human-driver behavior modeling based on cognitive hierarchy theory and reinforcement learning.
In the deployment of deep neural models, how to effectively and automatically find feasible deep models under diverse design objectives is fundamental. Most existing neural architecture search (NAS) methods utilize surrogates to predict the detailed performance (e.g., accuracy and model size) of a candidate architecture during the search, which however is complicated and inefficient. In contrast, we aim to learn an efficient Pareto classifier to simplify the search process of NAS by transforming the complex multi-objective NAS task into a simple Pareto-dominance classification task. To this end, we propose a classification-wise Pareto evolution approach for one-shot NAS, where an online classifier is trained to predict the dominance relationship between the candidate and constructed reference architectures, instead of using surrogates to fit the objective functions. The main contribution of this study is to change supernet adaption into a Pareto classifier. Besides, we design two adaptive schemes to select the reference set of architectures for constructing classification boundary and regulate the rate of positive samples over negative ones, respectively. We compare the proposed evolution approach with state-of-the-art approaches on widely-used benchmark datasets, and experimental results indicate that the proposed approach outperforms other approaches and have found a number of neural architectures with different model sizes ranging from 2M to 6M under diverse objectives and constraints.
The identification of protein-ligand interaction plays a key role in biochemical research and drug discovery. Although deep learning has recently shown great promise in discovering new drugs, there remains a gap between deep learning-based and experimental approaches. Here we propose a novel framework, named AIMEE, integrating AI Model and Enzymology Experiments, to identify inhibitors against 3CL protease of SARS-CoV-2, which has taken a significant toll on people across the globe. From a bioactive chemical library, we have conducted two rounds of experiments and identified six novel inhibitors with a hit rate of 29.41%, and four of them showed an IC50 value less than 3 {\mu}M. Moreover, we explored the interpretability of the central model in AIMEE, mapping the deep learning extracted features to domain knowledge of chemical properties. Based on this knowledge, a commercially available compound was selected and proven to be an activity-based probe of 3CLpro. This work highlights the great potential of combining deep learning models and biochemical experiments for intelligent iteration and expanding the boundaries of drug discovery.
Reinforcement Learning (RL) is essentially a trial-and-error learning procedure which may cause unsafe behavior during the exploration-and-exploitation process. This hinders the applications of RL to real-world control problems, especially to those for safety-critical systems. In this paper, we introduce a framework for safe RL that is based on integration of an RL algorithm with an add-on safety supervision module, called the Robust Action Governor (RAG), which exploits set-theoretic techniques and online optimization to manage safety-related requirements during learning. We illustrate this proposed safe RL framework through an application to automotive adaptive cruise control.
This paper proposes a learning reference governor (LRG) approach to enforce state and control constraints in systems for which an accurate model is unavailable; and this approach enables the reference governor to gradually improve command tracking performance through learning while enforcing the constraints during learning and after learning is completed. The learning can be performed either on a black-box type model of the system or directly on the hardware. After introducing the LRG algorithm and outlining its theoretical properties, this paper investigates LRG application to fuel truck rollover avoidance. Through simulations based on a fuel truck model that accounts for liquid fuel sloshing effects, we show that the proposed LRG can effectively protect fuel trucks from rollover accidents under various operating conditions.
Non-geometric mobility hazards such as rover slippage and sinkage posing great challenges to costly planetary missions are closely related to the mechanical properties of terrain. In-situ proprioceptive processes for rovers to estimate terrain mechanical properties need to experience different slip as well as sinkage and are helpless to untraversed regions. This paper proposes to predict terrain mechanical properties with vision in the distance, which expands the sensing range to the whole view and can partly halt potential slippage and sinkage hazards in the planning stage. A semantic-based method is designed to predict bearing and shearing properties of terrain in two stages connected with semantic clues. The former segmentation phase segments terrain with a light-weighted network promising to be applied onboard with competitive 93% accuracy and high recall rate over 96%, while the latter inference phase predicts terrain properties in a quantitative manner based on human-like inference principles. The prediction results in several test routes are 12.5% and 10.8% in full-scale error and help to plan appropriate strategies to avoid suffering non-geometric hazards.
In the design optimization of metal forming, it is increasingly significant to use surrogate models to analyse the finite element analysis (FEA) simulations. However, traditional surrogate models using scalar based machine learning methods (SBMLMs) fall in short of accuracy and generalizability. This is because SBMLMs fail to harness the location information of the simulations. To overcome these shortcomings, image based machine learning methods (IBMLMs) are leveraged in this paper. The underlying theory of location information, which supports the advantages of IBMLM, is qualitatively interpreted. Based on this theory, a Res-SE-U-Net IBMLM surrogate model is developed and compared with a multi-layer perceptron (MLP) as a referencing SBMLM surrogate model. It is demonstrated that the IBMLM model is advantageous over the MLP SBMLM model in accuracy, generalizability, robustness, and informativeness. This paper presents a promising methodology of leveraging IBMLMs in surrogate models to make maximum use of info from FEA results. Future prospective studies that inspired by this paper are also discussed.
To maximize cumulative user engagement (e.g. cumulative clicks) in sequential recommendation, it is often needed to tradeoff two potentially conflicting objectives, that is, pursuing higher immediate user engagement (e.g., click-through rate) and encouraging user browsing (i.e., more items exposured). Existing works often study these two tasks separately, thus tend to result in sub-optimal results. In this paper, we study this problem from an online optimization perspective, and propose a flexible and practical framework to explicitly tradeoff longer user browsing length and high immediate user engagement. Specifically, by considering items as actions, user's requests as states and user leaving as an absorbing state, we formulate each user's behavior as a personalized Markov decision process (MDP), and the problem of maximizing cumulative user engagement is reduced to a stochastic shortest path (SSP) problem. Meanwhile, with immediate user engagement and quit probability estimation, it is shown that the SSP problem can be efficiently solved via dynamic programming. Experiments on real-world datasets demonstrate the effectiveness of the proposed approach. Moreover, this approach is deployed at a large E-commerce platform, achieved over 7% improvement of cumulative clicks.
We propose a game theoretic approach to address the problem of searching for available parking spots in a parking lot and picking the ``optimal'' one to park. The approach exploits limited information provided by the parking lot, i.e., its layout and the current number of cars in it. Considering the fact that such information is or can be easily made available for many structured parking lots, the proposed approach can be applicable without requiring major updates to existing parking facilities. For large parking lots, a sampling-based strategy is integrated with the proposed approach to overcome the associated computational challenge. The proposed approach is compared against a state-of-the-art heuristic-based parking spot search strategy in the literature through simulation studies and demonstrates its advantage in terms of achieving lower cost function values.
Liquid State Machine (LSM), also known as the recurrent version of Spiking Neural Networks (SNN), has attracted great research interests thanks to its high computational power, biological plausibility from the brain, simple structure and low training complexity. By exploring the design space in network architectures and parameters, recent works have demonstrated great potential for improving the accuracy of LSM model with low complexity. However, these works are based on manually-defined network architectures or predefined parameters. Considering the diversity and uniqueness of brain structure, the design of LSM model should be explored in the largest search space possible. In this paper, we propose a Neural Architecture Search (NAS) based framework to explore both architecture and parameter design space for automatic dataset-oriented LSM model. To handle the exponentially-increased design space, we adopt a three-step search for LSM, including multi-liquid architecture search, variation on the number of neurons and parameters search such as percentage connectivity and excitatory neuron ratio within each liquid. Besides, we propose to use Simulated Annealing (SA) algorithm to implement the three-step heuristic search. Three datasets, including image dataset of MNIST and NMNIST and speech dataset of FSDD, are used to test the effectiveness of our proposed framework. Simulation results show that our proposed framework can produce the dataset-oriented optimal LSM models with high accuracy and low complexity. The best classification accuracy on the three datasets is 93.2%, 92.5% and 84% respectively with only 1000 spiking neurons, and the network connections can be averagely reduced by 61.4% compared with a single LSM. Moreover, we find that the total quantity of neurons in optimal LSM models on three datasets can be further reduced by 20% with only about 0.5% accuracy loss.