Inverted landing is a challenging feat to perform in aerial robots, especially without external positioning. However, it is routinely performed by biological fliers such as bees, flies, and bats. Our previous observations of landing behaviors in flies suggest an open-loop causal relationship between their putative visual cues and the kinematics of the aerial maneuvers executed. For example, the degree of rotational maneuver (therefore the body inversion prior to touchdown) and the amount of leg-assisted body swing both depend on the flies' initial body states while approaching the ceiling. In this work, by using a physics-based simulation with experimental validation, we systematically investigated how optimized inverted landing maneuvers depend on the initial approach velocities with varied magnitude and direction. This was done by analyzing the putative visual cues (that can be derived from onboard measurements) during optimal maneuvering trajectories. We identified a three-dimensional policy region, from which a mapping to a global inverted landing policy can be developed without the use of external positioning data. In addition, we also investigated the effects of an array of landing gear designs on the optimized landing performance and identified their advantages and disadvantages. The above results have been partially validated using limited experimental testing and will continue to inform and guide our future experiments, for example by applying the calculated global policy.
Few-shot relation extraction (FSRE) focuses on recognizing novel relations by learning with merely a handful of annotated instances. Meta-learning has been widely adopted for such a task, which trains on randomly generated few-shot tasks to learn generic data representations. Despite impressive results achieved, existing models still perform suboptimally when handling hard FSRE tasks, where the relations are fine-grained and similar to each other. We argue this is largely because existing models do not distinguish hard tasks from easy ones in the learning process. In this paper, we introduce a novel approach based on contrastive learning that learns better representations by exploiting relation label information. We further design a method that allows the model to adaptively learn how to focus on hard tasks. Experiments on two standard datasets demonstrate the effectiveness of our method.
Despite the success of neural dialogue systems in achieving high performance on the leader-board, they cannot meet users' requirements in practice, due to their poor reasoning skills. The underlying reason is that most neural dialogue models only capture the syntactic and semantic information, but fail to model the logical consistency between the dialogue history and the generated response. Recently, a new multi-turn dialogue reasoning task has been proposed, to facilitate dialogue reasoning research. However, this task is challenging, because there are only slight differences between the illogical response and the dialogue history. How to effectively solve this challenge is still worth exploring. This paper proposes a Fine-grained Comparison Model (FCM) to tackle this problem. Inspired by human's behavior in reading comprehension, a comparison mechanism is proposed to focus on the fine-grained differences in the representation of each response candidate. Specifically, each candidate representation is compared with the whole history to obtain a history consistency representation. Furthermore, the consistency signals between each candidate and the speaker's own history are considered to drive a model to prefer a candidate that is logically consistent with the speaker's history logic. Finally, the above consistency representations are employed to output a ranking list of the candidate responses for multi-turn dialogue reasoning. Experimental results on two public dialogue datasets show that our method obtains higher ranking scores than the baseline models.
In this paper, we propose a new reinforcement learning (RL) algorithm, called encoding distributional soft actor-critic (E-DSAC), for decision-making in autonomous driving. Unlike existing RL-based decision-making methods, E-DSAC is suitable for situations where the number of surrounding vehicles is variable and eliminates the requirement for manually pre-designed sorting rules, resulting in higher policy performance and generality. We first develop an encoding distributional policy iteration (DPI) framework by embedding a permutation invariant module, which employs a feature neural network (NN) to encode the indicators of each vehicle, in the distributional RL framework. The proposed DPI framework is proved to exhibit important properties in terms of convergence and global optimality. Next, based on the developed encoding DPI framework, we propose the E-DSAC algorithm by adding the gradient-based update rule of the feature NN to the policy evaluation process of the DSAC algorithm. Then, the multi-lane driving task and the corresponding reward function are designed to verify the effectiveness of the proposed algorithm. Results show that the policy learned by E-DSAC can realize efficient, smooth, and relatively safe autonomous driving in the designed scenario. And the final policy performance learned by E-DSAC is about three times that of DSAC. Furthermore, its effectiveness has also been verified in real vehicle experiments.
In this paper, we propose a new state representation method, called encoding sum and concatenation (ESC), for the state representation of decision-making in autonomous driving. Unlike existing state representation methods, ESC is applicable to a variable number of surrounding vehicles and eliminates the need for manually pre-designed sorting rules, leading to higher representation ability and generality. The proposed ESC method introduces a representation neural network (NN) to encode each surrounding vehicle into an encoding vector, and then adds these vectors to obtain the representation vector of the set of surrounding vehicles. By concatenating the set representation with other variables, such as indicators of the ego vehicle and road, we realize the fixed-dimensional and permutation invariant state representation. This paper has further proved that the proposed ESC method can realize the injective representation if the output dimension of the representation NN is greater than the number of variables of all surrounding vehicles. This means that by taking the ESC representation as policy inputs, we can find the nearly optimal representation NN and policy NN by simultaneously optimizing them using gradient-based updating. Experiments demonstrate that compared with the fixed-permutation representation method, the proposed method improves the representation ability of the surrounding vehicles, and the corresponding approximation error is reduced by 62.2%.
Background and Objective: The new type of coronavirus is also called COVID-19. It began to spread at the end of 2019 and has now spread across the world. Until October 2020, It has infected around 37 million people and claimed about 1 million lives. We propose a deep learning model that can help radiologists and clinicians use chest X-rays to diagnose COVID-19 cases and show the diagnostic features of pneumonia. Methods: The approach in this study is: 1) we propose a data enhancement method to increase the diversity of the data set, thereby improving the generalization performance of the model. 2) Our deep convolution neural network model DPN-SE adds a self-attention mechanism to the DPN network. The addition of a self-attention mechanism has greatly improved the performance of the network. 3) Use the Lime interpretable library to mark the feature regions on the X-ray medical image that helps doctors more quickly diagnose COVID-19 in people. Results: Under the same network model, the data with and without data enhancement is put into the model for training respectively. At last, comparing two experimental results: among the 10 network models with different structures, 7 network models have improved their effects after using data enhancement, with an average improvement of 1% in recognition accuracy. We propose that the accuracy and recall rates of the DPN-SE network are 93% and 98% of cases (COVID vs. pneumonia bacteria vs. viral pneumonia vs. normal). Compared with the original DPN, the respective accuracy is improved by 2%. Conclusion: The data augmentation method we used has achieved effective results on a small amount of data set, showing that a reasonable data augmentation method can improve the recognition accuracy without changing the sample size and model structure. Overall, the proposed method and model can effectively become a very useful tool for clinical radiologists.
Question Answering (QA) models over Knowledge Bases (KBs) are capable of providing more precise answers by utilizing relation information among entities. Although effective, most of these models solely rely on fixed relation representations to obtain answers for different question-related KB subgraphs. Hence, the rich structured information of these subgraphs may be overlooked by the relation representation vectors. Meanwhile, the direction information of reasoning, which has been proven effective for the answer prediction on graphs, has not been fully explored in existing work. To address these challenges, we propose a novel neural model, Relation-updated Direction-guided Answer Selector (RDAS), which converts relations in each subgraph to additional nodes to learn structure information. Additionally, we utilize direction information to enhance the reasoning ability. Experimental results show that our model yields substantial improvements on two widely used datasets.
Decision and control are two of the core functionalities of high-level automated vehicles. Current mainstream methods, such as functionality decomposition or end-to-end reinforcement learning (RL), either suffer high time complexity or poor interpretability and limited safety performance in real-world complex autonomous driving tasks. In this paper, we present an interpretable and efficient decision and control framework for automated vehicles, which decomposes the driving task into multi-path planning and optimal tracking that are structured hierarchically. First, the multi-path planning is to generate several paths only considering static constraints. Then, the optimal tracking is designed to track the optimal path while considering the dynamic obstacles. To that end, in theory, we formulate a constrained optimal control problem (OCP) for each candidate path, optimize them separately and choose the one with the best tracking performance to follow. More importantly, we propose a model-based reinforcement learning (RL) algorithm, which is served as an approximate constrained OCP solver, to unload the heavy computation by the paradigm of offline training and online application. Specifically, the OCPs for all paths are considered together to construct a multi-task RL problem and then solved offline by our algorithm into value and policy networks, for real-time online path selecting and tracking respectively. We verify our framework in both simulation and the real world. Results show that our method has better online computing efficiency and driving performance including traffic efficiency and safety compared with baseline methods. In addition, it yields great interpretability and adaptability among different driving tasks. The real road test also suggests that it is applicable in complicated traffic scenarios without even tuning.
Flying animals resort to fast, large-degree-of-freedom motion of flapping wings (i.e., their aerodynamic surfaces), a key feature that distinguishes them from rotary or fixed-winged robotic fliers with relatively limited motion of aerodynamic surfaces. However, it is well known that flapping-wing aerodynamics are characterised by highly unsteady and three-dimensional flows difficult to model or control. Accurate aerodynamic force predictions often rely on high-fidelity and expensive computational or experimental methods. Here, we developed a computationally efficient model that can accurately predict aerodynamic forces generated by 548 different flapping-wing motions, surpassing the predictive accuracy and generality of the existing quasi-steady models. Specifically, we trained a state-space model that dynamically mapped wing motion kinematics to aerodynamic forces and moments measured from a dynamically scaled robotic wing. This predictive model used as few as 12 states to successfully capture the unsteady and nonlinear fluid effects pertinent to force generation without explicit information of fluid flows. Also, we provided a comprehensive assessment of the control authority of key wing kinematic variables and their linear predictability of aerodynamic forces. We found that instantaneous aerodynamic forces/moments were largely predictable by the wing motion history within a half stroke cycle. Furthermore, the angle of attack, normal acceleration, and pitching motion had the strongest and the most instant effects on the aerodynamic force/moment generation. Our results show that flapping flight offers inherently high force control authority and predictability, which are key to the development of agile and stable aerial fliers.