Event extraction (EE) is a crucial task aiming at extracting events from texts, which includes two subtasks: event detection (ED) and event argument extraction (EAE). In this paper, we check the reliability of EE evaluations and identify three major pitfalls: (1) The data preprocessing discrepancy makes the evaluation results on the same dataset not directly comparable, but the data preprocessing details are not widely noted and specified in papers. (2) The output space discrepancy of different model paradigms makes different-paradigm EE models lack grounds for comparison and also leads to unclear mapping issues between predictions and annotations. (3) The absence of pipeline evaluation of many EAE-only works makes them hard to be directly compared with EE works and may not well reflect the model performance in real-world pipeline scenarios. We demonstrate the significant influence of these pitfalls through comprehensive meta-analyses of recent papers and empirical experiments. To avoid these pitfalls, we suggest a series of remedies, including specifying data preprocessing, standardizing outputs, and providing pipeline evaluation results. To help implement these remedies, we develop a consistent evaluation framework OMNIEVENT, which can be obtained from https://github.com/THU-KEG/OmniEvent.
Based on the variational method, we propose a novel paradigm that provides a unified framework of training neural operators and solving partial differential equations (PDEs) with the variational form, which we refer to as the variational operator learning (VOL). We first derive the functional approximation of the system from the node solution prediction given by neural operators, and then conduct the variational operation by automatic differentiation, constructing a forward-backward propagation loop to derive the residual of the linear system. One or several update steps of the steepest decent method (SD) and the conjugate gradient method (CG) are provided in every iteration as a cheap yet effective update for training the neural operators. Experimental results show the proposed VOL can learn a variety of solution operators in PDEs of the steady heat transfer and the variable stiffness elasticity with satisfactory results and small error. The proposed VOL achieves nearly label-free training. Only five to ten labels are used for the output distribution-shift session in all experiments. Generalization benefits of the VOL are investigated and discussed.
Environment perception including detection, classification, tracking, and motion prediction are key enablers for automated driving systems and intelligent transportation applications. Fueled by the advances in sensing technologies and machine learning techniques, LiDAR-based sensing systems have become a promising solution. The current challenges of this solution are how to effectively combine different perception tasks into a single backbone and how to efficiently learn the spatiotemporal features directly from point cloud sequences. In this research, we propose a novel spatiotemporal attention network based on a transformer self-attention mechanism for joint semantic segmentation and motion prediction within a point cloud at the voxel level. The network is trained to simultaneously outputs the voxel level class and predicted motion by learning directly from a sequence of point cloud datasets. The proposed backbone includes both a temporal attention module (TAM) and a spatial attention module (SAM) to learn and extract the complex spatiotemporal features. This approach has been evaluated with the nuScenes dataset, and promising performance has been achieved.
Taking advantage of both vehicle-to-everything (V2X) communication and automated driving technology, connected and automated vehicles are quickly becoming one of the transformative solutions to many transportation problems. However, in a mixed traffic environment at signalized intersections, it is still a challenging task to improve overall throughput and energy efficiency considering the complexity and uncertainty in the traffic system. In this study, we proposed a hybrid reinforcement learning (HRL) framework which combines the rule-based strategy and the deep reinforcement learning (deep RL) to support connected eco-driving at signalized intersections in mixed traffic. Vision-perceptive methods are integrated with vehicle-to-infrastructure (V2I) communications to achieve higher mobility and energy efficiency in mixed connected traffic. The HRL framework has three components: a rule-based driving manager that operates the collaboration between the rule-based policies and the RL policy; a multi-stream neural network that extracts the hidden features of vision and V2I information; and a deep RL-based policy network that generate both longitudinal and lateral eco-driving actions. In order to evaluate our approach, we developed a Unity-based simulator and designed a mixed-traffic intersection scenario. Moreover, several baselines were implemented to compare with our new design, and numerical experiments were conducted to test the performance of the HRL model. The experiments show that our HRL method can reduce energy consumption by 12.70% and save 11.75% travel time when compared with a state-of-the-art model-based Eco-Driving approach.
This paper presented a deep reinforcement learning method named Double Deep Q-networks to design an end-to-end vision-based adaptive cruise control (ACC) system. A simulation environment of a highway scene was set up in Unity, which is a game engine that provided both physical models of vehicles and feature data for training and testing. Well-designed reward functions associated with the following distance and throttle/brake force were implemented in the reinforcement learning model for both internal combustion engine (ICE) vehicles and electric vehicles (EV) to perform adaptive cruise control. The gap statistics and total energy consumption are evaluated for different vehicle types to explore the relationship between reward functions and powertrain characteristics. Compared with the traditional radar-based ACC systems or human-in-the-loop simulation, the proposed vision-based ACC system can generate either a better gap regulated trajectory or a smoother speed trajectory depending on the preset reward function. The proposed system can be well adaptive to different speed trajectories of the preceding vehicle and operated in real-time.
Accurate lane localization and lane change detection are crucial in advanced driver assistance systems and autonomous driving systems for safer and more efficient trajectory planning. Conventional localization devices such as Global Positioning System only provide road-level resolution for car navigation, which is incompetent to assist in lane-level decision making. The state of art technique for lane localization is to use Light Detection and Ranging sensors to correct the global localization error and achieve centimeter-level accuracy, but the real-time implementation and popularization for LiDAR is still limited by its computational burden and current cost. As a cost-effective alternative, vision-based lane change detection has been highly regarded for affordable autonomous vehicles to support lane-level localization. A deep learning-based computer vision system is developed to detect the lane change behavior using the images captured by a front-view camera mounted on the vehicle and data from the inertial measurement unit for highway driving. Testing results on real-world driving data have shown that the proposed method is robust with real-time working ability and could achieve around 87% lane change detection accuracy. Compared to the average human reaction to visual stimuli, the proposed computer vision system works 9 times faster, which makes it capable of helping make life-saving decisions in time.