Assembly sequence planning (ASP) is the essential process for modern manufacturing, proven to be NP-complete thus its effective and efficient solution has been a challenge for researchers in the field. In this paper, we present a graph-transformer based framework for the ASP problem which is trained and demonstrated on a self-collected ASP database. The ASP database contains a self-collected set of LEGO models. The LEGO model is abstracted to a heterogeneous graph structure after a thorough analysis of the original structure and feature extraction. The ground truth assembly sequence is first generated by brute-force search and then adjusted manually to in line with human rational habits. Based on this self-collected ASP dataset, we propose a heterogeneous graph-transformer framework to learn the latent rules for assembly planning. We evaluated the proposed framework in a series of experiment. The results show that the similarity of the predicted and ground truth sequences can reach 0.44, a medium correlation measured by Kendall's $\tau$. Meanwhile, we compared the different effects of node features and edge features and generated a feasible and reasonable assembly sequence as a benchmark for further research. Our data set and code is available on https://github.com/AIR-DISCOVER/ICRA\_ASP.
Although the use of multiple Unmanned Aerial Vehicles (UAVs) has great potential for fast autonomous exploration, it has received far too little attention. In this paper, we present RACER, a RApid Collaborative ExploRation approach using a fleet of decentralized UAVs. To effectively dispatch the UAVs, a pairwise interaction based on an online hgrid space decomposition is used. It ensures that all UAVs simultaneously explore distinct regions, using only asynchronous and limited communication. Further, we optimize the coverage paths of unknown space and balance the workloads partitioned to each UAV with a Capacitated Vehicle Routing Problem(CVRP) formulation. Given the task allocation, each UAV constantly updates the coverage path and incrementally extracts crucial information to support the exploration planning. A hierarchical planner finds exploration paths, refines local viewpoints and generates minimum-time trajectories in sequence to explore the unknown space agilely and safely. The proposed approach is evaluated extensively, showing high exploration efficiency, scalability and robustness to limited communication. Furthermore, for the first time, we achieve fully decentralized collaborative exploration with multiple UAVs in real world. We will release our implementation as an open-source package.
Convolutional neural networks can achieve remarkable performance in semantic segmentation tasks. However, such neural network approaches heavily rely on costly pixel-level annotation. Semi-supervised learning is a promising resolution to tackle this issue, but its performance still far falls behind the fully supervised counterpart. This work proposes a cross-teacher training framework with three modules that significantly improves traditional semi-supervised learning approaches. The core is a cross-teacher module, which could simultaneously reduce the coupling among peer networks and the error accumulation between teacher and student networks. In addition, we propose two complementary contrastive learning modules. The high-level module can transfer high-quality knowledge from labeled data to unlabeled ones and promote separation between classes in feature space. The low-level module can encourage low-quality features learning from the high-quality features among peer networks. In experiments, the cross-teacher module significantly improves the performance of traditional student-teacher approaches, and our framework outperforms stateof-the-art methods on benchmark datasets. Our source code of CTT will be released.
In response to the existing object detection algorithms are applied to complex fire scenarios with poor detection accuracy, slow speed and difficult deployment., this paper proposes a lightweight fire detection algorithm of Light-YOLOv5 that achieves a balance of speed and accuracy. First, the last layer of backbone network is replaced with SepViT Block to enhance the contact of backbone network to global information; second, a Light-BiFPN neck network is designed to lighten the model while improving the feature extraction; third, Global Attention Mechanism (GAM) is fused into the network to make the model more focused on global dimensional features; finally, we use the Mish activation function and SIoU loss to increase the convergence speed and improve the accuracy at the same time. The experimental results show that Light-YOLOv5 improves mAP by 3.3% compared to the original algorithm, reduces the number of parameters by 27.1%, decreases the computation by 19.1%, achieves FPS of 91.1. Even compared to the latest YOLOv7-tiny, the mAP of Light-YOLOv5 is 6.8% higher, which shows the effectiveness of the algorithm.
An exhaustive study has been conducted to investigate span-based models for the joint entity and relation extraction task. However, these models sample a large number of negative entities and negative relations during the model training, which are essential but result in grossly imbalanced data distributions and in turn cause suboptimal model performance. In order to address the above issues, we propose a two-phase paradigm for the span-based joint entity and relation extraction, which involves classifying the entities and relations in the first phase, and predicting the types of these entities and relations in the second phase. The two-phase paradigm enables our model to significantly reduce the data distribution gap, including the gap between negative entities and other entities, as well as the gap between negative relations and other relations. In addition, we make the first attempt at combining entity type and entity distance as global features, which has proven effective, especially for the relation extraction. Experimental results on several datasets demonstrate that the spanbased joint extraction model augmented with the two-phase paradigm and the global features consistently outperforms previous state-of-the-art span-based models for the joint extraction task, establishing a new standard benchmark. Qualitative and quantitative analyses further validate the effectiveness the proposed paradigm and the global features.
Antenna selection is capable of handling the cost and complexity issues in massive multiple-input multiple-output (MIMO) channels. The sum-rate capacity of a multiuser massive MIMO uplink channel is characterized under the Nakagami fading. A mathematically tractable sum-rate capacity upper bound is derived for the considered system. Moreover, for a sufficiently large base station (BS) antenna number, a deterministic equivalent (DE) of the sum-rate bound is derived. Based on this DE, the sum-rate capacity is shown to grow double logarithmically with the number of BS antennas. The validity of the analytical result is confirmed by numerical experiments.
Data-driven discovery of PDEs has made tremendous progress recently, and many canonical PDEs have been discovered successfully for proof-of-concept. However, determining the most proper PDE without prior references remains challenging in terms of practical applications. In this work, a physics-informed information criterion (PIC) is proposed to measure the parsimony and precision of the discovered PDE synthetically. The proposed PIC achieves state-of-the-art robustness to highly noisy and sparse data on seven canonical PDEs from different physical scenes, which confirms its ability to handle difficult situations. The PIC is also employed to discover unrevealed macroscale governing equations from microscopic simulation data in an actual physical scene. The results show that the discovered macroscale PDE is precise and parsimonious, and satisfies underlying symmetries, which facilitates understanding and simulation of the physical process. The proposition of PIC enables practical applications of PDE discovery in discovering unrevealed governing equations in broader physical scenes.
Modeling the evolution of user preference is essential in recommender systems. Recently, dynamic graph-based methods have been studied and achieved SOTA for recommendation, majority of which focus on user's stable long-term preference. However, in real-world scenario, user's short-term preference evolves over time dynamically. Although there exists sequential methods that attempt to capture it, how to model the evolution of short-term preference with dynamic graph-based methods has not been well-addressed yet. In particular: 1) existing methods do not explicitly encode and capture the evolution of short-term preference as sequential methods do; 2) simply using last few interactions is not enough for modeling the changing trend. In this paper, we propose Long Short-Term Preference Modeling for Continuous-Time Sequential Recommendation (LSTSR) to capture the evolution of short-term preference under dynamic graph. Specifically, we explicitly encode short-term preference and optimize it via memory mechanism, which has three key operations: Message, Aggregate and Update. Our memory mechanism can not only store one-hop information, but also trigger with new interactions online. Extensive experiments conducted on five public datasets show that LSTSR consistently outperforms many state-of-the-art recommendation methods across various lines.
Existing intelligent driving technology often has a problem in balancing smooth driving and fast obstacle avoidance, especially when the vehicle is in a non-structural environment, and is prone to instability in emergency situations. Therefore, this study proposed an autonomous obstacle avoidance control strategy that can effectively guarantee vehicle stability based on Attention-long short-term memory (Attention-LSTM) deep learning model with the idea of humanoid driving. First, we designed the autonomous obstacle avoidance control rules to guarantee the safety of unmanned vehicles. Second, we improved the autonomous obstacle avoidance control strategy combined with the stability analysis of special vehicles. Third, we constructed a deep learning obstacle avoidance control through experiments, and the average relative error of this system was 15%. Finally, the stability and accuracy of this control strategy were verified numerically and experimentally. The method proposed in this study can ensure that the unmanned vehicle can successfully avoid the obstacles while driving smoothly.
This paper considers a lens antenna array-assisted millimeter wave (mmWave) multiuser multiple-input multiple-output (MU-MIMO) system. The base station's beam selection matrix and user terminals' phase-only beamformers are jointly designed with the aim of maximizing the uplink sum rate. In order to deal with the formulated mixed-integer optimization problem, a penalty dual decomposition (PDD)-based iterative algorithm is developed via capitalizing on the weighted minimum mean square error (WMMSE), block coordinate descent (BCD), and minorization-maximization (MM) techniques. Moreover, a low-complexity sequential optimization (SO)-based algorithm is proposed at the cost of a slight sum rate performance loss. Numerical results demonstrate that the proposed methods can achieve higher sum rates than state-of-the-art methods.