In response to the existing object detection algorithms are applied to complex fire scenarios with poor detection accuracy, slow speed and difficult deployment., this paper proposes a lightweight fire detection algorithm of Light-YOLOv5 that achieves a balance of speed and accuracy. First, the last layer of backbone network is replaced with SepViT Block to enhance the contact of backbone network to global information; second, a Light-BiFPN neck network is designed to lighten the model while improving the feature extraction; third, Global Attention Mechanism (GAM) is fused into the network to make the model more focused on global dimensional features; finally, we use the Mish activation function and SIoU loss to increase the convergence speed and improve the accuracy at the same time. The experimental results show that Light-YOLOv5 improves mAP by 3.3% compared to the original algorithm, reduces the number of parameters by 27.1%, decreases the computation by 19.1%, achieves FPS of 91.1. Even compared to the latest YOLOv7-tiny, the mAP of Light-YOLOv5 is 6.8% higher, which shows the effectiveness of the algorithm.
An exhaustive study has been conducted to investigate span-based models for the joint entity and relation extraction task. However, these models sample a large number of negative entities and negative relations during the model training, which are essential but result in grossly imbalanced data distributions and in turn cause suboptimal model performance. In order to address the above issues, we propose a two-phase paradigm for the span-based joint entity and relation extraction, which involves classifying the entities and relations in the first phase, and predicting the types of these entities and relations in the second phase. The two-phase paradigm enables our model to significantly reduce the data distribution gap, including the gap between negative entities and other entities, as well as the gap between negative relations and other relations. In addition, we make the first attempt at combining entity type and entity distance as global features, which has proven effective, especially for the relation extraction. Experimental results on several datasets demonstrate that the spanbased joint extraction model augmented with the two-phase paradigm and the global features consistently outperforms previous state-of-the-art span-based models for the joint extraction task, establishing a new standard benchmark. Qualitative and quantitative analyses further validate the effectiveness the proposed paradigm and the global features.
Antenna selection is capable of handling the cost and complexity issues in massive multiple-input multiple-output (MIMO) channels. The sum-rate capacity of a multiuser massive MIMO uplink channel is characterized under the Nakagami fading. A mathematically tractable sum-rate capacity upper bound is derived for the considered system. Moreover, for a sufficiently large base station (BS) antenna number, a deterministic equivalent (DE) of the sum-rate bound is derived. Based on this DE, the sum-rate capacity is shown to grow double logarithmically with the number of BS antennas. The validity of the analytical result is confirmed by numerical experiments.
Data-driven discovery of PDEs has made tremendous progress recently, and many canonical PDEs have been discovered successfully for proof-of-concept. However, determining the most proper PDE without prior references remains challenging in terms of practical applications. In this work, a physics-informed information criterion (PIC) is proposed to measure the parsimony and precision of the discovered PDE synthetically. The proposed PIC achieves state-of-the-art robustness to highly noisy and sparse data on seven canonical PDEs from different physical scenes, which confirms its ability to handle difficult situations. The PIC is also employed to discover unrevealed macroscale governing equations from microscopic simulation data in an actual physical scene. The results show that the discovered macroscale PDE is precise and parsimonious, and satisfies underlying symmetries, which facilitates understanding and simulation of the physical process. The proposition of PIC enables practical applications of PDE discovery in discovering unrevealed governing equations in broader physical scenes.
Modeling the evolution of user preference is essential in recommender systems. Recently, dynamic graph-based methods have been studied and achieved SOTA for recommendation, majority of which focus on user's stable long-term preference. However, in real-world scenario, user's short-term preference evolves over time dynamically. Although there exists sequential methods that attempt to capture it, how to model the evolution of short-term preference with dynamic graph-based methods has not been well-addressed yet. In particular: 1) existing methods do not explicitly encode and capture the evolution of short-term preference as sequential methods do; 2) simply using last few interactions is not enough for modeling the changing trend. In this paper, we propose Long Short-Term Preference Modeling for Continuous-Time Sequential Recommendation (LSTSR) to capture the evolution of short-term preference under dynamic graph. Specifically, we explicitly encode short-term preference and optimize it via memory mechanism, which has three key operations: Message, Aggregate and Update. Our memory mechanism can not only store one-hop information, but also trigger with new interactions online. Extensive experiments conducted on five public datasets show that LSTSR consistently outperforms many state-of-the-art recommendation methods across various lines.
Existing intelligent driving technology often has a problem in balancing smooth driving and fast obstacle avoidance, especially when the vehicle is in a non-structural environment, and is prone to instability in emergency situations. Therefore, this study proposed an autonomous obstacle avoidance control strategy that can effectively guarantee vehicle stability based on Attention-long short-term memory (Attention-LSTM) deep learning model with the idea of humanoid driving. First, we designed the autonomous obstacle avoidance control rules to guarantee the safety of unmanned vehicles. Second, we improved the autonomous obstacle avoidance control strategy combined with the stability analysis of special vehicles. Third, we constructed a deep learning obstacle avoidance control through experiments, and the average relative error of this system was 15%. Finally, the stability and accuracy of this control strategy were verified numerically and experimentally. The method proposed in this study can ensure that the unmanned vehicle can successfully avoid the obstacles while driving smoothly.
This paper considers a lens antenna array-assisted millimeter wave (mmWave) multiuser multiple-input multiple-output (MU-MIMO) system. The base station's beam selection matrix and user terminals' phase-only beamformers are jointly designed with the aim of maximizing the uplink sum rate. In order to deal with the formulated mixed-integer optimization problem, a penalty dual decomposition (PDD)-based iterative algorithm is developed via capitalizing on the weighted minimum mean square error (WMMSE), block coordinate descent (BCD), and minorization-maximization (MM) techniques. Moreover, a low-complexity sequential optimization (SO)-based algorithm is proposed at the cost of a slight sum rate performance loss. Numerical results demonstrate that the proposed methods can achieve higher sum rates than state-of-the-art methods.
Constructing high-quality character image datasets is challenging because real-world images are often affected by image degradation. There are limitations when applying current image restoration methods to such real-world character images, since (i) the categories of noise in character images are different from those in general images; (ii) real-world character images usually contain more complex image degradation, e.g., mixed noise at different noise levels. To address these problems, we propose a real-world character restoration network (RCRN) to effectively restore degraded character images, where character skeleton information and scale-ensemble feature extraction are utilized to obtain better restoration performance. The proposed method consists of a skeleton extractor (SENet) and a character image restorer (CiRNet). SENet aims to preserve the structural consistency of the character and normalize complex noise. Then, CiRNet reconstructs clean images from degraded character images and their skeletons. Due to the lack of benchmarks for real-world character image restoration, we constructed a dataset containing 1,606 character images with real-world degradation to evaluate the validity of the proposed method. The experimental results demonstrate that RCRN outperforms state-of-the-art methods quantitatively and qualitatively.
Degraded images commonly exist in the general sources of character images, leading to unsatisfactory character recognition results. Existing methods have dedicated efforts to restoring degraded character images. However, the denoising results obtained by these methods do not appear to improve character recognition performance. This is mainly because current methods only focus on pixel-level information and ignore critical features of a character, such as its glyph, resulting in character-glyph damage during the denoising process. In this paper, we introduce a novel generic framework based on glyph fusion and attention mechanisms, i.e., CharFormer, for precisely recovering character images without changing their inherent glyphs. Unlike existing frameworks, CharFormer introduces a parallel target task for capturing additional information and injecting it into the image denoising backbone, which will maintain the consistency of character glyphs during character image denoising. Moreover, we utilize attention-based networks for global-local feature interaction, which will help to deal with blind denoising and enhance denoising performance. We compare CharFormer with state-of-the-art methods on multiple datasets. The experimental results show the superiority of CharFormer quantitatively and qualitatively.