The mathematical formula is the human language to describe nature and is the essence of scientific research. Finding mathematical formulas from observational data is a major demand of scientific research and a major challenge of artificial intelligence. This area is called symbolic regression. Originally symbolic regression was often formulated as a combinatorial optimization problem and solved using GP or reinforcement learning algorithms. These two kinds of algorithms have strong noise robustness ability and good Versatility. However, inference time usually takes a long time, so the search efficiency is relatively low. Later, based on large-scale pre-training data proposed, such methods use a large number of synthetic data points and expression pairs to train a Generative Pre-Trained Transformer(GPT). Then this GPT can only need to perform one forward propagation to obtain the results, the advantage is that the inference speed is very fast. However, its performance is very dependent on the training data and performs poorly on data outside the training set, which leads to poor noise robustness and Versatility of such methods. So, can we combine the advantages of the above two categories of SR algorithms? In this paper, we propose \textbf{FormulaGPT}, which trains a GPT using massive sparse reward learning histories of reinforcement learning-based SR algorithms as training data. After training, the SR algorithm based on reinforcement learning is distilled into a Transformer. When new test data comes, FormulaGPT can directly generate a "reinforcement learning process" and automatically update the learning policy in context. Tested on more than ten datasets including SRBench, formulaGPT achieves the state-of-the-art performance in fitting ability compared with four baselines. In addition, it achieves satisfactory results in noise robustness, versatility, and inference efficiency.
Mathematical formulas are the crystallization of human wisdom in exploring the laws of nature for thousands of years. Describing the complex laws of nature with a concise mathematical formula is a constant pursuit of scientists and a great challenge for artificial intelligence. This field is called symbolic regression. Symbolic regression was originally formulated as a combinatorial optimization problem, and GP and reinforcement learning algorithms were used to solve it. However, GP is sensitive to hyperparameters, and these two types of algorithms are inefficient. To solve this problem, researchers treat the mapping from data to expressions as a translation problem. And the corresponding large-scale pre-trained model is introduced. However, the data and expression skeletons do not have very clear word correspondences as the two languages do. Instead, they are more like two modalities (e.g., image and text). Therefore, in this paper, we proposed MMSR. The SR problem is solved as a pure multimodal problem, and contrastive learning is also introduced in the training process for modal alignment to facilitate later modal feature fusion. It is worth noting that in order to better promote the modal feature fusion, we adopt the strategy of training contrastive learning loss and other losses at the same time, which only needs one-step training, instead of training contrastive learning loss first and then training other losses. Because our experiments prove training together can make the feature extraction module and feature fusion module running-in better. Experimental results show that compared with multiple large-scale pre-training baselines, MMSR achieves the most advanced results on multiple mainstream datasets including SRBench.
Finding a concise and interpretable mathematical formula that accurately describes the relationship between each variable and the predicted value in the data is a crucial task in scientific research, as well as a significant challenge in artificial intelligence. This problem is referred to as symbolic regression, which is an NP-hard problem. In the previous year, a novel symbolic regression methodology utilizing Monte Carlo Tree Search (MCTS) was advanced, achieving state-of-the-art results on a diverse range of datasets. although this algorithm has shown considerable improvement in recovering target expressions compared to previous methods, the lack of guidance during the MCTS process severely hampers its search efficiency. Recently, some algorithms have added a pre-trained policy network to guide the search of MCTS, but the pre-trained policy network generalizes poorly. To optimize the trade-off between efficiency and versatility, we introduce SR-GPT, a novel algorithm for symbolic regression that integrates Monte Carlo Tree Search (MCTS) with a Generative Pre-Trained Transformer (GPT). By using GPT to guide the MCTS, the search efficiency of MCTS is significantly improved. Next, we utilize the MCTS results to further refine the GPT, enhancing its capabilities and providing more accurate guidance for the MCTS. MCTS and GPT are coupled together and optimize each other until the target expression is successfully determined. We conducted extensive evaluations of SR-GPT using 222 expressions sourced from over 10 different symbolic regression datasets. The experimental results demonstrate that SR-GPT outperforms existing state-of-the-art algorithms in accurately recovering symbolic expressions both with and without added noise.
Symbolic regression aims to derive interpretable symbolic expressions from data in order to better understand and interpret data. %which plays an important role in knowledge discovery and interpretable machine learning. In this study, a symbolic network called PruneSymNet is proposed for symbolic regression. This is a novel neural network whose activation function consists of common elementary functions and operators. The whole network is differentiable and can be trained by gradient descent method. Each subnetwork in the network corresponds to an expression, and our goal is to extract such subnetworks to get the desired symbolic expression. Therefore, a greedy pruning algorithm is proposed to prune the network into a subnetwork while ensuring the accuracy of data fitting. The proposed greedy pruning algorithm preserves the edge with the least loss in each pruning, but greedy algorithm often can not get the optimal solution. In order to alleviate this problem, we combine beam search during pruning to obtain multiple candidate expressions each time, and finally select the expression with the smallest loss as the final result. It was tested on the public data set and compared with the current popular algorithms. The results showed that the proposed algorithm had better accuracy.
Artificial neural networks (ANNs) have permeated various disciplinary domains, ranging from bioinformatics to financial analytics, where their application has become an indispensable facet of contemporary scientific research endeavors. However, the inherent limitations of traditional neural networks arise due to their relatively fixed network structures and activation functions. 1, The type of activation function is single and relatively fixed, which leads to poor "unit representation ability" of the network, and it is often used to solve simple problems with very complex networks; 2, the network structure is not adaptive, it is easy to cause network structure redundant or insufficient. To address the aforementioned issues, this study proposes a novel neural network called X-Net. By utilizing our designed Alternating Backpropagation mechanism, X-Net dynamically selects appropriate activation functions based on derivative information during training to enhance the network's representation capability for specific tasks. Simultaneously, it accurately adjusts the network structure at the neuron level to accommodate tasks of varying complexities and reduce computational costs. Through a series of experiments, we demonstrate the dual advantages of X-Net in terms of reducing model size and improving representation power. Specifically, in terms of the number of parameters, X-Net is only 3$\%$ of baselines on average, and only 1.4$\%$ under some tasks. In terms of representation ability, X-Net can achieve an average $R^2$=0.985 on the fitting task by only optimizing the activation function without introducing any parameters. Finally, we also tested the ability of X-Net to help scientific discovery on data from multiple disciplines such as society, energy, environment, and aerospace, and achieved concise and good results.
Mathematical formulas serve as the means of communication between humans and nature, encapsulating the operational laws governing natural phenomena. The concise formulation of these laws is a crucial objective in scientific research and an important challenge for artificial intelligence (AI). While traditional artificial neural networks (MLP) excel at data fitting, they often yield uninterpretable black box results that hinder our understanding of the relationship between variables x and predicted values y. Moreover, the fixed network architecture in MLP often gives rise to redundancy in both network structure and parameters. To address these issues, we propose MetaSymNet, a novel neural network that dynamically adjusts its structure in real-time, allowing for both expansion and contraction. This adaptive network employs the PANGU meta function as its activation function, which is a unique type capable of evolving into various basic functions during training to compose mathematical formulas tailored to specific needs. We then evolve the neural network into a concise, interpretable mathematical expression. To evaluate MetaSymNet's performance, we compare it with four state-of-the-art symbolic regression algorithms across more than 10 public datasets comprising 222 formulas. Our experimental results demonstrate that our algorithm outperforms others consistently regardless of noise presence or absence. Furthermore, we assess MetaSymNet against MLP and SVM regarding their fitting ability and extrapolation capability, these are two essential aspects of machine learning algorithms. The findings reveal that our algorithm excels in both areas. Finally, we compared MetaSymNet with MLP using iterative pruning in network structure complexity. The results show that MetaSymNet's network structure complexity is obviously less than MLP under the same goodness of fit.
Multi-agent path finding (MAPF) is an abstract model for the navigation of multiple robots in warehouse automation, where multiple robots plan collision-free paths from the start to goal positions. Reinforcement learning (RL) has been employed to develop partially observable distributed MAPF policies that can be scaled to any number of agents. However, RL-based MAPF policies often get agents stuck in deadlock due to warehouse automation's dense and structured obstacles. This paper proposes a novel hybrid MAPF policy, RDE, based on switching among the RL-based MAPF policy, the Distance heat map (DHM)-based policy and the Escape policy. The RL-based policy is used for coordination among agents. In contrast, when no other agents are in the agent's field of view, it can get the next action by querying the DHM. The escape policy that randomly selects valid actions can help agents escape the deadlock. We conduct simulations on warehouse-like structured grid maps using state-of-the-art RL-based MAPF policies (DHC and DCC), which show that RDE can significantly improve their performance.
Benefiting from the rapid development of deep learning, 2D and 3D computer vision applications are deployed in many safe-critical systems, such as autopilot and identity authentication. However, deep learning models are not trustworthy enough because of their limited robustness against adversarial attacks. The physically realizable adversarial attacks further pose fatal threats to the application and human safety. Lots of papers have emerged to investigate the robustness and safety of deep learning models against adversarial attacks. To lead to trustworthy AI, we first construct a general threat model from different perspectives and then comprehensively review the latest progress of both 2D and 3D adversarial attacks. We extend the concept of adversarial examples beyond imperceptive perturbations and collate over 170 papers to give an overview of deep learning model robustness against various adversarial attacks. To the best of our knowledge, we are the first to systematically investigate adversarial attacks for 3D models, a flourishing field applied to many real-world applications. In addition, we examine physical adversarial attacks that lead to safety violations. Last but not least, we summarize present popular topics, give insights on challenges, and shed light on future research on trustworthy AI.