Alert button
Picture for Xiang Gao

Xiang Gao

Alert button

Machine Learning Driven Sensitivity Analysis of E3SM Land Model Parameters for Wetland Methane Emissions

Dec 05, 2023
Sandeep Chinta, Xiang Gao, Qing Zhu

Methane (CH4) is the second most critical greenhouse gas after carbon dioxide, contributing to 16-25% of the observed atmospheric warming. Wetlands are the primary natural source of methane emissions globally. However, wetland methane emission estimates from biogeochemistry models contain considerable uncertainty. One of the main sources of this uncertainty arises from the numerous uncertain model parameters within various physical, biological, and chemical processes that influence methane production, oxidation, and transport. Sensitivity Analysis (SA) can help identify critical parameters for methane emission and achieve reduced biases and uncertainties in future projections. This study performs SA for 19 selected parameters responsible for critical biogeochemical processes in the methane module of the Energy Exascale Earth System Model (E3SM) land model (ELM). The impact of these parameters on various CH4 fluxes is examined at 14 FLUXNET- CH4 sites with diverse vegetation types. Given the extensive number of model simulations needed for global variance-based SA, we employ a machine learning (ML) algorithm to emulate the complex behavior of ELM methane biogeochemistry. ML enables the computational time to be shortened significantly from 6 CPU hours to 0.72 milliseconds, achieving reduced computational costs. We found that parameters linked to CH4 production and diffusion generally present the highest sensitivities despite apparent seasonal variation. Comparing simulated emissions from perturbed parameter sets against FLUXNET-CH4 observations revealed that better performances can be achieved at each site compared to the default parameter values. This presents a scope for further improving simulated emissions using parameter calibration with advanced optimization techniques like Bayesian optimization.

* 24 pages, 9 figures and 2 tables 
Viaarxiv icon

Structured Neural Networks for Density Estimation and Causal Inference

Nov 03, 2023
Asic Q. Chen, Ruian Shi, Xiang Gao, Ricardo Baptista, Rahul G. Krishnan

Injecting structure into neural networks enables learning functions that satisfy invariances with respect to subsets of inputs. For instance, when learning generative models using neural networks, it is advantageous to encode the conditional independence structure of observed variables, often in the form of Bayesian networks. We propose the Structured Neural Network (StrNN), which injects structure through masking pathways in a neural network. The masks are designed via a novel relationship we explore between neural network architectures and binary matrix factorization, to ensure that the desired independencies are respected. We devise and study practical algorithms for this otherwise NP-hard design problem based on novel objectives that control the model architecture. We demonstrate the utility of StrNN in three applications: (1) binary and Gaussian density estimation with StrNN, (2) real-valued density estimation with Structured Autoregressive Flows (StrAFs) and Structured Continuous Normalizing Flows (StrCNF), and (3) interventional and counterfactual analysis with StrAFs for causal inference. Our work opens up new avenues for learning neural networks that enable data-efficient generative modeling and the use of normalizing flows for causal effect estimation.

* 10 pages with 5 figures, to be published in Neural Information Processing Systems 2023 
Viaarxiv icon

Colmap-PCD: An Open-source Tool for Fine Image-to-point cloud Registration

Oct 09, 2023
Chunge Bai, Ruijie Fu, Xiang Gao

State-of-the-art techniques for monocular camera reconstruction predominantly rely on the Structure from Motion (SfM) pipeline. However, such methods often yield reconstruction outcomes that lack crucial scale information, and over time, accumulation of images leads to inevitable drift issues. In contrast, mapping methods based on LiDAR scans are popular in large-scale urban scene reconstruction due to their precise distance measurements, a capability fundamentally absent in visual-based approaches. Researchers have made attempts to utilize concurrent LiDAR and camera measurements in pursuit of precise scaling and color details within mapping outcomes. However, the outcomes are subject to extrinsic calibration and time synchronization precision. In this paper, we propose a novel cost-effective reconstruction pipeline that utilizes a pre-established LiDAR map as a fixed constraint to effectively address the inherent scale challenges present in monocular camera reconstruction. To our knowledge, our method is the first to register images onto the point cloud map without requiring synchronous capture of camera and LiDAR data, granting us the flexibility to manage reconstruction detail levels across various areas of interest. To facilitate further research in this domain, we have released Colmap-PCD${^{3}}$, an open-source tool leveraging the Colmap algorithm, that enables precise fine-scale registration of images to the point cloud map.

Viaarxiv icon

Incremental Rotation Averaging Revisited and More: A New Rotation Averaging Benchmark

Sep 29, 2023
Xiang Gao, Hainan Cui, Shuhan Shen

In order to further advance the accuracy and robustness of the incremental parameter estimation-based rotation averaging methods, in this paper, a new member of the Incremental Rotation Averaging (IRA) family is introduced, which is termed as IRAv4. As the most significant feature of the IRAv4, a task-specific connected dominating set is extracted to serve as a more reliable and accurate reference for rotation global alignment. In addition, to further address the limitations of the existing rotation averaging benchmark of relying on the slightly outdated Bundler camera calibration results as ground truths and focusing solely on rotation estimation accuracy, this paper presents a new COLMAP-based rotation averaging benchmark that incorporates a cross check between COLMAP and Bundler, and employ the accuracy of both rotation and downstream location estimation as evaluation metrics, which is desired to provide a more reliable and comprehensive evaluation tool for the rotation averaging research. Comprehensive comparisons between the proposed IRAv4 and other mainstream rotation averaging methods on this new benchmark demonstrate the effectiveness of our proposed approach.

* Submitted to IEEE Transactions 
Viaarxiv icon

Sensing User's Activity, Channel, and Location with Near-Field Extra-Large-Scale MIMO

Jul 20, 2023
Li Qiao, Anwen Liao, Zhuoran Li, Hua Wang, Zhen Gao, Xiang Gao, Yu Su, Pei Xiao, Li You, Derrick Wing Kwan Ng

Figure 1 for Sensing User's Activity, Channel, and Location with Near-Field Extra-Large-Scale MIMO
Figure 2 for Sensing User's Activity, Channel, and Location with Near-Field Extra-Large-Scale MIMO
Figure 3 for Sensing User's Activity, Channel, and Location with Near-Field Extra-Large-Scale MIMO
Figure 4 for Sensing User's Activity, Channel, and Location with Near-Field Extra-Large-Scale MIMO

This paper proposes a grant-free massive access scheme based on the millimeter wave (mmWave) extra-large-scale multiple-input multiple-output (XL-MIMO) to support massive Internet-of-Things (IoT) devices with low latency, high data rate, and high localization accuracy in the upcoming sixth-generation (6G) networks. The XL-MIMO consists of multiple antenna subarrays that are widely spaced over the service area to ensure line-of-sight (LoS) transmissions. First, we establish the XL-MIMO-based massive access model considering the near-field spatial non-stationary (SNS) property. Then, by exploiting the block sparsity of subarrays and the SNS property, we propose a structured block orthogonal matching pursuit algorithm for efficient active user detection (AUD) and channel estimation (CE). Furthermore, different sensing matrices are applied in different pilot subcarriers for exploiting the diversity gains. Additionally, a multi-subarray collaborative localization algorithm is designed for localization. In particular, the angle of arrival (AoA) and time difference of arrival (TDoA) of the LoS links between active users and related subarrays are extracted from the estimated XL-MIMO channels, and then the coordinates of active users are acquired by jointly utilizing the AoAs and TDoAs. Simulation results show that the proposed algorithms outperform existing algorithms in terms of AUD and CE performance and can achieve centimeter-level localization accuracy.

* Submitted to IEEE Transactions on Communications, Major revision. Codes will be open to all on https://gaozhen16.github.io/ soon 
Viaarxiv icon

Modularizing while Training: a New Paradigm for Modularizing DNN Models

Jun 15, 2023
Binhang Qi, Hailong Sun, Hongyu Zhang, Ruobing Zhao, Xiang Gao

Figure 1 for Modularizing while Training: a New Paradigm for Modularizing DNN Models
Figure 2 for Modularizing while Training: a New Paradigm for Modularizing DNN Models
Figure 3 for Modularizing while Training: a New Paradigm for Modularizing DNN Models
Figure 4 for Modularizing while Training: a New Paradigm for Modularizing DNN Models

Deep neural network (DNN) models have become increasingly crucial components in intelligent software systems. However, training a DNN model is typically expensive in terms of both time and money. To address this issue, researchers have recently focused on reusing existing DNN models - borrowing the idea of code reuse in software engineering. However, reusing an entire model could cause extra overhead or inherits the weakness from the undesired functionalities. Hence, existing work proposes to decompose an already trained model into modules, i.e., modularizing-after-training, and enable module reuse. Since trained models are not built for modularization, modularizing-after-training incurs huge overhead and model accuracy loss. In this paper, we propose a novel approach that incorporates modularization into the model training process, i.e., modularizing-while-training (MwT). We train a model to be structurally modular through two loss functions that optimize intra-module cohesion and inter-module coupling. We have implemented the proposed approach for modularizing Convolutional Neural Network (CNN) models in this work. The evaluation results on representative models demonstrate that MwT outperforms the state-of-the-art approach. Specifically, the accuracy loss caused by MwT is only 1.13 percentage points, which is 1.76 percentage points less than that of the baseline. The kernel retention rate of the modules generated by MwT is only 14.58%, with a reduction of 74.31% over the state-of-the-art approach. Furthermore, the total time cost required for training and modularizing is only 108 minutes, half of the baseline.

* Accepted at ICSE'24 
Viaarxiv icon

Open Set Relation Extraction via Unknown-Aware Training

Jun 08, 2023
Jun Zhao, Xin Zhao, Wenyu Zhan, Qi Zhang, Tao Gui, Zhongyu Wei, Yunwen Chen, Xiang Gao, Xuanjing Huang

Figure 1 for Open Set Relation Extraction via Unknown-Aware Training
Figure 2 for Open Set Relation Extraction via Unknown-Aware Training
Figure 3 for Open Set Relation Extraction via Unknown-Aware Training
Figure 4 for Open Set Relation Extraction via Unknown-Aware Training

The existing supervised relation extraction methods have achieved impressive performance in a closed-set setting, where the relations during both training and testing remain the same. In a more realistic open-set setting, unknown relations may appear in the test set. Due to the lack of supervision signals from unknown relations, a well-performing closed-set relation extractor can still confidently misclassify them into known relations. In this paper, we propose an unknown-aware training method, regularizing the model by dynamically synthesizing negative instances. To facilitate a compact decision boundary, ``difficult'' negative instances are necessary. Inspired by text adversarial attacks, we adaptively apply small but critical perturbations to original training instances and thus synthesizing negative instances that are more likely to be mistaken by the model as known relations. Experimental results show that this method achieves SOTA unknown relation detection without compromising the classification of known relations.

* Accepted by ACL2023 
Viaarxiv icon

Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model

May 22, 2023
Xiao Wang, Weikang Zhou, Qi Zhang, Jie Zhou, Songyang Gao, Junzhe Wang, Menghan Zhang, Xiang Gao, Yunwen Chen, Tao Gui

Figure 1 for Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model
Figure 2 for Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model
Figure 3 for Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model
Figure 4 for Farewell to Aimless Large-scale Pretraining: Influential Subset Selection for Language Model

Pretrained language models have achieved remarkable success in various natural language processing tasks. However, pretraining has recently shifted toward larger models and larger data, and this has resulted in significant computational and energy costs. In this paper, we propose Influence Subset Selection (ISS) for language model, which explicitly utilizes end-task knowledge to select a tiny subset of the pretraining corpus. Specifically, the ISS selects the samples that will provide the most positive influence on the performance of the end-task. Furthermore, we design a gradient matching based influence estimation method, which can drastically reduce the computation time of influence. With only 0.45% of the data and a three-orders-of-magnitude lower computational cost, ISS outperformed pretrained models (e.g., RoBERTa) on eight datasets covering four domains.

* Accepted by ACL2023 
Viaarxiv icon

Reusing Deep Neural Network Models through Model Re-engineering

Apr 01, 2023
Binhang Qi, Hailong Sun, Xiang Gao, Hongyu Zhang, Zhaotian Li, Xudong Liu

Figure 1 for Reusing Deep Neural Network Models through Model Re-engineering
Figure 2 for Reusing Deep Neural Network Models through Model Re-engineering
Figure 3 for Reusing Deep Neural Network Models through Model Re-engineering
Figure 4 for Reusing Deep Neural Network Models through Model Re-engineering

Training deep neural network (DNN) models, which has become an important task in today's software development, is often costly in terms of computational resources and time. With the inspiration of software reuse, building DNN models through reusing existing ones has gained increasing attention recently. Prior approaches to DNN model reuse have two main limitations: 1) reusing the entire model, while only a small part of the model's functionalities (labels) are required, would cause much overhead (e.g., computational and time costs for inference), and 2) model reuse would inherit the defects and weaknesses of the reused model, and hence put the new system under threats of security attack. To solve the above problem, we propose SeaM, a tool that re-engineers a trained DNN model to improve its reusability. Specifically, given a target problem and a trained model, SeaM utilizes a gradient-based search method to search for the model's weights that are relevant to the target problem. The re-engineered model that only retains the relevant weights is then reused to solve the target problem. Evaluation results on widely-used models show that the re-engineered models produced by SeaM only contain 10.11% weights of the original models, resulting 42.41% reduction in terms of inference time. For the target problem, the re-engineered models even outperform the original models in classification accuracy by 5.85%. Moreover, reusing the re-engineered models inherits an average of 57% fewer defects than reusing the entire model. We believe our approach to reducing reuse overhead and defect inheritance is one important step forward for practical model reuse.

* Accepted by ICSE'23 
Viaarxiv icon

Learning Regularized Positional Encoding for Molecular Prediction

Nov 23, 2022
Xiang Gao, Weihao Gao, Wenzhi Xiao, Zhirui Wang, Chong Wang, Liang Xiang

Figure 1 for Learning Regularized Positional Encoding for Molecular Prediction
Figure 2 for Learning Regularized Positional Encoding for Molecular Prediction
Figure 3 for Learning Regularized Positional Encoding for Molecular Prediction
Figure 4 for Learning Regularized Positional Encoding for Molecular Prediction

Machine learning has become a promising approach for molecular modeling. Positional quantities, such as interatomic distances and bond angles, play a crucial role in molecule physics. The existing works rely on careful manual design of their representation. To model the complex nonlinearity in predicting molecular properties in an more end-to-end approach, we propose to encode the positional quantities with a learnable embedding that is continuous and differentiable. A regularization technique is employed to encourage embedding smoothness along the physical dimension. We experiment with a variety of molecular property and force field prediction tasks. Improved performance is observed for three different model architectures after plugging in the proposed positional encoding method. In addition, the learned positional encoding allows easier physics-based interpretation. We observe that tasks of similar physics have the similar learned positional encoding.

* AI4Science Workshop at NeurIPS 2022 
Viaarxiv icon