Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ziheng Sun

GraphProp: Training the Graph Foundation Models using Graph Properties

Aug 06, 2025

Ziheng Sun, Qi Feng, Lehao Lin, Chris Ding, Jicong Fan

Abstract:This work focuses on training graph foundation models (GFMs) that have strong generalization ability in graph-level tasks such as graph classification. Effective GFM training requires capturing information consistent across different domains. We discover that graph structures provide more consistent cross-domain information compared to node features and graph labels. However, traditional GFMs primarily focus on transferring node features from various domains into a unified representation space but often lack structural cross-domain generalization. To address this, we introduce GraphProp, which emphasizes structural generalization. The training process of GraphProp consists of two main phases. First, we train a structural GFM by predicting graph invariants. Since graph invariants are properties of graphs that depend only on the abstract structure, not on particular labellings or drawings of the graph, this structural GFM has a strong ability to capture the abstract structural information and provide discriminative graph representations comparable across diverse domains. In the second phase, we use the representations given by the structural GFM as positional encodings to train a comprehensive GFM. This phase utilizes domain-specific node attributes and graph labels to further improve cross-domain node feature generalization. Our experiments demonstrate that GraphProp significantly outperforms the competitors in supervised learning and few-shot learning, especially in handling graphs without node attributes.

Via

Access Paper or Ask Questions

Learnable Kernel Density Estimation for Graphs

May 27, 2025

Xudong Wang, Ziheng Sun, Chris Ding, Jicong Fan

Abstract:This work proposes a framework LGKDE that learns kernel density estimation for graphs. The key challenge in graph density estimation lies in effectively capturing both structural patterns and semantic variations while maintaining theoretical guarantees. Combining graph kernels and kernel density estimation (KDE) is a standard approach to graph density estimation, but has unsatisfactory performance due to the handcrafted and fixed features of kernels. Our method LGKDE leverages graph neural networks to represent each graph as a discrete distribution and utilizes maximum mean discrepancy to learn the graph metric for multi-scale KDE, where all parameters are learned by maximizing the density of graphs relative to the density of their well-designed perturbed counterparts. The perturbations are conducted on both node features and graph spectra, which helps better characterize the boundary of normal density regions. Theoretically, we establish consistency and convergence guarantees for LGKDE, including bounds on the mean integrated squared error, robustness, and complexity. We validate LGKDE by demonstrating its effectiveness in recovering the underlying density of synthetic graph distributions and applying it to graph anomaly detection across diverse benchmark datasets. Extensive empirical evaluation shows that LGKDE demonstrates superior performance compared to state-of-the-art baselines on most benchmark datasets.

* Under Review

Via

Access Paper or Ask Questions

Back To The Future: A Hybrid Transformer-XGBoost Model for Action-oriented Future-proofing Nowcasting

Dec 21, 2024

Ziheng Sun

Abstract:Inspired by the iconic movie Back to the Future, this paper explores an innovative adaptive nowcasting approach that reimagines the relationship between present actions and future outcomes. In the movie, characters travel through time to manipulate past events, aiming to create a better future. Analogously, our framework employs predictive insights about the future to inform and adjust present conditions. This dual-stage model integrates the forecasting power of Transformers (future visionary) with the interpretability and efficiency of XGBoost (decision maker), enabling a seamless loop of future prediction and present adaptation. Through experimentation with meteorological datasets, we demonstrate the framework's advantage in achieving more accurate forecasting while guiding actionable interventions for real-time applications.

Via

Access Paper or Ask Questions

K-means Derived Unsupervised Feature Selection using Improved ADMM

Nov 19, 2024

Ziheng Sun, Chris Ding, Jicong Fan

Figure 1 for K-means Derived Unsupervised Feature Selection using Improved ADMM

Figure 2 for K-means Derived Unsupervised Feature Selection using Improved ADMM

Figure 3 for K-means Derived Unsupervised Feature Selection using Improved ADMM

Figure 4 for K-means Derived Unsupervised Feature Selection using Improved ADMM

Abstract:Feature selection is important for high-dimensional data analysis and is non-trivial in unsupervised learning problems such as dimensionality reduction and clustering. The goal of unsupervised feature selection is finding a subset of features such that the data points from different clusters are well separated. This paper presents a novel method called K-means Derived Unsupervised Feature Selection (K-means UFS). Unlike most existing spectral analysis based unsupervised feature selection methods, we select features using the objective of K-means. We develop an alternating direction method of multipliers (ADMM) to solve the NP-hard optimization problem of our K-means UFS model. Extensive experiments on real datasets show that our K-means UFS is more effective than the baselines in selecting features for clustering.

Via

Access Paper or Ask Questions