Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xinping Yi

Invariant Correlation of Representation with Label

Jul 01, 2024

Gaojie Jin, Ronghui Mu, Xinping Yi, Xiaowei Huang, Lijun Zhang

Abstract:The Invariant Risk Minimization (IRM) approach aims to address the challenge of domain generalization by training a feature representation that remains invariant across multiple environments. However, in noisy environments, IRM-related techniques such as IRMv1 and VREx may be unable to achieve the optimal IRM solution, primarily due to erroneous optimization directions. To address this issue, we introduce ICorr (an abbreviation for \textbf{I}nvariant \textbf{Corr}elation), a novel approach designed to surmount the above challenge in noisy settings. Additionally, we dig into a case study to analyze why previous methods may lose ground while ICorr can succeed. Through a theoretical lens, particularly from a causality perspective, we illustrate that the invariant correlation of representation with label is a necessary condition for the optimal invariant predictor in noisy environments, whereas the optimization motivations for other methods may not be. Furthermore, we empirically demonstrate the effectiveness of ICorr by comparing it with other domain generalization methods on various noisy datasets.

Via

Access Paper or Ask Questions

Towards Unified AI Models for MU-MIMO Communications: A Tensor Equivariance Framework

Jun 13, 2024

Yafei Wang, Hongwei Hou, Xinping Yi, Wenjin Wang, Shi Jin

Abstract:In this paper, we propose a unified framework based on equivariance for the design of artificial intelligence (AI)-assisted technologies in multi-user multiple-input-multiple-output (MU-MIMO) systems. We first provide definitions of multidimensional equivariance, high-order equivariance, and multidimensional invariance (referred to collectively as tensor equivariance). On this basis, by investigating the design of precoding and user scheduling, which are key techniques in MU-MIMO systems, we delve deeper into revealing tensor equivariance of the mappings from channel information to optimal precoding tensors, precoding auxiliary tensors, and scheduling indicators, respectively. To model mappings with tensor equivariance, we propose a series of plug-and-play tensor equivariant neural network (TENN) modules, where the computation involving intricate parameter sharing patterns is transformed into concise tensor operations. Building upon TENN modules, we propose the unified tensor equivariance framework that can be applicable to various communication tasks, based on which we easily accomplish the design of corresponding AI-assisted precoding and user scheduling schemes. Simulation results demonstrate that the constructed precoding and user scheduling methods achieve near-optimal performance while exhibiting significantly lower computational complexity and generalization to inputs with varying sizes across multiple dimensions. This validates the superiority of TENN modules and the unified framework.

* This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Continuous Geometry-Aware Graph Diffusion via Hyperbolic Neural PDE

Jun 03, 2024

Jiaxu Liu, Xinping Yi, Sihao Wu, Xiangyu Yin, Tianle Zhang, Xiaowei Huang, Jin Shi

Figure 1 for Continuous Geometry-Aware Graph Diffusion via Hyperbolic Neural PDE

Figure 2 for Continuous Geometry-Aware Graph Diffusion via Hyperbolic Neural PDE

Figure 3 for Continuous Geometry-Aware Graph Diffusion via Hyperbolic Neural PDE

Figure 4 for Continuous Geometry-Aware Graph Diffusion via Hyperbolic Neural PDE

Abstract:While Hyperbolic Graph Neural Network (HGNN) has recently emerged as a powerful tool dealing with hierarchical graph data, the limitations of scalability and efficiency hinder itself from generalizing to deep models. In this paper, by envisioning depth as a continuous-time embedding evolution, we decouple the HGNN and reframe the information propagation as a partial differential equation, letting node-wise attention undertake the role of diffusivity within the Hyperbolic Neural PDE (HPDE). By introducing theoretical principles \textit{e.g.,} field and flow, gradient, divergence, and diffusivity on a non-Euclidean manifold for HPDE integration, we discuss both implicit and explicit discretization schemes to formulate numerical HPDE solvers. Further, we propose the Hyperbolic Graph Diffusion Equation (HGDE) -- a flexible vector flow function that can be integrated to obtain expressive hyperbolic node embeddings. By analyzing potential energy decay of embeddings, we demonstrate that HGDE is capable of modeling both low- and high-order proximity with the benefit of local-global diffusivity functions. Experiments on node classification and link prediction and image-text classification tasks verify the superiority of the proposed method, which consistently outperforms various competitive models by a significant margin.

* The short version of this work will appear in the Proceedings of the 2024 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2024)

Via

Access Paper or Ask Questions

Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teaming

May 21, 2024

Jiaxu Liu, Xiangyu Yin, Sihao Wu, Jianhong Wang, Meng Fang, Xinping Yi, Xiaowei Huang

Figure 1 for Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teaming

Figure 2 for Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teaming

Figure 3 for Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teaming

Figure 4 for Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teaming

Abstract:With the proliferation of red-teaming strategies for Large Language Models (LLMs), the deficiency in the literature about improving the safety and robustness of LLM defense strategies is becoming increasingly pronounced. This paper introduces the LLM-based \textbf{sentinel} model as a plug-and-play prefix module designed to reconstruct the input prompt with just a few ($<30$) additional tokens, effectively reducing toxicity in responses from target LLMs. The sentinel model naturally overcomes the \textit{parameter inefficiency} and \textit{limited model accessibility} for fine-tuning large target models. We employ an interleaved training regimen using Proximal Policy Optimization (PPO) to optimize both red team and sentinel models dynamically, incorporating a value head-sharing mechanism inspired by the multi-agent centralized critic to manage the complex interplay between agents. Our extensive experiments across text-to-text and text-to-image demonstrate the effectiveness of our approach in mitigating toxic outputs, even when dealing with larger models like \texttt{Llama-2}, \texttt{GPT-3.5} and \texttt{Stable-Diffusion}, highlighting the potential of our framework in enhancing safety and robustness in various applications.

* Preprint, 10 pages main with 10 pages appendix

Via

Access Paper or Ask Questions

Direct Training Needs Regularisation: Anytime Optimal Inference Spiking Neural Network

Apr 15, 2024

Dengyu Wu, Yi Qi, Kaiwen Cai, Gaojie Jin, Xinping Yi, Xiaowei Huang

Figure 1 for Direct Training Needs Regularisation: Anytime Optimal Inference Spiking Neural Network

Figure 2 for Direct Training Needs Regularisation: Anytime Optimal Inference Spiking Neural Network

Figure 3 for Direct Training Needs Regularisation: Anytime Optimal Inference Spiking Neural Network

Figure 4 for Direct Training Needs Regularisation: Anytime Optimal Inference Spiking Neural Network

Abstract:Spiking Neural Network (SNN) is acknowledged as the next generation of Artificial Neural Network (ANN) and hold great promise in effectively processing spatial-temporal information. However, the choice of timestep becomes crucial as it significantly impacts the accuracy of the neural network training. Specifically, a smaller timestep indicates better performance in efficient computing, resulting in reduced latency and operations. While, using a small timestep may lead to low accuracy due to insufficient information presentation with few spikes. This observation motivates us to develop an SNN that is more reliable for adaptive timestep by introducing a novel regularisation technique, namely Spatial-Temporal Regulariser (STR). Our approach regulates the ratio between the strength of spikes and membrane potential at each timestep. This effectively balances spatial and temporal performance during training, ultimately resulting in an Anytime Optimal Inference (AOI) SNN. Through extensive experiments on frame-based and event-based datasets, our method, in combination with cutoff based on softmax output, achieves state-of-the-art performance in terms of both latency and accuracy. Notably, with STR and cutoff, SNN achieves 2.14 to 2.89 faster in inference compared to the pre-configured timestep with near-zero accuracy drop of 0.50% to 0.64% over the event-based datasets. Code available: https://github.com/Dengyu-Wu/AOI-SNN-Regularisation

Via

Access Paper or Ask Questions

Robust Symbol-Level Precoding for Massive MIMO Communication Under Channel Aging

Feb 07, 2024

Yafei Wang, Xinping Yi, Hongwei Hou, Wenjin Wang, Shi Jin

Abstract:This paper investigates the robust design of symbol-level precoding (SLP) for multiuser multiple-input multiple-output (MIMO) downlink transmission with imperfect channel state information (CSI) caused by channel aging. By utilizing the a posteriori channel model based on the widely adopted jointly correlated channel model, the imperfect CSI is modeled as the statistical CSI incorporating the channel mean and channel variance information with spatial correlation. With the signal model in the presence of channel aging, we formulate the signal-to-noise-plus-interference ratio (SINR) balancing and minimum mean square error (MMSE) problems for robust SLP design. The former targets to maximize the minimum SINR across users, while the latter minimizes the mean square error between the received signal and the target constellation point. When it comes to massive MIMO scenarios, the increment in the number of antennas poses a computational complexity challenge, limiting the deployment of SLP schemes. To address such a challenge, we simplify the objective function of the SINR balancing problem and further derive a closed-form SLP scheme. Besides, by approximating the matrix involved in the computation, we modify the proposed algorithm and develop an MMSE-based SLP scheme with lower computation complexity. Simulation results confirm the superiority of the proposed schemes over the state-of-the-art SLP schemes.

Via

Access Paper or Ask Questions

Rethinking Spectral Graph Neural Networks with Spatially Adaptive Filtering

Jan 30, 2024

Jingwei Guo, Kaizhu Huang, Xinping Yi, Zixian Su, Rui Zhang

Abstract:Whilst spectral Graph Neural Networks (GNNs) are theoretically well-founded in the spectral domain, their practical reliance on polynomial approximation implies a profound linkage to the spatial domain. As previous studies rarely examine spectral GNNs from the spatial perspective, their spatial-domain interpretability remains elusive, e.g., what information is essentially encoded by spectral GNNs in the spatial domain? In this paper, to answer this question, we establish a theoretical connection between spectral filtering and spatial aggregation, unveiling an intrinsic interaction that spectral filtering implicitly leads the original graph to an adapted new graph, explicitly computed for spatial aggregation. Both theoretical and empirical investigations reveal that the adapted new graph not only exhibits non-locality but also accommodates signed edge weights to reflect label consistency among nodes. These findings thus highlight the interpretable role of spectral GNNs in the spatial domain and inspire us to rethink graph spectral filters beyond the fixed-order polynomials, which neglect global information. Built upon the theoretical findings, we revisit the state-of-the-art spectral GNNs and propose a novel Spatially Adaptive Filtering (SAF) framework, which leverages the adapted new graph by spectral filtering for an auxiliary non-local aggregation. Notably, our proposed SAF comprehensively models both node similarity and dissimilarity from a global perspective, therefore alleviating persistent deficiencies of GNNs related to long-range dependencies and graph heterophily. Extensive experiments over 13 node classification benchmarks demonstrate the superiority of our proposed framework to the state-of-the-art models.

Via

Access Paper or Ask Questions

Graph Neural Networks with Diverse Spectral Filtering

Dec 14, 2023

Jingwei Guo, Kaizhu Huang, Xinping Yi, Rui Zhang

Figure 1 for Graph Neural Networks with Diverse Spectral Filtering

Figure 2 for Graph Neural Networks with Diverse Spectral Filtering

Figure 3 for Graph Neural Networks with Diverse Spectral Filtering

Figure 4 for Graph Neural Networks with Diverse Spectral Filtering

Abstract:Spectral Graph Neural Networks (GNNs) have achieved tremendous success in graph machine learning, with polynomial filters applied for graph convolutions, where all nodes share the identical filter weights to mine their local contexts. Despite the success, existing spectral GNNs usually fail to deal with complex networks (e.g., WWW) due to such homogeneous spectral filtering setting that ignores the regional heterogeneity as typically seen in real-world networks. To tackle this issue, we propose a novel diverse spectral filtering (DSF) framework, which automatically learns node-specific filter weights to exploit the varying local structure properly. Particularly, the diverse filter weights consist of two components -- A global one shared among all nodes, and a local one that varies along network edges to reflect node difference arising from distinct graph parts -- to balance between local and global information. As such, not only can the global graph characteristics be captured, but also the diverse local patterns can be mined with awareness of different node positions. Interestingly, we formulate a novel optimization problem to assist in learning diverse filters, which also enables us to enhance any spectral GNNs with our DSF framework. We showcase the proposed framework on three state-of-the-arts including GPR-GNN, BernNet, and JacobiConv. Extensive experiments over 10 benchmark datasets demonstrate that our framework can consistently boost model performance by up to 4.92% in node classification tasks, producing diverse filters with enhanced interpretability. Code is available at \url{https://github.com/jingweio/DSF}.

* Proceedings of the ACM Web Conference 2023
* Accepted by Proceedings of the ACM Web Conference 2023 (WWW '23)

Via

Access Paper or Ask Questions

Beam-Delay Domain Channel Estimation for mmWave XL-MIMO Systems

Dec 10, 2023

Hongwei Hou, Xuan He, Tianhao Fang, Xinping Yi, Wenjin Wang, Shi Jin

Figure 1 for Beam-Delay Domain Channel Estimation for mmWave XL-MIMO Systems

Figure 2 for Beam-Delay Domain Channel Estimation for mmWave XL-MIMO Systems

Figure 3 for Beam-Delay Domain Channel Estimation for mmWave XL-MIMO Systems

Figure 4 for Beam-Delay Domain Channel Estimation for mmWave XL-MIMO Systems

Abstract:This paper investigates the uplink channel estimation of the millimeter-wave (mmWave) extremely large-scale multiple-input-multiple-output (XL-MIMO) communication system in the beam-delay domain, taking into account the near-field and beam-squint effects due to the transmission bandwidth and array aperture growth. Specifically, we model the sparsity in the delay domain to explore inter-subcarrier correlations and propose the beam-delay domain sparse representation of spatial-frequency domain channels. The independent and non-identically distributed Bernoulli-Gaussian models with unknown prior hyperparameters are employed to capture the sparsity in the beam-delay domain, posing a challenge for channel estimation. Under the constrained Bethe free energy minimization framework, we design different structures on the beliefs to develop hybrid message passing (HMP) algorithms, thus achieving efficient joint estimation of beam-delay domain channel and prior hyperparameters. To further improve the model accuracy, the multidimensional grid point perturbation (MDGPP)-based representation is presented, which assigns individual perturbation parameters to each multidimensional discrete grid. By treating the MDGPP parameters as unknown hyperparameters, we propose the two-stage HMP algorithm for MDGPP-based channel estimation, where the output of the initial estimation stage is pruned for the refinement stage for the computational complexity reduction. Numerical simulations demonstrate the significant superiority of the proposed algorithms over benchmarks with both near-field and beam-squint effects.

* This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Soft Demodulator for Symbol-Level Precoding in Coded Multiuser MISO Systems

Oct 16, 2023

Yafei Wang, Hongwei Hou, Wenjin Wang, Xinping Yi, Shi Jin

Abstract:In this paper, we consider symbol-level precoding (SLP) in channel-coded multiuser multi-input single-output (MISO) systems. It is observed that the received SLP signals do not always follow Gaussian distribution, rendering the conventional soft demodulation with the Gaussian assumption unsuitable for the coded SLP systems. It, therefore, calls for novel soft demodulator designs for non-Gaussian distributed SLP signals with accurate log-likelihood ratio (LLR) calculation. To this end, we first investigate the non-Gaussian characteristics of both phase-shift keying (PSK) and quadrature amplitude modulation (QAM) received signals with existing SLP schemes and categorize the signals into two distinct types. The first type exhibits an approximate-Gaussian distribution with the outliers extending along the constructive interference region (CIR). In contrast, the second type follows some distribution that significantly deviates from the Gaussian distribution. To obtain accurate LLR, we propose the modified Gaussian soft demodulator and Gaussian mixture model (GMM) soft demodulators to deal with two types of signals respectively. Subsequently, to further reduce the computational complexity and pilot overhead, we put forward a novel neural soft demodulator, named pilot feature extraction network (PFEN), leveraging the transformer mechanism in deep learning. Simulation results show that the proposed soft demodulators dramatically improve the throughput of existing SLPs for both PSK and QAM transmission in coded systems.

Via

Access Paper or Ask Questions