Two-timescale stochastic approximation (TTSA) is among the most general frameworks for iterative stochastic algorithms. This includes well-known stochastic optimization methods such as SGD variants and those designed for bilevel or minimax problems, as well as reinforcement learning like the family of gradient-based temporal difference (GTD) algorithms. In this paper, we conduct an in-depth asymptotic analysis of TTSA under controlled Markovian noise via central limit theorem (CLT), uncovering the coupled dynamics of TTSA influenced by the underlying Markov chain, which has not been addressed by previous CLT results of TTSA only with Martingale difference noise. Building upon our CLT, we expand its application horizon of efficient sampling strategies from vanilla SGD to a wider TTSA context in distributed learning, thus broadening the scope of Hu et al. (2022). In addition, we leverage our CLT result to deduce the statistical properties of GTD algorithms with nonlinear function approximation using Markovian samples and show their identical asymptotic performance, a perspective not evident from current finite-time bounds.
Graph contrastive learning is a general learning paradigm excelling at capturing invariant information from diverse perturbations in graphs. Recent works focus on exploring the structural rationale from graphs, thereby increasing the discriminability of the invariant information. However, such methods may incur in the mis-learning of graph models towards the interpretability of graphs, and thus the learned noisy and task-agnostic information interferes with the prediction of graphs. To this end, with the purpose of exploring the intrinsic rationale of graphs, we accordingly propose to capture the dimensional rationale from graphs, which has not received sufficient attention in the literature. The conducted exploratory experiments attest to the feasibility of the aforementioned roadmap. To elucidate the innate mechanism behind the performance improvement arising from the dimensional rationale, we rethink the dimensional rationale in graph contrastive learning from a causal perspective and further formalize the causality among the variables in the pre-training stage to build the corresponding structural causal model. On the basis of the understanding of the structural causal model, we propose the dimensional rationale-aware graph contrastive learning approach, which introduces a learnable dimensional rationale acquiring network and a redundancy reduction constraint. The learnable dimensional rationale acquiring network is updated by leveraging a bi-level meta-learning technique, and the redundancy reduction constraint disentangles the redundant features through a decorrelation process during learning. Empirically, compared with state-of-the-art methods, our method can yield significant performance boosts on various benchmarks with respect to discriminability and transferability. The code implementation of our method is available at https://github.com/ByronJi/DRGCL.
Post-training quantization (PTQ) has driven attention to producing efficient large language models (LLMs) with ultra-low costs. Since hand-craft quantization parameters lead to low performance in low-bit quantization, recent methods optimize the quantization parameters through block-wise reconstruction between the floating-point and quantized models. However, these methods suffer from two challenges: accumulated errors from independent one-by-one block quantization and reconstruction difficulties from extreme weight and activation outliers. To address these two challenges, we propose CBQ, a cross-block reconstruction-based PTQ method for LLMs. To reduce error accumulation, we introduce a cross-block dependency with the aid of a homologous reconstruction scheme to build the long-range dependency between adjacent multi-blocks with overlapping. To reduce reconstruction difficulty, we design a coarse-to-fine pre-processing (CFP) to truncate weight outliers and dynamically scale activation outliers before optimization, and an adaptive rounding scheme, called LoRA-Rounding, with two low-rank learnable matrixes to further rectify weight quantization errors. Extensive experiments demonstrate that: (1) CBQ pushes both activation and weight quantization to low-bit settings W4A4, W4A8, and W2A16. (2) CBQ achieves better performance than the existing state-of-the-art methods on various LLMs and benchmark datasets.
The misuse of AI imagery can have harmful societal effects, prompting the creation of detectors to combat issues like the spread of fake news. Existing methods can effectively detect images generated by seen generators, but it is challenging to detect those generated by unseen generators. They do not concentrate on amplifying the output discrepancy when detectors process real versus fake images. This results in a close output distribution of real and fake samples, increasing classification difficulty in detecting unseen generators. This paper addresses the unseen-generator detection problem by considering this task from the perspective of anomaly detection and proposes an adversarial teacher-student discrepancy-aware framework. Our method encourages smaller output discrepancies between the student and the teacher models for real images while aiming for larger discrepancies for fake images. We employ adversarial learning to train a feature augmenter, which promotes smaller discrepancies between teacher and student networks when the inputs are fake images. Our method has achieved state-of-the-art on public benchmarks, and the visualization results show that a large output discrepancy is maintained when faced with various types of generators.
Intelligent Reflecting Surface (IRS) utilizes low-cost, passive reflecting elements to enhance the passive beam gain, improve Wireless Energy Transfer (WET) efficiency, and enable its deployment for numerous Internet of Things (IoT) devices. However, the increasing number of IRS elements presents considerable channel estimation challenges. This is due to the lack of active Radio Frequency (RF) chains in an IRS, while pilot overhead becomes intolerable. To address this issue, we propose a Channel State Information (CSI)-free scheme that maximizes received energy in a specific direction and covers the entire space through phased beam rotation. Furthermore, we take into account the impact of an imperfect IRS and meticulously design the active precoder and IRS reflecting phase shift to mitigate its effects. Our proposed technique does not alter the existing IRS hardware architecture, allowing for easy implementation in the current system, and enabling access or removal of any Energy Receivers (ERs) without additional cost. Numerical results illustrate the efficacy of our CSI-free scheme in facilitating large-scale IRS without compromising performance due to excessive pilot overhead. Furthermore, our scheme outperforms the CSI-based counterpart in scenarios involving large-scale ERs, making it a promising solution in the era of IoT.
Fluid antenna multiple access (FAMA) is capable of exploiting the high spatial diversity of wireless channels to mitigate multi-user interference via flexible port switching, which achieves a better performance than traditional multi-input-multi-output (MIMO) systems. Moreover, integrated data and energy transfer (IDET) is able to provide both the wireless data transfer (WDT) and wireless energy transfer (WET) services towards low-power devices. In this paper, a FAMA assisted IDET system is studied, where $N$ access points (APs) provide dedicated IDET services towards $N$ user equipments (UEs). Each UE is equipped with a single fluid antenna. The performance of WDT and WET , \textit{i.e.}, the WDT outage probability, the WET outage probability, the reliable throughput and the average energy harvesting amount, are analysed theoretically by using time switching (TS) between WDT and WET. Numerical results validate our theoretical analysis, which reveals that the number of UEs and TS ratio should be optimized to achieve a trade-off between the WDT and WET performance. Moreover, FAMA assisted IDET achieves a better performance in terms of both WDT and WET than traditional MIMO with the same antenna size.
Point cloud datasets often suffer from inadequate sample sizes in comparison to image datasets, making data augmentation challenging. While traditional methods, like rigid transformations and scaling, have limited potential in increasing dataset diversity due to their constraints on altering individual sample shapes, we introduce the Biharmonic Augmentation (BA) method. BA is a novel and efficient data augmentation technique that diversifies point cloud data by imposing smooth non-rigid deformations on existing 3D structures. This approach calculates biharmonic coordinates for the deformation function and learns diverse deformation prototypes. Utilizing a CoefNet, our method predicts coefficients to amalgamate these prototypes, ensuring comprehensive deformation. Moreover, we present AdvTune, an advanced online augmentation system that integrates adversarial training. This system synergistically refines the CoefNet and the classification network, facilitating the automated creation of adaptive shape deformations contingent on the learner status. Comprehensive experimental analysis validates the superiority of Biharmonic Augmentation, showcasing notable performance improvements over prevailing point cloud augmentation techniques across varied network designs.
Spatio-temporal prediction is crucial in numerous real-world applications, including traffic forecasting and crime prediction, which aim to improve public transportation and safety management. Many state-of-the-art models demonstrate the strong capability of spatio-temporal graph neural networks (STGNN) to capture complex spatio-temporal correlations. However, despite their effectiveness, existing approaches do not adequately address several key challenges. Data quality issues, such as data scarcity and sparsity, lead to data noise and a lack of supervised signals, which significantly limit the performance of STGNN. Although recent STGNN models with contrastive learning aim to address these challenges, most of them use pre-defined augmentation strategies that heavily depend on manual design and cannot be customized for different Spatio-Temporal Graph (STG) scenarios. To tackle these challenges, we propose a new spatio-temporal contrastive learning (CL4ST) framework to encode robust and generalizable STG representations via the STG augmentation paradigm. Specifically, we design the meta view generator to automatically construct node and edge augmentation views for each disentangled spatial and temporal graph in a data-driven manner. The meta view generator employs meta networks with parameterized generative model to customize the augmentations for each input. This personalizes the augmentation strategies for every STG and endows the learning framework with spatio-temporal-aware information. Additionally, we integrate a unified spatio-temporal graph attention network with the proposed meta view generator and two-branch graph contrastive learning paradigms. Extensive experiments demonstrate that our CL4ST significantly improves performance over various state-of-the-art baselines in traffic and crime prediction.
In this paper, we propose a green beamforming design for the integrated sensing and communication (ISAC) system, using beam-matching error to assess radar performance. The beam-matching error metric, which considers the mean square error between the desired and designed beam patterns, provides a more practical evaluation approach. To tackle the non-convex challenge inherent in beamforming design, we apply semidefinite relaxation (SDR) to address the rank-one relaxation issue, followed by the iterative rank minimization algorithm (IRM) for rank-one recovery. The simulation results showcase the effectiveness of our proposed optimal beamforming design, emphasizing the exceptional performance of the radar component in sensing tasks.
This paper proposes a novel non-orthogonal multiple access (NOMA)-assisted orthogonal time-frequency space (OTFS)-integrated sensing and communication (ISAC) network, which uses unmanned aerial vehicles (UAVs) as air base stations to support multiple users. By employing ISAC, the UAV extracts position and velocity information from the user's echo signals, and non-orthogonal power allocation is conducted to achieve a superior achievable rate. A 3D motion prediction topology is used to guide the NOMA transmission for multiple users, and a robust power allocation solution is proposed under perfect and imperfect channel estimation for Maxi-min Fairness (MMF) and Maximum sum-Rate (SR) problems. Simulation results demonstrate the superiority of the proposed NOMA-assisted OTFS-ISAC system over other systems in terms of achievable rate under both perfect and imperfect channel conditions with the aid of 3D motion prediction topology.