Large language models (LLMs) have been widely studied for their ability to store and utilize positive knowledge. However, negative knowledge, such as "lions don't live in the ocean", is also ubiquitous in the world but rarely mentioned explicitly in the text. What do LLMs know about negative knowledge? This work examines the ability of LLMs to negative commonsense knowledge. We design a constrained keywords-to-sentence generation task (CG) and a Boolean question-answering task (QA) to probe LLMs. Our experiments reveal that LLMs frequently fail to generate valid sentences grounded in negative commonsense knowledge, yet they can correctly answer polar yes-or-no questions. We term this phenomenon the belief conflict of LLMs. Our further analysis shows that statistical shortcuts and negation reporting bias from language modeling pre-training cause this conflict.
The conventional machine learning (ML) and deep learning approaches need to share customers' sensitive information with an external credit bureau to generate a prediction model that opens the door to privacy leakage. This leakage risk makes financial companies face an enormous challenge in their cooperation. Federated learning is a machine learning setting that can protect data privacy, but the high communication cost is often the bottleneck of the federated systems, especially for large neural networks. Limiting the number and size of communications is necessary for the practical training of large neural structures. Gradient sparsification has received increasing attention as a method to reduce communication cost, which only updates significant gradients and accumulates insignificant gradients locally. However, the secure aggregation framework cannot directly use gradient sparsification. This article proposes two sparsification methods to reduce communication cost in federated learning. One is a time-varying hierarchical sparsification method for model parameter update, which solves the problem of maintaining model accuracy after high ratio sparsity. It can significantly reduce the cost of a single communication. The other is to apply the sparsification method to the secure aggregation framework. We sparse the encryption mask matrix to reduce the cost of communication while protecting privacy. Experiments show that under different Non-IID experiment settings, our method can reduce the upload communication cost to about 2.9% to 18.9% of the conventional federated learning algorithm when the sparse rate is 0.01.
Terahertz (THz) band is expected to be one of the key enabling technologies of the sixth generation (6G) wireless networks because of its abundant available bandwidth and very narrow beam width. Due to high frequency operations, electrically small array apertures are employed, and the signal wavefront becomes spherical in the near-field. Therefore, near-field signal model should be considered for channel acquisition in THz systems. Unlike prior works which mostly ignore the impact of near-field beam-split (NB) and consider either narrowband scenario or far-field models, this paper introduces both a model-based and a model-free techniques for wideband THz channel estimation in the presence of NB. The model-based approach is based on orthogonal matching pursuit (OMP) algorithm, for which we design an NB-aware dictionary. The key idea is to exploit the angular and range deviations due to the NB. We then employ the OMP algorithm, which accounts for the deviations thereby ipso facto mitigating the effect of NB. We further introduce a federated learning (FL)-based approach as a model-free solution for channel estimation in a multi-user scenario to achieve reduced complexity and training overhead. Through numerical simulations, we demonstrate the effectiveness of the proposed channel estimation techniques for wideband THz systems in comparison with the existing state-of-the-art techniques.
There is an increasing demand for interpretation of model predictions especially in high-risk applications. Various visualization approaches have been proposed to estimate the part of input which is relevant to a specific model prediction. However, most approaches require model structure and parameter details in order to obtain the visualization results, and in general much effort is required to adapt each approach to multiple types of tasks particularly when model backbone and input format change over tasks. In this study, a simple yet effective visualization framework called PAMI is proposed based on the observation that deep learning models often aggregate features from local regions for model predictions. The basic idea is to mask majority of the input and use the corresponding model output as the relative contribution of the preserved input part to the original model prediction. For each input, since only a set of model outputs are collected and aggregated, PAMI does not require any model detail and can be applied to various prediction tasks with different model backbones and input formats. Extensive experiments on multiple tasks confirm the proposed method performs better than existing visualization approaches in more precisely finding class-specific input regions, and when applied to different model backbones and input formats. The source code will be released publicly.
This correspondence investigates a reconfigurable intelligent surface (RIS)-assisted wireless communication system with security threats. The RIS is deployed to enhance the secrecy outage probability (SOP) of the data sent to a legitimate user. By deriving the distributions of the received signal-to-noise-ratios (SNRs) at the legitimate user and the eavesdropper, we formulate, in a closed-form expression, a tight bound for the SOP under the constraint of discrete phase control at the RIS. The SOP is characterized as a function of the number of antenna elements, $N$, and the number of discrete phase choices, $2^b$. It is revealed that the performance loss in terms of SOP due to the discrete phase control is ignorable for large $N$ when $b\!\geq\!3$. In addition, we explicitly quantify this SOP loss when binary phase shifts with $b\!=\!1$ is utilized. It is identified that increasing the RIS antenna elements by $1.6$ times can achieve the same SOP with binary phase shifts as that by the RIS with ideally continuous phase shifts. Numerical simulations are conducted to verify the accuracy of these theoretical observations.
Cloth-changing person re-identification (CC-ReID), which aims to match person identities under clothing changes, is a new rising research topic in recent years. However, typical biometrics-based CC-ReID methods often require cumbersome pose or body part estimators to learn cloth-irrelevant features from human biometric traits, which comes with high computational costs. Besides, the performance is significantly limited due to the resolution degradation of surveillance images. To address the above limitations, we propose an effective Identity-Sensitive Knowledge Propagation framework (DeSKPro) for CC-ReID. Specifically, a Cloth-irrelevant Spatial Attention module is introduced to eliminate the distraction of clothing appearance by acquiring knowledge from the human parsing module. To mitigate the resolution degradation issue and mine identity-sensitive cues from human faces, we propose to restore the missing facial details using prior facial knowledge, which is then propagated to a smaller network. After training, the extra computations for human parsing or face restoration are no longer required. Extensive experiments show that our framework outperforms state-of-the-art methods by a large margin. Our code is available at https://github.com/KimbingNg/DeskPro.
For the demonstration of ultra-wideband bandwidth and pencil-beamforming, the terahertz (THz)-band has been envisioned as one of the key enabling technologies for the sixth generation networks. However, the acquisition of the THz channel entails several unique challenges such as severe path loss and beam-split. Prior works usually employ ultra-massive arrays and additional hardware components comprised of time-delayers to compensate for these loses. In order to provide a cost-effective solution, this paper introduces a sparse-Bayesian-learning (SBL) technique for joint channel and beam-split estimation. Specifically, we first model the beam-split as an array perturbation inspired from array signal processing. Next, a low-complexity approach is developed by exploiting the line-of-sight-dominant feature of THz channel to reduce the computational complexity involved in the proposed SBL technique for channel estimation (SBCE). Additionally, based on federated-learning, we implement a model-free technique to the proposed model-based SBCE solution. Further to that, we examine the near-field considerations of THz channel, and introduce the range-dependent near-field beam-split. The theoretical performance bounds, i.e., Cram\'er-Rao lower bounds, are derived for near- and far-field parameters, e.g., user directions, ranges and beam-split, and several numerical experiments are conducted. Numerical simulations demonstrate that SBCE outperforms the existing approaches and exhibits lower hardware cost.
This paper addresses two major challenges in terahertz (THz) channel estimation: the beam-split phenomenon, i.e., beam misalignment because of frequency-independent analog beamformers, and computational complexity because of the usage of ultra-massive number of antennas to compensate propagation losses. Data-driven techniques are known to mitigate the complexity of this problem but usually require the transmission of the datasets from the users to a central server entailing huge communications overhead. In this work, we employ federated learning (FL), wherein the users transmit only the model parameters instead of the whole dataset, for THz channel estimation to improve the communications-efficiency. In order to accurately estimate the channel despite beam-split, we propose a beamspace support alignment technique without requiring additional hardware. Compared to the previous works, our method provides higher channel estimation accuracy as well as approximately $68$ times lower communications overhead.
Analog/mixed-signal circuit design is one of the most complex and time-consuming stages in the whole chip design process. Due to various process, voltage, and temperature (PVT) variations from chip manufacturing, analog circuits inevitably suffer from performance degradation. Although there has been plenty of work on automating analog circuit design under the typical condition, limited research has been done on exploring robust designs under real and unpredictable silicon variations. Automatic analog design against variations requires prohibitive computation and time costs. To address the challenge, we present RobustAnalog, a robust circuit design framework that involves the variation information in the optimization process. Specifically, circuit optimizations under different variations are considered as a set of tasks. Similarities among tasks are leveraged and competitions are alleviated to realize a sample-efficient multi-task training. Moreover, RobustAnalog prunes the task space according to the current performance in each iteration, leading to a further simulation cost reduction. In this way, RobustAnalog can rapidly produce a set of circuit parameters that satisfies diverse constraints (e.g. gain, bandwidth, noise...) across variations. We compare RobustAnalog with Bayesian optimization, Evolutionary algorithm, and Deep Deterministic Policy Gradient (DDPG) and demonstrate that RobustAnalog can significantly reduce required optimization time by 14-30 times. Therefore, our study provides a feasible method to handle various real silicon conditions.