Abstract:Identifying the effects of causes and causes of effects is vital in virtually every scientific field. Often, however, the needed probabilities may not be fully identifiable from the data sources available. This paper shows how partial identifiability is still possible for several probabilities of causation. We term this epsilon-identifiability and demonstrate its usefulness in cases where the behavior of certain subpopulations can be restricted to within some narrow bounds. In particular, we show how unidentifiable causal effects and counterfactual probabilities can be narrowly bounded when such allowances are made. Often those allowances are easily measured and reasonably assumed. Finally, epsilon-identifiability is applied to the unit selection problem.
Abstract:This paper aims to tackle the issues on unavailable or insufficient clinical US data and meaningful annotation to enable bone segmentation and registration for US-guided spinal surgery. While the US is not a standard paradigm for spinal surgery, the scarcity of intra-operative clinical US data is an insurmountable bottleneck in training a neural network. Moreover, due to the characteristics of US imaging, it is difficult to clearly annotate bone surfaces which causes the trained neural network missing its attention to the details. Hence, we propose an In silico bone US simulation framework that synthesizes realistic US images from diagnostic CT volume. Afterward, using these simulated bone US we train a lightweight vision transformer model that can achieve accurate and on-the-fly bone segmentation for spinal sonography. In the validation experiments, the realistic US simulation was conducted by deriving from diagnostic spinal CT volume to facilitate a radiation-free US-guided pedicle screw placement procedure. When it is employed for training bone segmentation task, the Chamfer distance achieves 0.599mm; when it is applied for CT-US registration, the associated bone segmentation accuracy achieves 0.93 in Dice, and the registration accuracy based on the segmented point cloud is 0.13~3.37mm in a complication-free manner. While bone US images exhibit strong echoes at the medium interface, it may enable the model indistinguishable between thin interfaces and bone surfaces by simply relying on small neighborhood information. To overcome these shortcomings, we propose to utilize a Long-range Contrast Learning Module to fully explore the Long-range Contrast between the candidates and their surrounding pixels.
Abstract:Symbol-level precoding (SLP) manipulates the transmitted signals to accurately exploit the multi-user interference (MUI) in the multi-user downlink. This enables that all the resultant interference contributes to correct detection, which is the so-called constructive interference (CI). Its performance superiority comes at the cost of solving a nonlinear optimization problem on a symbol-by-symbol basis, for which the resulting complexity becomes prohibitive in realistic wireless communication systems. In this paper, we investigate low-complexity SLP algorithms for both phase-shift keying (PSK) and quadrature amplitude modulation (QAM). Specifically, we first prove that the max-min SINR balancing (SB) SLP problem for PSK signaling is not separable, which is contrary to the power minimization (PM) SLP problem, and accordingly, existing decomposition methods are not applicable. Next, we establish an explicit duality between the PM-SLP and SB-SLP problems for PSK modulation. The proposed duality facilitates obtaining the solution to the SB-SLP given the solution to the PM-SLP without the need for one-dimension search, and vice versa. We then propose a closed-form power scaling algorithm to solve the SB-SLP via PM-SLP to take advantage of the separability of the PM-SLP. As for QAM modulation, we convert the PM-SLP problem into a separable equivalent optimization problem, and decompose the new problem into several simple parallel subproblems with closed-form solutions, leveraging the proximal Jacobian alternating direction method of multipliers (PJ-ADMM). We further prove that the proposed duality can be generalized to the multi-level modulation case, based on which a power scaling parallel inverse-free algorithm is also proposed to solve the SB-SLP for QAM signaling. Numerical results show that the proposed algorithms offer optimal performance with lower complexity than the state-of-the-art.
Abstract:Molecular property calculations are the bedrock of chemical physics. High-fidelity \textit{ab initio} modeling techniques for computing the molecular properties can be prohibitively expensive, and motivate the development of machine-learning models that make the same predictions more efficiently. Training graph neural networks over large molecular databases introduces unique computational challenges such as the need to process millions of small graphs with variable size and support communication patterns that are distinct from learning over large graphs such as social networks. This paper demonstrates a novel hardware-software co-design approach to scale up the training of graph neural networks for molecular property prediction. We introduce an algorithm to coalesce the batches of molecular graphs into fixed size packs to eliminate redundant computation and memory associated with alternative padding techniques and improve throughput via minimizing communication. We demonstrate the effectiveness of our co-design approach by providing an implementation of a well-established molecular property prediction model on the Graphcore Intelligence Processing Units (IPU). We evaluate the training performance on multiple molecular graph databases with varying degrees of graph counts, sizes and sparsity. We demonstrate that such a co-design approach can reduce the training time of such molecular property prediction models from days to less than two hours, opening new possibilities for AI-driven scientific discovery.
Abstract:Discriminative unsupervised learning methods such as contrastive learning have demonstrated the ability to learn generalized visual representations on centralized data. It is nonetheless challenging to adapt such methods to a distributed system with unlabeled, private, and heterogeneous client data due to user styles and preferences. Federated learning enables multiple clients to collectively learn a global model without provoking any privacy breach between local clients. On the other hand, another direction of federated learning studies personalized methods to address the local heterogeneity. However, work on solving both generalization and personalization without labels in a decentralized setting remains unfamiliar. In this work, we propose a novel method, FedStyle, to learn a more generalized global model by infusing local style information with local content information for contrastive learning, and to learn more personalized local models by inducing local style information for downstream tasks. The style information is extracted by contrasting original local data with strongly augmented local data (Sobel filtered images). Through extensive experiments with linear evaluations in both IID and non-IID settings, we demonstrate that FedStyle outperforms both the generalization baseline methods and personalization baseline methods in a stylized decentralized setting. Through comprehensive ablations, we demonstrate our design of style infusion and stylized personalization improve performance significantly.
Abstract:Probabilities of causation play a crucial role in modern decision-making. Pearl defined three binary probabilities of causation, the probability of necessity and sufficiency (PNS), the probability of sufficiency (PS), and the probability of necessity (PN). These probabilities were then bounded by Tian and Pearl using a combination of experimental and observational data. However, observational data are not always available in practice; in such a case, Tian and Pearl's Theorem provided valid but less effective bounds using pure experimental data. In this paper, we discuss the conditions that observational data are worth considering to improve the quality of the bounds. More specifically, we defined the expected improvement of the bounds by assuming the observational distributions are uniformly distributed on their feasible interval. We further applied the proposed theorems to the unit selection problem defined by Li and Pearl.
Abstract:This paper deals with the problem of learning the probabilities of causation of subpopulations given finite population data. The tight bounds of three basic probabilities of causation, the probability of necessity and sufficiency (PNS), the probability of sufficiency (PS), and the probability of necessity (PN), were derived by Tian and Pearl. However, obtaining the bounds for each subpopulation requires experimental and observational distributions of each subpopulation, which is usually impractical to estimate given finite population data. We propose a machine learning model that helps to learn the bounds of the probabilities of causation for subpopulations given finite population data. We further show by a simulated study that the machine learning model is able to learn the bounds of PNS for 32768 subpopulations with only knowing roughly 500 of them from the finite population data.
Abstract:The unit selection problem is to identify a group of individuals who are most likely to exhibit a desired mode of behavior, for example, selecting individuals who would respond one way if incentivized and a different way if not. The unit selection problem consists of evaluation and search subproblems. Li and Pearl defined the "benefit function" to evaluate the average payoff of selecting a certain individual with given characteristics. The search subproblem is then to design an algorithm to identify the characteristics that maximize the above benefit function. The hardness of the search subproblem arises due to the large number of characteristics available for each individual and the sparsity of the data available in each cell of characteristics. In this paper, we present a machine learning framework that uses the bounds of the benefit function that are estimable from the finite population data to learn the bounds of the benefit function for each cell of characteristics. Therefore, we could easily obtain the characteristics that maximize the benefit function.
Abstract:Non-orthogonal multiple access (NOMA) is a powerful transmission technique that enhances the spectral efficiency of communication links, and is being investigated for 5G standards and beyond. A major drawback of NOMA is the need to apply successive interference cancellation (SIC) at the receiver on a symbol-by-symbol basis, which limits its practicality. To circumvent this, in this paper a novel constructive multiple access (CoMA) scheme is proposed and investigated. CoMA aligns the superimposed signals to the different users constructively to the signal of interest. Since the superimposed signal aligns with the data signal, there is no need to remove it at the receiver using SIC. Accordingly, SIC component can be removed at the receiver side. In this regard and in order to provide a comprehensive investigation and comparison, different optimization problems for user paring NOMA multiple-input-single-output (MISO) systems are considered. Firstly, an optimal precoder to minimize the total transmission power for CoMA subject to a quality-of-service constraint is obtained, and compared to conventional NOMA. Then, a precoder that minimizes the CoMA symbol error rate (SER) subject to power constraint is investigated. Further, the computational complexity of CoMA is considered and compared with conventional NOMA scheme in terms of total number of complex operations. The results in this paper prove the superiority of the proposed CoMA scheme over the conventional NOMA technique, and demonstrate that CoMA is an attractive solution for user paring NOMA MISO systems with low number of BS antennas, while circumventing the receive SIC complexity.
Abstract:Technology videos contain rich multi-modal information. In cross-modal information search, the data features of different modalities cannot be compared directly, so the semantic gap between different modalities is a key problem that needs to be solved. To address the above problems, this paper proposes a novel Feature Fusion based Adversarial Cross-modal Retrieval method (FFACR) to achieve text-to-video matching, ranking and searching. The proposed method uses the framework of adversarial learning to construct a video multimodal feature fusion network and a feature mapping network as generator, a modality discrimination network as discriminator. Multi-modal features of videos are obtained by the feature fusion network. The feature mapping network projects multi-modal features into the same semantic space based on semantics and similarity. The modality discrimination network is responsible for determining the original modality of features. Generator and discriminator are trained alternately based on adversarial learning, so that the data obtained by the feature mapping network is semantically consistent with the original data and the modal features are eliminated, and finally the similarity is used to rank and obtain the search results in the semantic space. Experimental results demonstrate that the proposed method performs better in text-to-video search than other existing methods, and validate the effectiveness of the method on the self-built datasets of technology videos.