Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jun Wang

IBM T. J. Watson Research Center

Enhancing the Rationale-Input Alignment for Self-explaining Rationalization

Dec 15, 2023

Wei Liu, Haozhao Wang, Jun Wang, Zhiying Deng, YuanKai Zhang, Cheng Wang, Ruixuan Li

Figure 1 for Enhancing the Rationale-Input Alignment for Self-explaining Rationalization

Figure 2 for Enhancing the Rationale-Input Alignment for Self-explaining Rationalization

Figure 3 for Enhancing the Rationale-Input Alignment for Self-explaining Rationalization

Figure 4 for Enhancing the Rationale-Input Alignment for Self-explaining Rationalization

Abstract:Rationalization empowers deep learning models with self-explaining capabilities through a cooperative game, where a generator selects a semantically consistent subset of the input as a rationale, and a subsequent predictor makes predictions based on the selected rationale. In this paper, we discover that rationalization is prone to a problem named \emph{rationale shift}, which arises from the algorithmic bias of the cooperative game. Rationale shift refers to a situation where the semantics of the selected rationale may deviate from the original input, but the predictor still produces accurate predictions based on the deviation, resulting in a compromised generator with misleading feedback. To address this issue, we first demonstrate the importance of the alignment between the rationale and the full input through both empirical observations and theoretical analysis. Subsequently, we introduce a novel approach called DAR (\textbf{D}iscriminatively \textbf{A}ligned \textbf{R}ationalization), which utilizes an auxiliary module pretrained on the full input to discriminatively align the selected rationale and the original input. We theoretically illustrate how DAR accomplishes the desired alignment, thereby overcoming the rationale shift problem. The experiments on two widely used real-world benchmarks show that the proposed method significantly improves the explanation quality (measured by the overlap between the model-selected explanation and the human-annotated rationale) as compared to state-of-the-art techniques. Additionally, results on two synthetic settings further validate the effectiveness of DAR in addressing the rationale shift problem.

* Accept at ICDE 2024

Via

Access Paper or Ask Questions

Multi-granularity Causal Structure Learning

Dec 12, 2023

Jiaxuan Liang, Jun Wang, Guoxian Yu, Shuyin Xia, Guoyin Wang

Figure 1 for Multi-granularity Causal Structure Learning

Figure 2 for Multi-granularity Causal Structure Learning

Figure 3 for Multi-granularity Causal Structure Learning

Figure 4 for Multi-granularity Causal Structure Learning

Abstract:Unveil, model, and comprehend the causal mechanisms underpinning natural phenomena stand as fundamental endeavors across myriad scientific disciplines. Meanwhile, new knowledge emerges when discovering causal relationships from data. Existing causal learning algorithms predominantly focus on the isolated effects of variables, overlook the intricate interplay of multiple variables and their collective behavioral patterns. Furthermore, the ubiquity of high-dimensional data exacts a substantial temporal cost for causal algorithms. In this paper, we develop a novel method called MgCSL (Multi-granularity Causal Structure Learning), which first leverages sparse auto-encoder to explore coarse-graining strategies and causal abstractions from micro-variables to macro-ones. MgCSL then takes multi-granularity variables as inputs to train multilayer perceptrons and to delve the causality between variables. To enhance the efficacy on high-dimensional data, MgCSL introduces a simplified acyclicity constraint to adeptly search the directed acyclic graph among variables. Experimental results show that MgCSL outperforms competitive baselines, and finds out explainable causal connections on fMRI datasets.

* Accepted by the Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI2024)

Via

Access Paper or Ask Questions

Supervised Contrastive Learning for Fine-grained Chromosome Recognition

Dec 12, 2023

Ruijia Chang, Suncheng Xiang, Chengyu Zhou, Kui Su, Dahong Qian, Jun Wang

Abstract:Chromosome recognition is an essential task in karyotyping, which plays a vital role in birth defect diagnosis and biomedical research. However, existing classification methods face significant challenges due to the inter-class similarity and intra-class variation of chromosomes. To address this issue, we propose a supervised contrastive learning strategy that is tailored to train model-agnostic deep networks for reliable chromosome classification. This method enables extracting fine-grained chromosomal embeddings in latent space. These embeddings effectively expand inter-class boundaries and reduce intra-class variations, enhancing their distinctiveness in predicting chromosome types. On top of two large-scale chromosome datasets, we comprehensively validate the power of our contrastive learning strategy in boosting cutting-edge deep networks such as Transformers and ResNets. Extensive results demonstrate that it can significantly improve models' generalization performance, with an accuracy improvement up to +4.5%. Codes and pretrained models will be released upon acceptance of this work.

Via

Access Paper or Ask Questions

Federated Causality Learning with Explainable Adaptive Optimization

Dec 09, 2023

Dezhi Yang, Xintong He, Jun Wang, Guoxian Yu, Carlotta Domeniconi, Jinglin Zhang

Figure 1 for Federated Causality Learning with Explainable Adaptive Optimization

Figure 2 for Federated Causality Learning with Explainable Adaptive Optimization

Figure 3 for Federated Causality Learning with Explainable Adaptive Optimization

Figure 4 for Federated Causality Learning with Explainable Adaptive Optimization

Abstract:Discovering the causality from observational data is a crucial task in various scientific domains. With increasing awareness of privacy, data are not allowed to be exposed, and it is very hard to learn causal graphs from dispersed data, since these data may have different distributions. In this paper, we propose a federated causal discovery strategy (FedCausal) to learn the unified global causal graph from decentralized heterogeneous data. We design a global optimization formula to naturally aggregate the causal graphs from client data and constrain the acyclicity of the global graph without exposing local data. Unlike other federated causal learning algorithms, FedCausal unifies the local and global optimizations into a complete directed acyclic graph (DAG) learning process with a flexible optimization objective. We prove that this optimization objective has a high interpretability and can adaptively handle homogeneous and heterogeneous data. Experimental results on synthetic and real datasets show that FedCausal can effectively deal with non-independently and identically distributed (non-iid) data and has a superior performance.

* Accepted by the Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI2024)

Via

Access Paper or Ask Questions

Multi-dimensional Fair Federated Learning

Dec 09, 2023

Cong Su, Guoxian Yu, Jun Wang, Hui Li, Qingzhong Li, Han Yu

Figure 1 for Multi-dimensional Fair Federated Learning

Figure 2 for Multi-dimensional Fair Federated Learning

Figure 3 for Multi-dimensional Fair Federated Learning

Figure 4 for Multi-dimensional Fair Federated Learning

Abstract:Federated learning (FL) has emerged as a promising collaborative and secure paradigm for training a model from decentralized data without compromising privacy. Group fairness and client fairness are two dimensions of fairness that are important for FL. Standard FL can result in disproportionate disadvantages for certain clients, and it still faces the challenge of treating different groups equitably in a population. The problem of privately training fair FL models without compromising the generalization capability of disadvantaged clients remains open. In this paper, we propose a method, called mFairFL, to address this problem and achieve group fairness and client fairness simultaneously. mFairFL leverages differential multipliers to construct an optimization objective for empirical risk minimization with fairness constraints. Before aggregating locally trained models, it first detects conflicts among their gradients, and then iteratively curates the direction and magnitude of gradients to mitigate these conflicts. Theoretical analysis proves mFairFL facilitates the fairness in model development. The experimental evaluations based on three benchmark datasets show significant advantages of mFairFL compared to seven state-of-the-art baselines.

* Accepted by the Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI2024)

Via

Access Paper or Ask Questions

Design and trajectory tracking control of CuRobot: A Cubic Reversible Robot

Nov 28, 2023

Kai Yang, Jiahui Wang, Yuchen Weng, Baolei Wu, Fuqiang Li, Jihong Zhu, Jun Wang

Figure 1 for Design and trajectory tracking control of CuRobot: A Cubic Reversible Robot

Figure 2 for Design and trajectory tracking control of CuRobot: A Cubic Reversible Robot

Figure 3 for Design and trajectory tracking control of CuRobot: A Cubic Reversible Robot

Figure 4 for Design and trajectory tracking control of CuRobot: A Cubic Reversible Robot

Abstract:In field environments, numerous robots necessitate manual intervention for restoration of functionality post a turnover, resulting in diminished operational efficiency. This study presents an innovative design solution for a reversible omnidirectional mobile robot denoted as CuRobot, featuring a cube structure, thereby facilitating uninterrupted omnidirectional movement even in the event of flipping. The incorporation of eight conical wheels at the cube vertices ensures consistent omnidirectional motion no matter which face of the cube contacts the ground. Additionally, a kinematic model is formulated for CuRobot, accompanied by the development of a trajectory tracking controller utilizing model predictive control. Through simulation experiments, the correlation between trajectory tracking accuracy and the robot's motion direction is examined. Furthermore, the robot's proficiency in omnidirectional mobility and sustained movement post-flipping is substantiated via both simulation and prototype experiments. This design reduces the inefficiencies associated with manual intervention, thereby increasing the operational robustness of robots in field environments.

Via

Access Paper or Ask Questions

Single-cell Multi-view Clustering via Community Detection with Unknown Number of Clusters

Nov 28, 2023

Dayu Hu, Zhibin Dong, Ke Liang, Jun Wang, Siwei Wang, Xinwang Liu

Figure 1 for Single-cell Multi-view Clustering via Community Detection with Unknown Number of Clusters

Figure 2 for Single-cell Multi-view Clustering via Community Detection with Unknown Number of Clusters

Figure 3 for Single-cell Multi-view Clustering via Community Detection with Unknown Number of Clusters

Figure 4 for Single-cell Multi-view Clustering via Community Detection with Unknown Number of Clusters

Abstract:Single-cell multi-view clustering enables the exploration of cellular heterogeneity within the same cell from different views. Despite the development of several multi-view clustering methods, two primary challenges persist. Firstly, most existing methods treat the information from both single-cell RNA (scRNA) and single-cell Assay of Transposase Accessible Chromatin (scATAC) views as equally significant, overlooking the substantial disparity in data richness between the two views. This oversight frequently leads to a degradation in overall performance. Additionally, the majority of clustering methods necessitate manual specification of the number of clusters by users. However, for biologists dealing with cell data, precisely determining the number of distinct cell types poses a formidable challenge. To this end, we introduce scUNC, an innovative multi-view clustering approach tailored for single-cell data, which seamlessly integrates information from different views without the need for a predefined number of clusters. The scUNC method comprises several steps: initially, it employs a cross-view fusion network to create an effective embedding, which is then utilized to generate initial clusters via community detection. Subsequently, the clusters are automatically merged and optimized until no further clusters can be merged. We conducted a comprehensive evaluation of scUNC using three distinct single-cell datasets. The results underscored that scUNC outperforms the other baseline methods.

Via

Access Paper or Ask Questions

Mission-driven Exploration for Accelerated Deep Reinforcement Learning with Temporal Logic Task Specifications

Nov 28, 2023

Jun Wang, Hosein Hasanbeig, Kaiyuan Tan, Zihe Sun, Yiannis Kantaros

Figure 1 for Mission-driven Exploration for Accelerated Deep Reinforcement Learning with Temporal Logic Task Specifications

Figure 2 for Mission-driven Exploration for Accelerated Deep Reinforcement Learning with Temporal Logic Task Specifications

Figure 3 for Mission-driven Exploration for Accelerated Deep Reinforcement Learning with Temporal Logic Task Specifications

Abstract:This paper addresses the problem of designing optimal control policies for mobile robots with mission and safety requirements specified using Linear Temporal Logic (LTL). We consider robots with unknown stochastic dynamics operating in environments with unknown geometric structure. The robots are equipped with sensors allowing them to detect obstacles. Our goal is to synthesize a control policy that maximizes the probability of satisfying an LTL-encoded task in the presence of motion and environmental uncertainty. Several deep reinforcement learning (DRL) algorithms have been proposed recently to address similar problems. A common limitation in related works is that of slow learning performance. In order to address this issue, we propose a novel DRL algorithm, which has the capability to learn control policies at a notably faster rate compared to similar methods. Its sample efficiency is due to a mission-driven exploration strategy that prioritizes exploration towards directions that may contribute to mission accomplishment. Identifying these directions relies on an automaton representation of the LTL task as well as a learned neural network that (partially) models the unknown system dynamics. We provide comparative experiments demonstrating the efficiency of our algorithm on robot navigation tasks in unknown environments.

Via

Access Paper or Ask Questions

Verified Compositional Neuro-Symbolic Control for Stochastic Systems with Temporal Logic Tasks

Nov 22, 2023

Jun Wang, Haojun Chen, Zihe Sun, Yiannis Kantaros

Abstract:Several methods have been proposed recently to learn neural network (NN) controllers for autonomous agents, with unknown and stochastic dynamics, tasked with complex missions captured by Linear Temporal Logic (LTL). Due to the sample-inefficiency of the majority of these works, compositional learning methods have been proposed decomposing the LTL specification into smaller sub-tasks. Then, separate controllers are learned and composed to satisfy the original task. A key challenge within these approaches is that they often lack safety guarantees or the provided guarantees are impractical. This paper aims to address this challenge. Particularly, we consider autonomous systems with unknown and stochastic dynamics and LTL-encoded tasks. We assume that the system is equipped with a finite set of base skills modeled by trained NN feedback controllers. Our goal is to check if there exists a temporal composition of the trained NN controllers - and if so, to compute it - that will yield a composite system behavior that satisfies the assigned LTL task with probability one. We propose a new approach that relies on a novel integration of automata theory and data-driven reachability analysis tools for NN-controlled stochastic systems. The resulting neuro-symbolic controller allows the agent to generate safe behaviors for unseen complex temporal logic tasks in a zero-shot fashion by leveraging its base skills. We show correctness of the proposed method and we provide conditions under which it is complete. To the best of our knowledge, this is the first work that designs verified temporal compositions of NN controllers for unknown and stochastic systems. Finally, we provide extensive numerical simulations and hardware experiments on robot navigation tasks to demonstrate the proposed method.

* arXiv admin note: substantial text overlap with arXiv:2209.06130

Via

Access Paper or Ask Questions

Channel Capacity and Bounds In Mixed Gaussian-Impulsive Noise

Nov 15, 2023

Tianfu Qi, Jun Wang, Qihang Peng, Xiaoping Li, Xiaonan Chen

Figure 1 for Channel Capacity and Bounds In Mixed Gaussian-Impulsive Noise

Figure 2 for Channel Capacity and Bounds In Mixed Gaussian-Impulsive Noise

Figure 3 for Channel Capacity and Bounds In Mixed Gaussian-Impulsive Noise

Figure 4 for Channel Capacity and Bounds In Mixed Gaussian-Impulsive Noise

Abstract:Communication systems suffer from the mixed noise consisting of both non-Gaussian impulsive noise (IN) and white Gaussian noise (WGN) in many practical applications. However, there is little literature about the channel capacity under mixed noise. In this paper, we prove the existence of the capacity under p-th moment constraint and show that there are only finite mass points in the capacity-achieving distribution. Moreover, we provide lower and upper capacity bounds with closed forms. It is shown that the lower bounds can degenerate to the well-known Shannon formula under special scenarios. In addition, the capacity for specific modulations and the corresponding lower bounds are discussed. Numerical results reveal that the capacity decreases when the impulsiveness of the mixed noise becomes dominant and the obtained capacity bounds are shown to be very tight.

Via

Access Paper or Ask Questions