Alert button
Picture for Yixin Wang

Yixin Wang

Alert button

Imputing Brain Measurements Across Data Sets via Graph Neural Networks

Aug 19, 2023
Yixin Wang, Wei Peng, Susan F. Tapert, Qingyu Zhao, Kilian M. Pohl

Figure 1 for Imputing Brain Measurements Across Data Sets via Graph Neural Networks
Figure 2 for Imputing Brain Measurements Across Data Sets via Graph Neural Networks
Figure 3 for Imputing Brain Measurements Across Data Sets via Graph Neural Networks
Figure 4 for Imputing Brain Measurements Across Data Sets via Graph Neural Networks

Publicly available data sets of structural MRIs might not contain specific measurements of brain Regions of Interests (ROIs) that are important for training machine learning models. For example, the curvature scores computed by Freesurfer are not released by the Adolescent Brain Cognitive Development (ABCD) Study. One can address this issue by simply reapplying Freesurfer to the data set. However, this approach is generally computationally and labor intensive (e.g., requiring quality control). An alternative is to impute the missing measurements via a deep learning approach. However, the state-of-the-art is designed to estimate randomly missing values rather than entire measurements. We therefore propose to re-frame the imputation problem as a prediction task on another (public) data set that contains the missing measurements and shares some ROI measurements with the data sets of interest. A deep learning model is then trained to predict the missing measurements from the shared ones and afterwards is applied to the other data sets. Our proposed algorithm models the dependencies between ROI measurements via a graph neural network (GNN) and accounts for demographic differences in brain measurements (e.g. sex) by feeding the graph encoding into a parallel architecture. The architecture simultaneously optimizes a graph decoder to impute values and a classifier in predicting demographic factors. We test the approach, called Demographic Aware Graph-based Imputation (DAGI), on imputing those missing Freesurfer measurements of ABCD (N=3760) by training the predictor on those publicly released by the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA, N=540)...

* Accepted at the 6th workshop on PRedictive Intelligence in Medicine (PRIME 2023) - MICCAI 2023 
Viaarxiv icon

The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions

Aug 10, 2023
Jun Ma, Ronald Xie, Shamini Ayyadhury, Cheng Ge, Anubha Gupta, Ritu Gupta, Song Gu, Yao Zhang, Gihun Lee, Joonkee Kim, Wei Lou, Haofeng Li, Eric Upschulte, Timo Dickscheid, José Guilherme de Almeida, Yixin Wang, Lin Han, Xin Yang, Marco Labagnara, Sahand Jamal Rahi, Carly Kempster, Alice Pollitt, Leon Espinosa, Tâm Mignot, Jan Moritz Middeke, Jan-Niklas Eckardt, Wangkai Li, Zhaoyang Li, Xiaochen Cai, Bizhe Bai, Noah F. Greenwald, David Van Valen, Erin Weisbart, Beth A. Cimini, Zhuoshi Li, Chao Zuo, Oscar Brück, Gary D. Bader, Bo Wang

Figure 1 for The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions
Figure 2 for The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions
Figure 3 for The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions
Figure 4 for The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions

Cell segmentation is a critical step for quantitative single-cell analysis in microscopy images. Existing cell segmentation methods are often tailored to specific modalities or require manual interventions to specify hyperparameters in different experimental settings. Here, we present a multi-modality cell segmentation benchmark, comprising over 1500 labeled images derived from more than 50 diverse biological experiments. The top participants developed a Transformer-based deep-learning algorithm that not only exceeds existing methods, but can also be applied to diverse microscopy images across imaging platforms and tissue types without manual parameter adjustments. This benchmark and the improved algorithm offer promising avenues for more accurate and versatile cell analysis in microscopy imaging.

* NeurIPS22 Cell Segmentation Challenge: https://neurips22-cellseg.grand-challenge.org/ 
Viaarxiv icon

Rethinking Medical Report Generation: Disease Revealing Enhancement with Knowledge Graph

Jul 24, 2023
Yixin Wang, Zihao Lin, Haoyu Dong

Figure 1 for Rethinking Medical Report Generation: Disease Revealing Enhancement with Knowledge Graph
Figure 2 for Rethinking Medical Report Generation: Disease Revealing Enhancement with Knowledge Graph
Figure 3 for Rethinking Medical Report Generation: Disease Revealing Enhancement with Knowledge Graph
Figure 4 for Rethinking Medical Report Generation: Disease Revealing Enhancement with Knowledge Graph

Knowledge Graph (KG) plays a crucial role in Medical Report Generation (MRG) because it reveals the relations among diseases and thus can be utilized to guide the generation process. However, constructing a comprehensive KG is labor-intensive and its applications on the MRG process are under-explored. In this study, we establish a complete KG on chest X-ray imaging that includes 137 types of diseases and abnormalities. Based on this KG, we find that the current MRG data sets exhibit a long-tailed problem in disease distribution. To mitigate this problem, we introduce a novel augmentation strategy that enhances the representation of disease types in the tail-end of the distribution. We further design a two-stage MRG approach, where a classifier is first trained to detect whether the input images exhibit any abnormalities. The classified images are then independently fed into two transformer-based generators, namely, ``disease-specific generator" and ``disease-free generator" to generate the corresponding reports. To enhance the clinical evaluation of whether the generated reports correctly describe the diseases appearing in the input image, we propose diverse sensitivity (DS), a new metric that checks whether generated diseases match ground truth and measures the diversity of all generated diseases. Results show that the proposed two-stage generation framework and augmentation strategies improve DS by a considerable margin, indicating a notable reduction in the long-tailed problem associated with under-represented diseases.

Viaarxiv icon

Bidirectional Attention as a Mixture of Continuous Word Experts

Jul 08, 2023
Kevin Christian Wibisono, Yixin Wang

Figure 1 for Bidirectional Attention as a Mixture of Continuous Word Experts
Figure 2 for Bidirectional Attention as a Mixture of Continuous Word Experts
Figure 3 for Bidirectional Attention as a Mixture of Continuous Word Experts

Bidirectional attention $\unicode{x2013}$ composed of self-attention with positional encodings and the masked language model (MLM) objective $\unicode{x2013}$ has emerged as a key component of modern large language models (LLMs). Despite its empirical success, few studies have examined its statistical underpinnings: What statistical model is bidirectional attention implicitly fitting? What sets it apart from its non-attention predecessors? We explore these questions in this paper. The key observation is that fitting a single-layer single-head bidirectional attention, upon reparameterization, is equivalent to fitting a continuous bag of words (CBOW) model with mixture-of-experts (MoE) weights. Further, bidirectional attention with multiple heads and multiple layers is equivalent to stacked MoEs and a mixture of MoEs, respectively. This statistical viewpoint reveals the distinct use of MoE in bidirectional attention, which aligns with its practical effectiveness in handling heterogeneous data. It also suggests an immediate extension to categorical tabular data, if we view each word location in a sentence as a tabular feature. Across empirical studies, we find that this extension outperforms existing tabular extensions of transformers in out-of-distribution (OOD) generalization. Finally, this statistical perspective of bidirectional attention enables us to theoretically characterize when linear word analogies are present in its word embeddings. These analyses show that bidirectional attention can require much stronger assumptions to exhibit linear word analogies than its non-attention predecessors.

* 31 pages 
Viaarxiv icon

On the Identifiability of Markov Switching Models

May 26, 2023
Carles Balsells-Rodas, Yixin Wang, Yingzhen Li

Figure 1 for On the Identifiability of Markov Switching Models
Figure 2 for On the Identifiability of Markov Switching Models
Figure 3 for On the Identifiability of Markov Switching Models
Figure 4 for On the Identifiability of Markov Switching Models

Identifiability of latent variable models has recently gained interest in terms of its applications to interpretability or out of distribution generalisation. In this work, we study identifiability of Markov Switching Models as a first step towards extending recent results to sequential latent variable models. We present identifiability conditions within first-order Markov dependency structures, and parametrise the transition distribution via non-linear Gaussians. Our experiments showcase the applicability of our approach for regime-dependent causal discovery and high-dimensional time series segmentation.

Viaarxiv icon

Delayed and Indirect Impacts of Link Recommendations

Mar 17, 2023
Han Zhang, Shangen Lu, Yixin Wang, Mihaela Curmei

Figure 1 for Delayed and Indirect Impacts of Link Recommendations
Figure 2 for Delayed and Indirect Impacts of Link Recommendations
Figure 3 for Delayed and Indirect Impacts of Link Recommendations
Figure 4 for Delayed and Indirect Impacts of Link Recommendations

The impacts of link recommendations on social networks are challenging to evaluate, and so far they have been studied in limited settings. Observational studies are restricted in the kinds of causal questions they can answer and naive A/B tests often lead to biased evaluations due to unaccounted network interference. Furthermore, evaluations in simulation settings are often limited to static network models that do not take into account the potential feedback loops between link recommendation and organic network evolution. To this end, we study the impacts of recommendations on social networks in dynamic settings. Adopting a simulation-based approach, we consider an explicit dynamic formation model -- an extension of the celebrated Jackson-Rogers model -- and investigate how link recommendations affect network evolution over time. Empirically, we find that link recommendations have surprising delayed and indirect effects on the structural properties of networks. Specifically, we find that link recommendations can exhibit considerably different impacts in the immediate term and in the long term. For instance, we observe that friend-of-friend recommendations can have an immediate effect in decreasing degree inequality, but in the long term, they can make the degree distribution substantially more unequal. Moreover, we show that the effects of recommendations can persist in networks, in part due to their indirect impacts on natural dynamics even after recommendations are turned off. We show that, in counterfactual simulations, removing the indirect effects of link recommendations can make the network trend faster toward what it would have been under natural growth dynamics.

Viaarxiv icon

Clarifying Trust of Materials Property Predictions using Neural Networks with Distribution-Specific Uncertainty Quantification

Feb 06, 2023
Cameron Gruich, Varun Madhavan, Yixin Wang, Bryan Goldsmith

Figure 1 for Clarifying Trust of Materials Property Predictions using Neural Networks with Distribution-Specific Uncertainty Quantification
Figure 2 for Clarifying Trust of Materials Property Predictions using Neural Networks with Distribution-Specific Uncertainty Quantification
Figure 3 for Clarifying Trust of Materials Property Predictions using Neural Networks with Distribution-Specific Uncertainty Quantification
Figure 4 for Clarifying Trust of Materials Property Predictions using Neural Networks with Distribution-Specific Uncertainty Quantification

It is critical that machine learning (ML) model predictions be trustworthy for high-throughput catalyst discovery approaches. Uncertainty quantification (UQ) methods allow estimation of the trustworthiness of an ML model, but these methods have not been well explored in the field of heterogeneous catalysis. Herein, we investigate different UQ methods applied to a crystal graph convolutional neural network (CGCNN) to predict adsorption energies of molecules on alloys from the Open Catalyst 2020 (OC20) dataset, the largest existing heterogeneous catalyst dataset. We apply three UQ methods to the adsorption energy predictions, namely k-fold ensembling, Monte Carlo dropout, and evidential regression. The effectiveness of each UQ method is assessed based on accuracy, sharpness, dispersion, calibration, and tightness. Evidential regression is demonstrated to be a powerful approach for rapidly obtaining tunable, competitively trustworthy UQ estimates for heterogeneous catalysis applications when using neural networks. Recalibration of model uncertainties is shown to be essential in practical screening applications of catalysts using uncertainties.

* 28 pages, 16 figures (8 main text, 8 SI), submitted to Machine Learning: Science & Technology journal (MLST, IOP) 
Viaarxiv icon

On Learning Necessary and Sufficient Causal Graphs

Jan 29, 2023
Hengrui Cai, Yixin Wang, Michael Jordan, Rui Song

Figure 1 for On Learning Necessary and Sufficient Causal Graphs
Figure 2 for On Learning Necessary and Sufficient Causal Graphs
Figure 3 for On Learning Necessary and Sufficient Causal Graphs
Figure 4 for On Learning Necessary and Sufficient Causal Graphs

The causal revolution has spurred interest in understanding complex relationships in various fields. Most existing methods aim to discover causal relationships among all variables in a large-scale complex graph. However, in practice, only a small number of variables in the graph are relevant for the outcomes of interest. As a result, causal estimation with the full causal graph -- especially given limited data -- could lead to many falsely discovered, spurious variables that may be highly correlated with but have no causal impact on the target outcome. In this paper, we propose to learn a class of necessary and sufficient causal graphs (NSCG) that only contains causally relevant variables for an outcome of interest, which we term causal features. The key idea is to utilize probabilities of causation to systematically evaluate the importance of features in the causal graph, allowing us to identify a subgraph that is relevant to the outcome of interest. To learn NSCG from data, we develop a score-based necessary and sufficient causal structural learning (NSCSL) algorithm, by establishing theoretical relationships between probabilities of causation and causal effects of features. Across empirical studies of simulated and real data, we show that the proposed NSCSL algorithm outperforms existing algorithms and can reveal important yeast genes for target heritable traits of interest.

Viaarxiv icon

Team Resilience under Shock: An Empirical Analysis of GitHub Repositories during Early COVID-19 Pandemic

Jan 29, 2023
Xuan Lu, Wei Ai, Yixin Wang, Qiaozhu Mei

Figure 1 for Team Resilience under Shock: An Empirical Analysis of GitHub Repositories during Early COVID-19 Pandemic
Figure 2 for Team Resilience under Shock: An Empirical Analysis of GitHub Repositories during Early COVID-19 Pandemic
Figure 3 for Team Resilience under Shock: An Empirical Analysis of GitHub Repositories during Early COVID-19 Pandemic
Figure 4 for Team Resilience under Shock: An Empirical Analysis of GitHub Repositories during Early COVID-19 Pandemic

While many organizations have shifted to working remotely during the COVID-19 pandemic, how the remote workforce and the remote teams are influenced by and would respond to this and future shocks remain largely unknown. Software developers have relied on remote collaborations long before the pandemic, working in virtual teams (GitHub repositories). The dynamics of these repositories through the pandemic provide a unique opportunity to understand how remote teams react under shock. This work presents a systematic analysis. We measure the overall effect of the early pandemic on public GitHub repositories by comparing their sizes and productivity with the counterfactual outcomes forecasted as if there were no pandemic. We find that the productivity level and the number of active members of these teams vary significantly during different periods of the pandemic. We then conduct a finer-grained investigation and study the heterogeneous effects of the shock on individual teams. We find that the resilience of a team is highly correlated to certain properties of the team before the pandemic. Through a bootstrapped regression analysis, we reveal which types of teams are robust or fragile to the shock.

* 12 pages, 4 figures. To be published in the 17th International AAAI Conference on Web and Social Media (ICWSM) 
Viaarxiv icon

Posterior Collapse and Latent Variable Non-identifiability

Jan 02, 2023
Yixin Wang, David M. Blei, John P. Cunningham

Figure 1 for Posterior Collapse and Latent Variable Non-identifiability
Figure 2 for Posterior Collapse and Latent Variable Non-identifiability
Figure 3 for Posterior Collapse and Latent Variable Non-identifiability
Figure 4 for Posterior Collapse and Latent Variable Non-identifiability

Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.

* 19 pages, 4 figures; NeurIPS 2021 
Viaarxiv icon