Alert button
Picture for Seong Jae Hwang

Seong Jae Hwang

Alert button

Evidence-empowered Transfer Learning for Alzheimer's Disease

Mar 03, 2023
Kai Tzu-iunn Ong, Hana Kim, Minjin Kim, Jinseong Jang, Beomseok Sohn, Yoon Seong Choi, Dosik Hwang, Seong Jae Hwang, Jinyoung Yeo

Figure 1 for Evidence-empowered Transfer Learning for Alzheimer's Disease
Figure 2 for Evidence-empowered Transfer Learning for Alzheimer's Disease
Figure 3 for Evidence-empowered Transfer Learning for Alzheimer's Disease
Figure 4 for Evidence-empowered Transfer Learning for Alzheimer's Disease

Transfer learning has been widely utilized to mitigate the data scarcity problem in the field of Alzheimer's disease (AD). Conventional transfer learning relies on re-using models trained on AD-irrelevant tasks such as natural image classification. However, it often leads to negative transfer due to the discrepancy between the non-medical source and target medical domains. To address this, we present evidence-empowered transfer learning for AD diagnosis. Unlike conventional approaches, we leverage an AD-relevant auxiliary task, namely morphological change prediction, without requiring additional MRI data. In this auxiliary task, the diagnosis model learns the evidential and transferable knowledge from morphological features in MRI scans. Experimental results demonstrate that our framework is not only effective in improving detection performance regardless of model capacity, but also more data-efficient and faithful.

* Accepted to IEEE International Symposium on Biomedical Imaging (ISBI) 2023 
Viaarxiv icon

PAC-Bayesian Domain Adaptation Bounds for Multiclass Learners

Jul 12, 2022
Anthony Sicilia, Katherine Atwell, Malihe Alikhani, Seong Jae Hwang

Figure 1 for PAC-Bayesian Domain Adaptation Bounds for Multiclass Learners
Figure 2 for PAC-Bayesian Domain Adaptation Bounds for Multiclass Learners
Figure 3 for PAC-Bayesian Domain Adaptation Bounds for Multiclass Learners
Figure 4 for PAC-Bayesian Domain Adaptation Bounds for Multiclass Learners

Multiclass neural networks are a common tool in modern unsupervised domain adaptation, yet an appropriate theoretical description for their non-uniform sample complexity is lacking in the adaptation literature. To fill this gap, we propose the first PAC-Bayesian adaptation bounds for multiclass learners. We facilitate practical use of our bounds by also proposing the first approximation techniques for the multiclass distribution divergences we consider. For divergences dependent on a Gibbs predictor, we propose additional PAC-Bayesian adaptation bounds which remove the need for inefficient Monte-Carlo estimation. Empirically, we test the efficacy of our proposed approximation techniques as well as some novel design-concepts which we include in our bounds. Finally, we apply our bounds to analyze a common adaptation algorithm that uses neural networks.

Viaarxiv icon

Test-time Fourier Style Calibration for Domain Generalization

May 18, 2022
Xingchen Zhao, Chang Liu, Anthony Sicilia, Seong Jae Hwang, Yun Fu

Figure 1 for Test-time Fourier Style Calibration for Domain Generalization
Figure 2 for Test-time Fourier Style Calibration for Domain Generalization
Figure 3 for Test-time Fourier Style Calibration for Domain Generalization
Figure 4 for Test-time Fourier Style Calibration for Domain Generalization

The topic of generalizing machine learning models learned on a collection of source domains to unknown target domains is challenging. While many domain generalization (DG) methods have achieved promising results, they primarily rely on the source domains at train-time without manipulating the target domains at test-time. Thus, it is still possible that those methods can overfit to source domains and perform poorly on target domains. Driven by the observation that domains are strongly related to styles, we argue that reducing the gap between source and target styles can boost models' generalizability. To solve the dilemma of having no access to the target domain during training, we introduce Test-time Fourier Style Calibration (TF-Cal) for calibrating the target domain style on the fly during testing. To access styles, we utilize Fourier transformation to decompose features into amplitude (style) features and phase (semantic) features. Furthermore, we present an effective technique to Augment Amplitude Features (AAF) to complement TF-Cal. Extensive experiments on several popular DG benchmarks and a segmentation dataset for medical images demonstrate that our method outperforms state-of-the-art methods.

* 31st International Joint Conference on Artificial Intelligence (IJCAI) 2022 
Viaarxiv icon

The Change that Matters in Discourse Parsing: Estimating the Impact of Domain Shift on Parser Error

Mar 21, 2022
Katherine Atwell, Anthony Sicilia, Seong Jae Hwang, Malihe Alikhani

Figure 1 for The Change that Matters in Discourse Parsing: Estimating the Impact of Domain Shift on Parser Error
Figure 2 for The Change that Matters in Discourse Parsing: Estimating the Impact of Domain Shift on Parser Error
Figure 3 for The Change that Matters in Discourse Parsing: Estimating the Impact of Domain Shift on Parser Error
Figure 4 for The Change that Matters in Discourse Parsing: Estimating the Impact of Domain Shift on Parser Error

Discourse analysis allows us to attain inferences of a text document that extend beyond the sentence-level. The current performance of discourse models is very low on texts outside of the training distribution's coverage, diminishing the practical utility of existing models. There is need for a measure that can inform us to what extent our model generalizes from the training to the test sample when these samples may be drawn from distinct distributions. While this can be estimated via distribution shift, we argue that this does not directly correlate with change in the observed error of a classifier (i.e. error-gap). Thus, we propose to use a statistic from the theoretical domain adaptation literature which can be directly tied to error-gap. We study the bias of this statistic as an estimator of error-gap both theoretically and through a large-scale empirical study of over 2400 experiments on 6 discourse datasets from domains including, but not limited to: news, biomedical texts, TED talks, Reddit posts, and fiction. Our results not only motivate our proposal and help us to understand its limitations, but also provide insight on the properties of discourse models and datasets which improve performance in domain adaptation. For instance, we find that non-news datasets are slightly easier to transfer to than news datasets when the training and test sets are very different. Our code and an associated Python package are available to allow practitioners to make more informed model and dataset choices.

Viaarxiv icon

Point Cloud Augmentation with Weighted Local Transformations

Oct 11, 2021
Sihyeon Kim, Sanghyeok Lee, Dasol Hwang, Jaewon Lee, Seong Jae Hwang, Hyunwoo J. Kim

Figure 1 for Point Cloud Augmentation with Weighted Local Transformations
Figure 2 for Point Cloud Augmentation with Weighted Local Transformations
Figure 3 for Point Cloud Augmentation with Weighted Local Transformations
Figure 4 for Point Cloud Augmentation with Weighted Local Transformations

Despite the extensive usage of point clouds in 3D vision, relatively limited data are available for training deep neural networks. Although data augmentation is a standard approach to compensate for the scarcity of data, it has been less explored in the point cloud literature. In this paper, we propose a simple and effective augmentation method called PointWOLF for point cloud augmentation. The proposed method produces smoothly varying non-rigid deformations by locally weighted transformations centered at multiple anchor points. The smooth deformations allow diverse and realistic augmentations. Furthermore, in order to minimize the manual efforts to search the optimal hyperparameters for augmentation, we present AugTune, which generates augmented samples of desired difficulties producing targeted confidence scores. Our experiments show our framework consistently improves the performance for both shape classification and part segmentation tasks. Particularly, with PointNet++, PointWOLF achieves the state-of-the-art 89.7 accuracy on shape classification with the real-world ScanObjectNN dataset.

* 9 pages, Accepted to ICCV 2021 
Viaarxiv icon

PAC Bayesian Performance Guarantees for Deep (Stochastic) Networks in Medical Imaging

Apr 12, 2021
Anthony Sicilia, Xingchen Zhao, Anastasia Sosnovskikh, Seong Jae Hwang

Figure 1 for PAC Bayesian Performance Guarantees for Deep (Stochastic) Networks in Medical Imaging
Figure 2 for PAC Bayesian Performance Guarantees for Deep (Stochastic) Networks in Medical Imaging

Application of deep neural networks to medical imaging tasks has in some sense become commonplace. Still, a "thorn in the side" of the deep learning movement is the argument that deep networks are somehow prone to overfitting and are thus unable to generalize well when datasets are small. The claim is not baseless and likely stems from the observation that PAC bounds on generalization error are usually so large for deep networks that they are vacuous (i.e., logically meaningless). Contrary to this, recent advances using the PAC-Bayesian framework have instead shown non-vacuous bounds on generalization error for large (stochastic) networks and standard datasets (e.g., MNIST and CIFAR-10). We apply these techniques to a much smaller medical imagining dataset (the ISIC 2018 challenge set). Further, we consider generalization of deep networks on segmentation tasks which has not commonly been done using the PAC-Bayesian framework. Importantly, we observe that the resultant bounds are also non-vacuous despite the sharp reduction in sample size. In total, our results demonstrate the applicability of PAC-Bayesian bounds for deep stochastic networks in the medical imaging domain.

Viaarxiv icon

Multi-Domain Learning by Meta-Learning: Taking Optimal Steps in Multi-Domain Loss Landscapes by Inner-Loop Learning

Feb 25, 2021
Anthony Sicilia, Xingchen Zhao, Davneet Minhas, Erin O'Connor, Howard Aizenstein, William Klunk, Dana Tudorascu, Seong Jae Hwang

Figure 1 for Multi-Domain Learning by Meta-Learning: Taking Optimal Steps in Multi-Domain Loss Landscapes by Inner-Loop Learning
Figure 2 for Multi-Domain Learning by Meta-Learning: Taking Optimal Steps in Multi-Domain Loss Landscapes by Inner-Loop Learning
Figure 3 for Multi-Domain Learning by Meta-Learning: Taking Optimal Steps in Multi-Domain Loss Landscapes by Inner-Loop Learning

We consider a model-agnostic solution to the problem of Multi-Domain Learning (MDL) for multi-modal applications. Many existing MDL techniques are model-dependent solutions which explicitly require nontrivial architectural changes to construct domain-specific modules. Thus, properly applying these MDL techniques for new problems with well-established models, e.g. U-Net for semantic segmentation, may demand various low-level implementation efforts. In this paper, given emerging multi-modal data (e.g., various structural neuroimaging modalities), we aim to enable MDL purely algorithmically so that widely used neural networks can trivially achieve MDL in a model-independent manner. To this end, we consider a weighted loss function and extend it to an effective procedure by employing techniques from the recently active area of learning-to-learn (meta-learning). Specifically, we take inner-loop gradient steps to dynamically estimate posterior distributions over the hyperparameters of our loss function. Thus, our method is model-agnostic, requiring no additional model parameters and no network architecture changes; instead, only a few efficient algorithmic modifications are needed to improve performance in MDL. We demonstrate our solution to a fitting problem in medical imaging, specifically, in the automatic segmentation of white matter hyperintensity (WMH). We look at two neuroimaging modalities (T1-MR and FLAIR) with complementary information fitting for our problem.

* IEEE International Symposium on Biomedical Imaging 2021 
Viaarxiv icon

Robust White Matter Hyperintensity Segmentation on Unseen Domain

Feb 17, 2021
Xingchen Zhao, Anthony Sicilia, Davneet Minhas, Erin O'Connor, Howard Aizenstein, William Klunk, Dana Tudorascu, Seong Jae Hwang

Figure 1 for Robust White Matter Hyperintensity Segmentation on Unseen Domain
Figure 2 for Robust White Matter Hyperintensity Segmentation on Unseen Domain
Figure 3 for Robust White Matter Hyperintensity Segmentation on Unseen Domain
Figure 4 for Robust White Matter Hyperintensity Segmentation on Unseen Domain

Typical machine learning frameworks heavily rely on an underlying assumption that training and test data follow the same distribution. In medical imaging which increasingly begun acquiring datasets from multiple sites or scanners, this identical distribution assumption often fails to hold due to systematic variability induced by site or scanner dependent factors. Therefore, we cannot simply expect a model trained on a given dataset to consistently work well, or generalize, on a dataset from another distribution. In this work, we address this problem, investigating the application of machine learning models to unseen medical imaging data. Specifically, we consider the challenging case of Domain Generalization (DG) where we train a model without any knowledge about the testing distribution. That is, we train on samples from a set of distributions (sources) and test on samples from a new, unseen distribution (target). We focus on the task of white matter hyperintensity (WMH) prediction using the multi-site WMH Segmentation Challenge dataset and our local in-house dataset. We identify how two mechanically distinct DG approaches, namely domain adversarial learning and mix-up, have theoretical synergy. Then, we show drastic improvements of WMH prediction on an unseen target domain.

* IEEE International Symposium on Biomedical Imaging 2021 
Viaarxiv icon