Alert button
Picture for Gopinath Chennupati

Gopinath Chennupati

Alert button

Decoy Selection for Protein Structure Prediction Via Extreme Gradient Boosting and Ranking

Oct 03, 2020
Nasrin Akhter, Gopinath Chennupati, Hristo Djidjev, Amarda Shehu

Figure 1 for Decoy Selection for Protein Structure Prediction Via Extreme Gradient Boosting and Ranking
Figure 2 for Decoy Selection for Protein Structure Prediction Via Extreme Gradient Boosting and Ranking
Figure 3 for Decoy Selection for Protein Structure Prediction Via Extreme Gradient Boosting and Ranking
Figure 4 for Decoy Selection for Protein Structure Prediction Via Extreme Gradient Boosting and Ranking

Identifying one or more biologically-active/native decoys from millions of non-native decoys is one of the major challenges in computational structural biology. The extreme lack of balance in positive and negative samples (native and non-native decoys) in a decoy set makes the problem even more complicated. Consensus methods show varied success in handling the challenge of decoy selection despite some issues associated with clustering large decoy sets and decoy sets that do not show much structural similarity. Recent investigations into energy landscape-based decoy selection approaches show promises. However, lack of generalization over varied test cases remains a bottleneck for these methods. We propose a novel decoy selection method, ML-Select, a machine learning framework that exploits the energy landscape associated with the structure space probed through a template-free decoy generation. The proposed method outperforms both clustering and energy ranking-based methods, all the while consistently offering better performance on varied test-cases. Moreover, ML-Select shows promising results even for the decoy sets consisting of mostly low-quality decoys. ML-Select is a useful method for decoy selection. This work suggests further research in finding more effective ways to adopt machine learning frameworks in achieving robust performance for decoy selection in template-free protein structure prediction.

* Accepted for BMC Bioinformatics 
Viaarxiv icon

Why I'm not Answering: Understanding Determinants of Classification of an Abstaining Classifier for Cancer Pathology Reports

Sep 24, 2020
Sayera Dhaubhadel, Jamaludin Mohd-Yusof, Kumkum Ganguly, Gopinath Chennupati, Sunil Thulasidasan, Nicolas Hengartner, Brent J. Mumphrey, Eric B. Durban, Jennifer A. Doherty, Mireille Lemieux, Noah Schaefferkoetter, Georgia Tourassi, Linda Coyle, Lynne Penberthy, Benjamin McMahon, Tanmoy Bhattacharya

Figure 1 for Why I'm not Answering: Understanding Determinants of Classification of an Abstaining Classifier for Cancer Pathology Reports
Figure 2 for Why I'm not Answering: Understanding Determinants of Classification of an Abstaining Classifier for Cancer Pathology Reports
Figure 3 for Why I'm not Answering: Understanding Determinants of Classification of an Abstaining Classifier for Cancer Pathology Reports
Figure 4 for Why I'm not Answering: Understanding Determinants of Classification of an Abstaining Classifier for Cancer Pathology Reports

Safe deployment of deep learning systems in critical real world applications requires models to make few mistakes, and only under predictable circumstances. Development of such a model is not yet possible, in general. In this work, we address this problem with an abstaining classifier tuned to have $>$95% accuracy, and identify the determinants of abstention with LIME (the Local Interpretable Model-agnostic Explanations method). Essentially, we are training our model to learn the attributes of pathology reports that are likely to lead to incorrect classifications, albeit at the cost of reduced sensitivity. We demonstrate our method in a multitask setting to classify cancer pathology reports from the NCI SEER cancer registries on six tasks of greatest importance. For these tasks, we reduce the classification error rate by factors of 2-5 by abstaining on 25-45% of the reports. For the specific case of cancer site, we are able to identify metastasis and reports involving lymph nodes as responsible for many of the classification mistakes, and that the extent and types of mistakes vary systematically with cancer site (eg. breast, lung, and prostate). When combining across three of the tasks, our model classifies 50% of the reports with an accuracy greater than 95% for three of the six tasks and greater than 85% for all six tasks on the retained samples. By using this information, we expect to define work flows that incorporate machine learning only in the areas where it is sufficiently robust and accurate, saving human attention to areas where it is required.

Viaarxiv icon

Why I'm not Answering: An Abstention-Based Approach to Classify Cancer Pathology Reports

Sep 17, 2020
Sayera Dhaubhadel, Jamaludin Mohd-Yusof, Kumkum Ganguly, Gopinath Chennupati, Sunil Thulasidasan, Nicolas Hengartner, Brent J. Mumphrey, Eric B. Durban, Jennifer A. Doherty, Mireille Lemieux, Noah Schaefferkoetter, Georgia Tourassi, Linda Coyle, Lynne Penberthy, Benjamin McMahon, Tanmoy Bhattacharya

Figure 1 for Why I'm not Answering: An Abstention-Based Approach to Classify Cancer Pathology Reports
Figure 2 for Why I'm not Answering: An Abstention-Based Approach to Classify Cancer Pathology Reports
Figure 3 for Why I'm not Answering: An Abstention-Based Approach to Classify Cancer Pathology Reports
Figure 4 for Why I'm not Answering: An Abstention-Based Approach to Classify Cancer Pathology Reports

Safe deployment of deep learning systems in critical real world applications requires models to make few mistakes, and only under predictable circumstances. Development of such a model is not yet possible, in general. In this work, we address this problem with an abstaining classifier tuned to have $>$95% accuracy, and identify the determinants of abstention with LIME (the Local Interpretable Model-agnostic Explanations method). Essentially, we are training our model to learn the attributes of pathology reports that are likely to lead to incorrect classifications, albeit at the cost of reduced sensitivity. We demonstrate our method in a multitask setting to classify cancer pathology reports from the NCI SEER cancer registries on six tasks of greatest importance. For these tasks, we reduce the classification error rate by factors of 2-5 by abstaining on 25-45% of the reports. For the specific case of cancer site, we are able to identify metastasis and reports involving lymph nodes as responsible for many of the classification mistakes, and that the extent and types of mistakes vary systematically with cancer site (eg. breast, lung, and prostate). When combining across three of the tasks, our model classifies 50% of the reports with an accuracy greater than 95% for three of the six tasks and greater than 85% for all six tasks on the retained samples. By using this information, we expect to define work flows that incorporate machine learning only in the areas where it is sufficiently robust and accurate, saving human attention to areas where it is required.

Viaarxiv icon

Why I'm not Answering

Sep 10, 2020
Sayera Dhaubhadel, Jamaludin Mohd-Yusof, Kumkum Ganguly, Gopinath Chennupati, Sunil Thulasidasan, Nicolas Hengartner, Brent J. Mumphrey, Eric B. Durban, Jennifer A. Doherty, Mireille Lemieux, Noah Schaefferkoetter, Georgia Tourassi, Linda Coyle, Lynne Penberthy, Benjamin McMahon, Tanmoy Bhattacharya

Figure 1 for Why I'm not Answering
Figure 2 for Why I'm not Answering
Figure 3 for Why I'm not Answering
Figure 4 for Why I'm not Answering

Safe deployment of deep learning systems in critical real world applications requires models to make few mistakes, and only under predictable circumstances. Development of such a model is not yet possible, in general. In this work, we address this problem with an abstaining classifier tuned to have $>$95\% accuracy, and identify the determinants of abstention with LIME (the Local Interpretable Model-agnostic Explanations method). Essentially, we are training our model to learn the attributes of pathology reports that are likely to lead to incorrect classifications, albeit at the cost of reduced sensitivity. We demonstrate our method in a multitask setting to classify cancer pathology reports from the NCI SEER cancer registries on six tasks of greatest importance. For these tasks, we reduce the classification error rate by factors of 2--5 by abstaining on 25--45\% of the reports. For the specific case of cancer site, we are able to identify metastasis and reports involving lymph nodes as responsible for many of the classification mistakes, and that the extent and types of mistakes vary systematically with cancer site (eg. breast, lung, and prostate). When combining across three of the tasks, our model classifies 50\% of the reports with an accuracy greater than 95\% for three of the six tasks and greater than 85\% for all six tasks on the retained samples. By using this information, we expect to define work flows that incorporate machine learning only in the areas where it is sufficiently robust and accurate, saving human attention to areas where it is required.

Viaarxiv icon

Distributed Non-Negative Tensor Train Decomposition

Aug 04, 2020
Manish Bhattarai, Gopinath Chennupati, Erik Skau, Raviteja Vangara, Hirsto Djidjev, Boian Alexandrov

Figure 1 for Distributed Non-Negative Tensor Train Decomposition
Figure 2 for Distributed Non-Negative Tensor Train Decomposition
Figure 3 for Distributed Non-Negative Tensor Train Decomposition
Figure 4 for Distributed Non-Negative Tensor Train Decomposition

The era of exascale computing opens new venues for innovations and discoveries in many scientific, engineering, and commercial fields. However, with the exaflops also come the extra-large high-dimensional data generated by high-performance computing. High-dimensional data is presented as multidimensional arrays, aka tensors. The presence of latent (not directly observable) structures in the tensor allows a unique representation and compression of the data by classical tensor factorization techniques. However, the classical tensor methods are not always stable or they can be exponential in their memory requirements, which makes them not suitable for high-dimensional tensors. Tensor train (TT) is a state-of-the-art tensor network introduced for factorization of high-dimensional tensors. TT transforms the initial high-dimensional tensor in a network of three-dimensional tensors that requires only a linear storage. Many real-world data, such as, density, temperature, population, probability, etc., are non-negative and for an easy interpretation, the algorithms preserving non-negativity are preferred. Here, we introduce a distributed non-negative tensor-train and demonstrate its scalability and the compression on synthetic and real-world big datasets.

* Accepted to IEEE-HPEC 2020 
Viaarxiv icon

On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks

May 27, 2019
Sunil Thulasidasan, Gopinath Chennupati, Jeff Bilmes, Tanmoy Bhattacharya, Sarah Michalak

Figure 1 for On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks
Figure 2 for On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks
Figure 3 for On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks
Figure 4 for On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks

Mixup~\cite{zhang2017mixup} is a recently proposed method for training deep neural networks where additional samples are generated during training by convexly combining random pairs of images and their associated labels. While simple to implement, it has shown to be a surprisingly effective method of data augmentation for image classification; DNNs trained with mixup show noticeable gains in classification performance on a number of image classification benchmarks. In this work, we discuss a hitherto untouched aspect of mixup training -- the calibration and predictive uncertainty of models trained with mixup. We find that DNNs trained with mixup are significantly better calibrated -- i.e., the predicted softmax scores are much better indicators of the actual likelihood of a correct prediction -- than DNNs trained in the regular fashion. We conduct experiments on a number of image classification architectures and datasets -- including large-scale datasets like ImageNet -- and find this to be the case. Additionally, we find that merely mixing features does not result in the same calibration benefit and that the label smoothing in mixup training plays a significant role in improving calibration. Finally, we also observe that mixup-trained DNNs are less prone to over-confident predictions on out-of-distribution and random-noise data. We conclude that the typical overconfidence seen in neural networks, even on in-distribution data is likely a consequence of training with hard labels, suggesting that mixup training be employed for classification tasks where predictive uncertainty is a significant concern.

Viaarxiv icon

Combating Label Noise in Deep Learning Using Abstention

May 27, 2019
Sunil Thulasidasan, Tanmoy Bhattacharya, Jeff Bilmes, Gopinath Chennupati, Jamal Mohd-Yusof

Figure 1 for Combating Label Noise in Deep Learning Using Abstention
Figure 2 for Combating Label Noise in Deep Learning Using Abstention
Figure 3 for Combating Label Noise in Deep Learning Using Abstention
Figure 4 for Combating Label Noise in Deep Learning Using Abstention

We introduce a novel method to combat label noise when training deep neural networks for classification. We propose a loss function that permits abstention during training thereby allowing the DNN to abstain on confusing samples while continuing to learn and improve classification performance on the non-abstained samples. We show how such a deep abstaining classifier (DAC) can be used for robust learning in the presence of different types of label noise. In the case of structured or systematic label noise -- where noisy training labels or confusing examples are correlated with underlying features of the data-- training with abstention enables representation learning for features that are associated with unreliable labels. In the case of unstructured (arbitrary) label noise, abstention during training enables the DAC to be used as an effective data cleaner by identifying samples that are likely to have label noise. We provide analytical results on the loss function behavior that enable dynamic adaption of abstention rates based on learning progress during training. We demonstrate the utility of the deep abstaining classifier for various image classification tasks under different types of label noise; in the case of arbitrary label noise, we show significant improvements over previously published results on multiple image benchmarks.

* ICML 2019 
Viaarxiv icon

eAnt-Miner : An Ensemble Ant-Miner to Improve the ACO Classification

Sep 09, 2014
Gopinath Chennupati

Figure 1 for eAnt-Miner : An Ensemble Ant-Miner to Improve the ACO Classification
Figure 2 for eAnt-Miner : An Ensemble Ant-Miner to Improve the ACO Classification
Figure 3 for eAnt-Miner : An Ensemble Ant-Miner to Improve the ACO Classification
Figure 4 for eAnt-Miner : An Ensemble Ant-Miner to Improve the ACO Classification

Ant Colony Optimization (ACO) has been applied in supervised learning in order to induce classification rules as well as decision trees, named Ant-Miners. Although these are competitive classifiers, the stability of these classifiers is an important concern that owes to their stochastic nature. In this paper, to address this issue, an acclaimed machine learning technique named, ensemble of classifiers is applied, where an ACO classifier is used as a base classifier to prepare the ensemble. The main trade-off is, the predictions in the new approach are determined by discovering a group of models as opposed to the single model classification. In essence, we prepare multiple models from the randomly replaced samples of training data from which, a unique model is prepared by aggregating the models to test the unseen data points. The main objective of this new approach is to increase the stability of the Ant-Miner results there by improving the performance of ACO classification. We found that the ensemble Ant-Miners significantly improved the stability by reducing the classification error on unseen data.

* 13 pages, 2 figures, 6 tables 
Viaarxiv icon