Alert button
Picture for Prathosh AP

Prathosh AP

Alert button

DeGPR: Deep Guided Posterior Regularization for Multi-Class Cell Detection and Counting

Apr 03, 2023
Aayush Kumar Tyagi, Chirag Mohapatra, Prasenjit Das, Govind Makharia, Lalita Mehra, Prathosh AP, Mausam

Figure 1 for DeGPR: Deep Guided Posterior Regularization for Multi-Class Cell Detection and Counting
Figure 2 for DeGPR: Deep Guided Posterior Regularization for Multi-Class Cell Detection and Counting
Figure 3 for DeGPR: Deep Guided Posterior Regularization for Multi-Class Cell Detection and Counting
Figure 4 for DeGPR: Deep Guided Posterior Regularization for Multi-Class Cell Detection and Counting

Multi-class cell detection and counting is an essential task for many pathological diagnoses. Manual counting is tedious and often leads to inter-observer variations among pathologists. While there exist multiple, general-purpose, deep learning-based object detection and counting methods, they may not readily transfer to detecting and counting cells in medical images, due to the limited data, presence of tiny overlapping objects, multiple cell types, severe class-imbalance, minute differences in size/shape of cells, etc. In response, we propose guided posterior regularization (DeGPR), which assists an object detector by guiding it to exploit discriminative features among cells. The features may be pathologist-provided or inferred directly from visual data. We validate our model on two publicly available datasets (CoNSeP and MoNuSAC), and on MuCeD, a novel dataset that we contribute. MuCeD consists of 55 biopsy images of the human duodenum for predicting celiac disease. We perform extensive experimentation with three object detection baselines on three datasets to show that DeGPR is model-agnostic, and consistently improves baselines obtaining up to 9% (absolute) mAP gains.

Viaarxiv icon

Discovering mesoscopic descriptions of collective movement with neural stochastic modelling

Mar 17, 2023
Utkarsh Pratiush, Arshed Nabeel, Vishwesha Guttal, Prathosh AP

Figure 1 for Discovering mesoscopic descriptions of collective movement with neural stochastic modelling
Figure 2 for Discovering mesoscopic descriptions of collective movement with neural stochastic modelling
Figure 3 for Discovering mesoscopic descriptions of collective movement with neural stochastic modelling
Figure 4 for Discovering mesoscopic descriptions of collective movement with neural stochastic modelling

Collective motion is an ubiquitous phenomenon in nature, inspiring engineers, physicists and mathematicians to develop mathematical models and bio-inspired designs. Collective motion at small to medium group sizes ($\sim$10-1000 individuals, also called the `mesoscale'), can show nontrivial features due to stochasticity. Therefore, characterizing both the deterministic and stochastic aspects of the dynamics is crucial in the study of mesoscale collective phenomena. Here, we use a physics-inspired, neural-network based approach to characterize the stochastic group dynamics of interacting individuals, through a stochastic differential equation (SDE) that governs the collective dynamics of the group. We apply this technique on both synthetic and real-world datasets, and identify the deterministic and stochastic aspects of the dynamics using drift and diffusion fields, enabling us to make novel inferences about the nature of order in these systems.

Viaarxiv icon

ALM-KD: Knowledge Distillation with noisy labels via adaptive loss mixing

Feb 07, 2022
Durga Sivasubramanian, Pradeep Shenoy, Prathosh AP, Ganesh Ramakrishnan

Figure 1 for ALM-KD: Knowledge Distillation with noisy labels via adaptive loss mixing
Figure 2 for ALM-KD: Knowledge Distillation with noisy labels via adaptive loss mixing
Figure 3 for ALM-KD: Knowledge Distillation with noisy labels via adaptive loss mixing
Figure 4 for ALM-KD: Knowledge Distillation with noisy labels via adaptive loss mixing

Knowledge distillation is a technique where the outputs of a pretrained model, often known as the teacher model is used for training a student model in a supervised setting. The teacher model outputs being a richer distribution over labels should improve the student model's performance as opposed to training with the usual hard labels. However, the label distribution imposed by the logits of the teacher network may not be always informative and may lead to poor student performance. We tackle this problem via the use of an adaptive loss mixing scheme during KD. Specifically, our method learns an instance-specific convex combination of the teacher-matching and label supervision objectives, using meta learning on a validation metric signalling to the student `how much' of KD is to be used. Through a range of experiments on controlled synthetic data and real-world datasets, we demonstrate performance gains obtained using our approach in the standard KD setting as well as in multi-teacher and self-distillation settings.

Viaarxiv icon

ScRAE: Deterministic Regularized Autoencoders with Flexible Priors for Clustering Single-cell Gene Expression Data

Jul 16, 2021
Arnab Kumar Mondal, Himanshu Asnani, Parag Singla, Prathosh AP

Figure 1 for ScRAE: Deterministic Regularized Autoencoders with Flexible Priors for Clustering Single-cell Gene Expression Data
Figure 2 for ScRAE: Deterministic Regularized Autoencoders with Flexible Priors for Clustering Single-cell Gene Expression Data
Figure 3 for ScRAE: Deterministic Regularized Autoencoders with Flexible Priors for Clustering Single-cell Gene Expression Data
Figure 4 for ScRAE: Deterministic Regularized Autoencoders with Flexible Priors for Clustering Single-cell Gene Expression Data

Clustering single-cell RNA sequence (scRNA-seq) data poses statistical and computational challenges due to their high-dimensionality and data-sparsity, also known as `dropout' events. Recently, Regularized Auto-Encoder (RAE) based deep neural network models have achieved remarkable success in learning robust low-dimensional representations. The basic idea in RAEs is to learn a non-linear mapping from the high-dimensional data space to a low-dimensional latent space and vice-versa, simultaneously imposing a distributional prior on the latent space, which brings in a regularization effect. This paper argues that RAEs suffer from the infamous problem of bias-variance trade-off in their naive formulation. While a simple AE without a latent regularization results in data over-fitting, a very strong prior leads to under-representation and thus bad clustering. To address the above issues, we propose a modified RAE framework (called the scRAE) for effective clustering of the single-cell RNA sequencing data. scRAE consists of deterministic AE with a flexibly learnable prior generator network, which is jointly trained with the AE. This facilitates scRAE to trade-off better between the bias and variance in the latent space. We demonstrate the efficacy of the proposed method through extensive experimentation on several real-world single-cell Gene expression datasets.

* IEEE/ACM Transactions on Computational Biology and Bioinformatics 
Viaarxiv icon

Contrastive Semi-Supervised Learning for 2D Medical Image Segmentation

Jul 10, 2021
Prashant Pandey, Ajey Pai, Nisarg Bhatt, Prasenjit Das, Govind Makharia, Prathosh AP, Mausam

Figure 1 for Contrastive Semi-Supervised Learning for 2D Medical Image Segmentation
Figure 2 for Contrastive Semi-Supervised Learning for 2D Medical Image Segmentation
Figure 3 for Contrastive Semi-Supervised Learning for 2D Medical Image Segmentation
Figure 4 for Contrastive Semi-Supervised Learning for 2D Medical Image Segmentation

Contrastive Learning (CL) is a recent representation learning approach, which encourages inter-class separability and intra-class compactness in learned image representations. Since medical images often contain multiple semantic classes in an image, using CL to learn representations of local features (as opposed to global) is important. In this work, we present a novel semi-supervised 2D medical segmentation solution that applies CL on image patches, instead of full images. These patches are meaningfully constructed using the semantic information of different classes obtained via pseudo labeling. We also propose a novel consistency regularization (CR) scheme, which works in synergy with CL. It addresses the problem of confirmation bias, and encourages better clustering in the feature space. We evaluate our method on four public medical segmentation datasets and a novel histopathology dataset that we introduce. Our method obtains consistent improvements over state-of-the-art semi-supervised segmentation approaches for all datasets.

* MICCAI 2021 
Viaarxiv icon

Domain Generalization via Inference-time Label-Preserving Target Projections

Mar 01, 2021
Prashant Pandey, Mrigank Raman, Sumanth Varambally, Prathosh AP

Figure 1 for Domain Generalization via Inference-time Label-Preserving Target Projections
Figure 2 for Domain Generalization via Inference-time Label-Preserving Target Projections
Figure 3 for Domain Generalization via Inference-time Label-Preserving Target Projections
Figure 4 for Domain Generalization via Inference-time Label-Preserving Target Projections

Generalization of machine learning models trained on a set of source domains on unseen target domains with different statistics, is a challenging problem. While many approaches have been proposed to solve this problem, they only utilize source data during training but do not take advantage of the fact that a single target example is available at the time of inference. Motivated by this, we propose a method that effectively uses the target sample during inference beyond mere classification. Our method has three components - (i) A label-preserving feature or metric transformation on source data such that the source samples are clustered in accordance with their class irrespective of their domain (ii) A generative model trained on the these features (iii) A label-preserving projection of the target point on the source-feature manifold during inference via solving an optimization problem on the input space of the generative model using the learned metric. Finally, the projected target is used in the classifier. Since the projected target feature comes from the source manifold and has the same label as the real target by design, the classifier is expected to perform better on it than the true target. We demonstrate that our method outperforms the state-of-the-art Domain Generalization methods on multiple datasets and tasks.

* CVPR 2021 
Viaarxiv icon

A Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition

Oct 03, 2020
Ayush Srivastava, Oshin Dutta, Prathosh AP, Sumeet Agarwal, Jigyasa Gupta

Figure 1 for A Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition
Figure 2 for A Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition
Figure 3 for A Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition
Figure 4 for A Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition

In the last few years, compression of deep neural networks has become an important strand of machine learning and computer vision research. Deep models require sizeable computational complexity and storage, when used for instance for Human Action Recognition (HAR) from videos, making them unsuitable to be deployed on edge devices. In this paper, we address this issue and propose a method to effectively compress Recurrent Neural Networks (RNNs) such as Gated Recurrent Units (GRUs) and Long-Short-Term-Memory Units (LSTMs) that are used for HAR. We use a Variational Information Bottleneck (VIB) theory-based pruning approach to limit the information flow through the sequential cells of RNNs to a small subset. Further, we combine our pruning method with a specific group-lasso regularization technique that significantly improves compression. The proposed techniques reduce model parameters and memory footprint from latent representations, with little or no reduction in the validation accuracy while increasing the inference speed several-fold. We perform experiments on the three widely used Action Recognition datasets, viz. UCF11, HMDB51, and UCF101, to validate our approach. It is shown that our method achieves over 70 times greater compression than the nearest competitor with comparable accuracy for the task of action recognition on UCF11.

* Submitted to WACV 
Viaarxiv icon

Discrepancy Minimization in Domain Generalization with Generative Nearest Neighbors

Jul 28, 2020
Prashant Pandey, Mrigank Raman, Sumanth Varambally, Prathosh AP

Figure 1 for Discrepancy Minimization in Domain Generalization with Generative Nearest Neighbors
Figure 2 for Discrepancy Minimization in Domain Generalization with Generative Nearest Neighbors
Figure 3 for Discrepancy Minimization in Domain Generalization with Generative Nearest Neighbors
Figure 4 for Discrepancy Minimization in Domain Generalization with Generative Nearest Neighbors

Domain generalization (DG) deals with the problem of domain shift where a machine learning model trained on multiple-source domains fail to generalize well on a target domain with different statistics. Multiple approaches have been proposed to solve the problem of domain generalization by learning domain invariant representations across the source domains that fail to guarantee generalization on the shifted target domain. We propose a Generative Nearest Neighbor based Discrepancy Minimization (GNNDM) method which provides a theoretical guarantee that is upper bounded by the error in the labeling process of the target. We employ a Domain Discrepancy Minimization Network (DDMN) that learns domain agnostic features to produce a single source domain while preserving the class labels of the data points. Features extracted from this source domain are learned using a generative model whose latent space is used as a sampler to retrieve the nearest neighbors for the target data points. The proposed method does not require access to the domain labels (a more realistic scenario) as opposed to the existing approaches. Empirically, we show the efficacy of our method on two datasets: PACS and VLCS. Through extensive experimentation, we demonstrate the effectiveness of the proposed method that outperforms several state-of-the-art DG methods.

Viaarxiv icon

Unsupervised Domain Adaptation for Semantic Segmentation of NIR Images through Generative Latent Search

Jul 17, 2020
Prashant Pandey, Aayush Kumar Tyagi, Sameer Ambekar, Prathosh AP

Figure 1 for Unsupervised Domain Adaptation for Semantic Segmentation of NIR Images through Generative Latent Search
Figure 2 for Unsupervised Domain Adaptation for Semantic Segmentation of NIR Images through Generative Latent Search
Figure 3 for Unsupervised Domain Adaptation for Semantic Segmentation of NIR Images through Generative Latent Search
Figure 4 for Unsupervised Domain Adaptation for Semantic Segmentation of NIR Images through Generative Latent Search

Segmentation of the pixels corresponding to human skin is an essential first step in multiple applications ranging from surveillance to heart-rate estimation from remote-photoplethysmography. However, the existing literature considers the problem only in the visible-range of the EM-spectrum which limits their utility in low or no light settings where the criticality of the application is higher. To alleviate this problem, we consider the problem of skin segmentation from the Near-infrared images. However, Deep learning based state-of-the-art segmentation techniques demands large amounts of labelled data that is unavailable for the current problem. Therefore we cast the skin segmentation problem as that of target-independent Unsupervised Domain Adaptation (UDA) where we use the data from the Red-channel of the visible-range to develop skin segmentation algorithm on NIR images. We propose a method for target-independent segmentation where the 'nearest-clone' of a target image in the source domain is searched and used as a proxy in the segmentation network trained only on the source domain. We prove the existence of 'nearest-clone' and propose a method to find it through an optimization algorithm over the latent space of a Deep generative model based on variational inference. We demonstrate the efficacy of the proposed method for NIR skin segmentation over the state-of-the-art UDA segmentation methods on the two newly created skin segmentation datasets in NIR domain despite not having access to the target NIR data. Additionally, we report state-of-the-art results for adaption from Synthia to Cityscapes which is a popular setting in Unsupervised Domain Adaptation for semantic segmentation. The code and datasets are available at https://github.com/ambekarsameer96/GLSS.

* ECCV 2020 [Spotlight] 
Viaarxiv icon