Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gautam Shroff

Graph Neural Networks for Leveraging Industrial Equipment Structure: An application to Remaining Useful Life Estimation

Jun 30, 2020

Jyoti Narwariya, Pankaj Malhotra, Vishnu TV, Lovekesh Vig, Gautam Shroff

Figure 1 for Graph Neural Networks for Leveraging Industrial Equipment Structure: An application to Remaining Useful Life Estimation

Figure 2 for Graph Neural Networks for Leveraging Industrial Equipment Structure: An application to Remaining Useful Life Estimation

Figure 3 for Graph Neural Networks for Leveraging Industrial Equipment Structure: An application to Remaining Useful Life Estimation

Figure 4 for Graph Neural Networks for Leveraging Industrial Equipment Structure: An application to Remaining Useful Life Estimation

Abstract:Automated equipment health monitoring from streaming multisensor time-series data can be used to enable condition-based maintenance, avoid sudden catastrophic failures, and ensure high operational availability. We note that most complex machinery has a well-documented and readily accessible underlying structure capturing the inter-dependencies between sub-systems or modules. Deep learning models such as those based on recurrent neural networks (RNNs) or convolutional neural networks (CNNs) fail to explicitly leverage this potentially rich source of domain-knowledge into the learning procedure. In this work, we propose to capture the structure of a complex equipment in the form of a graph, and use graph neural networks (GNNs) to model multi-sensor time-series data. Using remaining useful life estimation as an application task, we evaluate the advantage of incorporating the graph structure via GNNs on the publicly available turbofan engine benchmark dataset. We observe that the proposed GNN-based RUL estimation model compares favorably to several strong baselines from literature such as those based on RNNs and CNNs. Additionally, we observe that the learned network is able to focus on the module (node) with impending failure through a simple attention mechanism, potentially paving the way for actionable diagnosis.

* Accepted at AAAI workshop DLGMA'20

Via

Access Paper or Ask Questions

MultiMBNN: Matched and Balanced Causal Inference with Neural Networks

Apr 29, 2020

Ankit Sharma, Garima Gupta, Ranjitha Prasad, Arnab Chatterjee, Lovekesh Vig, Gautam Shroff

Figure 1 for MultiMBNN: Matched and Balanced Causal Inference with Neural Networks

Figure 2 for MultiMBNN: Matched and Balanced Causal Inference with Neural Networks

Figure 3 for MultiMBNN: Matched and Balanced Causal Inference with Neural Networks

Figure 4 for MultiMBNN: Matched and Balanced Causal Inference with Neural Networks

Abstract:Causal inference (CI) in observational studies has received a lot of attention in healthcare, education, ad attribution, policy evaluation, etc. Confounding is a typical hazard, where the context affects both, the treatment assignment and response. In a multiple treatment scenario, we propose the neural network based MultiMBNN, where we overcome confounding by employing generalized propensity score based matching, and learning balanced representations. We benchmark the performance on synthetic and real-world datasets using PEHE, and mean absolute percentage error over ATE as metrics. MultiMBNN outperforms the state-of-the-art algorithms for CI such as TARNet and Perfect Match (PM).

* 6 journal pages; to be published in ESANN 2020

Via

Access Paper or Ask Questions

Privacy Guidelines for Contact Tracing Applications

Apr 28, 2020

Manish Shukla, Rajan M A, Sachin Lodha, Gautam Shroff, Ramesh Raskar

Abstract:Contact tracing is a very powerful method to implement and enforce social distancing to avoid spreading of infectious diseases. The traditional approach of contact tracing is time consuming, manpower intensive, dangerous and prone to error due to fatigue or lack of skill. Due to this there is an emergence of mobile based applications for contact tracing. These applications primarily utilize a combination of GPS based absolute location and Bluetooth based relative location remitted from user's smartphone to infer various insights. These applications have eased the task of contact tracing; however, they also have severe implication on user's privacy, for example, mass surveillance, personal information leakage and additionally revealing the behavioral patterns of the user. This impact on user's privacy leads to trust deficit in these applications, and hence defeats their purpose. In this work we discuss the various scenarios which a contact tracing application should be able to handle. We highlight the privacy handling of some of the prominent contact tracing applications. Additionally, we describe the various threat actors who can disrupt its working, or misuse end user's data, or hamper its mass adoption. Finally, we present privacy guidelines for contact tracing applications from different stakeholder's perspective. To best of our knowledge, this is the first generic work which provides privacy guidelines for contact tracing applications.

* 10 pages, 0 images

Via

Access Paper or Ask Questions

MetaCI: Meta-Learning for Causal Inference in a Heterogeneous Population

Dec 09, 2019

Ankit Sharma, Garima Gupta, Ranjitha Prasad, Arnab Chatterjee, Lovekesh Vig, Gautam Shroff

Figure 1 for MetaCI: Meta-Learning for Causal Inference in a Heterogeneous Population

Figure 2 for MetaCI: Meta-Learning for Causal Inference in a Heterogeneous Population

Figure 3 for MetaCI: Meta-Learning for Causal Inference in a Heterogeneous Population

Figure 4 for MetaCI: Meta-Learning for Causal Inference in a Heterogeneous Population

Abstract:Performing inference on data obtained through observational studies is becoming extremely relevant due to the widespread availability of data in fields such as healthcare, education, retail, etc. Furthermore, this data is accrued from multiple homogeneous subgroups of a heterogeneous population, and hence, generalizing the inference mechanism over such data is essential. We propose the MetaCI framework with the goal of answering counterfactual questions in the context of causal inference (CI), where the factual observations are obtained from several homogeneous subgroups. While the CI network is designed to generalize from factual to counterfactual distribution in order to tackle covariate shift, MetaCI employs the meta-learning paradigm to tackle the shift in data distributions between training and test phase due to the presence of heterogeneity in the population, and due to drifts in the target distribution, also known as concept shift. We benchmark the performance of the MetaCI algorithm using the mean absolute percentage error over the average treatment effect as the metric, and demonstrate that meta initialization has significant gains compared to randomly initialized networks, and other methods.

* 10 pages, NeurIPS 2019 CausalML workshop

Via

Access Paper or Ask Questions

Meta-Learning for Few-Shot Time Series Classification

Sep 25, 2019

Jyoti Narwariya, Pankaj Malhotra, Lovekesh Vig, Gautam Shroff, Vishnu Tv

Figure 1 for Meta-Learning for Few-Shot Time Series Classification

Figure 2 for Meta-Learning for Few-Shot Time Series Classification

Figure 3 for Meta-Learning for Few-Shot Time Series Classification

Figure 4 for Meta-Learning for Few-Shot Time Series Classification

Abstract:Deep neural networks (DNNs) have achieved state-of-the-art results on time series classification (TSC) tasks. In this work, we focus on leveraging DNNs in the often-encountered practical scenario where access to labeled training data is difficult, and where DNNs would be prone to overfitting. We leverage recent advancements in gradient-based meta-learning, and propose an approach to train a residual neural network with convolutional layers as a meta-learning agent for few-shot TSC. The network is trained on a diverse set of few-shot tasks sampled from various domains (e.g. healthcare, activity recognition, etc.) such that it can solve a target task from another domain using only a small number of training samples from the target task. Most existing meta-learning approaches are limited in practice as they assume a fixed number of target classes across tasks. We overcome this limitation in order to train a common agent across domains with each domain having different number of target classes, we utilize a triplet-loss based learning procedure that does not require any constraints to be enforced on the number of classes for the few-shot TSC tasks. To the best of our knowledge, we are the first to use meta-learning based pre-training for TSC. Our approach sets a new benchmark for few-shot TSC, outperforming several strong baselines on few-shot tasks sampled from 41 datasets in UCR TSC Archive. We observe that pre-training under the meta-learning paradigm allows the network to quickly adapt to new unseen tasks with small number of labeled instances.

Via

Access Paper or Ask Questions

NISER: Normalized Item and Session Representations with Graph Neural Networks

Sep 13, 2019

Priyanka Gupta, Diksha Garg, Pankaj Malhotra, Lovekesh Vig, Gautam Shroff

Figure 1 for NISER: Normalized Item and Session Representations with Graph Neural Networks

Figure 2 for NISER: Normalized Item and Session Representations with Graph Neural Networks

Figure 3 for NISER: Normalized Item and Session Representations with Graph Neural Networks

Figure 4 for NISER: Normalized Item and Session Representations with Graph Neural Networks

Abstract:The goal of session-based recommendation (SR) models is to utilize the information from past actions (e.g. item/product clicks) in a session to recommend items that a user is likely to click next. Recently it has been shown that the sequence of item interactions in a session can be modeled as graph-structured data to better account for complex item transitions. Graph neural networks (GNNs) can learn useful representations for such session-graphs, and have been shown to improve over sequential models such as recurrent neural networks [14]. However, we note that these GNN-based recommendation models suffer from popularity bias: the models are biased towards recommending popular items, and fail to recommend relevant long-tail items (less popular or less frequent items). Therefore, these models perform poorly for the less popular new items arriving daily in a practical online setting. We demonstrate that this issue is, in part, related to the magnitude or norm of the learned item and session-graph representations (embedding vectors). We propose a training procedure that mitigates this issue by using normalized representations. The models using normalized item and session-graph representations perform significantly better: i. for the less popular long-tail items in the offline setting, and ii. for the less popular newly introduced items in the online setting. Furthermore, our approach significantly improves upon existing state-of-the-art on three benchmark datasets.

Via

Access Paper or Ask Questions

Meta-Learning for Black-box Optimization

Jul 16, 2019

Vishnu TV, Pankaj Malhotra, Jyoti Narwariya, Lovekesh Vig, Gautam Shroff

Figure 1 for Meta-Learning for Black-box Optimization

Figure 2 for Meta-Learning for Black-box Optimization

Figure 3 for Meta-Learning for Black-box Optimization

Figure 4 for Meta-Learning for Black-box Optimization

Abstract:Recently, neural networks trained as optimizers under the "learning to learn" or meta-learning framework have been shown to be effective for a broad range of optimization tasks including derivative-free black-box function optimization. Recurrent neural networks (RNNs) trained to optimize a diverse set of synthetic non-convex differentiable functions via gradient descent have been effective at optimizing derivative-free black-box functions. In this work, we propose RNN-Opt: an approach for learning RNN-based optimizers for optimizing real-parameter single-objective continuous functions under limited budget constraints. Existing approaches utilize an observed improvement based meta-learning loss function for training such models. We propose training RNN-Opt by using synthetic non-convex functions with known (approximate) optimal values by directly using discounted regret as our meta-learning loss function. We hypothesize that a regret-based loss function mimics typical testing scenarios, and would therefore lead to better optimizers compared to optimizers trained only to propose queries that improve over previous queries. Further, RNN-Opt incorporates simple yet effective enhancements during training and inference procedures to deal with the following practical challenges: i) Unknown range of possible values for the black-box function to be optimized, and ii) Practical and domain-knowledge based constraints on the input parameters. We demonstrate the efficacy of RNN-Opt in comparison to existing methods on several synthetic as well as standard benchmark black-box functions along with an anonymized industrial constrained optimization problem.

Via

Access Paper or Ask Questions

One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis

Jun 06, 2019

Vishal Sunder, Ashwin Srinivasan, Lovekesh Vig, Gautam Shroff, Rohit Rahul

Figure 1 for One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis

Figure 2 for One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis

Figure 3 for One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis

Figure 4 for One-shot Information Extraction from Document Images using Neuro-Deductive Program Synthesis

Abstract:Our interest in this paper is in meeting a rapidly growing industrial demand for information extraction from images of documents such as invoices, bills, receipts etc. In practice users are able to provide a very small number of example images labeled with the information that needs to be extracted. We adopt a novel two-level neuro-deductive, approach where (a) we use pre-trained deep neural networks to populate a relational database with facts about each document-image; and (b) we use a form of deductive reasoning, related to meta-interpretive learning of transition systems to learn extraction programs: Given task-specific transitions defined using the entities and relations identified by the neural detectors and a small number of instances (usually 1, sometimes 2) of images and the desired outputs, a resource-bounded meta-interpreter constructs proofs for the instance(s) via logical deduction; a set of logic programs that extract each desired entity is easily synthesized from such proofs. In most cases a single training example together with a noisy-clone of itself suffices to learn a program-set that generalizes well on test documents, at which time the value of each entity is determined by a majority vote across its program-set. We demonstrate our two-level neuro-deductive approach on publicly available datasets ("Patent" and "Doctor's Bills") and also describe its use in a real-life industrial problem.

* 11 pages, appears in the 13th International Workshop on Neural-Symbolic Learning and Reasoning at IJCAI 2019

Via

Access Paper or Ask Questions

Fast Online "Next Best Offers" using Deep Learning

May 31, 2019

Rekha Singhal, Gautam Shroff, Mukund Kumar, Sharod Roy, Sanket Kadarkar, Rupinder virk, Siddharth Verma, Vartika Tiwari

Figure 1 for Fast Online "Next Best Offers" using Deep Learning

Figure 2 for Fast Online "Next Best Offers" using Deep Learning

Figure 3 for Fast Online "Next Best Offers" using Deep Learning

Figure 4 for Fast Online "Next Best Offers" using Deep Learning

Abstract:In this paper, we present iPrescribe, a scalable low-latency architecture for recommending 'next-best-offers' in an online setting. The paper presents the design of iPrescribe and compares its performance for implementations using different real-time streaming technology stacks. iPrescribe uses an ensemble of deep learning and machine learning algorithms for prediction. We describe the scalable real-time streaming technology stack and optimized machine-learning implementations to achieve a 90th percentile recommendation latency of 38 milliseconds. Optimizations include a novel mechanism to deploy recurrent Long Short Term Memory (LSTM) deep learning networks efficiently.

* 7 Pages, Accepted in COMAD-CODS 2019

Via

Access Paper or Ask Questions

ConvTimeNet: A Pre-trained Deep Convolutional Neural Network for Time Series Classification

May 02, 2019

Kathan Kashiparekh, Jyoti Narwariya, Pankaj Malhotra, Lovekesh Vig, Gautam Shroff

Figure 1 for ConvTimeNet: A Pre-trained Deep Convolutional Neural Network for Time Series Classification

Figure 2 for ConvTimeNet: A Pre-trained Deep Convolutional Neural Network for Time Series Classification

Figure 3 for ConvTimeNet: A Pre-trained Deep Convolutional Neural Network for Time Series Classification

Figure 4 for ConvTimeNet: A Pre-trained Deep Convolutional Neural Network for Time Series Classification

Abstract:Training deep neural networks often requires careful hyper-parameter tuning and significant computational resources. In this paper, we propose ConvTimeNet (CTN): an off-the-shelf deep convolutional neural network (CNN) trained on diverse univariate time series classification (TSC) source tasks. Once trained, CTN can be easily adapted to new TSC target tasks via a small amount of fine-tuning using labeled instances from the target tasks. We note that the length of convolutional filters is a key aspect when building a pre-trained model that can generalize to time series of different lengths across datasets. To achieve this, we incorporate filters of multiple lengths in all convolutional layers of CTN to capture temporal features at multiple time scales. We consider all 65 datasets with time series of lengths up to 512 points from the UCR TSC Benchmark for training and testing transferability of CTN: We train CTN on a randomly chosen subset of 24 datasets using a multi-head approach with a different softmax layer for each training dataset, and study generalizability and transferability of the learned filters on the remaining 41 TSC datasets. We observe significant gains in classification accuracy as well as computational efficiency when using pre-trained CTN as a starting point for subsequent task-specific fine-tuning compared to existing state-of-the-art TSC approaches. We also provide qualitative insights into the working of CTN by: i) analyzing the activations and filters of first convolution layer suggesting the filters in CTN are generically useful, ii) analyzing the impact of the design decision to incorporate multiple length decisions, and iii) finding regions of time series that affect the final classification decision via occlusion sensitivity analysis.

* Accepted at IJCNN 2019

Via

Access Paper or Ask Questions