Alert button
Picture for Prashant Gupta

Prashant Gupta

Alert button

Enhash: A Fast Streaming Algorithm For Concept Drift Detection

Nov 07, 2020
Aashi Jindal, Prashant Gupta, Debarka Sengupta, Jayadeva

Figure 1 for Enhash: A Fast Streaming Algorithm For Concept Drift Detection
Figure 2 for Enhash: A Fast Streaming Algorithm For Concept Drift Detection
Figure 3 for Enhash: A Fast Streaming Algorithm For Concept Drift Detection
Figure 4 for Enhash: A Fast Streaming Algorithm For Concept Drift Detection

We propose Enhash, a fast ensemble learner that detects \textit{concept drift} in a data stream. A stream may consist of abrupt, gradual, virtual, or recurring events, or a mixture of various types of drift. Enhash employs projection hash to insert an incoming sample. We show empirically that the proposed method has competitive performance to existing ensemble learners in much lesser time. Also, Enhash has moderate resource requirements. Experiments relevant to performance comparison were performed on 6 artificial and 4 real data sets consisting of various types of drifts.

Viaarxiv icon

A Weighted Mutual k-Nearest Neighbour for Classification Mining

May 14, 2020
Joydip Dhar, Ashaya Shukla, Mukul Kumar, Prashant Gupta

Figure 1 for A Weighted Mutual k-Nearest Neighbour for Classification Mining
Figure 2 for A Weighted Mutual k-Nearest Neighbour for Classification Mining

kNN is a very effective Instance based learning method, and it is easy to implement. Due to heterogeneous nature of data, noises from different possible sources are also widespread in nature especially in case of large-scale databases. For noise elimination and effect of pseudo neighbours, in this paper, we propose a new learning algorithm which performs the task of anomaly detection and removal of pseudo neighbours from the dataset so as to provide comparative better results. This algorithm also tries to minimize effect of those neighbours which are distant. A concept of certainty measure is also introduced for experimental results. The advantage of using concept of mutual neighbours and distance-weighted voting is that, dataset will be refined after removal of anomaly and weightage concept compels to take into account more consideration of those neighbours, which are closer. Consequently, finally the performance of proposed algorithm is calculated.

* 5 pages, 1 figure, 5 tables 
Viaarxiv icon

Guided Random Forest and its application to data approximation

Sep 02, 2019
Prashant Gupta, Aashi Jindal, Jayadeva, Debarka Sengupta

Figure 1 for Guided Random Forest and its application to data approximation
Figure 2 for Guided Random Forest and its application to data approximation
Figure 3 for Guided Random Forest and its application to data approximation
Figure 4 for Guided Random Forest and its application to data approximation

We present a new way of constructing an ensemble classifier, named the Guided Random Forest (GRAF) in the sequel. GRAF extends the idea of building oblique decision trees with localized partitioning to obtain a global partitioning. We show that global partitioning bridges the gap between decision trees and boosting algorithms. We empirically demonstrate that global partitioning reduces the generalization error bound. Results on 115 benchmark datasets show that GRAF yields comparable or better results on a majority of datasets. We also present a new way of approximating the datasets in the framework of random forests.

Viaarxiv icon

Continuous Toolpath Planning in Additive Manufacturing

Aug 19, 2019
Prashant Gupta, Bala Krishnamoorthy

Figure 1 for Continuous Toolpath Planning in Additive Manufacturing
Figure 2 for Continuous Toolpath Planning in Additive Manufacturing
Figure 3 for Continuous Toolpath Planning in Additive Manufacturing
Figure 4 for Continuous Toolpath Planning in Additive Manufacturing

We develop a framework that creates a new polygonal mesh representation of the 3D domain of a layer-by-layer 3D printing job on which we identify single, continuous tool paths covering each connected piece of the domain in every layer. We present a tool path algorithm that traverses each such continuous tool path with no crossovers. The key construction at the heart of our framework is a novel Euler transformation that we introduced recently in a separate manuscript. Our Euler transformation converts a 2-dimensional cell complex K into a new 2-complex K^ such that every vertex in the 1-skeleton G^ of K^ has degree 4. Hence G^ is Eulerian, and an Eulerian tour can be followed to print all edges in a continuous fashion without stops. We start with a mesh K of the union of polygons obtained by projecting all layers to the plane. First we compute its Euler transformation K^. In the slicing step, we clip K^ at each layer i using its polygon to obtain K^_i. We then patch K^_i by adding edges such that any odd-degree nodes created by slicing are transformed to have even degrees again. We print extra support edges in place of any segments left out to ensure there are no edges without support in the next layer above. These support edges maintain the Euler nature of K^_i. Finally, we describe a tree-based search algorithm that builds the continuous tool path by traversing "concentric" cycles in the Euler complex. Our algorithm produces a tool path that avoids material collisions and crossovers, and can be printed in a continuous fashion irrespective of complex geometry or topology of the domain (e.g., holes).

* A couple sections from arXiv:1812.02412 are included here for the sake of completeness 
Viaarxiv icon

Pentagon at MEDIQA 2019: Multi-task Learning for Filtering and Re-ranking Answers using Language Inference and Question Entailment

Jul 01, 2019
Hemant Pugaliya, Karan Saxena, Shefali Garg, Sheetal Shalini, Prashant Gupta, Eric Nyberg, Teruko Mitamura

Figure 1 for Pentagon at MEDIQA 2019: Multi-task Learning for Filtering and Re-ranking Answers using Language Inference and Question Entailment
Figure 2 for Pentagon at MEDIQA 2019: Multi-task Learning for Filtering and Re-ranking Answers using Language Inference and Question Entailment
Figure 3 for Pentagon at MEDIQA 2019: Multi-task Learning for Filtering and Re-ranking Answers using Language Inference and Question Entailment
Figure 4 for Pentagon at MEDIQA 2019: Multi-task Learning for Filtering and Re-ranking Answers using Language Inference and Question Entailment

Parallel deep learning architectures like fine-tuned BERT and MT-DNN, have quickly become the state of the art, bypassing previous deep and shallow learning methods by a large margin. More recently, pre-trained models from large related datasets have been able to perform well on many downstream tasks by just fine-tuning on domain-specific datasets . However, using powerful models on non-trivial tasks, such as ranking and large document classification, still remains a challenge due to input size limitations of parallel architecture and extremely small datasets (insufficient for fine-tuning). In this work, we introduce an end-to-end system, trained in a multi-task setting, to filter and re-rank answers in the medical domain. We use task-specific pre-trained models as deep feature extractors. Our model achieves the highest Spearman's Rho and Mean Reciprocal Rank of 0.338 and 0.9622 respectively, on the ACL-BioNLP workshop MediQA Question Answering shared-task.

Viaarxiv icon