The present research scholars are having keen interest in doing their research activities in the area of Data mining all over the world. Especially, [13]Mining Image data is the one of the essential features in this present scenario since image data plays vital role in every aspect of the system such as business for marketing, hospital for surgery, engineering for construction, Web for publication and so on. The other area in the Image mining system is the Content-Based Image Retrieval (CBIR) which performs retrieval based on the similarity defined in terms of extracted features with more objectiveness. The drawback in CBIR is the features of the query image alone are considered. Hence, a new technique called Image retrieval based on optimum clusters is proposed for improving user interaction with image retrieval systems by fully exploiting the similarity information. The index is created by describing the images according to their color characteristics, with compact feature vectors, that represent typical color distributions [12].
We consider the problem of accurately and efficiently querying a remote server to retrieve information about images captured by a mobile device. In addition to reduced transmission overhead and computational complexity, the retrieval protocol should be robust to variations in the image acquisition process, such as translation, rotation, scaling, and sensor-related differences. We propose to extract scale-invariant image features and then perform clustering to reduce the number of features needed for image matching. Principal Component Analysis (PCA) and Non-negative Matrix Factorization (NMF) are investigated as candidate clustering approaches. The image matching complexity at the database server is quadratic in the (small) number of clusters, not in the (very large) number of image features. We employ an image-dependent information content metric to approximate the model order, i.e., the number of clusters, needed for accurate matching, which is preferable to setting the model order using trial and error. We show how to combine the hypotheses provided by PCA and NMF factor loadings, thereby obtaining more accurate retrieval than using either approach alone. In experiments on a database of urban images, we obtain a top-1 retrieval accuracy of 89% and a top-3 accuracy of 92.5%.
This work proposes a novel SLAM framework for stereo and visual inertial odometry estimation. It builds an efficient and robust parametrization of co-planar points and lines which leverages specific geometric constraints to improve camera pose optimization in terms of both efficiency and accuracy. %reduce the size of the Hessian matrix in the optimization. The pipeline consists of extracting 2D points and lines, predicting planar regions and filtering the outliers via RANSAC. Our parametrization scheme then represents co-planar points and lines as their 2D image coordinates and parameters of planes. We demonstrate the effectiveness of the proposed method by comparing it to traditional parametrizations in a novel Monte-Carlo simulation set. Further, the whole stereo SLAM and VIO system is compared with state-of-the-art methods on the public real-world dataset EuRoC. Our method shows better results in terms of accuracy and efficiency than the state-of-the-art. The code is released at https://github.com/LiXin97/Co-Planar-Parametrization.
Within the domain of Computational Fluid Dynamics, Direct Numerical Simulation (DNS) is used to obtain highly accurate numerical solutions for fluid flows. However, this approach for numerically solving the Navier-Stokes equations is extremely computationally expensive mostly due to the requirement of greatly refined grids. Large Eddy Simulation (LES) presents a more computationally efficient approach for solving fluid flows on lower-resolution (LR) grids but results in an overall reduction in solution fidelity. Through this paper, we introduce a novel deep learning framework SR-DNS Net, which aims to mitigate this inherent trade-off between solution fidelity and computational complexity by leveraging deep learning techniques used in image super-resolution. Using our model, we wish to learn the mapping from a coarser LR solution to a refined high-resolution (HR) DNS solution so as to eliminate the need for DNS simulations on highly refined grids. Our model efficiently reconstructs the high-fidelity DNS data from the LES like low-resolution solutions while yielding good reconstruction metrics. Thus our implementation improves the solution accuracy of LR solutions while incurring only a marginal increase in computational cost required for deploying the trained deep learning model.
We propose a novel solution for the cashier problem. Current cashier system/Point of Sale (POS) terminals can be inefficient, cumbersome and time-consuming for the users. There is a need for a solution dependent on modern technology and ubiquitous computing resources. We present I-POST (Intelligent Point of Sale and Transaction) as a software system that uses smart devices, mobile phone and state of the art machine learning algorithms to process the user transactions in automated and real time manner. I-POST is an automated checkout system that allows the user to walk in a store, collect his items and exit the store. There is no need to stand and wait in a queue. The system uses object detection and facial recognition algorithm to process the authentication of the client and the state of the object. At point of exit, the classifier sends the data to the backend server which execute the payments. The system uses Convolution Neural Network (CNN) for the image recognition and processing. CNN is a supervised learning model that has found major application in pattern recognition problem. The current implementation uses two classifiers that work intrinsically to authenticate the user and track the items. The model accuracy for object recognition is 97%, the loss is 9.3%. We expect that such systems can bring efficiency to the market and has the potential for broad and diverse applications.
Multi-spectral satellite imaging sensors acquire various spectral band images such as red (R), green (G), blue (B), near-infrared (N), etc. Thanks to the unique spectroscopic property of each spectral band with respective to the objects on the ground, multi-spectral satellite imagery can be used for various geological survey applications. Unfortunately, image artifacts from imaging sensor noises often affect the quality of scenes and have negative impacts on the applications of satellite imagery. Recently, deep learning approaches have been extensively explored for the removal of noises in satellite imagery. Most deep learning denoising methods, however, follow a supervised learning scheme, which requires matched noisy image and clean image pairs that are difficult to collect in real situations. In this paper, we propose a novel unsupervised multispectral denoising method for satellite imagery using wavelet subband cycle-consistent adversarial network (WavCycleGAN). The proposed method is based on unsupervised learning scheme using adversarial loss and cycle-consistency loss to overcome the lack of paired data. Moreover, in contrast to the standard image domain cycleGAN, we introduce a wavelet subband domain learning scheme for effective denoising without sacrificing high frequency components such as edges and detail information. Experimental results for the removal of vertical stripe and wave noises in satellite imaging sensors demonstrate that the proposed method effectively removes noises and preserves important high frequency features of satellite images.
The connection between visual input and tactile sensing is critical for object manipulation tasks such as grasping and pushing. In this work, we introduce the challenging task of estimating a set of tactile physical properties from visual information. We aim to build a model that learns the complex mapping between visual information and tactile physical properties. We construct a first of its kind image-tactile dataset with over 400 multiview image sequences and the corresponding tactile properties. A total of fifteen tactile physical properties across categories including friction, compliance, adhesion, texture, and thermal conductance are measured and then estimated by our models. We develop a cross-modal framework comprised of an adversarial objective and a novel visuo-tactile joint classification loss. Additionally, we develop a neural architecture search framework capable of selecting optimal combinations of viewing angles for estimating a given physical property.
Image classification has advanced significantly in recent years with the availability of large-scale image sets. However, fine-grained classification remains a major challenge due to the annotation cost of large numbers of fine-grained categories. This project shows that compelling classification performance can be achieved on such categories even without labeled training data. Given image and class embeddings, we learn a compatibility function such that matching embeddings are assigned a higher score than mismatching ones; zero-shot classification of an image proceeds by finding the label yielding the highest joint compatibility score. We use state-of-the-art image features and focus on different supervised attributes and unsupervised output embeddings either derived from hierarchies or learned from unlabeled text corpora. We establish a substantially improved state-of-the-art on the Animals with Attributes and Caltech-UCSD Birds datasets. Most encouragingly, we demonstrate that purely unsupervised output embeddings (learned from Wikipedia and improved with fine-grained text) achieve compelling results, even outperforming the previous supervised state-of-the-art. By combining different output embeddings, we further improve results.
There are numerous models of quantum neural networks that have been applied to variegated problems such as image classification, pattern recognition etc.Quantum inspired algorithms have been relevant for quite awhile. More recently, in the NISQ era, hybrid quantum classical models have shown promising results. Multi-feature regression is common problem in classical machine learning. Hence we present a comparative analysis of continuous variable quantum neural networks (Variational circuits) and quantum backpropagating multi layer perceptron (QBMLP). We have chosen the contemporary problem of predicting rise in COVID-19 cases in India and USA. We provide a statistical comparison between two models , both of which perform better than the classical artificial neural networks.
Encoded (or ciphered) manuscripts are a special type of historical documents that contain encrypted text. The automatic recognition of this kind of documents is challenging because: 1) the cipher alphabet changes from one document to another, 2) there is a lack of annotated corpus for training and 3) touching symbols make the symbol segmentation difficult and complex. To overcome these difficulties, we propose a novel method for handwritten ciphers recognition based on few-shot object detection. Our method first detects all symbols of a given alphabet in a line image, and then a decoding step maps the symbol similarity scores to the final sequence of transcribed symbols. By training on synthetic data, we show that the proposed architecture is able to recognize handwritten ciphers with unseen alphabets. In addition, if few labeled pages with the same alphabet are used for fine tuning, our method surpasses existing unsupervised and supervised HTR methods for ciphers recognition.