Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

More Diverse Means Better: Multimodal Deep Learning Meets Remote Sensing Imagery Classification

Aug 12, 2020
Danfeng Hong, Lianru Gao, Naoto Yokoya, Jing Yao, Jocelyn Chanussot, Qian Du, Bing Zhang

Classification and identification of the materials lying over or beneath the Earth's surface have long been a fundamental but challenging research topic in geoscience and remote sensing (RS) and have garnered a growing concern owing to the recent advancements of deep learning techniques. Although deep networks have been successfully applied in single-modality-dominated classification tasks, yet their performance inevitably meets the bottleneck in complex scenes that need to be finely classified, due to the limitation of information diversity. In this work, we provide a baseline solution to the aforementioned difficulty by developing a general multimodal deep learning (MDL) framework. In particular, we also investigate a special case of multi-modality learning (MML) -- cross-modality learning (CML) that exists widely in RS image classification applications. By focusing on "what", "where", and "how" to fuse, we show different fusion strategies as well as how to train deep networks and build the network architecture. Specifically, five fusion architectures are introduced and developed, further being unified in our MDL framework. More significantly, our framework is not only limited to pixel-wise classification tasks but also applicable to spatial information modeling with convolutional neural networks (CNNs). To validate the effectiveness and superiority of the MDL framework, extensive experiments related to the settings of MML and CML are conducted on two different multimodal RS datasets. Furthermore, the codes and datasets will be available at, contributing to the RS community.

* IEEE Transactions on Geoscience and Remote Sensing, 2020 

  Access Paper or Ask Questions

ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval

Aug 04, 2020
Quan Cui, Qing-Yuan Jiang, Xiu-Shen Wei, Wu-Jun Li, Osamu Yoshie

Retrieving content relevant images from a large-scale fine-grained dataset could suffer from intolerably slow query speed and highly redundant storage cost, due to high-dimensional real-valued embeddings which aim to distinguish subtle visual differences of fine-grained objects. In this paper, we study the novel fine-grained hashing topic to generate compact binary codes for fine-grained images, leveraging the search and storage efficiency of hash learning to alleviate the aforementioned problems. Specifically, we propose a unified end-to-end trainable network, termed as ExchNet. Based on attention mechanisms and proposed attention constraints, it can firstly obtain both local and global features to represent object parts and whole fine-grained objects, respectively. Furthermore, to ensure the discriminative ability and semantic meaning's consistency of these part-level features across images, we design a local feature alignment approach by performing a feature exchanging operation. Later, an alternative learning algorithm is employed to optimize the whole ExchNet and then generate the final binary hash codes. Validated by extensive experiments, our proposal consistently outperforms state-of-the-art generic hashing methods on five fine-grained datasets, which shows our effectiveness. Moreover, compared with other approximate nearest neighbor methods, ExchNet achieves the best speed-up and storage reduction, revealing its efficiency and practicality.

* Accepted by ECCV2020 

  Access Paper or Ask Questions

Spectral Superresolution of Multispectral Imagery with Joint Sparse and Low-Rank Learning

Jul 28, 2020
Lianru Gao, Danfeng Hong, Jing Yao, Bing Zhang, Paolo Gamba, Jocelyn Chanussot

Extensive attention has been widely paid to enhance the spatial resolution of hyperspectral (HS) images with the aid of multispectral (MS) images in remote sensing. However, the ability in the fusion of HS and MS images remains to be improved, particularly in large-scale scenes, due to the limited acquisition of HS images. Alternatively, we super-resolve MS images in the spectral domain by the means of partially overlapped HS images, yielding a novel and promising topic: spectral superresolution (SSR) of MS imagery. This is challenging and less investigated task due to its high ill-posedness in inverse imaging. To this end, we develop a simple but effective method, called joint sparse and low-rank learning (J-SLoL), to spectrally enhance MS images by jointly learning low-rank HS-MS dictionary pairs from overlapped regions. J-SLoL infers and recovers the unknown hyperspectral signals over a larger coverage by sparse coding on the learned dictionary pair. Furthermore, we validate the SSR performance on three HS-MS datasets (two for classification and one for unmixing) in terms of reconstruction, classification, and unmixing by comparing with several existing state-of-the-art baselines, showing the effectiveness and superiority of the proposed J-SLoL algorithm. Furthermore, the codes and datasets will be available at:\_TGRS\_J-SLoL, contributing to the RS community.

* IEEE Transactions on Geoscience and Remote Sensing, 2020 

  Access Paper or Ask Questions

Online Human Activity Recognition Employing Hierarchical Hidden Markov Models

Mar 12, 2019
Parviz Asghari, Elnaz Soelimani, Ehsan Nazerfard

In the last few years there has been a growing interest in Human Activity Recognition~(HAR) topic. Sensor-based HAR approaches, in particular, has been gaining more popularity owing to their privacy preserving nature. Furthermore, due to the widespread accessibility of the internet, a broad range of streaming-based applications such as online HAR, has emerged over the past decades. However, proposing sufficiently robust online activity recognition approach in smart environment setting is still considered as a remarkable challenge. This paper presents a novel online application of Hierarchical Hidden Markov Model in order to detect the current activity on the live streaming of sensor events. Our method consists of two phases. In the first phase, data stream is segmented based on the beginning and ending of the activity patterns. Also, on-going activity is reported with every receiving observation. This phase is implemented using Hierarchical Hidden Markov models. The second phase is devoted to the correction of the provided label for the segmented data stream based on statistical features. The proposed model can also discover the activities that happen during another activity - so-called interrupted activities. After detecting the activity pane, the predicted label will be corrected utilizing statistical features such as time of day at which the activity happened and the duration of the activity. We validated our proposed method by testing it against two different smart home datasets and demonstrated its effectiveness, which is competing with the state-of-the-art methods.

* 9 pages, 9 figures, 7 tables 

  Access Paper or Ask Questions

GANE: A Generative Adversarial Network Embedding

May 21, 2018
Huiting Hong, Xin Li, Mingzhong Wang

Network embedding has become a hot research topic recently which can provide low-dimensional feature representations for many machine learning applications. Current work focuses on either (1) whether the embedding is designed as an unsupervised learning task by explicitly preserving the structural connectivity in the network, or (2) whether the embedding is a by-product during the supervised learning of a specific discriminative task in a deep neural network. In this paper, we focus on bridging the gap of the two lines of the research. We propose to adapt the Generative Adversarial model to perform network embedding, in which the generator is trying to generate vertex pairs, while the discriminator tries to distinguish the generated vertex pairs from real connections (edges) in the network. Wasserstein-1 distance is adopted to train the generator to gain better stability. We develop three variations of models, including GANE which applies cosine similarity, GANE-O1 which preserves the first-order proximity, and GANE-O2 which tries to preserves the second-order proximity of the network in the low-dimensional embedded vector space. We later prove that GANE-O2 has the same objective function as GANE-O1 when negative sampling is applied to simplify the training process in GANE-O2. Experiments with real-world network datasets demonstrate that our models constantly outperform state-of-the-art solutions with significant improvements on precision in link prediction, as well as on visualizations and accuracy in clustering tasks.

  Access Paper or Ask Questions

Recommender System for News Articles using Supervised Learning

Jul 03, 2017
Akshay Kumar Chaturvedi, Filipa Peleja, Ana Freire

In the last decade we have observed a mass increase of information, in particular information that is shared through smartphones. Consequently, the amount of information that is available does not allow the average user to be aware of all his options. In this context, recommender systems use a number of techniques to help a user find the desired product. Hence, nowadays recommender systems play an important role. Recommender Systems' aim to identify products that best fits user preferences. These techniques are advantageous to both users and vendors, as it enables the user to rapidly find what he needs and the vendors to promote their products and sales. As the industry became aware of the gains that could be accomplished by using these algorithms, also a very interesting problem for many researchers, recommender systems became a very active area since the mid 90's. Having in mind that this is an ongoing problem the present thesis intends to observe the value of using a recommender algorithm to find users likes by observing her domain preferences. In a balanced probabilistic method, this thesis will show how news topics can be used to recommend news articles. In this thesis, we used different machine learning methods to determine the user ratings for an article. To tackle this problem, supervised learning methods such as linear regression, Naive Bayes and logistic regression are used. All the aforementioned models have a different nature which has an impact on the solution of the given problem. Furthermore, number of experiments are presented and discussed to identify the feature set that fits best to the problem.

* 36 pages 

  Access Paper or Ask Questions

MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization

Sep 28, 2016
Ramakrishnan Kannan, Grey Ballard, Haesun Park

Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors $W$ and $H$, for the given input matrix $A$, such that $A \approx W H$. NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient parallel algorithms to solve the problem for big data sets. The main contribution of this work is a new, high-performance parallel computational framework for a broad class of NMF algorithms that iteratively solves alternating non-negative least squares (NLS) subproblems for $W$ and $H$. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). The framework is flexible and able to leverage a variety of NMF and NLS algorithms, including Multiplicative Update, Hierarchical Alternating Least Squares, and Block Principal Pivoting. Our implementation allows us to benchmark and compare different algorithms on massive dense and sparse data matrices of size that spans for few hundreds of millions to billions. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements. The code and the datasets used for conducting the experiments are available online.

* arXiv admin note: text overlap with arXiv:1509.09313 

  Access Paper or Ask Questions

Flexible High-dimensional Classification Machines and Their Asymptotic Properties

Oct 11, 2013
Xingye Qiao, Lingsong Zhang

Classification is an important topic in statistics and machine learning with great potential in many real applications. In this paper, we investigate two popular large margin classification methods, Support Vector Machine (SVM) and Distance Weighted Discrimination (DWD), under two contexts: the high-dimensional, low-sample size data and the imbalanced data. A unified family of classification machines, the FLexible Assortment MachinE (FLAME) is proposed, within which DWD and SVM are special cases. The FLAME family helps to identify the similarities and differences between SVM and DWD. It is well known that many classifiers overfit the data in the high-dimensional setting; and others are sensitive to the imbalanced data, that is, the class with a larger sample size overly influences the classifier and pushes the decision boundary towards the minority class. SVM is resistant to the imbalanced data issue, but it overfits high-dimensional data sets by showing the undesired data-piling phenomena. The DWD method was proposed to improve SVM in the high-dimensional setting, but its decision boundary is sensitive to the imbalanced ratio of sample sizes. Our FLAME family helps to understand an intrinsic connection between SVM and DWD, and improves both methods by providing a better trade-off between sensitivity to the imbalanced data and overfitting the high-dimensional data. Several asymptotic properties of the FLAME classifiers are studied. Simulations and real data applications are investigated to illustrate the usefulness of the FLAME classifiers.

* 49 pages, 11 figures 

  Access Paper or Ask Questions

Atomic Filter: a Weak Form of Shift Operator for Graph Signals

Apr 01, 2022
Lihua Yang, Qing Zhang, Qian Zhang, Chao Huang

The shift operation plays a crucial role in the classical signal processing. It is the generator of all the filters and the basic operation for time-frequency analysis, such as windowed Fourier transform and wavelet transform. With the rapid development of internet technology and big data science, a large amount of data are expressed as signals defined on graphs. In order to establish the theory of filtering, windowed Fourier transform and wavelet transform in the setting of graph signals, we need to extend the shift operation of classical signals to graph signals. It is a fundamental problem since the vertex set of a graph is usually not a vector space and the addition operation cannot be defined on the vertex set of the graph. In this paper, based on our understanding on the core role of shift operation in classical signal processing we propose the concept of atomic filters, which can be viewed as a weak form of the shift operator for graph signals. Then, we study the conditions such that an atomic filter is norm-preserving, periodic, or real-preserving. The property of real-preserving holds naturally in the classical signal processing, but no the research has been reported on this topic in the graph signal setting. With these conditions we propose the concept of normal atomic filters for graph signals, which degenerates into the classical shift operator under mild conditions if the graph is circulant. Typical examples of graphs that have or have not normal atomic filters are given. Finally, as an application, atomic filters are utilized to construct time-frequency atoms which constitute a frame of the graph signal space.

  Access Paper or Ask Questions

Temporal Aggregation for Adaptive RGBT Tracking

Jan 29, 2022
Zhangyong Tang, Tianyang Xu, Xiao-Jun Wu

Visual object tracking with RGB and thermal infrared (TIR) spectra available, shorted in RGBT tracking, is a novel and challenging research topic which draws increasing attention nowadays. In this paper, we propose an RGBT tracker which takes spatio-temporal clues into account for robust appearance model learning, and simultaneously, constructs an adaptive fusion sub-network for cross-modal interactions. Unlike most existing RGBT trackers that implement object tracking tasks with only spatial information included, temporal information is further considered in this method. Specifically, different from traditional Siamese trackers, which only obtain one search image during the process of picking up template-search image pairs, an extra search sample adjacent to the original one is selected to predict the temporal transformation, resulting in improved robustness of tracking performance.As for multi-modal tracking, constrained to the limited RGBT datasets, the adaptive fusion sub-network is appended to our method at the decision level to reflect the complementary characteristics contained in two modalities. To design a thermal infrared assisted RGB tracker, the outputs of the classification head from the TIR modality are taken into consideration before the residual connection from the RGB modality. Extensive experimental results on three challenging datasets, i.e. VOT-RGBT2019, GTOT and RGBT210, verify the effectiveness of our method. Code will be shared at \textcolor{blue}{\emph{}}.

* 12 pages, 10 figures 

  Access Paper or Ask Questions