Dimension reduction and visualization of high-dimensional data have become very important research topics because of the rapid growth of large databases in data science. In this paper, we propose using a generalized sigmoid function to model the distance similarity in both high- and low-dimensional spaces. In particular, the parameter b is introduced to the generalized sigmoid function in low-dimensional space, so that we can adjust the heaviness of the function tail by changing the value of b. Using both simulated and real-world data sets, we show that our proposed method can generate visualization results comparable to those of uniform manifold approximation and projection (UMAP), which is a newly developed manifold learning technique with fast running speed, better global structure, and scalability to massive data sets. In addition, according to the purpose of the study and the data structure, we can decrease or increase the value of b to either reveal the finer cluster structure of the data or maintain the neighborhood continuity of the embedding for better visualization. Finally, we use domain knowledge to demonstrate that the finer subclusters revealed with small values of b are meaningful.
Learning interpretable and disentangled representations of data is a key topic in machine learning research. Variational Autoencoder (VAE) is a scalable method for learning directed latent variable models of complex data. It employs a clear and interpretable objective that can be easily optimized. However, this objective does not provide an explicit measure for the quality of latent variable representations which may result in their poor quality. We propose Variational Mutual Information Maximization Framework for VAE to address this issue. In comparison to other methods, it provides an explicit objective that maximizes lower bound on mutual information between latent codes and observations. The objective acts as a regularizer that forces VAE to not ignore the latent variable and allows one to select particular components of it to be most informative with respect to the observations. On top of that, the proposed framework provides a way to evaluate mutual information between latent codes and observations for a fixed VAE model. We have conducted our experiments on VAE models with Gaussian and joint Gaussian and discrete latent variables. Our results illustrate that the proposed approach strengthens relationships between latent codes and observations and improves learned representations.
Chinese poetry is an important part of worldwide culture, and classical and modern sub-branches are quite different. The former is a unique genre and has strict constraints, while the latter is very flexible in length, optional to have rhymes, and similar to modern poetry in other languages. Thus, it requires more to control the coherence and improve the novelty. In this paper, we propose a generate-retrieve-then-refine paradigm to jointly improve the coherence and novelty. In the first stage, a draft is generated given keywords (i.e., topics) only. The second stage produces a "refining vector" from retrieval lines. At last, we take into consideration both the draft and the "refining vector" to generate a new poem. The draft provides future sentence-level information for a line to be generated. Meanwhile, the "refining vector" points out the direction of refinement based on impressive words detection mechanism which can learn good patterns from references and then create new ones via insertion operation. Experimental results on a collected large-scale modern Chinese poetry dataset show that our proposed approach can not only generate more coherent poems, but also improve the diversity and novelty.
Data-driven approaches, most prominently deep learning, have become powerful tools for prediction in many domains. A natural question to ask is whether data-driven methods could also be used for numerical weather prediction. First studies show promise but the lack of a common dataset and evaluation metrics make inter-comparison between studies difficult. Here we present a benchmark dataset for data-driven medium-range weather forecasting, a topic of high scientific interest for atmospheric and computer scientists alike. We provide data derived from the ERA5 archive that has been processed to facilitate the use in machine learning models. We propose a simple and clear evaluation metric which will enable a direct comparison between different methods. Further, we provide baseline scores from simple linear regression techniques, deep learning models as well as purely physical forecasting models. All data is publicly available at https://mediatum.ub.tum.de/1524895 and the companion code is reproducible with tutorials for getting started. We hope that this dataset will accelerate research in data-driven weather forecasting.
Multi-view clustering aims at integrating complementary information from multiple heterogeneous views to improve clustering results. Existing multi-view clustering solutions can only output a single clustering of the data. Due to their multiplicity, multi-view data, can have different groupings that are reasonable and interesting from different perspectives. However, how to find multiple, meaningful, and diverse clustering results from multi-view data is still a rarely studied and challenging topic in multi-view clustering and multiple clusterings. In this paper, we introduce a deep matrix factorization based solution (DMClusts) to discover multiple clusterings. DMClusts gradually factorizes multi-view data matrices into representational subspaces layer-by-layer and generates one clustering in each layer. To enforce the diversity between generated clusterings, it minimizes a new redundancy quantification term derived from the proximity between samples in these subspaces. We further introduce an iterative optimization procedure to simultaneously seek multiple clusterings with quality and diversity. Experimental results on benchmark datasets confirm that DMClusts outperforms state-of-the-art multiple clustering solutions.
Discovering human mobility patterns with geo-location data collected from smartphone users has been a hot research topic in recent years. In this paper, we attempt to discover daily mobile patterns based on GPS data. We view this problem from a probabilistic perspective in order to explore more information from the original GPS data compared to other conventional methods. A non-parameter Bayesian modeling method, Infinite Gaussian Mixture Model, is used to estimate the probability density for the daily mobility. Then, we use Kullback-Leibler divergence as the metrics to measure the similarity of different probability distributions. And combining Infinite Gaussian Mixture Model and Kullback-Leibler divergence, we derived an automatic clustering algorithm to discover mobility patterns for each individual user without setting the number of clusters in advance. In the experiments, the effectiveness of our method is validated on the real user data collected from different users. The results show that the IGMM-based algorithm outperforms the GMM-based algorithm. We also test our methods on the dataset with different lengths to discover the minimum data length for discovering mobility patterns.
Pedestrian attribute recognition has been an emerging research topic in the area of video surveillance. To predict the existence of a particular attribute, it is demanded to localize the regions related to the attribute. However, in this task, the region annotations are not available. How to carve out these attribute-related regions remains challenging. Existing methods applied attribute-agnostic visual attention or heuristic body-part localization mechanisms to enhance the local feature representations, while neglecting to employ attributes to define local feature areas. We propose a flexible Attribute Localization Module (ALM) to adaptively discover the most discriminative regions and learns the regional features for each attribute at multiple levels. Moreover, a feature pyramid architecture is also introduced to enhance the attribute-specific localization at low-levels with high-level semantic guidance. The proposed framework does not require additional region annotations and can be trained end-to-end with multi-level deep supervision. Extensive experiments show that the proposed method achieves state-of-the-art results on three pedestrian attribute datasets, including PETA, RAP, and PA-100K.
Detecting manipulated facial images and videos is an increasingly important topic in digital media forensics. As advanced synthetic face generation and manipulation methods become available, new types of fake face representations are being created and raise significant concerns for their implications in social media. Hence, it is crucial to detect the manipulated face image and locate manipulated facial regions. Instead of simply using a multi-task learning approach to simultaneously detect manipulated images and predict the manipulated mask (regions), we propose to utilize the attention mechanism to process and improve the feature maps of the classifier model. The learned attention maps highlight the informative regions to further improve the binary classification power, and also visualize the manipulated regions. In addition, to enable our study of manipulated facial images detection and localization, we have collected the first database which contains numerous types of facial forgeries. With this dataset, we perform a thorough analysis of data-driven fake face detection. We demonstrate that the use of an attention mechanism improves manipulated facial region localization and fake detection.
Dimension Estimation (DE) and Dimension Reduction (DR) are two closely related topics, but with quite different goals. In DE, one attempts to estimate the intrinsic dimensionality or number of latent variables in a set of measurements of a random vector. However, in DR, one attempts to project a random vector, either linearly or non-linearly, to a lower dimensional space that preserves the information contained in the original higher dimensional space. Of course, these two ideas are quite closely linked since, for example, doing DR to a dimension smaller than suggested by DE will likely lead to information loss. Accordingly, in this paper we will focus on a particular class of deep neural networks called autoencoders which are used extensively for DR but are less well studied for DE. We show that several important questions arise when using autoencoders for DE, above and beyond those that arise for more classic DR/DE techniques such as Principal Component Analysis. We address autoencoder architectural choices and regularization techniques that allow one to transform autoencoder latent layer representations into estimates of intrinsic dimension.
With the development of research on memristor, memristive neural networks (MNNs) have become a hot research topic recently. Because memristor can mimic the spike timing-dependent plasticity (STDP), the research on STDP based MNNs is rapidly increasing. However, although state-of-the-art works on STDP based MNNs have many applications such as pattern recognition, STDP mechanism brings relatively complex hardware framework and low processing speed, which block MNNs' hardware realization. A non-STDP based unsupervised MNN is constructed in this paper. Through the comparison with STDP method on the basis of two common structures including feedforward and crossbar, non-STDP based MNNs not only remain the same advantages as STDP based MNNs including high accuracy and convergence speed in pattern recognition, but also better hardware performance as few hardware resources and higher processing speed. By virtue of the combination of memristive character and simple mechanism, non-STDP based MNNs have better hardware compatibility, which may give a new viewpoint for memristive neural networks' engineering applications.