Understanding how brain functions has been an intriguing topic for years. With the recent progress on collecting massive data and developing advanced technology, people have become interested in addressing the challenge of decoding brain wave data into meaningful mind states, with many machine learning models and algorithms being revisited and developed, especially the ones that handle time series data because of the nature of brain waves. However, many of these time series models, like HMM with hidden state in discrete space or State Space Model with hidden state in continuous space, only work with one source of data and cannot handle different sources of information simultaneously. In this paper, we propose an extension of State Space Model to work with different sources of information together with its learning and inference algorithms. We apply this model to decode the mind state of students during lectures based on their brain waves and reach a significant better results compared to traditional methods.
The task of text segmentation may be undertaken at many levels in text analysis---paragraphs, sentences, words, or even letters. Here, we focus on a relatively fine scale of segmentation, hypothesizing it to be in accord with a stochastic model of language generation, as the smallest scale where independent units of meaning are produced. Our goals in this letter include the development of methods for the segmentation of these minimal independent units, which produce feature-representations of texts that align with the independence assumption of the bag-of-terms model, commonly used for prediction and classification in computational text analysis. We also propose the measurement of texts' association (with respect to realized segmentations) to the model of language generation. We find (1) that our segmentations of phrases exhibit much better associations to the generation model than words and (2), that texts which are well fit are generally topically homogeneous. Because our generative model produces Zipf's law, our study further suggests that Zipf's law may be a consequence of homogeneity in language production.
We develop a fully discriminative learning approach for supervised Latent Dirichlet Allocation (LDA) model using Back Propagation (i.e., BP-sLDA), which maximizes the posterior probability of the prediction variable given the input document. Different from traditional variational learning or Gibbs sampling approaches, the proposed learning method applies (i) the mirror descent algorithm for maximum a posterior inference and (ii) back propagation over a deep architecture together with stochastic gradient/mirror descent for model parameter estimation, leading to scalable and end-to-end discriminative learning of the model. As a byproduct, we also apply this technique to develop a new learning method for the traditional unsupervised LDA model (i.e., BP-LDA). Experimental results on three real-world regression and classification tasks show that the proposed methods significantly outperform the previous supervised topic models, neural networks, and is on par with deep neural networks.
Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in statistical learning of latent variable models and in data mining. In this paper, we propose fast and randomized tensor CP decomposition algorithms based on sketching. We build on the idea of count sketches, but introduce many novel ideas which are unique to tensors. We develop novel methods for randomized computation of tensor contractions via FFTs, without explicitly forming the tensors. Such tensor contractions are encountered in decomposition methods such as tensor power iterations and alternating least squares. We also design novel colliding hashes for symmetric tensors to further save time in computing the sketches. We then combine these sketching ideas with existing whitening and tensor power iterative techniques to obtain the fastest algorithm on both sparse and dense tensors. The quality of approximation under our method does not depend on properties such as sparsity, uniformity of elements, etc. We apply the method for topic modeling and obtain competitive results.
If people with high risk of suicide can be identified through social media like microblog, it is possible to implement an active intervention system to save their lives. Based on this motivation, the current study administered the Suicide Probability Scale(SPS) to 1041 weibo users at Sina Weibo, which is a leading microblog service provider in China. Two NLP (Natural Language Processing) methods, the Chinese edition of Linguistic Inquiry and Word Count (LIWC) lexicon and Latent Dirichlet Allocation (LDA), are used to extract linguistic features from the Sina Weibo data. We trained predicting models by machine learning algorithm based on these two types of features, to estimate suicide probability based on linguistic features. The experiment results indicate that LDA can find topics that relate to suicide probability, and improve the performance of prediction. Our study adds value in prediction of suicidal probability of social network users with their behaviors.
The study of opinions, their formation and change, is one of the defining topics addressed by social psychology, but in recent years other disciplines, like computer science and complexity, have tried to deal with this issue. Despite the flourishing of different models and theories in both fields, several key questions still remain unanswered. The understanding of how opinions change and the way they are affected by social influence are challenging issues requiring a thorough analysis of opinion per se but also of the way in which they travel between agents' minds and are modulated by these exchanges. To account for the two-faceted nature of opinions, which are mental entities undergoing complex social processes, we outline a preliminary model in which a cognitive theory of opinions is put forward and it is paired with a formal description of them and of their spreading among minds. Furthermore, investigating social influence also implies the necessity to account for the way in which people change their minds, as a consequence of interacting with other people, and the need to explain the higher or lower persistence of such changes.
In manifold learning, algorithms based on graph Laplacians constructed from data have received considerable attention both in practical applications and theoretical analysis. In particular, the convergence of graph Laplacians obtained from sampled data to certain continuous operators has become an active research topic recently. Most of the existing work has been done under the assumption that the data is sampled from a manifold without boundary or that the functions of interests are evaluated at a point away from the boundary. However, the question of boundary behavior is of considerable practical and theoretical interest. In this paper we provide an analysis of the behavior of graph Laplacians at a point near or on the boundary, discuss their convergence rates and their implications and provide some numerical results. It turns out that while points near the boundary occupy only a small part of the total volume of a manifold, the behavior of graph Laplacian there has different scaling properties from its behavior elsewhere on the manifold, with global effects on the whole manifold, an observation with potentially important implications for the general problem of learning on manifolds.
We present a new method for segmenting, and a new user interface for indexing and visualizing, the semantic content of extended instructional videos. Given a series of key frames from the video, we generate a condensed view of the data by clustering frames according to media type and visual similarities. Using various visual filters, key frames are first assigned a media type (board, class, computer, illustration, podium, and sheet). Key frames of media type board and sheet are then clustered based on contents via an algorithm with near-linear cost. A novel user interface, the result of two user studies, displays related topics using icons linked topologically, allowing users to quickly locate semantically related portions of the video. We analyze the accuracy of the segmentation tool on 17 instructional videos, each of which is from 75 to 150 minutes in duration (a total of 40 hours); the classification accuracy exceeds 96%.
Recent machine learning algorithms such as neural networks can classify objects and actions in video frames with high accuracy. Here, I discuss a classification of objects based on basal dynamic patterns referencing one tradition, the link between rabbit, toad, and the Moon, which can be seen in several cultures. In order for them to be classified into one class, a basic pattern of behavior (cyclic appearance and disappearance) works as a feature point. A static character such as the shape and time scale of the behavior are not essential for this classification. In cognitive semantics, image schemas are introduced to describe basal patterns of events. If learning of these image schemas is attained, a machine may be able to categorize rabbit, toad, and the Moon as the same class. For learning, video frames that show boundary boxes or segmentation may be helpful. Although this discussion is preliminary and many tasks remain to be solved, the classification based on basal behaviors can be an important topic for cognitive processes and computer science.
The Donate Speech campaign has so far succeeded in gathering approximately 3600 hours of ordinary, colloquial Finnish speech into the Lahjoita puhetta (Donate Speech) corpus. The corpus includes over twenty thousand speakers from all the regions of Finland and from all age brackets. The primary goals of the collection were to create a representative, large-scale resource to study spontaneous spoken Finnish and to accelerate the development of language technology and speech-based services. In this paper, we present the collection process and the collected corpus, and showcase its versatility through multiple use cases. The evaluated use cases include: automatic speech recognition of spontaneous speech, detection of age, gender, dialect and topic and metadata analysis. We provide benchmarks for the use cases, as well down loadable, trained baseline systems with open-source code for reproducibility. One further use case is to verify the metadata and transcripts given in this corpus itself, and to suggest artificial metadata and transcripts for the part of the corpus where it is missing.