Increased popularity of different text representations has also brought many improvements in Natural Language Processing (NLP) tasks. Without need of supervised data, embeddings trained on large corpora provide us meaningful relations to be used on different NLP tasks. Even though training these vectors is relatively easy with recent methods, information gained from the data heavily depends on the structure of the corpus language. Since the popularly researched languages have a similar morphological structure, problems occurring for morphologically rich languages are mainly disregarded in studies. For morphologically rich languages, context-free word vectors ignore morphological structure of languages. In this study, we prepared texts in morphologically different forms in a morphologically rich language, Turkish, and compared the results on different intrinsic and extrinsic tasks. To see the effect of morphological structure, we trained word2vec model on texts which lemma and suffixes are treated differently. We also trained subword model fastText and compared the embeddings on word analogy, text classification, sentimental analysis, and language model tasks.
Many real-world phenomena are observed at multiple resolutions. Predictive models designed to predict these phenomena typically consider different resolutions separately. This approach might be limiting in applications where predictions are desired at fine resolutions but available training data is scarce. In this paper, we propose classification algorithms that leverage supervision from coarser resolutions to help train models on finer resolutions. The different resolutions are modeled as different views of the data in a multi-view framework that exploits the complementarity of features across different views to improve models on both views. Unlike traditional multi-view learning problems, the key challenge in our case is that there is no one-to-one correspondence between instances across different views in our case, which requires explicit modeling of the correspondence of instances across resolutions. We propose to use the features of instances at different resolutions to learn the correspondence between instances across resolutions using an attention mechanism.Experiments on the real-world application of mapping urban areas using satellite observations and sentiment classification on text data show the effectiveness of the proposed methods.
Text classification has been one of the major problems in natural language processing. With the advent of deep learning, convolutional neural network (CNN) has been a popular solution to this task. However, CNNs which were first proposed for images, face many crucial challenges in the context of text processing, namely in their elementary blocks: convolution filters and max pooling. These challenges have largely been overlooked by the most existing CNN models proposed for text classification. In this paper, we present an experimental study on the fundamental blocks of CNNs in text categorization. Based on this critique, we propose Sequential Convolutional Attentive Recurrent Network (SCARN). The proposed SCARN model utilizes both the advantages of recurrent and convolutional structures efficiently in comparison to previously proposed recurrent convolutional models. We test our model on different text classification datasets across tasks like sentiment analysis and question classification. Extensive experiments establish that SCARN outperforms other recurrent convolutional architectures with significantly less parameters. Furthermore, SCARN achieves better performance compared to equally large various deep CNN and LSTM architectures.
Stock market prediction is one of the most attractive research topic since the successful prediction on the market's future movement leads to significant profit. Traditional short term stock market predictions are usually based on the analysis of historical market data, such as stock prices, moving averages or daily returns. However, financial news also contains useful information on public companies and the market. Existing methods in finance literature exploit sentiment signal features, which are limited by not considering factors such as events and the news context. We address this issue by leveraging deep neural models to extract rich semantic features from news text. In particular, a Bidirectional-LSTM are used to encode the news text and capture the context information, self attention mechanism are applied to distribute attention on most relative words, news and days. In terms of predicting directional changes in both Standard & Poor's 500 index and individual companies stock price, we show that this technique is competitive with other state of the art approaches, demonstrating the effectiveness of recent NLP technology advances for computational finance.
Computational modeling of human multimodal language is an emerging research area in natural language processing spanning the language, visual and acoustic modalities. Comprehending multimodal language requires modeling not only the interactions within each modality (intra-modal interactions) but more importantly the interactions between modalities (cross-modal interactions). In this paper, we propose the Recurrent Multistage Fusion Network (RMFN) which decomposes the fusion problem into multiple stages, each of them focused on a subset of multimodal signals for specialized, effective fusion. Cross-modal interactions are modeled using this multistage fusion approach which builds upon intermediate representations of previous stages. Temporal and intra-modal interactions are modeled by integrating our proposed fusion approach with a system of recurrent neural networks. The RMFN displays state-of-the-art performance in modeling human multimodal language across three public datasets relating to multimodal sentiment analysis, emotion recognition, and speaker traits recognition. We provide visualizations to show that each stage of fusion focuses on a different subset of multimodal signals, learning increasingly discriminative multimodal representations.
As the popularity of social media platforms continues to rise, an ever-increasing amount of human communication and self- expression takes place online. Most recent research has focused on mining social media for public user opinion about external entities such as product reviews or sentiment towards political news. However, less attention has been paid to analyzing users' internalized thoughts and emotions from a mental health perspective. In this paper, we quantify the semantic difference between public Tweets and private mental health journals used in online cognitive behavioral therapy. We will use deep transfer learning techniques for analyzing the semantic gap between the two domains. We show that for the task of emotional valence prediction, social media can be successfully harnessed to create more accurate, robust, and personalized mental health models. Our results suggest that the semantic gap between public and private self-expression is small, and that utilizing the abundance of available social media is one way to overcome the small sample sizes of mental health data, which are commonly limited by availability and privacy concerns.
Word2Vec is a widely used algorithm for extracting low-dimensional vector representations of words. It generated considerable excitement in the machine learning and natural language processing (NLP) communities recently due to its exceptional performance in many NLP applications such as named entity recognition, sentiment analysis, machine translation and question answering. State-of-the-art algorithms including those by Mikolov et al. have been parallelized for multi-core CPU architectures but are based on vector-vector operations that are memory-bandwidth intensive and do not efficiently use computational resources. In this paper, we improve reuse of various data structures in the algorithm through the use of minibatching, hence allowing us to express the problem using matrix multiply operations. We also explore different techniques to distribute word2vec computation across nodes in a compute cluster, and demonstrate good strong scalability up to 32 nodes. In combination, these techniques allow us to scale up the computation near linearly across cores and nodes, and process hundreds of millions of words per second, which is the fastest word2vec implementation to the best of our knowledge.
The bag-of-words (BOW) model is the common approach for classifying documents, where words are used as feature for training a classifier. This generally involves a huge number of features. Some techniques, such as Latent Semantic Analysis (LSA) or Latent Dirichlet Allocation (LDA), have been designed to summarize documents in a lower dimension with the least semantic information loss. Some semantic information is nevertheless always lost, since only words are considered. Instead, we aim at using information coming from n-grams to overcome this limitation, while remaining in a low-dimension space. Many approaches, such as the Skip-gram model, provide good word vector representations very quickly. We propose to average these representations to obtain representations of n-grams. All n-grams are thus embedded in a same semantic space. A K-means clustering can then group them into semantic concepts. The number of features is therefore dramatically reduced and documents can be represented as bag of semantic concepts. We show that this model outperforms LSA and LDA on a sentiment classification task, and yields similar results than a traditional BOW-model with far less features.
Software quality in use comprises quality from the user's perspective. It has gained its importance in e-government applications, mobile-based applications, embedded systems, and even business process development. User's decisions on software acquisitions are often ad hoc or based on preference due to difficulty in quantitatively measuring software quality in use. But, why is quality-in-use measurement difficult? Although there are many software quality models, to the authors' knowledge no works survey the challenges related to software quality-in-use measurement. This article has two main contributions: 1) it identifies and explains major issues and challenges in measuring software quality in use in the context of the ISO SQuaRE series and related software quality models and highlights open research areas; and 2) it sheds light on a research direction that can be used to predict software quality in use. In short, the quality-in-use measurement issues are related to the complexity of the current standard models and the limitations and incompleteness of the customized software quality models. A sentiment analysis of software reviews is proposed to deal with these issues.