Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Sentiment": models, code, and papers

Transfer Learning with Dynamic Distribution Adaptation

Sep 17, 2019
Jindong Wang, Yiqiang Chen, Wenjie Feng, Han Yu, Meiyu Huang, Qiang Yang

Transfer learning aims to learn robust classifiers for the target domain by leveraging knowledge from a source domain. Since the source and the target domains are usually from different distributions, existing methods mainly focus on adapting the cross-domain marginal or conditional distributions. However, in real applications, the marginal and conditional distributions usually have different contributions to the domain discrepancy. Existing methods fail to quantitatively evaluate the different importance of these two distributions, which will result in unsatisfactory transfer performance. In this paper, we propose a novel concept called Dynamic Distribution Adaptation (DDA), which is capable of quantitatively evaluating the relative importance of each distribution. DDA can be easily incorporated into the framework of structural risk minimization to solve transfer learning problems. On the basis of DDA, we propose two novel learning algorithms: (1) Manifold Dynamic Distribution Adaptation (MDDA) for traditional transfer learning, and (2) Dynamic Distribution Adaptation Network (DDAN) for deep transfer learning. Extensive experiments demonstrate that MDDA and DDAN significantly improve the transfer learning performance and setup a strong baseline over the latest deep and adversarial methods on digits recognition, sentiment analysis, and image classification. More importantly, it is shown that marginal and conditional distributions have different contributions to the domain divergence, and our DDA is able to provide good quantitative evaluation of their relative importance which leads to better performance. We believe this observation can be helpful for future research in transfer learning.

* ACM Transactions on Intelligent Systems and Technology (ACM TIST) 2019 
* Accepted to ACM Transactions on Intelligent Systems and Technology (ACM TIST) 2019, 25 pages. arXiv admin note: text overlap with arXiv:1807.07258 

  Access Paper or Ask Questions

Intrinsically Sparse Long Short-Term Memory Networks

Jan 26, 2019
Shiwei Liu, Decebal Constantin Mocanu, Mykola Pechenizkiy

Long Short-Term Memory (LSTM) has achieved state-of-the-art performances on a wide range of tasks. Its outstanding performance is guaranteed by the long-term memory ability which matches the sequential data perfectly and the gating structure controlling the information flow. However, LSTMs are prone to be memory-bandwidth limited in realistic applications and need an unbearable period of training and inference time as the model size is ever-increasing. To tackle this problem, various efficient model compression methods have been proposed. Most of them need a big and expensive pre-trained model which is a nightmare for resource-limited devices where the memory budget is strictly limited. To remedy this situation, in this paper, we incorporate the Sparse Evolutionary Training (SET) procedure into LSTM, proposing a novel model dubbed SET-LSTM. Rather than starting with a fully-connected architecture, SET-LSTM has a sparse topology and dramatically fewer parameters in both phases, training and inference. Considering the specific architecture of LSTMs, we replace the LSTM cells and embedding layers with sparse structures and further on, use an evolutionary strategy to adapt the sparse connectivity to the data. Additionally, we find that SET-LSTM can provide many different good combinations of sparse connectivity to substitute the overparameterized optimization problem of dense neural networks. Evaluated on four sentiment analysis classification datasets, the results demonstrate that our proposed model is able to achieve usually better performance than its fully connected counterpart while having less than 4\% of its parameters.

* 9 pages, 8 figures and 4 tables 

  Access Paper or Ask Questions

Robust Task Clustering for Deep Many-Task Learning

May 18, 2018
Mo Yu, Xiaoxiao Guo, Jinfeng Yi, Shiyu Chang, Saloni Potdar, Gerald Tesauro, Haoyu Wang, Bowen Zhou

We investigate task clustering for deep-learning based multi-task and few-shot learning in a many-task setting. We propose a new method to measure task similarities with cross-task transfer performance matrix for the deep learning scenario. Although this matrix provides us critical information regarding similarity between tasks, its asymmetric property and unreliable performance scores can affect conventional clustering methods adversely. Additionally, the uncertain task-pairs, i.e., the ones with extremely asymmetric transfer scores, may collectively mislead clustering algorithms to output an inaccurate task-partition. To overcome these limitations, we propose a novel task-clustering algorithm by using the matrix completion technique. The proposed algorithm constructs a partially-observed similarity matrix based on the certainty of cluster membership of the task-pairs. We then use a matrix completion algorithm to complete the similarity matrix. Our theoretical analysis shows that under mild constraints, the proposed algorithm will perfectly recover the underlying "true" similarity matrix with a high probability. Our results show that the new task clustering method can discover task clusters for training flexible and superior neural network models in a multi-task learning setup for sentiment classification and dialog intent classification tasks. Our task clustering approach also extends metric-based few-shot learning methods to adapt multiple metrics, which demonstrates empirical advantages when the tasks are diverse.

  Access Paper or Ask Questions

Latent Relational Metric Learning via Memory-based Attention for Collaborative Ranking

Feb 13, 2018
Yi Tay, Anh Tuan Luu, Siu Cheung Hui

This paper proposes a new neural architecture for collaborative ranking with implicit feedback. Our model, LRML (\textit{Latent Relational Metric Learning}) is a novel metric learning approach for recommendation. More specifically, instead of simple push-pull mechanisms between user and item pairs, we propose to learn latent relations that describe each user item interaction. This helps to alleviate the potential geometric inflexibility of existing metric learing approaches. This enables not only better performance but also a greater extent of modeling capability, allowing our model to scale to a larger number of interactions. In order to do so, we employ a augmented memory module and learn to attend over these memory blocks to construct latent relations. The memory-based attention module is controlled by the user-item interaction, making the learned relation vector specific to each user-item pair. Hence, this can be interpreted as learning an exclusive and optimal relational translation for each user-item interaction. The proposed architecture demonstrates the state-of-the-art performance across multiple recommendation benchmarks. LRML outperforms other metric learning models by $6\%-7.5\%$ in terms of [email protected] and [email protected] on large datasets such as Netflix and MovieLens20M. Moreover, qualitative studies also demonstrate evidence that our proposed model is able to infer and encode explicit sentiment, temporal and attribute information despite being only trained on implicit feedback. As such, this ascertains the ability of LRML to uncover hidden relational structure within implicit datasets.

* WWW 2018 

  Access Paper or Ask Questions

How Have We Reacted To The COVID-19 Pandemic? Analyzing Changing Indian Emotions Through The Lens of Twitter

Aug 20, 2020
Rajdeep Mukherjee, Sriyash Poddar, Atharva Naik, Soham Dasgupta

Since its outbreak, the ongoing COVID-19 pandemic has caused unprecedented losses to human lives and economies around the world. As of 18th July 2020, the World Health Organization (WHO) has reported more than 13 million confirmed cases including close to 600,000 deaths across 216 countries and territories. Despite several government measures, India has gradually moved up the ranks to become the third worst-hit nation by the pandemic after the US and Brazil, thus causing widespread anxiety and fear among her citizens. As majority of the world's population continues to remain confined to their homes, more and more people have started relying on social media platforms such as Twitter for expressing their feelings and attitudes towards various aspects of the pandemic. With rising concerns of mental well-being, it becomes imperative to analyze the dynamics of public affect in order to anticipate any potential threats and take precautionary measures. Since affective states of human mind are more nuanced than meager binary sentiments, here we propose a deep learning-based system to identify people's emotions from their tweets. We achieve competitive results on two benchmark datasets for multi-label emotion classification. We then use our system to analyze the evolution of emotional responses among Indians as the pandemic continues to spread its wings. We also study the development of salient factors contributing towards the changes in attitudes over time. Finally, we discuss directions to further improve our work and hope that our analysis can aid in better public health monitoring.

* 4 pages, submitted to CODS-COMAD 2021 

  Access Paper or Ask Questions

StackGenVis: Alignment of Data, Algorithms, and Models for Stacking Ensemble Learning Using Performance Metrics

May 04, 2020
Angelos Chatzimparmpas, Rafael M. Martins, Kostiantyn Kucher, Andreas Kerren

In machine learning (ML), ensemble methods such as bagging, boosting, and stacking are widely-established approaches that regularly achieve top-notch predictive performance. Stacking (also called "stacked generalization") is an ensemble method that combines heterogeneous base models, arranged in at least one layer, and then employs another metamodel to summarize the predictions of those models. Although it may be a highly-effective approach for increasing the predictive performance of ML, generating a stack of models from scratch can be a cumbersome trial-and-error process. This challenge stems from the enormous space of available solutions, with different sets of data instances and features that could be used for training, several algorithms to choose from, and instantiations of these algorithms (i.e., models) that perform differently according to diverse metrics. In this work, we present a knowledge generation model, which supports ensemble learning with the use of visualization, and a visual analytics system for stacked generalization. Our system, StackGenVis, assists users in dynamically managing data instances, selecting the most important features for a given data set, and choosing a set of top-performant and diverse algorithms. In consequence, our proposed tool helps users to decide between distinct models and to reduce the complexity of the resulting stack by removing overpromising and underperforming models. The applicability and effectiveness of StackGenVis are demonstrated with two use cases: a real-world healthcare data set and a collection of data related to sentiment/stance detection in texts. Finally, the tool has been evaluated through interviews with three ML experts.

* This manuscript is currently under review 

  Access Paper or Ask Questions

Snippext: Semi-supervised Opinion Mining with Augmented Data

Feb 07, 2020
Zhengjie Miao, Yuliang Li, Xiaolan Wang, Wang-Chiew Tan

Online services are interested in solutions to opinion mining, which is the problem of extracting aspects, opinions, and sentiments from text. One method to mine opinions is to leverage the recent success of pre-trained language models which can be fine-tuned to obtain high-quality extractions from reviews. However, fine-tuning language models still requires a non-trivial amount of training data. In this paper, we study the problem of how to significantly reduce the amount of labeled training data required in fine-tuning language models for opinion mining. We describe Snippext, an opinion mining system developed over a language model that is fine-tuned through semi-supervised learning with augmented data. A novelty of Snippext is its clever use of a two-prong approach to achieve state-of-the-art (SOTA) performance with little labeled training data through: (1) data augmentation to automatically generate more labeled training data from existing ones, and (2) a semi-supervised learning technique to leverage the massive amount of unlabeled data in addition to the (limited amount of) labeled data. We show with extensive experiments that Snippext performs comparably and can even exceed previous SOTA results on several opinion mining tasks with only half the training data required. Furthermore, it achieves new SOTA results when all training data are leveraged. By comparison to a baseline pipeline, we found that Snippext extracts significantly more fine-grained opinions which enable new opportunities of downstream applications.

* Accepted to WWW 2020 

  Access Paper or Ask Questions

Micro-expression detection in long videos using optical flow and recurrent neural networks

Mar 26, 2019
Michiel Verburg, Vlado Menkovski

Facial micro-expressions are subtle and involuntary expressions that can reveal concealed emotions. Micro-expressions are an invaluable source of information in application domains such as lie detection, mental health, sentiment analysis and more. One of the biggest challenges in this field of research is the small amount of available spontaneous micro-expression data. However, spontaneous data collection is burdened by time-consuming and expensive annotation. Hence, methods are needed which can reduce the amount of data that annotators have to review. This paper presents a novel micro-expression spotting method using a recurrent neural network (RNN) on optical flow features. We extract Histogram of Oriented Optical Flow (HOOF) features to encode the temporal changes in selected face regions. Finally, the RNN spots short intervals which are likely to contain occurrences of relevant facial micro-movements. The proposed method is evaluated on the SAMM database. Any chance of subject bias is eliminated by training the RNN using Leave-One-Subject-Out cross-validation. Comparing the spotted intervals with the labeled data shows that the method produced 1569 false positives while obtaining a recall of 0.4654. The initial results show that the proposed method would reduce the video length by a factor of 3.5, while still retaining almost half of the relevant micro-movements. Lastly, as the model gets more data, it becomes better at detecting intervals, which makes the proposed method suitable for supporting the annotation process.

* 6 pages, 6 figures and 1 table 

  Access Paper or Ask Questions

Active Learning for Crowd-Sourced Databases

Dec 20, 2014
Barzan Mozafari, Purnamrita Sarkar, Michael J. Franklin, Michael I. Jordan, Samuel Madden

Crowd-sourcing has become a popular means of acquiring labeled data for a wide variety of tasks where humans are more accurate than computers, e.g., labeling images, matching objects, or analyzing sentiment. However, relying solely on the crowd is often impractical even for data sets with thousands of items, due to time and cost constraints of acquiring human input (which cost pennies and minutes per label). In this paper, we propose algorithms for integrating machine learning into crowd-sourced databases, with the goal of allowing crowd-sourcing applications to scale, i.e., to handle larger datasets at lower costs. The key observation is that, in many of the above tasks, humans and machine learning algorithms can be complementary, as humans are often more accurate but slow and expensive, while algorithms are usually less accurate, but faster and cheaper. Based on this observation, we present two new active learning algorithms to combine humans and algorithms together in a crowd-sourced database. Our algorithms are based on the theory of non-parametric bootstrap, which makes our results applicable to a broad class of machine learning models. Our results, on three real-life datasets collected with Amazon's Mechanical Turk, and on 15 well-known UCI data sets, show that our methods on average ask humans to label one to two orders of magnitude fewer items to achieve the same accuracy as a baseline that labels random images, and two to eight times fewer questions than previous active learning schemes.

* A shorter version of this manuscript has been published in Proceedings of Very Large Data Bases 2015, entitled "Scaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning" 

  Access Paper or Ask Questions

Happiness is assortative in online social networks

Mar 03, 2011
Johan Bollen, Bruno Goncalves, Guangchen Ruan, Huina Mao

Social networks tend to disproportionally favor connections between individuals with either similar or dissimilar characteristics. This propensity, referred to as assortative mixing or homophily, is expressed as the correlation between attribute values of nearest neighbour vertices in a graph. Recent results indicate that beyond demographic features such as age, sex and race, even psychological states such as "loneliness" can be assortative in a social network. In spite of the increasing societal importance of online social networks it is unknown whether assortative mixing of psychological states takes place in situations where social ties are mediated solely by online networking services in the absence of physical contact. Here, we show that general happiness or Subjective Well-Being (SWB) of Twitter users, as measured from a 6 month record of their individual tweets, is indeed assortative across the Twitter social network. To our knowledge this is the first result that shows assortative mixing in online networks at the level of SWB. Our results imply that online social networks may be equally subject to the social mechanisms that cause assortative mixing in real social networks and that such assortative mixing takes place at the level of SWB. Given the increasing prevalence of online social networks, their propensity to connect users with similar levels of SWB may be an important instrument in better understanding how both positive and negative sentiments spread through online social ties. Future research may focus on how event-specific mood states can propagate and influence user behavior in "real life".

* Artificial Life 17(3), 237-251 (2011) 
* 17 pages, 9 figures 

  Access Paper or Ask Questions