Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gerard de Melo

R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

Jul 02, 2021
Xiang Hu, Haitao Mi, Zujie Wen, Yafang Wang, Yi Su, Jing Zheng, Gerard de Melo

Figure 1 for R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

Figure 2 for R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

Figure 3 for R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

Figure 4 for R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling

Human language understanding operates at multiple levels of granularity (e.g., words, phrases, and sentences) with increasing levels of abstraction that can be hierarchically combined. However, existing deep models with stacked layers do not explicitly model any sort of hierarchical process. This paper proposes a recursive Transformer model based on differentiable CKY style binary trees to emulate the composition process. We extend the bidirectional language model pre-training objective to this architecture, attempting to predict each word given its left and right abstraction nodes. To scale up our approach, we also introduce an efficient pruned tree induction algorithm to enable encoding in just a linear number of composition steps. Experimental results on language modeling and unsupervised parsing show the effectiveness of our approach.

* To be published in the proceedings of ACL-IJCNLP 2021

Via

Access Paper or Ask Questions

Guilt by Association: Emotion Intensities in Lexical Representations

Apr 18, 2021
Shahab Raji, Gerard de Melo

Figure 1 for Guilt by Association: Emotion Intensities in Lexical Representations

Figure 2 for Guilt by Association: Emotion Intensities in Lexical Representations

What do word vector representations reveal about the emotions associated with words? In this study, we consider the task of estimating word-level emotion intensity scores for specific emotions, exploring unsupervised, supervised, and finally a self-supervised method of extracting emotional associations from word vector representations. Overall, we find that word vectors carry substantial potential for inducing fine-grained emotion intensity scores, showing a far higher correlation with human ground truth ratings than achieved by state-of-the-art emotion lexicons.

Via

Access Paper or Ask Questions

Context-Aware Interaction Network for Question Matching

Apr 17, 2021
Zhe Hu, Zuohui Fu, Yu Yin, Gerard de Melo, Cheng Peng

Figure 1 for Context-Aware Interaction Network for Question Matching

Figure 2 for Context-Aware Interaction Network for Question Matching

Figure 3 for Context-Aware Interaction Network for Question Matching

Figure 4 for Context-Aware Interaction Network for Question Matching

Impressive milestones have been achieved in text matching by adopting a cross-attention mechanism to capture pertinent semantic connections between two sentences. However, these cross-attention mechanisms focus on word-level links between the two inputs, neglecting the importance of contextual information. We propose a context-aware interaction network (COIN) to properly align two sequences and infer their semantic relationship. Specifically, each interaction block includes (1) a context-aware cross-attention mechanism to effectively integrate contextual information, and (2) a gate fusion layer to flexibly interpolate aligned representations. We apply multiple stacked interaction blocks to produce alignments at different levels and gradually refine the attention results. Experiments on two question matching datasets and detailed analyses confirm the effectiveness of our model.

Via

Access Paper or Ask Questions

Faithfully Explainable Recommendation via Neural Logic Reasoning

Apr 16, 2021
Yaxin Zhu, Yikun Xian, Zuohui Fu, Gerard de Melo, Yongfeng Zhang

Figure 1 for Faithfully Explainable Recommendation via Neural Logic Reasoning

Figure 2 for Faithfully Explainable Recommendation via Neural Logic Reasoning

Figure 3 for Faithfully Explainable Recommendation via Neural Logic Reasoning

Figure 4 for Faithfully Explainable Recommendation via Neural Logic Reasoning

Knowledge graphs (KG) have become increasingly important to endow modern recommender systems with the ability to generate traceable reasoning paths to explain the recommendation process. However, prior research rarely considers the faithfulness of the derived explanations to justify the decision making process. To the best of our knowledge, this is the first work that models and evaluates faithfully explainable recommendation under the framework of KG reasoning. Specifically, we propose neural logic reasoning for explainable recommendation (LOGER) by drawing on interpretable logical rules to guide the path reasoning process for explanation generation. We experiment on three large-scale datasets in the e-commerce domain, demonstrating the effectiveness of our method in delivering high-quality recommendations as well as ascertaining the faithfulness of the derived explanation.

* Accepted in NAACL 2021

Via

Access Paper or Ask Questions

Fast and Effective Biomedical Entity Linking Using a Dual Encoder

Mar 08, 2021
Rajarshi Bhowmik, Karl Stratos, Gerard de Melo

Figure 1 for Fast and Effective Biomedical Entity Linking Using a Dual Encoder

Figure 2 for Fast and Effective Biomedical Entity Linking Using a Dual Encoder

Figure 3 for Fast and Effective Biomedical Entity Linking Using a Dual Encoder

Figure 4 for Fast and Effective Biomedical Entity Linking Using a Dual Encoder

Biomedical entity linking is the task of identifying mentions of biomedical concepts in text documents and mapping them to canonical entities in a target thesaurus. Recent advancements in entity linking using BERT-based models follow a retrieve and rerank paradigm, where the candidate entities are first selected using a retriever model, and then the retrieved candidates are ranked by a reranker model. While this paradigm produces state-of-the-art results, they are slow both at training and test time as they can process only one mention at a time. To mitigate these issues, we propose a BERT-based dual encoder model that resolves multiple mentions in a document in one shot. We show that our proposed model is multiple times faster than existing BERT-based models while being competitive in accuracy for biomedical entity linking. Additionally, we modify our dual encoder model for end-to-end biomedical entity linking that performs both mention span detection and entity disambiguation and out-performs two recently proposed models.

Via

Access Paper or Ask Questions

Assessing Emoji Use in Modern Text Processing Tools

Jan 02, 2021
Abu Awal Md Shoeb, Gerard de Melo

Figure 1 for Assessing Emoji Use in Modern Text Processing Tools

Figure 2 for Assessing Emoji Use in Modern Text Processing Tools

Figure 3 for Assessing Emoji Use in Modern Text Processing Tools

Figure 4 for Assessing Emoji Use in Modern Text Processing Tools

Emojis have become ubiquitous in digital communication, due to their visual appeal as well as their ability to vividly convey human emotion, among other factors. The growing prominence of emojis in social media and other instant messaging also leads to an increased need for systems and tools to operate on text containing emojis. In this study, we assess this support by considering test sets of tweets with emojis, based on which we perform a series of experiments investigating the ability of prominent NLP and text processing tools to adequately process them. In particular, we consider tokenization, part-of-speech tagging, as well as sentiment analysis. Our findings show that many tools still have notable shortcomings when operating on text containing emojis.

Via

Access Paper or Ask Questions

Interactive Question Clarification in Dialogue via Reinforcement Learning

Dec 17, 2020
Xiang Hu, Zujie Wen, Yafang Wang, Xiaolong Li, Gerard de Melo

Figure 1 for Interactive Question Clarification in Dialogue via Reinforcement Learning

Figure 2 for Interactive Question Clarification in Dialogue via Reinforcement Learning

Figure 3 for Interactive Question Clarification in Dialogue via Reinforcement Learning

Figure 4 for Interactive Question Clarification in Dialogue via Reinforcement Learning

Coping with ambiguous questions has been a perennial problem in real-world dialogue systems. Although clarification by asking questions is a common form of human interaction, it is hard to define appropriate questions to elicit more specific intents from a user. In this work, we propose a reinforcement model to clarify ambiguous questions by suggesting refinements of the original query. We first formulate a collection partitioning problem to select a set of labels enabling us to distinguish potential unambiguous intents. We list the chosen labels as intent phrases to the user for further confirmation. The selected label along with the original user query then serves as a refined query, for which a suitable response can more easily be identified. The model is trained using reinforcement learning with a deep policy network. We evaluate our model based on real-world user clicks and demonstrate significant improvements across several different experiments.

* COLING industry track

Via

Access Paper or Ask Questions

Cross-Domain Learning for Classifying Propaganda in Online Contents

Nov 22, 2020
Liqiang Wang, Xiaoyu Shen, Gerard de Melo, Gerhard Weikum

Figure 1 for Cross-Domain Learning for Classifying Propaganda in Online Contents

Figure 2 for Cross-Domain Learning for Classifying Propaganda in Online Contents

Figure 3 for Cross-Domain Learning for Classifying Propaganda in Online Contents

Figure 4 for Cross-Domain Learning for Classifying Propaganda in Online Contents

As news and social media exhibit an increasing amount of manipulative polarized content, detecting such propaganda has received attention as a new task for content analysis. Prior work has focused on supervised learning with training data from the same domain. However, as propaganda can be subtle and keeps evolving, manual identification and proper labeling are very demanding. As a consequence, training data is a major bottleneck. In this paper, we tackle this bottleneck and present an approach to leverage cross-domain learning, based on labeled documents and sentences from news and tweets, as well as political speeches with a clear difference in their degrees of being propagandistic. We devise informative features and build various classifiers for propaganda labeling, using cross-domain learning. Our experiments demonstrate the usefulness of this approach, and identify difficulties and limitations in various configurations of sources and targets for the transfer step. We further analyze the influence of various features, and characterize salient indicators of propaganda.

* TTO 2020

Via

Access Paper or Ask Questions

Cross-Domain Learning forClassifying Propaganda in Online Contents

Nov 13, 2020
Liqiang Wang, Xiaoyu Shen, Gerard de Melo, Gerhard Weikum

* TTO 2020

Via

Access Paper or Ask Questions

GitEvolve: Predicting the Evolution of GitHub Repositories

Oct 09, 2020
Honglu Zhou, Hareesh Ravi, Carlos M. Muniz, Vahid Azizi, Linda Ness, Gerard de Melo, Mubbasir Kapadia

Figure 1 for GitEvolve: Predicting the Evolution of GitHub Repositories

Figure 2 for GitEvolve: Predicting the Evolution of GitHub Repositories

Figure 3 for GitEvolve: Predicting the Evolution of GitHub Repositories

Figure 4 for GitEvolve: Predicting the Evolution of GitHub Repositories

Software development is becoming increasingly open and collaborative with the advent of platforms such as GitHub. Given its crucial role, there is a need to better understand and model the dynamics of GitHub as a social platform. Previous work has mostly considered the dynamics of traditional social networking sites like Twitter and Facebook. We propose GitEvolve, a system to predict the evolution of GitHub repositories and the different ways by which users interact with them. To this end, we develop an end-to-end multi-task sequential deep neural network that given some seed events, simultaneously predicts which user-group is next going to interact with a given repository, what the type of the interaction is, and when it happens. To facilitate learning, we use graph based representation learning to encode relationship between repositories. We map users to groups by modelling common interests to better predict popularity and to generalize to unseen users during inference. We introduce an artificial event type to better model varying levels of activity of repositories in the dataset. The proposed multi-task architecture is generic and can be extended to model information diffusion in other social networks. In a series of experiments, we demonstrate the effectiveness of the proposed model, using multiple metrics and baselines. Qualitative analysis of the model's ability to predict popularity and forecast trends proves its applicability.

Via

Access Paper or Ask Questions