Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

Query-Based Named Entity Recognition

Aug 24, 2019
Yuxian Meng, Xiaoya Li, Zijun Sun, Jiwei Li

In this paper, we propose a new strategy for the task of named entity recognition (NER). We cast the task as a query-based machine reading comprehension task: e.g., the task of extracting entities with PER is formalized as answering the question of "which person is mentioned in the text ?". Such a strategy comes with the advantage that it solves the long-standing issue of handling overlapping or nested entities (the same token that participates in more than one entity categories) with sequence-labeling techniques for NER. Additionally, since the query encodes informative prior knowledge, this strategy facilitates the process of entity extraction, leading to better performances. We experiment the proposed model on five widely used NER datasets on English and Chinese, including MSRA, Resume, OntoNotes, ACE04 and ACE05. The proposed model sets new SOTA results on all of these datasets.

* Work in progress 

  Access Paper or Ask Questions

On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning

Aug 21, 2019
Yerai Doval, Jose Camacho-Collados, Luis Espinosa-Anke, Steven Schockaert

Cross-lingual word embeddings are vector representations of words in different languages where words with similar meaning are represented by similar vectors, regardless of the language. Recent developments which construct these embeddings by aligning monolingual spaces have shown that accurate alignments can be obtained with little or no supervision. However, the focus has been on a particular controlled scenario for evaluation, and there is no strong evidence on how current state-of-the-art systems would fare with noisy text or for language pairs with major linguistic differences. In this paper we present an extensive evaluation over multiple cross-lingual embedding models, analyzing their strengths and limitations with respect to different variables such as target language, training corpora and amount of supervision. Our conclusions put in doubt the view that high-quality cross-lingual embeddings can always be learned without much supervision.

* 13 pages, 2 figures, 7 tables 

  Access Paper or Ask Questions

Multimodal Fusion with Deep Neural Networks for Audio-Video Emotion Recognition

Jul 06, 2019
Juan D. S. Ortega, Mohammed Senoussaoui, Eric Granger, Marco Pedersoli, Patrick Cardinal, Alessandro L. Koerich

This paper presents a novel deep neural network (DNN) for multimodal fusion of audio, video and text modalities for emotion recognition. The proposed DNN architecture has independent and shared layers which aim to learn the representation for each modality, as well as the best combined representation to achieve the best prediction. Experimental results on the AVEC Sentiment Analysis in the Wild dataset indicate that the proposed DNN can achieve a higher level of Concordance Correlation Coefficient (CCC) than other state-of-the-art systems that perform early fusion of modalities at feature-level (i.e., concatenation) and late fusion at score-level (i.e., weighted average) fusion. The proposed DNN has achieved CCCs of 0.606, 0.534, and 0.170 on the development partition of the dataset for predicting arousal, valence and liking, respectively.

  Access Paper or Ask Questions

Attention model for articulatory features detection

Jul 02, 2019
Ievgen Karaulov, Dmytro Tkanov

Articulatory distinctive features, as well as phonetic transcription, play important role in speech-related tasks: computer-assisted pronunciation training, text-to-speech conversion (TTS), studying speech production mechanisms, speech recognition for low-resourced languages. End-to-end approaches to speech-related tasks got a lot of traction in recent years. We apply Listen, Attend and Spell~(LAS)~\cite{Chan-LAS2016} architecture to phones recognition on a small small training set, like TIMIT~\cite{TIMIT-1992}. Also, we introduce a novel decoding technique that allows to train manners and places of articulation detectors end-to-end using attention models. We also explore joint phones recognition and articulatory features detection in multitask learning setting.

* Interspeech 2019, 5 pages, 2 figures 

  Access Paper or Ask Questions

Multi-task Learning for Multi-modal Emotion Recognition and Sentiment Analysis

May 14, 2019
Md Shad Akhtar, Dushyant Singh Chauhan, Deepanway Ghosal, Soujanya Poria, Asif Ekbal, Pushpak Bhattacharyya

Related tasks often have inter-dependence on each other and perform better when solved in a joint framework. In this paper, we present a deep multi-task learning framework that jointly performs sentiment and emotion analysis both. The multi-modal inputs (i.e., text, acoustic and visual frames) of a video convey diverse and distinctive information, and usually do not have equal contribution in the decision making. We propose a context-level inter-modal attention framework for simultaneously predicting the sentiment and expressed emotions of an utterance. We evaluate our proposed approach on CMU-MOSEI dataset for multi-modal sentiment and emotion analysis. Evaluation results suggest that multi-task learning framework offers improvement over the single-task framework. The proposed approach reports new state-of-the-art performance for both sentiment analysis and emotion analysis.

* Accepted for publication in NAACL:HLT-2019 

  Access Paper or Ask Questions

Public vs Media Opinion on Robots

May 05, 2019
Alireza Javaheri, Navid Moghadamnejad, Hamidreza Keshavarz, Ehsan Javaheri, Chelsea Dobbins, Elaheh Momeni, Reza Rawassizadeh

Fast proliferation of robots in people's everyday lives during recent years calls for a profound examination of public consensus, which is the ultimate determinant of the future of this industry. This paper investigates text corpora, consisting of posts in Twitter, Google News, Bing News, and Kickstarter, over an 8 year period to quantify the public and media opinion about this emerging technology. Results demonstrate that the news platforms and the public take an overall positive position on robots. However, there is a deviation between news coverage and people's attitude. Among various robot types, sex robots raise the fiercest debate. Besides, our evaluation reveals that the public and news media conceptualization of robotics has altered over the recent years. More specifically, a shift from the solely industrial-purposed machines, towards more social, assistive, and multi-purpose gadgets is visible.

* 15 pages, 6 figures, 4 tables 

  Access Paper or Ask Questions

A human-editable Sign Language representation for software editing---and a writing system?

Nov 05, 2018
Michael Filhol

To equip SL with software properly, we need an input system to represent and manipulate signed contents in the same way that every day software allows to process written text. Refuting the claim that video is good enough a medium to serve the purpose, we propose to build a representation that is: editable, queryable, synthesisable and user-friendly---we define those terms upfront. The issue being functionally and conceptually linked to that of writing, we study existing writing systems, namely those in use for vocal languages, those designed and proposed for SLs, and more spontaneous ways in which SL users put their language in writing. Observing each paradigm in turn, we move on to propose a new approach to satisfy our goals of integration in software. We finally open the prospect of our proposition being used outside of this restricted scope, as a writing system in itself, and compare its properties to the other writing systems presented.

  Access Paper or Ask Questions

Crime Event Embedding with Unsupervised Feature Selection

Nov 04, 2018
Shixiang Zhu, Yao Xie

We present a novel event embedding algorithm for crime data that can jointly capture time, location, and the complex free-text component of each event. The embedding is achieved by regularized Restricted Boltzmann Machines (RBMs), and we introduce a new way to regularize by imposing a $\ell_1$ penalty on the conditional distributions of the observed variables of RBMs. This choice of regularization performs feature selection and it also leads to efficient computation since the gradient can be computed in a closed form. The feature selection forces embedding to be based on the most important keywords, which captures the common modus operandi (M. O.) in crime series. Using numerical experiments on a large-scale crime dataset, we show that our regularized RBMs can achieve better event embedding and the selected features are highly interpretable from human understanding.

  Access Paper or Ask Questions

Modeling Empathy and Distress in Reaction to News Stories

Aug 30, 2018
Sven Buechel, Anneke Buffone, Barry Slaff, Lyle Ungar, João Sedoc

Computational detection and understanding of empathy is an important factor in advancing human-computer interaction. Yet to date, text-based empathy prediction has the following major limitations: It underestimates the psychological complexity of the phenomenon, adheres to a weak notion of ground truth where empathic states are ascribed by third parties, and lacks a shared corpus. In contrast, this contribution presents the first publicly available gold standard for empathy prediction. It is constructed using a novel annotation methodology which reliably captures empathy assessments by the writer of a statement using multi-item scales. This is also the first computational work distinguishing between multiple forms of empathy, empathic concern, and personal distress, as recognized throughout psychology. Finally, we present experimental results for three different predictive models, of which a CNN performs the best.

* To appear at EMNLP 2018 

  Access Paper or Ask Questions

A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation

Aug 27, 2018
Jingjing Xu, Xuancheng Ren, Yi Zhang, Qi Zeng, Xiaoyan Cai, Xu Sun

Narrative story generation is a challenging problem because it demands the generated sentences with tight semantic connections, which has not been well studied by most existing generative models. To address this problem, we propose a skeleton-based model to promote the coherence of generated stories. Different from traditional models that generate a complete sentence at a stroke, the proposed model first generates the most critical phrases, called skeleton, and then expands the skeleton to a complete and fluent sentence. The skeleton is not manually defined, but learned by a reinforcement learning method. Compared to the state-of-the-art models, our skeleton-based model can generate significantly more coherent text according to human evaluation and automatic evaluation. The G-score is improved by 20.1% in the human evaluation. The code is available at

* Accepted by EMNLP 2018 

  Access Paper or Ask Questions