Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

Neural embeddings for metaphor detection in a corpus of Greek texts

Feb 10, 2019
Eirini Florou, Konstantinos Perifanos, Dionysis Goutsos

One of the major challenges that NLP faces is metaphor detection, especially by automatic means, a task that becomes even more difficult for languages lacking in linguistic resources and tools. Our purpose is the automatic differentiation between literal and metaphorical meaning in authentic non-annotated phrases from the Corpus of Greek Texts by means of computational methods of machine learning. For this purpose the theoretical background of distributional semantics is discussed and employed. Distributional Semantics Theory develops concepts and methods for the quantification and classification of semantic similarities displayed by linguistic elements in large amounts of linguistic data according to their distributional properties. In accordance with this model, the approach followed in the thesis takes into account the linguistic context for the computation of the distributional representation of phrases in geometrical space, as well as for their comparison with the distributional representations of other phrases, whose function in speech is already "known" with the objective to reach conclusions about their literal or metaphorical function in the specific linguistic context. This procedure aims at dealing with the lack of linguistic resources for the Greek language, as the almost impossible up to now semantic comparison between "phrases", takes the form of an arithmetical comparison of their distributional representations in geometrical space.

* IISA 2018 - The 9th International Conference on Information, Intelligence, Systems and Applications 

  Access Paper or Ask Questions

Semi-supervised Text Regression with Conditional Generative Adversarial Networks

Oct 02, 2018
Tao Li, Xudong Liu, Shihan Su

Enormous online textual information provides intriguing opportunities for understandings of social and economic semantics. In this paper, we propose a novel text regression model based on a conditional generative adversarial network (GAN), with an attempt to associate textual data and social outcomes in a semi-supervised manner. Besides promising potential of predicting capabilities, our superiorities are twofold: (i) the model works with unbalanced datasets of limited labelled data, which align with real-world scenarios; and (ii) predictions are obtained by an end-to-end framework, without explicitly selecting high-level representations. Finally we point out related datasets for experiments and future research directions.

* submitted to the 3rd International Workshop on Application of Big Data for Computational Social Science of the 2018 IEEE International Conference on Big Data (BigData 2018), Seattle, Washington, USA, December 10-13, 2018 

  Access Paper or Ask Questions

Bleaching Text: Abstract Features for Cross-lingual Gender Prediction

May 08, 2018
Rob van der Goot, Nikola Ljubešić, Ian Matroos, Malvina Nissim, Barbara Plank

Gender prediction has typically focused on lexical and social network features, yielding good performance, but making systems highly language-, topic-, and platform-dependent. Cross-lingual embeddings circumvent some of these limitations, but capture gender-specific style less. We propose an alternative: bleaching text, i.e., transforming lexical strings into more abstract features. This study provides evidence that such features allow for better transfer across languages. Moreover, we present a first study on the ability of humans to perform cross-lingual gender prediction. We find that human predictive power proves similar to that of our bleached models, and both perform better than lexical models.

* Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics 

  Access Paper or Ask Questions

Research Project: Text Engineering Tool for Ontological Scientometry

Jan 08, 2016
Rustam Tagiew

The number of scientific papers grows exponentially in many disciplines. The share of online available papers grows as well. At the same time, the period of time for a paper to loose at chance to be cited anymore shortens. The decay of the citing rate shows similarity to ultradiffusional processes as for other online contents in social networks. The distribution of papers per author shows similarity to the distribution of posts per user in social networks. The rate of uncited papers for online available papers grows while some papers 'go viral' in terms of being cited. Summarized, the practice of scientific publishing moves towards the domain of social networks. The goal of this project is to create a text engineering tool, which can semi-automatically categorize a paper according to its type of contribution and extract relationships between them into an ontological database. Semi-automatic categorization means that the mistakes made by automatic pre-categorization and relationship-extraction will be corrected through a wikipedia-like front-end by volunteers from general public. This tool should not only help researchers and the general public to find relevant supplementary material and peers faster, but also provide more information for research funding agencies.

* 5 pages, 2 figure 

  Access Paper or Ask Questions

Entities, Dates, and Languages: Zero-Shot on Historical Texts with T0

Apr 11, 2022
Francesco De Toni, Christopher Akiki, Javier de la Rosa, Clémentine Fourrier, Enrique Manjavacas, Stefan Schweter, Daniel van Strien

In this work, we explore whether the recently demonstrated zero-shot abilities of the T0 model extend to Named Entity Recognition for out-of-distribution languages and time periods. Using a historical newspaper corpus in 3 languages as test-bed, we use prompts to extract possible named entities. Our results show that a naive approach for prompt-based zero-shot multilingual Named Entity Recognition is error-prone, but highlights the potential of such an approach for historical languages lacking labeled datasets. Moreover, we also find that T0-like models can be probed to predict the publication date and language of a document, which could be very relevant for the study of historical texts.

  Access Paper or Ask Questions

Using Database Rule for Weak Supervised Text-to-SQL Generation

Jul 31, 2019
Tong Guo, Huilin Gao

We present a simple way to do the task of text-to-SQL problem with weak supervision. We call it Rule-SQL. Given the question and the answer from the database table without the SQL logic form, Rule-SQL use the rules based on table column names and question string for the SQL exploration first and then use the explored SQL for supervised training. We design several rules for reducing the exploration search space. For the deep model, we leverage BERT for the representation layer and separate the model to SELECT, AGG and WHERE parts. The experiment result on WikiSQL outperforms the strong baseline of full supervision and is comparable to the start-of-the-art weak supervised mothods.

  Access Paper or Ask Questions

Learning from Video and Text via Large-Scale Discriminative Clustering

Jul 27, 2017
Antoine Miech, Jean-Baptiste Alayrac, Piotr Bojanowski, Ivan Laptev, Josef Sivic

Discriminative clustering has been successfully applied to a number of weakly-supervised learning tasks. Such applications include person and action recognition, text-to-video alignment, object co-segmentation and colocalization in videos and images. One drawback of discriminative clustering, however, is its limited scalability. We address this issue and propose an online optimization algorithm based on the Block-Coordinate Frank-Wolfe algorithm. We apply the proposed method to the problem of weakly supervised learning of actions and actors from movies together with corresponding movie scripts. The scaling up of the learning problem to 66 feature length movies enables us to significantly improve weakly supervised action recognition.

* To appear in ICCV 2017 

  Access Paper or Ask Questions

Text Classification for Predicting Multi-level Product Categories

Sep 02, 2021
Hadi Jahanshahi, Ozan Ozyegen, Mucahit Cevik, Beste Bulut, Deniz Yigit, Fahrettin F. Gonen, Ayşe Başar

In an online shopping platform, a detailed classification of the products facilitates user navigation. It also helps online retailers keep track of the price fluctuations in a certain industry or special discounts on a specific product category. Moreover, an automated classification system may help to pinpoint incorrect or subjective categories suggested by an operator. In this study, we focus on product title classification of the grocery products. We perform a comprehensive comparison of six different text classification models to establish a strong baseline for this task, which involves testing both traditional and recent machine learning methods. In our experiments, we investigate the generalizability of the trained models to the products of other online retailers, the dynamic masking of infeasible subcategories for pretrained language models, and the benefits of incorporating product titles in multiple languages. Our numerical results indicate that dynamic masking of subcategories is effective in improving prediction accuracy. In addition, we observe that using bilingual product titles is generally beneficial, and neural network-based models perform significantly better than SVM and XGBoost models. Lastly, we investigate the reasons for the misclassified products and propose future research directions to further enhance the prediction models.

* CASCON'21; 31st Annual International Conference on Computer Science and Software Engineering; Nov 22-26, 2021; Toronto, Canada} 

  Access Paper or Ask Questions

More Identifiable yet Equally Performant Transformers for Text Classification

Jun 02, 2021
Rishabh Bhardwaj, Navonil Majumder, Soujanya Poria, Eduard Hovy

Interpretability is an important aspect of the trustworthiness of a model's predictions. Transformer's predictions are widely explained by the attention weights, i.e., a probability distribution generated at its self-attention unit (head). Current empirical studies provide shreds of evidence that attention weights are not explanations by proving that they are not unique. A recent study showed theoretical justifications to this observation by proving the non-identifiability of attention weights. For a given input to a head and its output, if the attention weights generated in it are unique, we call the weights identifiable. In this work, we provide deeper theoretical analysis and empirical observations on the identifiability of attention weights. Ignored in the previous works, we find the attention weights are more identifiable than we currently perceive by uncovering the hidden role of the key vector. However, the weights are still prone to be non-unique attentions that make them unfit for interpretation. To tackle this issue, we provide a variant of the encoder layer that decouples the relationship between key and value vector and provides identifiable weights up to the desired length of the input. We prove the applicability of such variations by providing empirical justifications on varied text classification tasks. The implementations are available at

* ACL 2021 
* ACL 2021 

  Access Paper or Ask Questions