Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Text": models, code, and papers

Reading Comprehension in Czech via Machine Translation and Cross-lingual Transfer

Jul 03, 2020
Kateřina Macková, Milan Straka

Reading comprehension is a well studied task, with huge training datasets in English. This work focuses on building reading comprehension systems for Czech, without requiring any manually annotated Czech training data. First of all, we automatically translated SQuAD 1.1 and SQuAD 2.0 datasets to Czech to create training and development data, which we release at We then trained and evaluated several BERT and XLM-RoBERTa baseline models. However, our main focus lies in cross-lingual transfer models. We report that a XLM-RoBERTa model trained on English data and evaluated on Czech achieves very competitive performance, only approximately 2 percent points worse than a~model trained on the translated Czech data. This result is extremely good, considering the fact that the model has not seen any Czech data during training. The cross-lingual transfer approach is very flexible and provides a reading comprehension in any language, for which we have enough monolingual raw texts.

* Accepted at TSD 2020, 23rd International Conference on Text, Speech and Dialogue 

  Access Paper or Ask Questions

The SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion

May 28, 2020
Katharina Kann, Arya McCarthy, Garrett Nicolai, Mans Hulden

In this paper, we describe the findings of the SIGMORPHON 2020 shared task on unsupervised morphological paradigm completion (SIGMORPHON 2020 Task 2), a novel task in the field of inflectional morphology. Participants were asked to submit systems which take raw text and a list of lemmas as input, and output all inflected forms, i.e., the entire morphological paradigm, of each lemma. In order to simulate a realistic use case, we first released data for 5 development languages. However, systems were officially evaluated on 9 surprise languages, which were only revealed a few days before the submission deadline. We provided a modular baseline system, which is a pipeline of 4 components. 3 teams submitted a total of 7 systems, but, surprisingly, none of the submitted systems was able to improve over the baseline on average over all 9 test languages. Only on 3 languages did a submitted system obtain the best results. This shows that unsupervised morphological paradigm completion is still largely unsolved. We present an analysis here, so that this shared task will ground further research on the topic.


  Access Paper or Ask Questions

Domain-Guided Task Decomposition with Self-Training for Detecting Personal Events in Social Media

Apr 21, 2020
Payam Karisani, Joyce C. Ho, Eugene Agichtein

Mining social media content for tasks such as detecting personal experiences or events, suffer from lexical sparsity, insufficient training data, and inventive lexicons. To reduce the burden of creating extensive labeled data and improve classification performance, we propose to perform these tasks in two steps: 1. Decomposing the task into domain-specific sub-tasks by identifying key concepts, thus utilizing human domain understanding; and 2. Combining the results of learners for each key concept using co-training to reduce the requirements for labeled training data. We empirically show the effectiveness and generality of our approach, Co-Decomp, using three representative social media mining tasks, namely Personal Health Mention detection, Crisis Report detection, and Adverse Drug Reaction monitoring. The experiments show that our model is able to outperform the state-of-the-art text classification models--including those using the recently introduced BERT model--when small amounts of training data are available.

* WWW 2020 

  Access Paper or Ask Questions

PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

Apr 16, 2020
Wenwen Yu, Ning Lu, Xianbiao Qi, Ping Gong, Rong Xiao

Computer vision with state-of-the-art deep learning models has achieved huge success in the field of Optical Character Recognition (OCR) including text detection and recognition tasks recently. However, Key Information Extraction (KIE) from documents as the downstream task of OCR, having a large number of use scenarios in real-world, remains a challenge because documents not only have textual features extracting from OCR systems but also have semantic visual features that are not fully exploited and play a critical role in KIE. Too little work has been devoted to efficiently make full use of both textual and visual features of the documents. In this paper, we introduce PICK, a framework that is effective and robust in handling complex documents layout for KIE by combining graph learning with graph convolution operation, yielding a richer semantic representation containing the textual and visual features and global layout without ambiguity. Extensive experiments on real-world datasets have been conducted to show that our method outperforms baselines methods by significant margins.

* The first two authors contributed equally to this work. 8 pages, 3 figures, 4 tables 

  Access Paper or Ask Questions

Generating Multilingual Voices Using Speaker Space Translation Based on Bilingual Speaker Data

Apr 10, 2020
Soumi Maiti, Erik Marchi, Alistair Conkie

We present progress towards bilingual Text-to-Speech which is able to transform a monolingual voice to speak a second language while preserving speaker voice quality. We demonstrate that a bilingual speaker embedding space contains a separate distribution for each language and that a simple transform in speaker space generated by the speaker embedding can be used to control the degree of accent of a synthetic voice in a language. The same transform can be applied even to monolingual speakers. In our experiments speaker data from an English-Spanish (Mexican) bilingual speaker was used, and the goal was to enable English speakers to speak Spanish and Spanish speakers to speak English. We found that the simple transform was sufficient to convert a voice from one language to the other with a high degree of naturalness. In one case the transformed voice outperformed a native language voice in listening tests. Experiments further indicated that the transform preserved many of the characteristics of the original voice. The degree of accent present can be controlled and naturalness is relatively consistent across a range of accent values.

* Accepted to IEEE ICASSP 2020 

  Access Paper or Ask Questions

Igbo-English Machine Translation: An Evaluation Benchmark

Apr 01, 2020
Ignatius Ezeani, Paul Rayson, Ikechukwu Onyenwe, Chinedu Uchechukwu, Mark Hepple

Although researchers and practitioners are pushing the boundaries and enhancing the capacities of NLP tools and methods, works on African languages are lagging. A lot of focus on well resourced languages such as English, Japanese, German, French, Russian, Mandarin Chinese etc. Over 97% of the world's 7000 languages, including African languages, are low resourced for NLP i.e. they have little or no data, tools, and techniques for NLP research. For instance, only 5 out of 2965, 0.19% authors of full text papers in the ACL Anthology extracted from the 5 major conferences in 2018 ACL, NAACL, EMNLP, COLING and CoNLL, are affiliated to African institutions. In this work, we discuss our effort toward building a standard machine translation benchmark dataset for Igbo, one of the 3 major Nigerian languages. Igbo is spoken by more than 50 million people globally with over 50% of the speakers are in southeastern Nigeria. Igbo is low resourced although there have been some efforts toward developing IgboNLP such as part of speech tagging and diacritic restoration

* 4 pages 

  Access Paper or Ask Questions

KaoKore: A Pre-modern Japanese Art Facial Expression Dataset

Feb 20, 2020
Yingtao Tian, Chikahiko Suzuki, Tarin Clanuwat, Mikel Bober-Irizar, Alex Lamb, Asanobu Kitamoto

From classifying handwritten digits to generating strings of text, the datasets which have received long-time focus from the machine learning community vary greatly in their subject matter. This has motivated a renewed interest in building datasets which are socially and culturally relevant, so that algorithmic research may have a more direct and immediate impact on society. One such area is in history and the humanities, where better and relevant machine learning models can accelerate research across various fields. To this end, newly released benchmarks and models have been proposed for transcribing historical Japanese cursive writing, yet for the field as a whole using machine learning for historical Japanese artworks still remains largely uncharted. To bridge this gap, in this work we propose a new dataset KaoKore which consists of faces extracted from pre-modern Japanese artwork. We demonstrate its value as both a dataset for image classification as well as a creative and artistic dataset, which we explore using generative models. Dataset available at

  Access Paper or Ask Questions

HGAT: Hierarchical Graph Attention Network for Fake News Detection

Feb 05, 2020
Yuxiang Ren, Jiawei Zhang

The explosive growth of fake news has eroded the credibility of medias and governments. Fake news detection has become an urgent task. News articles along with other related components like news creators and news subjects can be modeled as a heterogeneous information network (HIN for short). In this paper, we focus on studying the HIN- based fake news detection problem. We propose a novel fake news detection framework, namely Hierarchical Graph Attention Network (HGAT) which employs a novel hierarchical attention mechanism to detect fake news by classifying news article nodes in the HIN. This method can effectively learn information from different types of related nodes through node-level and schema-level attention. Experiments with real-world fake news data show that our model can outperform text-based models and other network-based models. Besides, the experiments also demonstrate the expandability and potential of HGAT for heterogeneous graphs representation learning in the future.

  Access Paper or Ask Questions

Learning to Infer User Interface Attributes from Images

Dec 31, 2019
Philippe Schlattner, Pavol Bielik, Martin Vechev

We explore a new domain of learning to infer user interface attributes that helps developers automate the process of user interface implementation. Concretely, given an input image created by a designer, we learn to infer its implementation which when rendered, looks visually the same as the input image. To achieve this, we take a black box rendering engine and a set of attributes it supports (e.g., colors, border radius, shadow or text properties), use it to generate a suitable synthetic training dataset, and then train specialized neural models to predict each of the attribute values. To improve pixel-level accuracy, we additionally use imitation learning to train a neural policy that refines the predicted attribute values by learning to compute the similarity of the original and rendered images in their attribute space, rather than based on the difference of pixel values. We instantiate our approach to the task of inferring Android Button attribute values and achieve 92.5% accuracy on a dataset consisting of real-world Google Play Store applications.

  Access Paper or Ask Questions

Document Sub-structure in Neural Machine Translation

Dec 13, 2019
Radina Dobreva, Jie Zhou, Rachel Bawden

Current approaches to machine translation (MT) either translate sentences in isolation, disregarding the context they appear in, or model context on the level of the full document, without a notion of any internal structure the document may have. In this work we consider the fact that documents are rarely homogeneous blocks of text, but rather consist of parts covering different topics. Some documents, e.g. biographies and encyclopedia entries have highly predictable, regular structures in which sections are characterised by different topics. We draw inspiration from Louis and Webber (2014) who use this information to improve MT and transfer their proposal into the framework of neural MT. We compare two different methods of including information about the topic of the section within which each sentence is found: one using side constraints and the other using a cache-based model. We create and release the data on which we run our experiments -- parallel corpora for three language pairs (Chinese-English, French-English, Bulgarian-English) from Wikipedia biographies, preserving the boundaries of sections within the articles.

  Access Paper or Ask Questions