Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

GeoTextTagger: High-Precision Location Tagging of Textual Documents using a Natural Language Processing Approach

Jan 22, 2016
Shawn Brunsting, Hans De Sterck, Remco Dolman, Teun van Sprundel

Location tagging, also known as geotagging or geolocation, is the process of assigning geographical coordinates to input data. In this paper we present an algorithm for location tagging of textual documents. Our approach makes use of previous work in natural language processing by using a state-of-the-art part-of-speech tagger and named entity recognizer to find blocks of text which may refer to locations. A knowledge base (OpenStreatMap) is then used to find a list of possible locations for each block. Finally, one location is chosen for each block by assigning distance-based scores to each location and repeatedly selecting the location and block with the best score. We tested our geolocation algorithm with Wikipedia articles about topics with a well-defined geographical location that are geotagged by the articles' authors, where classification approaches have achieved median errors as low as 11 km, with attainable accuracy limited by the class size. Our approach achieved a 10th percentile error of 490 metres and median error of 54 kilometres on the Wikipedia dataset we used. When considering the five location tags with the greatest scores, 50% of articles were assigned at least one tag within 8.5 kilometres of the article's author-assigned true location. We also tested our approach on Twitter messages that are tagged with the location from which the message was sent. Twitter texts are challenging because they are short and unstructured and often do not contain words referring to the location they were sent from, but we obtain potentially useful results. We explain how we use the Spark framework for data analytics to collect and process our test data. In general, classification-based approaches for location tagging may be reaching their upper accuracy limit, but our precision-focused approach has high accuracy for some texts and shows significant potential for improvement overall.

  Access Paper or Ask Questions

Transfer Learning for Video Recognition with Scarce Training Data for Deep Convolutional Neural Network

Jun 15, 2015
Yu-Chuan Su, Tzu-Hsuan Chiu, Chun-Yen Yeh, Hsin-Fu Huang, Winston H. Hsu

Unconstrained video recognition and Deep Convolution Network (DCN) are two active topics in computer vision recently. In this work, we apply DCNs as frame-based recognizers for video recognition. Our preliminary studies, however, show that video corpora with complete ground truth are usually not large and diverse enough to learn a robust model. The networks trained directly on the video data set suffer from significant overfitting and have poor recognition rate on the test set. The same lack-of-training-sample problem limits the usage of deep models on a wide range of computer vision problems where obtaining training data are difficult. To overcome the problem, we perform transfer learning from images to videos to utilize the knowledge in the weakly labeled image corpus for video recognition. The image corpus help to learn important visual patterns for natural images, while these patterns are ignored by models trained only on the video corpus. Therefore, the resultant networks have better generalizability and better recognition rate. We show that by means of transfer learning from image to video, we can learn a frame-based recognizer with only 4k videos. Because the image corpus is weakly labeled, the entire learning process requires only 4k annotated instances, which is far less than the million scale image data sets required by previous works. The same approach may be applied to other visual recognition tasks where only scarce training data is available, and it improves the applicability of DCNs in various computer vision problems. Our experiments also reveal the correlation between meta-parameters and the performance of DCNs, given the properties of the target problem and data. These results lead to a heuristic for meta-parameter selection for future researches, which does not rely on the time consuming meta-parameter search.

  Access Paper or Ask Questions

General factorization framework for context-aware recommendations

May 19, 2015
Balázs Hidasi, Domonkos Tikk

Context-aware recommendation algorithms focus on refining recommendations by considering additional information, available to the system. This topic has gained a lot of attention recently. Among others, several factorization methods were proposed to solve the problem, although most of them assume explicit feedback which strongly limits their real-world applicability. While these algorithms apply various loss functions and optimization strategies, the preference modeling under context is less explored due to the lack of tools allowing for easy experimentation with various models. As context dimensions are introduced beyond users and items, the space of possible preference models and the importance of proper modeling largely increases. In this paper we propose a General Factorization Framework (GFF), a single flexible algorithm that takes the preference model as an input and computes latent feature matrices for the input dimensions. GFF allows us to easily experiment with various linear models on any context-aware recommendation task, be it explicit or implicit feedback based. The scaling properties makes it usable under real life circumstances as well. We demonstrate the framework's potential by exploring various preference models on a 4-dimensional context-aware problem with contexts that are available for almost any real life datasets. We show in our experiments -- performed on five real life, implicit feedback datasets -- that proper preference modelling significantly increases recommendation accuracy, and previously unused models outperform the traditional ones. Novel models in GFF also outperform state-of-the-art factorization algorithms. We also extend the method to be fully compliant to the Multidimensional Dataspace Model, one of the most extensive data models of context-enriched data. Extended GFF allows the seamless incorporation of information into the fac[truncated]

* The final publication is available at Springer via Data Mining and Knowledge Discovery, 2015 

  Access Paper or Ask Questions

Controlling the Risk of Conversational Search via Reinforcement Learning

Jan 15, 2021
Zhenduo Wang, Qingyao Ai

Users often formulate their search queries with immature language without well-developed keywords and complete structures. Such queries fail to express their true information needs and raise ambiguity as fragmental language often yield various interpretations and aspects. This gives search engines a hard time processing and understanding the query, and eventually leads to unsatisfactory retrieval results. An alternative approach to direct answer while facing an ambiguous query is to proactively ask clarifying questions to the user. Recent years have seen many works and shared tasks from both NLP and IR community about identifying the need for asking clarifying question and methodology to generate them. An often neglected fact by these works is that although sometimes the need for clarifying questions is correctly recognized, the clarifying questions these system generate are still off-topic and dissatisfaction provoking to users and may just cause users to leave the conversation. In this work, we propose a risk-aware conversational search agent model to balance the risk of answering user's query and asking clarifying questions. The agent is fully aware that asking clarifying questions can potentially collect more information from user, but it will compare all the choices it has and evaluate the risks. Only after then, it will make decision between answering or asking. To demonstrate that our system is able to retrieve better answers, we conduct experiments on the MSDialog dataset which contains real-world customer service conversations from Microsoft products community. We also purpose a reinforcement learning strategy which allows us to train our model on the original dataset directly and saves us from any further data annotation efforts. Our experiment results show that our risk-aware conversational search agent is able to significantly outperform strong non-risk-aware baselines.

* This paper is accepted by The Web Conference (WWW) 2021 

  Access Paper or Ask Questions

CHAMELEON: A Deep Learning Meta-Architecture for News Recommender Systems [Phd. Thesis]

Dec 29, 2019
Gabriel de Souza Pereira Moreira

Recommender Systems (RS) have became a popular research topic and, since 2016, Deep Learning methods and techniques have been increasingly explored in this area. News RS are aimed to personalize users experiences and help them discover relevant articles from a large and dynamic search space. The main contribution of this research was named CHAMELEON, a Deep Learning meta-architecture designed to tackle the specific challenges of news recommendation. It consists of a modular reference architecture which can be instantiated using different neural building blocks. As information about users' past interactions is scarce in the news domain, the user context can be leveraged to deal with the user cold-start problem. Articles' content is also important to tackle the item cold-start problem. Additionally, the temporal decay of items (articles) relevance is very accelerated in the news domain. Furthermore, external breaking events may temporally attract global readership attention, a phenomenon generally known as concept drift in machine learning. All those characteristics are explicitly modeled on this research by a contextual hybrid session-based recommendation approach using Recurrent Neural Networks. The task addressed by this research is session-based news recommendation, i.e., next-click prediction using only information available in the current user session. A method is proposed for a realistic temporal offline evaluation of such task, replaying the stream of user clicks and fresh articles being continuously published in a news portal. Experiments performed with two large datasets have shown the effectiveness of the CHAMELEON for news recommendation on many quality factors such as accuracy, item coverage, novelty, and reduced item cold-start problem, when compared to other traditional and state-of-the-art session-based recommendation algorithms.

* Phd. Thesis presented on Dec. 09, 2019 to the Instituto Tecnol\'ogico de Aeron\'autica (ITA), in partial fulfillment of the requirements for the degree of Doctor of Science in the Graduate Program of Electronics and Computing Engineering, Field of Informatics 

  Access Paper or Ask Questions

3D Object Recognition with Ensemble Learning --- A Study of Point Cloud-Based Deep Learning Models

May 22, 2019
Daniel Koguciuk, Łukasz Chechliński, Tarek El-Gaaly

In this study, we present an analysis of model-based ensemble learning for 3D point-cloud object classification and detection. An ensemble of multiple model instances is known to outperform a single model instance, but there is little study of the topic of ensemble learning for 3D point clouds. First, an ensemble of multiple model instances trained on the same part of the $\textit{ModelNet40}$ dataset was tested for seven deep learning, point cloud-based classification algorithms: $\textit{PointNet}$, $\textit{PointNet++}$, $\textit{SO-Net}$, $\textit{KCNet}$, $\textit{DeepSets}$, $\textit{DGCNN}$, and $\textit{PointCNN}$. Second, the ensemble of different architectures was tested. Results of our experiments show that the tested ensemble learning methods improve over state-of-the-art on the $\textit{ModelNet40}$ dataset, from $92.65\%$ to $93.64\%$ for the ensemble of single architecture instances, $94.03\%$ for two different architectures, and $94.15\%$ for five different architectures. We show that the ensemble of two models with different architectures can be as effective as the ensemble of 10 models with the same architecture. Third, a study on classic bagging i.e. with different subsets used for training multiple model instances) was tested and sources of ensemble accuracy growth were investigated for best-performing architecture, i.e. $\textit{SO-Net}$. We also investigate the ensemble learning of $\textit{Frustum PointNet}$ approach in the task of 3D object detection, increasing the average precision of 3D box detection on the $\textit{KITTI}$ dataset from $63.1\%$ to $66.5\%$ using only three model instances. We measure the inference time of all 3D classification architectures on a $\textit{Nvidia Jetson TX2}$, a common embedded computer for mobile robots, to allude to the use of these models in real-life applications.

  Access Paper or Ask Questions

Can a CNN Recognize Catalan Diet?

Jul 29, 2016
Pedro Herruzo, Marc Bolaños, Petia Radeva

Nowadays, we can find several diseases related to the unhealthy diet habits of the population, such as diabetes, obesity, anemia, bulimia and anorexia. In many cases, these diseases are related to the food consumption of people. Mediterranean diet is scientifically known as a healthy diet that helps to prevent many metabolic diseases. In particular, our work focuses on the recognition of Mediterranean food and dishes. The development of this methodology would allow to analise the daily habits of users with wearable cameras, within the topic of lifelogging. By using automatic mechanisms we could build an objective tool for the analysis of the patient's behaviour, allowing specialists to discover unhealthy food patterns and understand the user's lifestyle. With the aim to automatically recognize a complete diet, we introduce a challenging multi-labeled dataset related to Mediterranean diet called FoodCAT. The first type of label provided consists of 115 food classes with an average of 400 images per dish, and the second one consists of 12 food categories with an average of 3800 pictures per class. This dataset will serve as a basis for the development of automatic diet recognition. In this context, deep learning and more specifically, Convolutional Neural Networks (CNNs), currently are state-of-the-art methods for automatic food recognition. In our work, we compare several architectures for image classification, with the purpose of diet recognition. Applying the best model for recognising food categories, we achieve a top-1 accuracy of 72.29\%, and top-5 of 97.07\%. In a complete diet recognition of dishes from Mediterranean diet, enlarged with the Food-101 dataset for international dishes recognition, we achieve a top-1 accuracy of 68.07\%, and top-5 of 89.53\%, for a total of 115+101 food classes.

* 9 pages, 6 figures, 6 tables 

  Access Paper or Ask Questions

Monitoring Energy Trends through Automatic Information Extraction

Jan 05, 2022
Dilek Küçük

Energy research is of crucial public importance but the use of computer science technologies like automatic text processing and data management for the energy domain is still rare. Employing these technologies in the energy domain will be a significant contribution to the interdisciplinary topic of ``energy informatics", just like the related progress within the interdisciplinary area of ``bioinformatics". In this paper, we present the architecture of a Web-based semantic system called EneMonIE (Energy Monitoring through Information Extraction) for monitoring up-to-date energy trends through the use of automatic, continuous, and guided information extraction from diverse types of media available on the Web. The types of media handled by the system will include online news articles, social media texts, online news videos, and open-access scholarly papers and technical reports as well as various numeric energy data made publicly available by energy organizations. The system will utilize and contribute to the energy-related ontologies and its ultimate form will comprise components for (i) text categorization, (ii) named entity recognition, (iii) temporal expression extraction, (iv) event extraction, (v) social network construction, (vi) sentiment analysis, (vii) information fusion and summarization, (viii) media interlinking, and (ix) Web-based information retrieval and visualization. Wits its diverse data sources, automatic text processing capabilities, and presentation facilities open for public use; EneMonIE will be an important source of distilled and concise information for decision-makers including energy generation, transmission, and distribution system operators, energy research centres, related investors and entrepreneurs as well as for academicians, students, other individuals interested in the pace of energy events and technologies.

* 5 pages 

  Access Paper or Ask Questions

KLUE: Korean Language Understanding Evaluation

May 21, 2021
Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Yongsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Lyu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho

We introduce Korean Language Understanding Evaluation (KLUE) benchmark. KLUE is a collection of 8 Korean natural language understanding (NLU) tasks, including Topic Classification, SemanticTextual Similarity, Natural Language Inference, Named Entity Recognition, Relation Extraction, Dependency Parsing, Machine Reading Comprehension, and Dialogue State Tracking. We build all of the tasks from scratch from diverse source corpora while respecting copyrights, to ensure accessibility for anyone without any restrictions. With ethical considerations in mind, we carefully design annotation protocols. Along with the benchmark tasks and data, we provide suitable evaluation metrics and fine-tuning recipes for pretrained language models for each task. We furthermore release the pretrained language models (PLM), KLUE-BERT and KLUE-RoBERTa, to help reproducing baseline models on KLUE and thereby facilitate future research. We make a few interesting observations from the preliminary experiments using the proposed KLUE benchmark suite, already demonstrating the usefulness of this new benchmark suite. First, we find KLUE-RoBERTa-large outperforms other baselines, including multilingual PLMs and existing open-source Korean PLMs. Second, we see minimal degradation in performance even when we replace personally identifiable information from the pretraining corpus, suggesting that privacy and NLU capability are not at odds with each other. Lastly, we find that using BPE tokenization in combination with morpheme-level pre-tokenization is effective in tasks involving morpheme-level tagging, detection and generation. In addition to accelerating Korean NLP research, our comprehensive documentation on creating KLUE will facilitate creating similar resources for other languages in the future. KLUE is available at this URL.

* 76 pages, 10 figures, 36 tables 

  Access Paper or Ask Questions