Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

An ensemble learning framework based on group decision making

Jul 01, 2020
Jingyi He, Xiaojun Zhou, Rundong Zhang, Chunhua Yang

The classification problem is a significant topic in machine learning which aims to teach machines how to group together data by particular criteria. In this paper, a framework for the ensemble learning (EL) method based on group decision making (GDM) has been proposed to resolve this issue. In this framework, base learners can be considered as decision-makers, different categories can be seen as alternatives, classification results obtained by diverse base learners can be considered as performance ratings, and the precision, recall, and accuracy which can reflect the performances of the classification methods can be employed to identify the weights of decision-makers in GDM. Moreover, considering that the precision and recall defined in binary classification problems can not be used directly in the multi-classification problem, the One vs Rest (OvR) has been proposed to obtain the precision and recall of the base learner for each category. The experimental results demonstrate that the proposed EL method based on GDM has higher accuracy than other 6 current popular classification methods in most instances, which verifies the effectiveness of the proposed method.

* 6 pages, 2 figures 

  Access Paper or Ask Questions

Traffic Flow Forecast of Road Networks with Recurrent Neural Networks

Jun 08, 2020
Ralf Rüther, Andreas Klos, Marius Rosenbaum, Wolfram Schiffmann

The interest in developing smart cities has increased dramatically in recent years. In this context an intelligent transportation system depicts a major topic. The forecast of traffic flow is indispensable for an efficient intelligent transportation system. The traffic flow forecast is a difficult task, due to its stochastic and non linear nature. Besides classical statistical methods, neural networks are a promising possibility to predict future traffic flow. In our work, this prediction is performed with various recurrent neural networks. These are trained on measurements of induction loops, which are placed in intersections of the city. We utilized data from beginning of January to the end of July in 2018. Each model incorporates sequences of the measured traffic flow from all sensors and predicts the future traffic flow for each sensor simultaneously. A variety of model architectures, forecast horizons and input data were investigated. Most often the vector output model with gated recurrent units achieved the smallest error on the test set over all considered prediction scenarios. Due to the small amount of data, generalization of the trained models is limited.

* 12 pages 

  Access Paper or Ask Questions

Measuring Emotions in the COVID-19 Real World Worry Dataset

May 14, 2020
Bennett Kleinberg, Isabelle van der Vegt, Maximilian Mozes

The COVID-19 pandemic is having a dramatic impact on societies and economies around the world. With various measures of lockdowns and social distancing in place, it becomes important to understand emotional responses on a large scale. In this paper, we present the first ground truth dataset of emotional responses to COVID-19. We asked participants to indicate their emotions and express these in text. This resulted in the Real World Worry Dataset of 5,000 texts (2,500 short + 2,500 long texts). Our analyses suggest that emotional responses correlated with linguistic measures. Topic modeling further revealed that people in the UK worry about their family and the economic situation. Tweet-sized texts functioned as a call for solidarity, while longer texts shed light on worries and concerns. Using predictive modeling approaches, we were able to approximate the emotional responses of participants from text within 14% of their actual value. We encourage others to use the dataset and improve how we can use automated methods to learn about emotional responses and worries about an urgent problem.

* Accepted to ACL 2020 COVID-19 workshop 

  Access Paper or Ask Questions

Cost-Sensitive BERT for Generalisable Sentence Classification with Imbalanced Data

Mar 16, 2020
Harish Tayyar Madabushi, Elena Kochkina, Michael Castelle

The automatic identification of propaganda has gained significance in recent years due to technological and social changes in the way news is generated and consumed. That this task can be addressed effectively using BERT, a powerful new architecture which can be fine-tuned for text classification tasks, is not surprising. However, propaganda detection, like other tasks that deal with news documents and other forms of decontextualized social communication (e.g. sentiment analysis), inherently deals with data whose categories are simultaneously imbalanced and dissimilar. We show that BERT, while capable of handling imbalanced classes with no additional data augmentation, does not generalise well when the training and test data are sufficiently dissimilar (as is often the case with news sources, whose topics evolve over time). We show how to address this problem by providing a statistical measure of similarity between datasets and a method of incorporating cost-weighting into BERT when the training and test sets are dissimilar. We test these methods on the Propaganda Techniques Corpus (PTC) and achieve the second-highest score on sentence-level propaganda classification.

* NLP4IF 2019 

  Access Paper or Ask Questions

Agile Earth observation satellite scheduling over 20 years: formulations, methods and future directions

Mar 13, 2020
Xinwei Wang, Guohua Wu, Lining Xing, Witold Pedrycz

Agile satellites with advanced attitude maneuvering capability are the new generation of Earth observation satellites (EOSs). The continuous improvement in satellite technology and decrease in launch cost have boosted the development of agile EOSs (AEOSs). To efficiently employ the increasing orbiting AEOSs, the AEOS scheduling problem (AEOSSP) aiming to maximize the entire observation profit while satisfying all complex operational constraints, has received much attention over the past 20 years. The objectives of this paper are thus to summarize current research on AEOSSP, identify main accomplishments and highlight potential future research directions. To this end, general definitions of AEOSSP with operational constraints are described initially, followed by its three typical variations including different definitions of observation profit, multi-objective function and autonomous model. A detailed literature review from 1997 up to 2019 is then presented in line with four different solution methods, i.e., exact method, heuristic, metaheuristic and machine learning. Finally, we discuss a number of topics worth pursuing in the future.

  Access Paper or Ask Questions

An Information Diffusion Approach to Rumor Propagation and Identification on Twitter

Feb 24, 2020
Abiola Osho, Caden Waters, George Amariucai

With the increasing use of online social networks as a source of news and information, the propensity for a rumor to disseminate widely and quickly poses a great concern, especially in disaster situations where users do not have enough time to fact-check posts before making the informed decision to react to a post that appears to be credible. In this study, we explore the propagation pattern of rumors on Twitter by exploring the dynamics of microscopic-level misinformation spread, based on the latent message and user interaction attributes. We perform supervised learning for feature selection and prediction. Experimental results with real-world data sets give the models' prediction accuracy at about 90\% for the diffusion of both True and False topics. Our findings confirm that rumor cascades run deeper and that rumor masked as news, and messages that incite fear, will diffuse faster than other messages. We show that the models for True and False message propagation differ significantly, both in the prediction parameters and in the message features that govern the diffusion. Finally, we show that the diffusion pattern is an important metric in identifying the credibility of a tweet.

  Access Paper or Ask Questions

Machine Learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence

Feb 12, 2020
Sebastian Raschka, Joshua Patterson, Corey Nolet

Smarter applications are making better use of the insights gleaned from data, having an impact on every industry and research discipline. At the core of this revolution lies the tools and the methods that are driving it, from processing the massive piles of data generated each day to learning from and taking useful action. Deep neural networks, along with advancements in classical ML and scalable general-purpose GPU computing, have become critical components of artificial intelligence, enabling many of these astounding breakthroughs and lowering the barrier to adoption. Python continues to be the most preferred language for scientific computing, data science, and machine learning, boosting both performance and productivity by enabling the use of low-level libraries and clean high-level APIs. This survey offers insight into the field of machine learning with Python, taking a tour through important topics to identify some of the core hardware and software paradigms that have enabled it. We cover widely-used libraries and concepts, collected together for holistic comparison, with the goal of educating the reader and driving the field of Python machine learning forward.

  Access Paper or Ask Questions

WeatherBench: A benchmark dataset for data-driven weather forecasting

Feb 02, 2020
Stephan Rasp, Peter D. Dueben, Sebastian Scher, Jonathan A. Weyn, Soukayna Mouatadid, Nils Thuerey

Data-driven approaches, most prominently deep learning, have become powerful tools for prediction in many domains. A natural question to ask is whether data-driven methods could also be used for numerical weather prediction. First studies show promise but the lack of a common dataset and evaluation metrics make inter-comparison between studies difficult. Here we present a benchmark dataset for data-driven medium-range weather forecasting, a topic of high scientific interest for atmospheric and computer scientists alike. We provide data derived from the ERA5 archive that has been processed to facilitate the use in machine learning models. We propose a simple and clear evaluation metric which will enable a direct comparison between different methods. Further, we provide baseline scores from simple linear regression techniques, deep learning models as well as purely physical forecasting models. All data is publicly available and the companion code is reproducible with tutorials for getting started. We hope that this dataset will accelerate research in data-driven weather forecasting.

* Link to data still missing 

  Access Paper or Ask Questions

From Shallow to Deep Interactions Between Knowledge Representation, Reasoning and Machine Learning (Kay R. Amel group)

Dec 13, 2019
Zied Bouraoui, Antoine Cornuéjols, Thierry Denœux, Sébastien Destercke, Didier Dubois, Romain Guillaume, João Marques-Silva, Jérôme Mengin, Henri Prade, Steven Schockaert, Mathieu Serrurier, Christel Vrain

This paper proposes a tentative and original survey of meeting points between Knowledge Representation and Reasoning (KRR) and Machine Learning (ML), two areas which have been developing quite separately in the last three decades. Some common concerns are identified and discussed such as the types of used representation, the roles of knowledge and data, the lack or the excess of information, or the need for explanations and causal understanding. Then some methodologies combining reasoning and learning are reviewed (such as inductive logic programming, neuro-symbolic reasoning, formal concept analysis, rule-based representations and ML, uncertainty in ML, or case-based reasoning and analogical reasoning), before discussing examples of synergies between KRR and ML (including topics such as belief functions on regression, EM algorithm versus revision, the semantic description of vector representations, the combination of deep learning with high level inference, knowledge graph completion, declarative frameworks for data mining, or preferences and recommendation). This paper is the first step of a work in progress aiming at a better mutual understanding of research in KRR and ML, and how they could cooperate.

* 53 pages 

  Access Paper or Ask Questions