Data-driven analysis and detection of abusive online content covers many different tasks, phenomena, contexts, and methodologies. This paper systematically reviews abusive language dataset creation and content in conjunction with an open website for cataloguing abusive language data. This collection of knowledge leads to a synthesis providing evidence-based recommendations for practitioners working with this complex and highly diverse data.
We introduce a novel algorithmic approach to content recommendation based on adaptive clustering of exploration-exploitation ("bandit") strategies. We provide a sharp regret analysis of this algorithm in a standard stochastic noise setting, demonstrate its scalability properties, and prove its effectiveness on a number of artificial and real-world datasets. Our experiments show a significant increase in prediction performance over state-of-the-art methods for bandit problems.
This article focuses on different aspects of pedestrian (crowd) modeling and simulation. The review includes: various modeling criteria, such as granularity, techniques, and factors involved in modeling pedestrian behavior, and different pedestrian simulation methods with a more detailed look at two approaches for simulating pedestrian behavior in traffic scenes. At the end, benefits and drawbacks of different simulation techniques are discussed and recommendations are made for future research.
DivinAI is an open and collaborative initiative promoted by the European Commission's Joint Research Centre to measure and monitor diversity indicators related to AI conferences, with special focus on gender balance, geographical representation, and presence of academia vs companies. This paper summarizes the main achievements and lessons learnt during the first year of life of the DivinAI project, and proposes a set of recommendations for its further development and maintenance by the AI community.
A formal cyber reasoning framework for automating the threat hunting process is described. The new cyber reasoning methodology introduces an operational semantics that operates over three subspaces -- knowledge, hypothesis, and action -- to enable human-machine co-creation of threat hypotheses and protective recommendations. An implementation of this framework shows that the approach is practical and can be used to generalize evidence-based multi-criteria threat investigations.
We observe a severe under-reporting of the different kinds of errors that Natural Language Generation systems make. This is a problem, because mistakes are an important indicator of where systems should still be improved. If authors only report overall performance metrics, the research community is left in the dark about the specific weaknesses that are exhibited by `state-of-the-art' research. Next to quantifying the extent of error under-reporting, this position paper provides recommendations for error identification, analysis and reporting.
Proof assistants offer tactics to facilitate inductive proofs. However, it still requires human ingenuity to decide what arguments to pass to those induction tactics. To automate this process, we present smart_induct for Isabelle/HOL. Given an inductive problem in any problem domain, smart_induct lists promising arguments for the induct tactic without relying on a search. Our evaluation demonstrated smart_induct produces valuable recommendations across problem domains.
In this paper, we investigate and experiment the notion of error correction memory applied to error correction in technical texts. The main purpose is to induce relatively generic correction patterns associated with more contextual correction recommendations, based on previously memorized and analyzed corrections. The notion of error correction memory is developed within the framework of the LELIE project and illustrated on the case of fuzzy lexical items, which is a major problem in technical texts.
Article discusses the application of Kullback-Leibler divergence to the recognition of speech signals and suggests three algorithms implementing this divergence criterion: correlation algorithm, spectral algorithm and filter algorithm. Discussion covers an approach to the problem of speech variability and is illustrated with the results of experimental modeling of speech signals. The article gives a number of recommendations on the choice of appropriate model parameters and provides a comparison to some other methods of speech recognition.