Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

State Selection Algorithms and Their Impact on The Performance of Stateful Network Protocol Fuzzing

Dec 24, 2021
Dongge Liu, Van-Thuan Pham, Gidon Ernst, Toby Murray, Benjamin I. P. Rubinstein

The statefulness property of network protocol implementations poses a unique challenge for testing and verification techniques, including Fuzzing. Stateful fuzzers tackle this challenge by leveraging state models to partition the state space and assist the test generation process. Since not all states are equally important and fuzzing campaigns have time limits, fuzzers need effective state selection algorithms to prioritize progressive states over others. Several state selection algorithms have been proposed but they were implemented and evaluated separately on different platforms, making it hard to achieve conclusive findings. In this work, we evaluate an extensive set of state selection algorithms on the same fuzzing platform that is AFLNet, a state-of-the-art fuzzer for network servers. The algorithm set includes existing ones supported by AFLNet and our novel and principled algorithm called AFLNetLegion. The experimental results on the ProFuzzBench benchmark show that (i) the existing state selection algorithms of AFLNet achieve very similar code coverage, (ii) AFLNetLegion clearly outperforms these algorithms in selected case studies, but (iii) the overall improvement appears insignificant. These are unexpected yet interesting findings. We identify problems and share insights that could open opportunities for future research on this topic.

* 10 pages, 8 figures, coloured, conference 

  Access Paper or Ask Questions

Intelligent Transportation Systems With The Use of External Infrastructure: A Literature Survey

Dec 15, 2021
Christian CreรŸ, Alois C. Knoll

Increasing problems in the transportation segment are accidents, bad traffic flow and pollution. The Intelligent Transportation System with the use of external infrastructure (ITS) can tackle these problems. To the best of our knowledge, there exists no current systematic review of the existing solutions. To fill this knowledge gap, this paper provides an overview about existing ITS which use external infrastructure. Furthermore, this paper discovers the currently not adequately answered research questions. For this reason, we performed a literature review to documents, which describes existing ITS solutions since 2009 until today. We categorized the results according to his technology level and analyzed their properties. Thereby, we made the several ITS comparable and highlighted the past development as well as the current trends. According to the mentioned method, we analyzed more than 346 papers, which includes 40 test bed projects. In summary, the current ITS can deliver high accurate information about individuals in traffic situations in real-time. However, further research in ITS should focus on more reliable perception of the traffic with the use of modern sensors, plug and play mechanism as well as secure real-time distribution in decentralized manner for a high amount of data. With addressing these topics, the development of Intelligent Transportation Systems is in a correction direction for the comprehensive roll-out.

* 14 pages, 4 tables, 4 figures 

  Access Paper or Ask Questions

TBCOV: Two Billion Multilingual COVID-19 Tweets with Sentiment, Entity, Geo, and Gender Labels

Oct 04, 2021
Muhammad Imran, Umair Qazi, Ferda Ofli

The widespread usage of social networks during mass convergence events, such as health emergencies and disease outbreaks, provides instant access to citizen-generated data that carry rich information about public opinions, sentiments, urgent needs, and situational reports. Such information can help authorities understand the emergent situation and react accordingly. Moreover, social media plays a vital role in tackling misinformation and disinformation. This work presents TBCOV, a large-scale Twitter dataset comprising more than two billion multilingual tweets related to the COVID-19 pandemic collected worldwide over a continuous period of more than one year. More importantly, several state-of-the-art deep learning models are used to enrich the data with important attributes, including sentiment labels, named-entities (e.g., mentions of persons, organizations, locations), user types, and gender information. Last but not least, a geotagging method is proposed to assign country, state, county, and city information to tweets, enabling a myriad of data analysis tasks to understand real-world issues. Our sentiment and trend analyses reveal interesting insights and confirm TBCOV's broad coverage of important topics.

* 20 pages, 13 figures, 8 tables 

  Access Paper or Ask Questions

Agreeing to Disagree: Annotating Offensive Language Datasets with Annotators' Disagreement

Sep 28, 2021
Elisa Leonardelli, Stefano Menini, Alessio Palmero Aprosio, Marco Guerini, Sara Tonelli

Since state-of-the-art approaches to offensive language detection rely on supervised learning, it is crucial to quickly adapt them to the continuously evolving scenario of social media. While several approaches have been proposed to tackle the problem from an algorithmic perspective, so to reduce the need for annotated data, less attention has been paid to the quality of these data. Following a trend that has emerged recently, we focus on the level of agreement among annotators while selecting data to create offensive language datasets, a task involving a high level of subjectivity. Our study comprises the creation of three novel datasets of English tweets covering different topics and having five crowd-sourced judgments each. We also present an extensive set of experiments showing that selecting training and test data according to different levels of annotators' agreement has a strong effect on classifiers performance and robustness. Our findings are further validated in cross-domain experiments and studied using a popular benchmark dataset. We show that such hard cases, where low agreement is present, are not necessarily due to poor-quality annotation and we advocate for a higher presence of ambiguous cases in future datasets, particularly in test sets, to better account for the different points of view expressed online.

* To appear at EMNLP 2021 (long paper) 

  Access Paper or Ask Questions

Discriminative Domain-Invariant Adversarial Network for Deep Domain Generalization

Aug 20, 2021
Mohammad Mahfujur Rahman, Clinton Fookes, Sridha Sridharan

Domain generalization approaches aim to learn a domain invariant prediction model for unknown target domains from multiple training source domains with different distributions. Significant efforts have recently been committed to broad domain generalization, which is a challenging and topical problem in machine learning and computer vision communities. Most previous domain generalization approaches assume that the conditional distribution across the domains remain the same across the source domains and learn a domain invariant model by minimizing the marginal distributions. However, the assumption of a stable conditional distribution of the training source domains does not really hold in practice. The hyperplane learned from the source domains will easily misclassify samples scattered at the boundary of clusters or far from their corresponding class centres. To address the above two drawbacks, we propose a discriminative domain-invariant adversarial network (DDIAN) for domain generalization. The discriminativeness of the features are guaranteed through a discriminative feature module and domain-invariant features are guaranteed through the global domain and local sub-domain alignment modules. Extensive experiments on several benchmarks show that DDIAN achieves better prediction on unseen target data during training compared to state-of-the-art domain generalization approaches.

* This manuscript is submitted to Computer Vision and Image Understanding (CVIU) 

  Access Paper or Ask Questions

Fractional Transfer Learning for Deep Model-Based Reinforcement Learning

Aug 14, 2021
Remo Sasso, Matthia Sabatelli, Marco A. Wiering

Reinforcement learning (RL) is well known for requiring large amounts of data in order for RL agents to learn to perform complex tasks. Recent progress in model-based RL allows agents to be much more data-efficient, as it enables them to learn behaviors of visual environments in imagination by leveraging an internal World Model of the environment. Improved sample efficiency can also be achieved by reusing knowledge from previously learned tasks, but transfer learning is still a challenging topic in RL. Parameter-based transfer learning is generally done using an all-or-nothing approach, where the network's parameters are either fully transferred or randomly initialized. In this work we present a simple alternative approach: fractional transfer learning. The idea is to transfer fractions of knowledge, opposed to discarding potentially useful knowledge as is commonly done with random initialization. Using the World Model-based Dreamer algorithm, we identify which type of components this approach is applicable to, and perform experiments in a new multi-source transfer learning setting. The results show that fractional transfer learning often leads to substantially improved performance and faster learning compared to learning from scratch and random initialization.

* 21 pages, 8 figures, 7 tables 

  Access Paper or Ask Questions

Towards the Unseen: Iterative Text Recognition by Distilling from Errors

Jul 26, 2021
Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yi-Zhe Song

Visual text recognition is undoubtedly one of the most extensively researched topics in computer vision. Great progress have been made to date, with the latest models starting to focus on the more practical "in-the-wild" setting. However, a salient problem still hinders practical deployment -- prior arts mostly struggle with recognising unseen (or rarely seen) character sequences. In this paper, we put forward a novel framework to specifically tackle this "unseen" problem. Our framework is iterative in nature, in that it utilises predicted knowledge of character sequences from a previous iteration, to augment the main network in improving the next prediction. Key to our success is a unique cross-modal variational autoencoder to act as a feedback module, which is trained with the presence of textual error distribution data. This module importantly translate a discrete predicted character space, to a continuous affine transformation parameter space used to condition the visual feature map at next iteration. Experiments on common datasets have shown competitive performance over state-of-the-arts under the conventional setting. Most importantly, under the new disjoint setup where train-test labels are mutually exclusive, ours offers the best performance thus showcasing the capability of generalising onto unseen words.

* IEEE International Conference on Computer Vision (ICCV), 2021 

  Access Paper or Ask Questions

Geometric Processing for Image-based 3D Object Modeling

Jun 27, 2021
Rongjun Qin, Xu Huang

Image-based 3D object modeling refers to the process of converting raw optical images to 3D digital representations of the objects. Very often, such models are desired to be dimensionally true, semantically labeled with photorealistic appearance (reality-based modeling). Laser scanning was deemed as the standard (and direct) way to obtaining highly accurate 3D measurements of objects, while one would have to abide the high acquisition cost and its unavailability on some of the platforms. Nowadays the image-based methods backboned by the recently developed advanced dense image matching algorithms and geo-referencing paradigms, are becoming the dominant approaches, due to its high flexibility, availability and low cost. The largely automated geometric processing of images in a 3D object reconstruction workflow, from ordered/unordered raw imagery to textured meshes, is becoming a critical part of the reality-based 3D modeling. This article summarizes the overall geometric processing workflow, with focuses on introducing the state-of-the-art methods of three major components of geometric processing: 1) geo-referencing; 2) Image dense matching 3) texture mapping. Finally, we will draw conclusions and share our outlooks of the topics discussed in this article.

  Access Paper or Ask Questions

Towards Target-dependent Sentiment Classification in News Articles

May 20, 2021
Felix Hamborg, Karsten Donnay, Bela Gipp

Extensive research on target-dependent sentiment classification (TSC) has led to strong classification performances in domains where authors tend to explicitly express sentiment about specific entities or topics, such as in reviews or on social media. We investigate TSC in news articles, a much less researched domain, despite the importance of news as an essential information source in individual and societal decision making. This article introduces NewsTSC, a manually annotated dataset to explore TSC on news articles. Investigating characteristics of sentiment in news and contrasting them to popular TSC domains, we find that sentiment in the news is expressed less explicitly, is more dependent on context and readership, and requires a greater degree of interpretation. In an extensive evaluation, we find that the state of the art in TSC performs worse on news articles than on other domains (average recall AvgRec = 69.8 on NewsTSC compared to AvgRev = [75.6, 82.2] on established TSC datasets). Reasons include incorrectly resolved relation of target and sentiment-bearing phrases and off-context dependence. As a major improvement over previous news TSC, we find that BERT's natural language understanding capabilities capture the less explicit sentiment used in news articles.

  Access Paper or Ask Questions