Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Material Classification using Neural Networks

Oct 17, 2017
Anca Sticlaru

The recognition and classification of the diversity of materials that exist in the environment around us are a key visual competence that computer vision systems focus on in recent years. Understanding the identification of materials in distinct images involves a deep process that has made usage of the recent progress in neural networks which has brought the potential to train architectures to extract features for this challenging task. This project uses state-of-the-art Convolutional Neural Network (CNN) techniques and Support Vector Machine (SVM) classifiers in order to classify materials and analyze the results. Building on various widely used material databases collected, a selection of CNN architectures is evaluated to understand which is the best approach to extract features in order to achieve outstanding results for the task. The results gathered over four material datasets and nine CNNs outline that the best overall performance of a CNN using a linear SVM can achieve up to ~92.5% mean average precision, while applying a new relevant direction in computer vision, transfer learning. By limiting the amount of information extracted from the layer before the last fully connected layer, transfer learning aims at analyzing the contribution of shading information and reflectance to identify which main characteristics decide the material category the image belongs to. In addition to the main topic of my project, the evaluation of the nine different CNN architectures, it is questioned if, by using the transfer learning instead of extracting the information from the last convolutional layer, the total accuracy of the system created improves. The results of the comparison emphasize the fact that the accuracy and performance of the system improve, especially in the datasets which consist of a large number of images.

* 45 pages, BSc thesis 

  Access Paper or Ask Questions

Machine learning \& artificial intelligence in the quantum domain

Sep 08, 2017
Vedran Dunjko, Hans J. Briegel

Quantum information technologies, and intelligent learning systems, are both emergent technologies that will likely have a transforming impact on our society. The respective underlying fields of research -- quantum information (QI) versus machine learning (ML) and artificial intelligence (AI) -- have their own specific challenges, which have hitherto been investigated largely independently. However, in a growing body of recent work, researchers have been probing the question to what extent these fields can learn and benefit from each other. QML explores the interaction between quantum computing and ML, investigating how results and techniques from one field can be used to solve the problems of the other. Recently, we have witnessed breakthroughs in both directions of influence. For instance, quantum computing is finding a vital application in providing speed-ups in ML, critical in our "big data" world. Conversely, ML already permeates cutting-edge technologies, and may become instrumental in advanced quantum technologies. Aside from quantum speed-up in data analysis, or classical ML optimization used in quantum experiments, quantum enhancements have also been demonstrated for interactive learning, highlighting the potential of quantum-enhanced learning agents. Finally, works exploring the use of AI for the very design of quantum experiments, and for performing parts of genuine research autonomously, have reported their first successes. Beyond the topics of mutual enhancement, researchers have also broached the fundamental issue of quantum generalizations of ML/AI concepts. This deals with questions of the very meaning of learning and intelligence in a world that is described by quantum mechanics. In this review, we describe the main ideas, recent developments, and progress in a broad spectrum of research investigating machine learning and artificial intelligence in the quantum domain.

* Review paper. 106 pages. 16 figures 

  Access Paper or Ask Questions

Brain Responses During Robot-Error Observation

Aug 16, 2017
Dominik Welke, Joos Behncke, Marina Hader, Robin Tibor Schirrmeister, Andreas Schönau, Boris Eßmann, Oliver Müller, Wolfram Burgard, Tonio Ball

Brain-controlled robots are a promising new type of assistive device for severely impaired persons. Little is however known about how to optimize the interaction of humans and brain-controlled robots. Information about the human's perceived correctness of robot performance might provide a useful teaching signal for adaptive control algorithms and thus help enhancing robot control. Here, we studied whether watching robots perform erroneous vs. correct action elicits differential brain responses that can be decoded from single trials of electroencephalographic (EEG) recordings, and whether brain activity during human-robot interaction is modulated by the robot's visual similarity to a human. To address these topics, we designed two experiments. In experiment I, participants watched a robot arm pour liquid into a cup. The robot performed the action either erroneously or correctly, i.e. it either spilled some liquid or not. In experiment II, participants observed two different types of robots, humanoid and non-humanoid, grabbing a ball. The robots either managed to grab the ball or not. We recorded high-resolution EEG during the observation tasks in both experiments to train a Filter Bank Common Spatial Pattern (FBCSP) pipeline on the multivariate EEG signal and decode for the correctness of the observed action, and for the type of the observed robot. Our findings show that it was possible to decode both correctness and robot type for the majority of participants significantly, although often just slightly, above chance level. Our findings suggest that non-invasive recordings of brain responses elicited when observing robots indeed contain decodable information about the correctness of the robot's action and the type of observed robot.

  Access Paper or Ask Questions

Dense v.s. Sparse: A Comparative Study of Sampling Analysis in Scene Classification of High-Resolution Remote Sensing Imagery

Jul 31, 2015
Jingwen Hu, Gui-Song Xia, Fan Hu, Liangpei Zhang

Scene classification is a key problem in the interpretation of high-resolution remote sensing imagery. Many state-of-the-art methods, e.g. bag-of-visual-words model and its variants, the topic models as well as deep learning-based approaches, share similar procedures: patch sampling, feature description/learning and classification. Patch sampling is the first and a key procedure which has a great influence on the results. In the literature, many different sampling strategies have been used, {e.g. dense sampling, random sampling, keypoint-based sampling and saliency-based sampling, etc. However, it is still not clear which sampling strategy is suitable for the scene classification of high-resolution remote sensing images. In this paper, we comparatively study the effects of different sampling strategies under the scenario of scene classification of high-resolution remote sensing images. We divide the existing sampling methods into two types: dense sampling and sparse sampling, the later of which includes random sampling, keypoint-based sampling and various saliency-based sampling proposed recently. In order to compare their performances, we rely on a standard bag-of-visual-words model to construct our testing scheme, owing to their simplicity, robustness and efficiency. The experimental results on two commonly used datasets show that dense sampling has the best performance among all the strategies but with high spatial and computational complexity, random sampling gives better or comparable results than other sparse sampling methods, like the sophisticated multi-scale key-point operators and the saliency-based methods which are intensively studied and commonly used recently.

* This paper has been withdrawn by the author due to the submission requirement of a journal 

  Access Paper or Ask Questions

The thermodynamic cost of fast thought

Jan 26, 2013
Alexandre de Castro

After more than sixty years, Shannon's research [1-3] continues to raise fundamental questions, such as the one formulated by Luce [4,5], which is still unanswered: "Why is information theory not very applicable to psychological problems, despite apparent similarities of concepts?" On this topic, Pinker [6], one of the foremost defenders of the computational theory of mind [6], has argued that thought is simply a type of computation, and that the gap between human cognition and computational models may be illusory. In this context, in his latest book, titled Thinking Fast and Slow [8], Kahneman [7,8] provides further theoretical interpretation by differentiating the two assumed systems of the cognitive functioning of the human mind. He calls them intuition (system 1) determined to be an associative (automatic, fast and perceptual) machine, and reasoning (system 2) required to be voluntary and to operate logical- deductively. In this paper, we propose an ansatz inspired by Ausubel's learning theory for investigating, from the constructivist perspective [9-12], information processing in the working memory of cognizers. Specifically, a thought experiment is performed utilizing the mind of a dual-natured creature known as Maxwell's demon: a tiny "man-machine" solely equipped with the characteristics of system 1, which prevents it from reasoning. The calculation presented here shows that [...]. This result indicates that when the system 2 is shut down, both an intelligent being, as well as a binary machine, incur the same energy cost per unit of information processed, which mathematically proves the computational attribute of the system 1, as Kahneman [7,8] theorized. This finding links information theory to human psychological features and opens a new path toward the conception of a multi-bit reasoning machine.

  Access Paper or Ask Questions

To what extent should we trust AI models when they extrapolate?

Jan 27, 2022
Roozbeh Yousefzadeh, Xuenan Cao

Many applications affecting human lives rely on models that have come to be known under the umbrella of machine learning and artificial intelligence. These AI models are usually complicated mathematical functions that map from an input space to an output space. Stakeholders are interested to know the rationales behind models' decisions and functional behavior. We study this functional behavior in relation to the data used to create the models. On this topic, scholars have often assumed that models do not extrapolate, i.e., they learn from their training samples and process new input by interpolation. This assumption is questionable: we show that models extrapolate frequently; the extent of extrapolation varies and can be socially consequential. We demonstrate that extrapolation happens for a substantial portion of datasets more than one would consider reasonable. How can we trust models if we do not know whether they are extrapolating? Given a model trained to recommend clinical procedures for patients, can we trust the recommendation when the model considers a patient older or younger than all the samples in the training set? If the training set is mostly Whites, to what extent can we trust its recommendations about Black and Hispanic patients? Which dimension (race, gender, or age) does extrapolation happen? Even if a model is trained on people of all races, it still may extrapolate in significant ways related to race. The leading question is, to what extent can we trust AI models when they process inputs that fall outside their training set? This paper investigates several social applications of AI, showing how models extrapolate without notice. We also look at different sub-spaces of extrapolation for specific individuals subject to AI models and report how these extrapolations can be interpreted, not mathematically, but from a humanistic point of view.

  Access Paper or Ask Questions

Modelling Direct Messaging Networks with Multiple Recipients for Cyber Deception

Nov 21, 2021
Kristen Moore, Cody J. Christopher, David Liebowitz, Surya Nepal, Renee Selvey

Cyber deception is emerging as a promising approach to defending networks and systems against attackers and data thieves. However, despite being relatively cheap to deploy, the generation of realistic content at scale is very costly, due to the fact that rich, interactive deceptive technologies are largely hand-crafted. With recent improvements in Machine Learning, we now have the opportunity to bring scale and automation to the creation of realistic and enticing simulated content. In this work, we propose a framework to automate the generation of email and instant messaging-style group communications at scale. Such messaging platforms within organisations contain a lot of valuable information inside private communications and document attachments, making them an enticing target for an adversary. We address two key aspects of simulating this type of system: modelling when and with whom participants communicate, and generating topical, multi-party text to populate simulated conversation threads. We present the LogNormMix-Net Temporal Point Process as an approach to the first of these, building upon the intensity-free modeling approach of Shchur et al.~\cite{shchur2019intensity} to create a generative model for unicast and multi-cast communications. We demonstrate the use of fine-tuned, pre-trained language models to generate convincing multi-party conversation threads. A live email server is simulated by uniting our LogNormMix-Net TPP (to generate the communication timestamp, sender and recipients) with the language model, which generates the contents of the multi-party email threads. We evaluate the generated content with respect to a number of realism-based properties, that encourage a model to learn to generate content that will engage the attention of an adversary to achieve a deception outcome.

  Access Paper or Ask Questions

What are the attackers doing now? Automating cyber threat intelligence extraction from text on pace with the changing threat landscape: A survey

Sep 14, 2021
Md Rayhanur Rahman, Rezvan Mahdavi-Hezaveh, Laurie Williams

Cybersecurity researchers have contributed to the automated extraction of CTI from textual sources, such as threat reports and online articles, where cyberattack strategies, procedures, and tools are described. The goal of this article is to aid cybersecurity researchers understand the current techniques used for cyberthreat intelligence extraction from text through a survey of relevant studies in the literature. We systematically collect "CTI extraction from text"-related studies from the literature and categorize the CTI extraction purposes. We propose a CTI extraction pipeline abstracted from these studies. We identify the data sources, techniques, and CTI sharing formats utilized in the context of the proposed pipeline. Our work finds ten types of extraction purposes, such as extraction indicators of compromise extraction, TTPs (tactics, techniques, procedures of attack), and cybersecurity keywords. We also identify seven types of textual sources for CTI extraction, and textual data obtained from hacker forums, threat reports, social media posts, and online news articles have been used by almost 90% of the studies. Natural language processing along with both supervised and unsupervised machine learning techniques such as named entity recognition, topic modelling, dependency parsing, supervised classification, and clustering are used for CTI extraction. We observe the technical challenges associated with these studies related to obtaining available clean, labelled data which could assure replication, validation, and further extension of the studies. As we find the studies focusing on CTI information extraction from text, we advocate for building upon the current CTI extraction work to help cybersecurity practitioners with proactive decision making such as threat prioritization, automated threat modelling to utilize knowledge from past cybersecurity incidents.

  Access Paper or Ask Questions

SkillNER: Mining and Mapping Soft Skills from any Text

Jan 22, 2021
Silvia Fareri, Nicola Melluso, Filippo Chiarello, Gualtiero Fantoni

In today's digital world there is an increasing focus on soft skills. The reasons are many, however the main ones can be traced down to the increased complexity of labor market dynamics and the shift towards digitalisation. Digitalisation has also increased the focus on soft skills, since such competencies are hardly acquired by Artificial Intelligence Systems. Despite this growing interest, researchers struggle in accurately defining the soft skill concept and in creating a complete and shared list of soft skills. Therefore, the aim of the present paper is the development of an automated tool capable of extracting soft skills from unstructured texts. Starting from an initial seed list of soft skills, we automatically collect a set of possible textual expressions referring to soft skills, thus creating a Soft Skills list. This has been done by applying Named Entity Recognition (NER) on a corpus of scientific papers developing a novel approach and a software application able to perform the automatic extraction of soft skills from text: the SkillNER. We measured the performance of the tools considering different training models and validated our approach comparing our list of soft skills with the skills labelled as transversal in ESCO (European Skills/Competence Qualification and Occupation). Finally we give a first example of how the SkillNER can be used, identifying the relationships among ESCO job profiles based on soft skills shared, and the relationships among soft skills based on job profiles in common. The final map of soft skills-job profiles may help accademia in achieving and sharing a clearer definition of what soft skills are and fuel future quantitative research on the topic.

  Access Paper or Ask Questions

Generating Knowledge Graphs by Employing Natural Language Processing and Machine Learning Techniques within the Scholarly Domain

Oct 28, 2020
Danilo Dessì, Francesco Osborne, Diego Reforgiato Recupero, Davide Buscaldi, Enrico Motta

The continuous growth of scientific literature brings innovations and, at the same time, raises new challenges. One of them is related to the fact that its analysis has become difficult due to the high volume of published papers for which manual effort for annotations and management is required. Novel technological infrastructures are needed to help researchers, research policy makers, and companies to time-efficiently browse, analyse, and forecast scientific research. Knowledge graphs i.e., large networks of entities and relationships, have proved to be effective solution in this space. Scientific knowledge graphs focus on the scholarly domain and typically contain metadata describing research publications such as authors, venues, organizations, research topics, and citations. However, the current generation of knowledge graphs lacks of an explicit representation of the knowledge presented in the research papers. As such, in this paper, we present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications and integrates them in a large-scale knowledge graph. Within this research work, we i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools, ii) describe an approach for integrating entities and relationships generated by these tools, iii) show the advantage of such an hybrid system over alternative approaches, and vi) as a chosen use case, we generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the Semantic Web domain. As our approach is general and can be applied to any domain, we expect that it can facilitate the management, analysis, dissemination, and processing of scientific knowledge.

* Accepted for publication in Future Generation Computer Systems journal - Special Issue on Machine Learning and Knowledge Graphs 

  Access Paper or Ask Questions