Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

The thermodynamic cost of fast thought

Jan 26, 2013
Alexandre de Castro

After more than sixty years, Shannon's research [1-3] continues to raise fundamental questions, such as the one formulated by Luce [4,5], which is still unanswered: "Why is information theory not very applicable to psychological problems, despite apparent similarities of concepts?" On this topic, Pinker [6], one of the foremost defenders of the computational theory of mind [6], has argued that thought is simply a type of computation, and that the gap between human cognition and computational models may be illusory. In this context, in his latest book, titled Thinking Fast and Slow [8], Kahneman [7,8] provides further theoretical interpretation by differentiating the two assumed systems of the cognitive functioning of the human mind. He calls them intuition (system 1) determined to be an associative (automatic, fast and perceptual) machine, and reasoning (system 2) required to be voluntary and to operate logical- deductively. In this paper, we propose an ansatz inspired by Ausubel's learning theory for investigating, from the constructivist perspective [9-12], information processing in the working memory of cognizers. Specifically, a thought experiment is performed utilizing the mind of a dual-natured creature known as Maxwell's demon: a tiny "man-machine" solely equipped with the characteristics of system 1, which prevents it from reasoning. The calculation presented here shows that [...]. This result indicates that when the system 2 is shut down, both an intelligent being, as well as a binary machine, incur the same energy cost per unit of information processed, which mathematically proves the computational attribute of the system 1, as Kahneman [7,8] theorized. This finding links information theory to human psychological features and opens a new path toward the conception of a multi-bit reasoning machine.

  Access Paper or Ask Questions

To what extent should we trust AI models when they extrapolate?

Jan 27, 2022
Roozbeh Yousefzadeh, Xuenan Cao

Many applications affecting human lives rely on models that have come to be known under the umbrella of machine learning and artificial intelligence. These AI models are usually complicated mathematical functions that map from an input space to an output space. Stakeholders are interested to know the rationales behind models' decisions and functional behavior. We study this functional behavior in relation to the data used to create the models. On this topic, scholars have often assumed that models do not extrapolate, i.e., they learn from their training samples and process new input by interpolation. This assumption is questionable: we show that models extrapolate frequently; the extent of extrapolation varies and can be socially consequential. We demonstrate that extrapolation happens for a substantial portion of datasets more than one would consider reasonable. How can we trust models if we do not know whether they are extrapolating? Given a model trained to recommend clinical procedures for patients, can we trust the recommendation when the model considers a patient older or younger than all the samples in the training set? If the training set is mostly Whites, to what extent can we trust its recommendations about Black and Hispanic patients? Which dimension (race, gender, or age) does extrapolation happen? Even if a model is trained on people of all races, it still may extrapolate in significant ways related to race. The leading question is, to what extent can we trust AI models when they process inputs that fall outside their training set? This paper investigates several social applications of AI, showing how models extrapolate without notice. We also look at different sub-spaces of extrapolation for specific individuals subject to AI models and report how these extrapolations can be interpreted, not mathematically, but from a humanistic point of view.

  Access Paper or Ask Questions

Modelling Direct Messaging Networks with Multiple Recipients for Cyber Deception

Nov 21, 2021
Kristen Moore, Cody J. Christopher, David Liebowitz, Surya Nepal, Renee Selvey

Cyber deception is emerging as a promising approach to defending networks and systems against attackers and data thieves. However, despite being relatively cheap to deploy, the generation of realistic content at scale is very costly, due to the fact that rich, interactive deceptive technologies are largely hand-crafted. With recent improvements in Machine Learning, we now have the opportunity to bring scale and automation to the creation of realistic and enticing simulated content. In this work, we propose a framework to automate the generation of email and instant messaging-style group communications at scale. Such messaging platforms within organisations contain a lot of valuable information inside private communications and document attachments, making them an enticing target for an adversary. We address two key aspects of simulating this type of system: modelling when and with whom participants communicate, and generating topical, multi-party text to populate simulated conversation threads. We present the LogNormMix-Net Temporal Point Process as an approach to the first of these, building upon the intensity-free modeling approach of Shchur et al.~\cite{shchur2019intensity} to create a generative model for unicast and multi-cast communications. We demonstrate the use of fine-tuned, pre-trained language models to generate convincing multi-party conversation threads. A live email server is simulated by uniting our LogNormMix-Net TPP (to generate the communication timestamp, sender and recipients) with the language model, which generates the contents of the multi-party email threads. We evaluate the generated content with respect to a number of realism-based properties, that encourage a model to learn to generate content that will engage the attention of an adversary to achieve a deception outcome.

  Access Paper or Ask Questions

What are the attackers doing now? Automating cyber threat intelligence extraction from text on pace with the changing threat landscape: A survey

Sep 14, 2021
Md Rayhanur Rahman, Rezvan Mahdavi-Hezaveh, Laurie Williams

Cybersecurity researchers have contributed to the automated extraction of CTI from textual sources, such as threat reports and online articles, where cyberattack strategies, procedures, and tools are described. The goal of this article is to aid cybersecurity researchers understand the current techniques used for cyberthreat intelligence extraction from text through a survey of relevant studies in the literature. We systematically collect "CTI extraction from text"-related studies from the literature and categorize the CTI extraction purposes. We propose a CTI extraction pipeline abstracted from these studies. We identify the data sources, techniques, and CTI sharing formats utilized in the context of the proposed pipeline. Our work finds ten types of extraction purposes, such as extraction indicators of compromise extraction, TTPs (tactics, techniques, procedures of attack), and cybersecurity keywords. We also identify seven types of textual sources for CTI extraction, and textual data obtained from hacker forums, threat reports, social media posts, and online news articles have been used by almost 90% of the studies. Natural language processing along with both supervised and unsupervised machine learning techniques such as named entity recognition, topic modelling, dependency parsing, supervised classification, and clustering are used for CTI extraction. We observe the technical challenges associated with these studies related to obtaining available clean, labelled data which could assure replication, validation, and further extension of the studies. As we find the studies focusing on CTI information extraction from text, we advocate for building upon the current CTI extraction work to help cybersecurity practitioners with proactive decision making such as threat prioritization, automated threat modelling to utilize knowledge from past cybersecurity incidents.

  Access Paper or Ask Questions

SkillNER: Mining and Mapping Soft Skills from any Text

Jan 22, 2021
Silvia Fareri, Nicola Melluso, Filippo Chiarello, Gualtiero Fantoni

In today's digital world there is an increasing focus on soft skills. The reasons are many, however the main ones can be traced down to the increased complexity of labor market dynamics and the shift towards digitalisation. Digitalisation has also increased the focus on soft skills, since such competencies are hardly acquired by Artificial Intelligence Systems. Despite this growing interest, researchers struggle in accurately defining the soft skill concept and in creating a complete and shared list of soft skills. Therefore, the aim of the present paper is the development of an automated tool capable of extracting soft skills from unstructured texts. Starting from an initial seed list of soft skills, we automatically collect a set of possible textual expressions referring to soft skills, thus creating a Soft Skills list. This has been done by applying Named Entity Recognition (NER) on a corpus of scientific papers developing a novel approach and a software application able to perform the automatic extraction of soft skills from text: the SkillNER. We measured the performance of the tools considering different training models and validated our approach comparing our list of soft skills with the skills labelled as transversal in ESCO (European Skills/Competence Qualification and Occupation). Finally we give a first example of how the SkillNER can be used, identifying the relationships among ESCO job profiles based on soft skills shared, and the relationships among soft skills based on job profiles in common. The final map of soft skills-job profiles may help accademia in achieving and sharing a clearer definition of what soft skills are and fuel future quantitative research on the topic.

  Access Paper or Ask Questions

Generating Knowledge Graphs by Employing Natural Language Processing and Machine Learning Techniques within the Scholarly Domain

Oct 28, 2020
Danilo Dessì, Francesco Osborne, Diego Reforgiato Recupero, Davide Buscaldi, Enrico Motta

The continuous growth of scientific literature brings innovations and, at the same time, raises new challenges. One of them is related to the fact that its analysis has become difficult due to the high volume of published papers for which manual effort for annotations and management is required. Novel technological infrastructures are needed to help researchers, research policy makers, and companies to time-efficiently browse, analyse, and forecast scientific research. Knowledge graphs i.e., large networks of entities and relationships, have proved to be effective solution in this space. Scientific knowledge graphs focus on the scholarly domain and typically contain metadata describing research publications such as authors, venues, organizations, research topics, and citations. However, the current generation of knowledge graphs lacks of an explicit representation of the knowledge presented in the research papers. As such, in this paper, we present a new architecture that takes advantage of Natural Language Processing and Machine Learning methods for extracting entities and relationships from research publications and integrates them in a large-scale knowledge graph. Within this research work, we i) tackle the challenge of knowledge extraction by employing several state-of-the-art Natural Language Processing and Text Mining tools, ii) describe an approach for integrating entities and relationships generated by these tools, iii) show the advantage of such an hybrid system over alternative approaches, and vi) as a chosen use case, we generated a scientific knowledge graph including 109,105 triples, extracted from 26,827 abstracts of papers within the Semantic Web domain. As our approach is general and can be applied to any domain, we expect that it can facilitate the management, analysis, dissemination, and processing of scientific knowledge.

* Accepted for publication in Future Generation Computer Systems journal - Special Issue on Machine Learning and Knowledge Graphs 

  Access Paper or Ask Questions

Synthetic Training for Monocular Human Mesh Recovery

Oct 27, 2020
Yu Sun, Qian Bao, Wu Liu, Wenpeng Gao, Yili Fu, Chuang Gan, Tao Mei

Recovering 3D human mesh from monocular images is a popular topic in computer vision and has a wide range of applications. This paper aims to estimate 3D mesh of multiple body parts (e.g., body, hands) with large-scale differences from a single RGB image. Existing methods are mostly based on iterative optimization, which is very time-consuming. We propose to train a single-shot model to achieve this goal. The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images. To solve this problem, we design a multi-branch framework to disentangle the regression of different body properties, enabling us to separate each component's training in a synthetic training manner using unpaired data available. Besides, to strengthen the generalization ability, most existing methods have used in-the-wild 2D pose datasets to supervise the estimated 3D pose via 3D-to-2D projection. However, we observe that the commonly used weak-perspective model performs poorly in dealing with the external foreshortening effect of camera projection. Therefore, we propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants for more proper supervision. The proposed method outperforms previous methods on the CMU Panoptic Studio dataset according to the evaluation results and achieves comparable results on the Human3.6M body and STB hand benchmarks. More impressively, the performance in close shot images gets significantly improved using the proposed D2S projection for weak supervision, while maintains obvious superiority in computational efficiency.

  Access Paper or Ask Questions

The Elements of End-to-end Deep Face Recognition: A Survey of Recent Advances

Sep 28, 2020
Hang Du, Hailin Shi, Dan Zeng, Tao Mei

Face recognition is one of the most fundamental and long-standing topics in computer vision community. With the recent developments of deep convolutional neural networks and large-scale datasets, deep face recognition has made remarkable progress and been widely used in the real-world applications. Given a natural image or video frame as input, an end-to-end deep face recognition system outputs the face feature for recognition. To achieve this, the whole system is generally built with three key elements: face detection, face preprocessing, and face representation. The face detection locates faces in the image or frame. Then, the face preprocessing is proceeded to calibrate the faces to a canonical view and crop them to a normalized pixel size. Finally, in the stage of face representation, the discriminative features are extracted from the preprocessed faces for recognition. All of the three elements are fulfilled by deep convolutional neural networks. In this paper, we present a comprehensive survey about the recent advances of every element of the end-to-end deep face recognition, since the thriving deep learning techniques have greatly improved the capability of them. To start with, we introduce an overview of the end-to-end deep face recognition, which, as mentioned above, includes face detection, face preprocessing, and face representation. Then, we review the deep learning based advances of each element, respectively, covering many aspects such as the up-to-date algorithm designs, evaluation metrics, datasets, performance comparison, existing challenges, and promising directions for future research. We hope this survey could bring helpful thoughts to one for better understanding of the big picture of end-to-end face recognition and deeper exploration in a systematic way.

  Access Paper or Ask Questions

Self-Supervised Gait Encoding with Locality-Aware Attention for Person Re-Identification

Aug 21, 2020
Haocong Rao, Siqi Wang, Xiping Hu, Mingkui Tan, Huang Da, Jun Cheng, Bin Hu

Gait-based person re-identification (Re-ID) is valuable for safety-critical applications, and using only 3D skeleton data to extract discriminative gait features for person Re-ID is an emerging open topic. Existing methods either adopt hand-crafted features or learn gait features by traditional supervised learning paradigms. Unlike previous methods, we for the first time propose a generic gait encoding approach that can utilize unlabeled skeleton data to learn gait representations in a self-supervised manner. Specifically, we first propose to introduce self-supervision by learning to reconstruct input skeleton sequences in reverse order, which facilitates learning richer high-level semantics and better gait representations. Second, inspired by the fact that motion's continuity endows temporally adjacent skeletons with higher correlations ("locality"), we propose a locality-aware attention mechanism that encourages learning larger attention weights for temporally adjacent skeletons when reconstructing current skeleton, so as to learn locality when encoding gait. Finally, we propose Attention-based Gait Encodings (AGEs), which are built using context vectors learned by locality-aware attention, as final gait representations. AGEs are directly utilized to realize effective person Re-ID. Our approach typically improves existing skeleton-based methods by 10-20% Rank-1 accuracy, and it achieves comparable or even superior performance to multi-modal methods with extra RGB or depth information. Our codes are available at

* In IJCAI, pages 898-905, 2020 
* Accepted at IJCAI 2020 Main Track. Sole copyright holder is IJCAI. Codes are available at 

  Access Paper or Ask Questions

SAR Image Despeckling by Deep Neural Networks: from a pre-trained model to an end-to-end training strategy

Jul 02, 2020
Emanuele Dalsasso, Xiangli Yang, Loïc Denis, Florence Tupin, Wen Yang

Speckle reduction is a longstanding topic in synthetic aperture radar (SAR) images. Many different schemes have been proposed for the restoration of intensity SAR images. Among the different possible approaches, methods based on convolutional neural networks (CNNs) have recently shown to reach state-of-the-art performance for SAR image restoration. CNN training requires good training data: many pairs of speckle-free / speckle-corrupted images. This is an issue in SAR applications, given the inherent scarcity of speckle-free images. To handle this problem, this paper analyzes different strategies one can adopt, depending on the speckle removal task one wishes to perform and the availability of multitemporal stacks of SAR data. The first strategy applies a CNN model, trained to remove additive white Gaussian noise from natural images, to a recently proposed SAR speckle removal framework: MuLoG (MUlti-channel LOgarithm with Gaussian denoising). No training on SAR images is performed, the network is readily applied to speckle reduction tasks. The second strategy considers a novel approach to construct a reliable dataset of speckle-free SAR images necessary to train a CNN model. Finally, a hybrid approach is also analyzed: the CNN used to remove additive white Gaussian noise is trained on speckle-free SAR images. The proposed methods are compared to other state-of-the-art speckle removal filters, to evaluate the quality of denoising and to discuss the pros and cons of the different strategies. Along with the paper, we make available the weights of the trained network to allow its usage by other researchers.

* Article submitted to Remote Sensing, MDPI. Notebook with Colab compatibility is available at 

  Access Paper or Ask Questions