Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

CCS-GAN: COVID-19 CT-scan classification with very few positive training images

Oct 01, 2021
Sumeet Menon, Jayalakshmi Mangalagiri, Josh Galita, Michael Morris, Babak Saboury, Yaacov Yesha, Yelena Yesha, Phuong Nguyen, Aryya Gangopadhyay, David Chapman

We present a novel algorithm that is able to classify COVID-19 pneumonia from CT Scan slices using a very small sample of training images exhibiting COVID-19 pneumonia in tandem with a larger number of normal images. This algorithm is able to achieve high classification accuracy using as few as 10 positive training slices (from 10 positive cases), which to the best of our knowledge is one order of magnitude fewer than the next closest published work at the time of writing. Deep learning with extremely small positive training volumes is a very difficult problem and has been an important topic during the COVID-19 pandemic, because for quite some time it was difficult to obtain large volumes of COVID-19 positive images for training. Algorithms that can learn to screen for diseases using few examples are an important area of research. We present the Cycle Consistent Segmentation Generative Adversarial Network (CCS-GAN). CCS-GAN combines style transfer with pulmonary segmentation and relevant transfer learning from negative images in order to create a larger volume of synthetic positive images for the purposes of improving diagnostic classification performance. The performance of a VGG-19 classifier plus CCS-GAN was trained using a small sample of positive image slices ranging from at most 50 down to as few as 10 COVID-19 positive CT-scan images. CCS-GAN achieves high accuracy with few positive images and thereby greatly reduces the barrier of acquiring large training volumes in order to train a diagnostic classifier for COVID-19.

* 10 pages, 9 figures, 1 table, submitted to IEEE Transactions on Medical Imaging 

  Access Paper or Ask Questions

Proceedings of the 9th International Symposium on Symbolic Computation in Software Science

Sep 06, 2021
Temur Kutsia

This volume contains papers presented at the Ninth International Symposium on Symbolic Computation in Software Science, SCSS 2021. Symbolic Computation is the science of computing with symbolic objects (terms, formulae, programs, representations of algebraic objects, etc.). Powerful algorithms have been developed during the past decades for the major subareas of symbolic computation: computer algebra and computational logic. These algorithms and methods are successfully applied in various fields, including software science, which covers a broad range of topics about software construction and analysis. Meanwhile, artificial intelligence methods and machine learning algorithms are widely used nowadays in various domains and, in particular, combined with symbolic computation. Several approaches mix artificial intelligence and symbolic methods and tools deployed over large corpora to create what is known as cognitive systems. Cognitive computing focuses on building systems that interact with humans naturally by reasoning, aiming at learning at scale. The purpose of SCSS is to promote research on theoretical and practical aspects of symbolic computation in software science, combined with modern artificial intelligence techniques. These proceedings contain the keynote paper by Bruno Buchberger and ten contributed papers. Besides, the conference program included three invited talks, nine short and work-in-progress papers, and a special session on computer algebra and computational logic. Due to the COVID-19 pandemic, the symposium was held completely online. It was organized by the Research Institute for Symbolic Computation (RISC) of the Johannes Kepler University Linz on September 8--10, 2021.

* EPTCS 342, 2021 

  Access Paper or Ask Questions

MIDV-2020: A Comprehensive Benchmark Dataset for Identity Document Analysis

Jul 01, 2021
Konstantin Bulatov, Ekaterina Emelianova, Daniil Tropin, Natalya Skoryukina, Yulia Chernyshova, Alexander Sheshkus, Sergey Usilin, Zuheng Ming, Jean-Christophe Burie, Muhammad Muzzamil Luqman, Vladimir V. Arlazarov

Identity documents recognition is an important sub-field of document analysis, which deals with tasks of robust document detection, type identification, text fields recognition, as well as identity fraud prevention and document authenticity validation given photos, scans, or video frames of an identity document capture. Significant amount of research has been published on this topic in recent years, however a chief difficulty for such research is scarcity of datasets, due to the subject matter being protected by security requirements. A few datasets of identity documents which are available lack diversity of document types, capturing conditions, or variability of document field values. In addition, the published datasets were typically designed only for a subset of document recognition problems, not for a complex identity document analysis. In this paper, we present a dataset MIDV-2020 which consists of 1000 video clips, 2000 scanned images, and 1000 photos of 1000 unique mock identity documents, each with unique text field values and unique artificially generated faces, with rich annotation. For the presented benchmark dataset baselines are provided for such tasks as document location and identification, text fields recognition, and face detection. With 72409 annotated images in total, to the date of publication the proposed dataset is the largest publicly available identity documents dataset with variable artificially generated data, and we believe that it will prove invaluable for advancement of the field of document analysis and recognition. The dataset is available for download at and .

  Access Paper or Ask Questions

The Contestation of Tech Ethics: A Sociotechnical Approach to Ethics and Technology in Action

Jun 03, 2021
Ben Green

Recent controversies related to topics such as fake news, privacy, and algorithmic bias have prompted increased public scrutiny of digital technologies and soul-searching among many of the people associated with their development. In response, the tech industry, academia, civil society, and governments have rapidly increased their attention to "ethics" in the design and use of digital technologies ("tech ethics"). Yet almost as quickly as ethics discourse has proliferated across the world of digital technologies, the limitations of these approaches have also become apparent: tech ethics is vague and toothless, is subsumed into corporate logics and incentives, and has a myopic focus on individual engineers and technology design rather than on the structures and cultures of technology production. As a result of these limitations, many have grown skeptical of tech ethics and its proponents, charging them with "ethics-washing": promoting ethics research and discourse to defuse criticism and government regulation without committing to ethical behavior. By looking at how ethics has been taken up in both science and business in superficial and depoliticizing ways, I recast tech ethics as a terrain of contestation where the central fault line is not whether it is desirable to be ethical, but what "ethics" entails and who gets to define it. This framing highlights the significant limits of current approaches to tech ethics and the importance of studying the formulation and real-world effects of tech ethics. In order to identify and develop more rigorous strategies for reforming digital technologies and the social relations that they mediate, I describe a sociotechnical approach to tech ethics, one that reflexively applies many of tech ethics' own lessons regarding digital technologies to tech ethics itself.

  Access Paper or Ask Questions

A Novel Neuron Model of Visual Processor

Apr 15, 2021
Jizhao Liu, Jing Lian, J C Sprott, Yide Ma

Simulating and imitating the neuronal network of humans or mammals is a popular topic that has been explored for many years in the fields of pattern recognition and computer vision. Inspired by neuronal conduction characteristics in the primary visual cortex of cats, pulse-coupled neural networks (PCNNs) can exhibit synchronous oscillation behavior, which can process digital images without training. However, according to the study of single cells in the cat primary visual cortex, when a neuron is stimulated by an external periodic signal, the interspike-interval (ISI) distributions represent a multimodal distribution. This phenomenon cannot be explained by all PCNN models. By analyzing the working mechanism of the PCNN, we present a novel neuron model of the primary visual cortex consisting of a continuous-coupled neural network (CCNN). Our model inherited the threshold exponential decay and synchronous pulse oscillation property of the original PCNN model, and it can exhibit chaotic behavior consistent with the testing results of cat primary visual cortex neurons. Therefore, our CCNN model is closer to real visual neural networks. For image segmentation tasks, the algorithm based on CCNN model has better performance than the state-of-art of visual cortex neural network model. The strength of our approach is that it helps neurophysiologists further understand how the primary visual cortex works and can be used to quantitatively predict the temporal-spatial behavior of real neural networks. CCNN may also inspire engineers to create brain-inspired deep learning networks for artificial intelligence purposes.

  Access Paper or Ask Questions

AutonoML: Towards an Integrated Framework for Autonomous Machine Learning

Dec 23, 2020
David Jacob Kedziora, Katarzyna Musial, Bogdan Gabrys

Over the last decade, the long-running endeavour to automate high-level processes in machine learning (ML) has risen to mainstream prominence, stimulated by advances in optimisation techniques and their impact on selecting ML models/algorithms. Central to this drive is the appeal of engineering a computational system that both discovers and deploys high-performance solutions to arbitrary ML problems with minimal human interaction. Beyond this, an even loftier goal is the pursuit of autonomy, which describes the capability of the system to independently adjust an ML solution over a lifetime of changing contexts. However, these ambitions are unlikely to be achieved in a robust manner without the broader synthesis of various mechanisms and theoretical frameworks, which, at the present time, remain scattered across numerous research threads. Accordingly, this review seeks to motivate a more expansive perspective on what constitutes an automated/autonomous ML system, alongside consideration of how best to consolidate those elements. In doing so, we survey developments in the following research areas: hyperparameter optimisation, multi-component models, neural architecture search, automated feature engineering, meta-learning, multi-level ensembling, dynamic adaptation, multi-objective evaluation, resource constraints, flexible user involvement, and the principles of generalisation. We also develop a conceptual framework throughout the review, augmented by each topic, to illustrate one possible way of fusing high-level mechanisms into an autonomous ML system. Ultimately, we conclude that the notion of architectural integration deserves more discussion, without which the field of automated ML risks stifling both its technical advantages and general uptake.

  Access Paper or Ask Questions

Single-Frame based Deep View Synchronization for Unsynchronized Multi-Camera Surveillance

Jul 08, 2020
Qi Zhang, Antoni B. Chan

Multi-camera surveillance has been an active research topic for understanding and modeling scenes. Compared to a single camera, multi-cameras provide larger field-of-view and more object cues, and the related applications are multi-view counting, multi-view tracking, 3D pose estimation or 3D reconstruction, etc. It is usually assumed that the cameras are all temporally synchronized when designing models for these multi-camera based tasks. However, this assumption is not always valid,especially for multi-camera systems with network transmission delay and low frame-rates due to limited network bandwidth, resulting in desynchronization of the captured frames across cameras. To handle the issue of unsynchronized multi-cameras, in this paper, we propose a synchronization model that works in conjunction with existing DNN-based multi-view models, thus avoiding the redesign of the whole model. Under the low-fps regime, we assume that only a single relevant frame is available from each view, and synchronization is achieved by matching together image contents guided by epipolar geometry. We consider two variants of the model, based on where in the pipeline the synchronization occurs, scene-level synchronization and camera-level synchronization. The view synchronization step and the task-specific view fusion and prediction step are unified in the same framework and trained in an end-to-end fashion. Our view synchronization models are applied to different DNNs-based multi-camera vision tasks under the unsynchronized setting, including multi-view counting and 3D pose estimation, and achieve good performance compared to baselines.

* 12 pages 

  Access Paper or Ask Questions

Augmenting Visual SLAM with Wi-Fi Sensing For Indoor Applications

Mar 15, 2019
Zakieh S. Hashemifar, Charuvahan Adhivarahan, Anand Balakrishnan, Karthik Dantu

Recent trends have accelerated the development of spatial applications on mobile devices and robots. These include navigation, augmented reality, human-robot interaction, and others. A key enabling technology for such applications is the understanding of the device's location and the map of the surrounding environment. This generic problem, referred to as Simultaneous Localization and Mapping (SLAM), is an extensively researched topic in robotics. However, visual SLAM algorithms face several challenges including perceptual aliasing and high computational cost. These challenges affect the accuracy, efficiency, and viability of visual SLAM algorithms, especially for long-term SLAM, and their use in resource-constrained mobile devices. A parallel trend is the ubiquity of Wi-Fi routers for quick Internet access in most urban environments. Most robots and mobile devices are equipped with a Wi-Fi radio as well. We propose a method to utilize Wi-Fi received signal strength to alleviate the challenges faced by visual SLAM algorithms. To demonstrate the utility of this idea, this work makes the following contributions: (i) We propose a generic way to integrate Wi-Fi sensing into visual SLAM algorithms, (ii) We integrate such sensing into three well-known SLAM algorithms, (iii) Using four distinct datasets, we demonstrate the performance of such augmentation in comparison to the original visual algorithms and (iv) We compare our work to Wi-Fi augmented FABMAP algorithm. Overall, we show that our approach can improve the accuracy of visual SLAM algorithms by 11% on average and reduce computation time on average by 15% to 25%.

* 16 pages, 19 figures, Autonomous Robots Journal submission (AuRo) 

  Access Paper or Ask Questions

End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion

Nov 15, 2018
Chao Shang, Yun Tang, Jing Huang, Jinbo Bi, Xiaodong He, Bowen Zhou

Knowledge graph embedding has been an active research topic for knowledge base completion, with progressive improvement from the initial TransE, TransH, DistMult et al to the current state-of-the-art ConvE. ConvE uses 2D convolution over embeddings and multiple layers of nonlinear features to model knowledge graphs. The model can be efficiently trained and scalable to large knowledge graphs. However, there is no structure enforcement in the embedding space of ConvE. The recent graph convolutional network (GCN) provides another way of learning graph node embedding by successfully utilizing graph connectivity structure. In this work, we propose a novel end-to-end Structure-Aware Convolutional Network (SACN) that takes the benefit of GCN and ConvE together. SACN consists of an encoder of a weighted graph convolutional network (WGCN), and a decoder of a convolutional network called Conv-TransE. WGCN utilizes knowledge graph node structure, node attributes and edge relation types. It has learnable weights that adapt the amount of information from neighbors used in local aggregation, leading to more accurate embeddings of graph nodes. Node attributes in the graph are represented as additional nodes in the WGCN. The decoder Conv-TransE enables the state-of-the-art ConvE to be translational between entities and relations while keeps the same link prediction performance as ConvE. We demonstrate the effectiveness of the proposed SACN on standard FB15k-237 and WN18RR datasets, and it gives about 10% relative improvement over the state-of-the-art ConvE in terms of [email protected], [email protected] and [email protected]

* The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019) 

  Access Paper or Ask Questions

Minimizing Polarization and Disagreement in Social Networks

Dec 28, 2017
Cameron Musco, Christopher Musco, Charalampos E. Tsourakakis

The rise of social media and online social networks has been a disruptive force in society. Opinions are increasingly shaped by interactions on online social media, and social phenomena including disagreement and polarization are now tightly woven into everyday life. In this work we initiate the study of the following question: given $n$ agents, each with its own initial opinion that reflects its core value on a topic, and an opinion dynamics model, what is the structure of a social network that minimizes {\em polarization} and {\em disagreement} simultaneously? This question is central to recommender systems: should a recommender system prefer a link suggestion between two online users with similar mindsets in order to keep disagreement low, or between two users with different opinions in order to expose each to the other's viewpoint of the world, and decrease overall levels of polarization? Our contributions include a mathematical formalization of this question as an optimization problem and an exact, time-efficient algorithm. We also prove that there always exists a network with $O(n/\epsilon^2)$ edges that is a $(1+\epsilon)$ approximation to the optimum. For a fixed graph, we additionally show how to optimize our objective function over the agents' innate opinions in polynomial time. We perform an empirical study of our proposed methods on synthetic and real-world data that verify their value as mining tools to better understand the trade-off between of disagreement and polarization. We find that there is a lot of space to reduce both polarization and disagreement in real-world networks; for instance, on a Reddit network where users exchange comments on politics, our methods achieve a $\sim 60\,000$-fold reduction in polarization and disagreement.

* 19 pages (accepted, WWW 2018) 

  Access Paper or Ask Questions