Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Avoiding bias when inferring race using name-based approaches

Apr 14, 2021
Diego Kozlowski, Dakota S. Murray, Alexis Bell, Will Hulsey, Vincent Larivière, Thema Monroe-White, Cassidy R. Sugimoto

Racial disparity in academia is a widely acknowledged problem. The quantitative understanding of racial-based systemic inequalities is an important step towards a more equitable research system. However, few large-scale analyses have been performed on this topic, mostly because of the lack of robust race-disambiguation algorithms. Identifying author information does not generally include the author's race. Therefore, an algorithm needs to be employed, using known information about authors, i.e., their names, to infer their perceived race. Nevertheless, as any other algorithm, the process of racial inference can generate biases if it is not carefully considered. When the research is focused on the understanding of racial-based inequalities, such biases undermine the objectives of the investigation and may perpetuate inequities. The goal of this article is to assess the biases introduced by the different approaches used name-based racial inference. We use information from US census and mortgage applications to infer the race of US author names in the Web of Science. We estimate the effects of using given and family names, thresholds or continuous distributions, and imputation. Our results demonstrate that the validity of name-based inference varies by race and ethnicity and that threshold approaches underestimate Black authors and overestimate White authors. We conclude with recommendations to avoid potential biases. This article fills an important research gap that will allow more systematic and unbiased studies on racial disparity in science.

  Access Paper or Ask Questions

Omni-swarm: A Decentralized Omnidirectional Visual-Inertial-UWB State Estimation System for Aerial Swarm

Apr 01, 2021
Hao Xu, Yichen Zhang, Boyu Zhou, Luqi Wang, Xinjie Yao, Guotao Meng, Shaojie Shen

The decentralized state estimation is one of the most fundamental components for autonomous aerial swarm systems in GPS-denied areas, which still remains a highly challenging research topic. To address this research niche, the Omni-swarm, a decentralized omnidirectional visual-inertial-UWB state estimation system for the aerial swarm is proposed in this paper. In order to solve the issues of observability, complicated initialization, insufficient accuracy and lack of global consistency, we introduce an omnidirectional perception system as the front-end of the Omni-swarm, consisting of omnidirectional sensors, which includes stereo fisheye cameras and ultra-wideband (UWB) sensors, and algorithms, which includes fisheye visual inertial odometry (VIO), multi-drone map-based localization and visual object detector. A graph-based optimization and forward propagation working as the back-end of the \textit{Omni-swarm} to fuse the measurements from the front-end. According to the experiment result, the proposed decentralized state estimation method on the swarm system achieves centimeter-level relative state estimation accuracy while ensuring global consistency. Moreover, supported by the Omni-swarm, inter-drone collision avoidance can be accomplished in a whole decentralized scheme without any external device, demonstrating the potential of Omni-swarm to be the foundation of autonomous aerial swarm flights in different scenarios.

  Access Paper or Ask Questions

Unbox the Black-box for the Medical Explainable AI via Multi-modal and Multi-centre Data Fusion: A Mini-Review, Two Showcases and Beyond

Feb 03, 2021
Guang Yang, Qinghao Ye, Jun Xia

Explainable Artificial Intelligence (XAI) is an emerging research topic of machine learning aimed at unboxing how AI systems' black-box choices are made. This research field inspects the measures and models involved in decision-making and seeks solutions to explain them explicitly. Many of the machine learning algorithms can not manifest how and why a decision has been cast. This is particularly true of the most popular deep neural network approaches currently in use. Consequently, our confidence in AI systems can be hindered by the lack of explainability in these black-box models. The XAI becomes more and more crucial for deep learning powered applications, especially for medical and healthcare studies, although in general these deep neural networks can return an arresting dividend in performance. The insufficient explainability and transparency in most existing AI systems can be one of the major reasons that successful implementation and integration of AI tools into routine clinical practice are uncommon. In this study, we first surveyed the current progress of XAI and in particular its advances in healthcare applications. We then introduced our solutions for XAI leveraging multi-modal and multi-centre data fusion, and subsequently validated in two showcases following real clinical scenarios. Comprehensive quantitative and qualitative analyses can prove the efficacy of our proposed XAI solutions, from which we can envisage successful applications in a broader range of clinical questions.

* 68 pages, 19 figures, submitted to the Information Fusion journal 

  Access Paper or Ask Questions

Counting Protests in News Articles: A Dataset and Semi-Automated Data Collection Pipeline

Feb 01, 2021
Tommy Leung, L. Nathan Perkins

Between January 2017 and January 2021, thousands of local news sources in the United States reported on over 42,000 protests about topics such as civil rights, immigration, guns, and the environment. Given the vast number of local journalists that report on protests daily, extracting these events as structured data to understand temporal and geographic trends can empower civic decision-making. However, the task of extracting events from news articles presents well known challenges to the NLP community in the fields of domain detection, slot filling, and coreference resolution. To help improve the resources available for extracting structured data from news stories, our contribution is three-fold. We 1) release a manually labeled dataset of news article URLs, dates, locations, crowd size estimates, and 494 discrete descriptive tags corresponding to 42,347 reported protest events in the United States between January 2017 and January 2021; 2) describe the semi-automated data collection pipeline used to discover, sort, and review the 144,568 English articles that comprise the dataset; and 3) benchmark a long-short term memory (LSTM) low dimensional classifier that demonstrates the utility of processing news articles based on syntactic structures, such as paragraphs and sentences, to count the number of reported protest events.

  Access Paper or Ask Questions

Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects

Jan 15, 2021
Sidrah Shabbir, Muhammad Ahmad

Hyperspectral Imaging (HSI) has been extensively utilized in many real-life applications because it benefits from the detailed spectral information contained in each pixel. Notably, the complex characteristics i.e., the nonlinear relation among the captured spectral information and the corresponding object of HSI data make accurate classification challenging for traditional methods. In the last few years, deep learning (DL) has been substantiated as a powerful feature extractor that effectively addresses the nonlinear problems that appeared in a number of computer vision tasks. This prompts the deployment of DL for HSI classification (HSIC) which revealed good performance. This survey enlists a systematic overview of DL for HSIC and compared state-of-the-art strategies of the said topic. Primarily, we will encapsulate the main challenges of traditional machine learning for HSIC and then we will acquaint the superiority of DL to address these problems. This survey breakdown the state-of-the-art DL frameworks into spectral-features, spatial-features, and together spatial-spectral features to systematically analyze the achievements (future directions as well) of these frameworks for HSIC. Moreover, we will consider the fact that DL requires a large number of labeled training examples whereas acquiring such a number for HSIC is challenging in terms of time and cost. Therefore, this survey discusses some strategies to improve the generalization performance of DL strategies which can provide some future guidelines.

  Access Paper or Ask Questions

HpGAN: Sequence Search with Generative Adversarial Networks

Dec 10, 2020
Mingxing Zhang, Zhengchun Zhou, Lanping Li, Zilong Liu, Meng Yang, Yanghe Feng

Sequences play an important role in many engineering applications and systems. Searching sequences with desired properties has long been an interesting but also challenging research topic. This article proposes a novel method, called HpGAN, to search desired sequences algorithmically using generative adversarial networks (GAN). HpGAN is based on the idea of zero-sum game to train a generative model, which can generate sequences with characteristics similar to the training sequences. In HpGAN, we design the Hopfield network as an encoder to avoid the limitations of GAN in generating discrete data. Compared with traditional sequence construction by algebraic tools, HpGAN is particularly suitable for intractable problems with complex objectives which prevent mathematical analysis. We demonstrate the search capabilities of HpGAN in two applications: 1) HpGAN successfully found many different mutually orthogonal complementary code sets (MOCCS) and optimal odd-length Z-complementary pairs (OB-ZCPs) which are not part of the training set. In the literature, both MOCSSs and OB-ZCPs have found wide applications in wireless communications. 2) HpGAN found new sequences which achieve four-times increase of signal-to-interference ratio--benchmarked against the well-known Legendre sequence--of a mismatched filter (MMF) estimator in pulse compression radar systems. These sequences outperform those found by AlphaSeq.

* 12 pages, 16 figures 

  Access Paper or Ask Questions

Sub-clusters of Normal Data for Anomaly Detection

Nov 17, 2020
Gahye Lee, Seungkyu Lee

Anomaly detection in data analysis is an interesting but still challenging research topic in real world applications. As the complexity of data dimension increases, it requires to understand the semantic contexts in its description for effective anomaly characterization. However, existing anomaly detection methods show limited performances with high dimensional data such as ImageNet. Existing studies have evaluated their performance on low dimensional, clean and well separated data set such as MNIST and CIFAR-10. In this paper, we study anomaly detection with high dimensional and complex normal data. Our observation is that, in general, anomaly data is defined by semantically explainable features which are able to be used in defining semantic sub-clusters of normal data as well. We hypothesize that if there exists reasonably good feature space semantically separating sub-clusters of given normal data, unseen anomaly also can be well distinguished in the space from the normal data. We propose to perform semantic clustering on given normal data and train a classifier to learn the discriminative feature space where anomaly detection is finally performed. Based on our careful and extensive experimental evaluations with MNIST, CIFAR-10, and ImageNet with various combinations of normal and anomaly data, we show that our anomaly detection scheme outperforms state of the art methods especially with high dimensional real world images.

  Access Paper or Ask Questions

A topological approach to exploring convolutional neural networks

Nov 02, 2020
Yang Zhao, Hao Zhang

Motivated by the elusive understanding concerning convolution neural networks (CNNs) in view of topology, we present two theoretical frameworks to interpret two topics by using topological data analysis. The first one reveals the topological essence of CNN filters. Our theory first abstracts a topological representation of how the features locate for a CNN filter, named feature topology, and characterises it by defining the starting edge density. We reveal a principle of CNN filters: tending to organize the feature topologies for the same category, and thus propose the SED Distribution to statistically describe such an organization. We demonstrate the effectiveness of CNN filters reflects in the compactness of SED Distribution, and introduce filter entropy to measure it. Remarkably, the variation of filter entropy during training reveals the essence of CNN training: a filter-entropy-decrease process. Also, based on the principle, we give a metric to assess the filter performance. The second one investigates the inter-class distinguishability in a model-agnostic way. For each class, we propose the MBC Distribution, a distribution that could differentiate categories by characterising the intrinsic organization of the given category. As for multi-classes, we introduce the category distance which metricizes the distance between two categories, and moreover propose the CD Matrix that comprehensively evaluates not just the distinguishability between each two category pair but the distinguishable degree for each category. Finally, our experiment results confirm our theories.

* 8 pages, 4 figures, pnas manuscript 

  Access Paper or Ask Questions

Infant Pose Learning with Small Data

Oct 13, 2020
Xiaofei Huang, Nihang Fu, Sarah Ostadabbas

With the increasing maturity of the human pose estimation domain, its applications have become more and more broaden. Yet, the state-of-the-art pose estimation models performance degrades significantly in the applications that include novel subjects or poses, such as infants with their unique movements. Infant motion analysis is a topic with critical importance in child health and developmental studies. However, models trained on large-scale adult pose datasets are barely successful in estimating infant poses due to significant differences in their body ratio and the versatility of poses they can take compared to adults. Moreover, the privacy and security considerations hinder the availability of enough infant images required for training a robust pose estimation model from scratch. Here, we propose a fine-tuned domain-adapted infant pose (FiDIP) estimation model, that transfers the knowledge of adult poses into estimating infant pose with the supervision of a domain adaptation technique on a mixed real and synthetic infant pose dataset. In developing FiDIP, we also built a synthetic and real infant pose (SyRIP) dataset with diverse and fully-annotated real infant images and generated synthetic infant images. We demonstrated that our FiDIP model outperforms other state-of-the-art human pose estimation model for the infant pose estimation, with the mean average precision (AP) as high as 92.2.

  Access Paper or Ask Questions

Taking Modality-free Human Identification as Zero-shot Learning

Oct 02, 2020
Zhizhe Liu, Xingxing Zhang, Zhenfeng Zhu, Shuai Zheng, Yao Zhao, Jian Cheng

Human identification is an important topic in event detection, person tracking, and public security. There have been numerous methods proposed for human identification, such as face identification, person re-identification, and gait identification. Typically, existing methods predominantly classify a queried image to a specific identity in an image gallery set (I2I). This is seriously limited for the scenario where only a textual description of the query or an attribute gallery set is available in a wide range of video surveillance applications (A2I or I2A). However, very few efforts have been devoted towards modality-free identification, i.e., identifying a query in a gallery set in a scalable way. In this work, we take an initial attempt, and formulate such a novel Modality-Free Human Identification (named MFHI) task as a generic zero-shot learning model in a scalable way. Meanwhile, it is capable of bridging the visual and semantic modalities by learning a discriminative prototype of each identity. In addition, the semantics-guided spatial attention is enforced on visual modality to obtain representations with both high global category-level and local attribute-level discrimination. Finally, we design and conduct an extensive group of experiments on two common challenging identification tasks, including face identification and person re-identification, demonstrating that our method outperforms a wide variety of state-of-the-art methods on modality-free human identification.

* 12 pages, 8 figures 

  Access Paper or Ask Questions