Abstract:Artificial Intelligence (AI) has made its way into various scientific fields, providing astonishing improvements over existing algorithms for a wide variety of tasks. In recent years, there have been severe concerns over the trustworthiness of AI technologies. The scientific community has focused on the development of trustworthy AI algorithms. However, machine and deep learning algorithms, popular in the AI community today, depend heavily on the data used during their development. These learning algorithms identify patterns in the data, learning the behavioral objective. Any flaws in the data have the potential to translate directly into algorithms. In this study, we discuss the importance of Responsible Machine Learning Datasets and propose a framework to evaluate the datasets through a responsible rubric. While existing work focuses on the post-hoc evaluation of algorithms for their trustworthiness, we provide a framework that considers the data component separately to understand its role in the algorithm. We discuss responsible datasets through the lens of fairness, privacy, and regulatory compliance and provide recommendations for constructing future datasets. After surveying over 100 datasets, we use 60 datasets for analysis and demonstrate that none of these datasets is immune to issues of fairness, privacy preservation, and regulatory compliance. We provide modifications to the ``datasheets for datasets" with important additions for improved dataset documentation. With governments around the world regularizing data protection laws, the method for the creation of datasets in the scientific community requires revision. We believe this study is timely and relevant in today's era of AI.
Abstract:Skin cancer is one of the deadliest diseases and has a high mortality rate if left untreated. The diagnosis generally starts with visual screening and is followed by a biopsy or histopathological examination. Early detection can aid in lowering mortality rates. Visual screening can be limited by the experience of the doctor. Due to the long tail distribution of dermatological datasets and significant intra-variability between classes, automatic classification utilizing computer-aided methods becomes challenging. In this work, we propose a multitask few-shot-based approach for skin lesions that generalizes well with few labelled data to address the small sample space challenge. The proposed approach comprises a fusion of a segmentation network that acts as an attention module and classification network. The output of the segmentation network helps to focus on the most discriminatory features while making a decision by the classification network. To further enhance the classification performance, we have combined segmentation and classification loss in a weighted manner. We have also included the visualization results that explain the decisions made by the algorithm. Three dermatological datasets are used to evaluate the proposed method thoroughly. We also conducted cross-database experiments to ensure that the proposed approach is generalizable across similar datasets. Experimental results demonstrate the efficacy of the proposed work.
Abstract:Audio has become an increasingly crucial biometric modality due to its ability to provide an intuitive way for humans to interact with machines. It is currently being used for a range of applications, including person authentication to banking to virtual assistants. Research has shown that these systems are also susceptible to spoofing and attacks. Therefore, protecting audio processing systems against fraudulent activities, such as identity theft, financial fraud, and spreading misinformation, is of paramount importance. This paper reviews the current state-of-the-art techniques for detecting audio spoofing and discusses the current challenges along with open research problems. The paper further highlights the importance of considering the ethical and privacy implications of audio spoofing detection systems. Lastly, the work aims to accentuate the need for building more robust and generalizable methods, the integration of automatic speaker verification and countermeasure systems, and better evaluation protocols.
Abstract:The presence of bias in deep models leads to unfair outcomes for certain demographic subgroups. Research in bias focuses primarily on facial recognition and attribute prediction with scarce emphasis on face detection. Existing studies consider face detection as binary classification into 'face' and 'non-face' classes. In this work, we investigate possible bias in the domain of face detection through facial region localization which is currently unexplored. Since facial region localization is an essential task for all face recognition pipelines, it is imperative to analyze the presence of such bias in popular deep models. Most existing face detection datasets lack suitable annotation for such analysis. Therefore, we web-curate the Fair Face Localization with Attributes (F2LA) dataset and manually annotate more than 10 attributes per face, including facial localization information. Utilizing the extensive annotations from F2LA, an experimental setup is designed to study the performance of four pre-trained face detectors. We observe (i) a high disparity in detection accuracies across gender and skin-tone, and (ii) interplay of confounding factors beyond demography. The F2LA data and associated annotations can be accessed at http://iab-rubric.org/index.php/F2LA.
Abstract:Deepfake refers to tailored and synthetically generated videos which are now prevalent and spreading on a large scale, threatening the trustworthiness of the information available online. While existing datasets contain different kinds of deepfakes which vary in their generation technique, they do not consider progression of deepfakes in a "phylogenetic" manner. It is possible that an existing deepfake face is swapped with another face. This process of face swapping can be performed multiple times and the resultant deepfake can be evolved to confuse the deepfake detection algorithms. Further, many databases do not provide the employed generative model as target labels. Model attribution helps in enhancing the explainability of the detection results by providing information on the generative model employed. In order to enable the research community to address these questions, this paper proposes DeePhy, a novel Deepfake Phylogeny dataset which consists of 5040 deepfake videos generated using three different generation techniques. There are 840 videos of one-time swapped deepfakes, 2520 videos of two-times swapped deepfakes and 1680 videos of three-times swapped deepfakes. With over 30 GBs in size, the database is prepared in over 1100 hours using 18 GPUs of 1,352 GB cumulative memory. We also present the benchmark on DeePhy dataset using six deepfake detection algorithms. The results highlight the need to evolve the research of model attribution of deepfakes and generalize the process over a variety of deepfake generation techniques. The database is available at: http://iab-rubric.org/deephy-database
Abstract:Deep Learning systems need large data for training. Datasets for training face verification systems are difficult to obtain and prone to privacy issues. Synthetic data generated by generative models such as GANs can be a good alternative. However, we show that data generated from GANs are prone to bias and fairness issues. Specifically, GANs trained on FFHQ dataset show biased behavior towards generating white faces in the age group of 20-29. We also demonstrate that synthetic faces cause disparate impact, specifically for race attribute, when used for fine tuning face verification systems.
Abstract:Existing facial analysis systems have been shown to yield biased results against certain demographic subgroups. Due to its impact on society, it has become imperative to ensure that these systems do not discriminate based on gender, identity, or skin tone of individuals. This has led to research in the identification and mitigation of bias in AI systems. In this paper, we encapsulate bias detection/estimation and mitigation algorithms for facial analysis. Our main contributions include a systematic review of algorithms proposed for understanding bias, along with a taxonomy and extensive overview of existing bias mitigation algorithms. We also discuss open challenges in the field of biased facial analysis.
Abstract:Globally, cataract is a common eye disease and one of the leading causes of blindness and vision impairment. The traditional process of detecting cataracts involves eye examination using a slit-lamp microscope or ophthalmoscope by an ophthalmologist, who checks for clouding of the normally clear lens of the eye. The lack of resources and unavailability of a sufficient number of experts pose a burden to the healthcare system throughout the world, and researchers are exploring the use of AI solutions for assisting the experts. Inspired by the progress in iris recognition, in this research, we present a novel algorithm for cataract detection using near-infrared eye images. The NIR cameras, which are popularly used in iris recognition, are of relatively low cost and easy to operate compared to ophthalmoscope setup for data capture. However, such NIR images have not been explored for cataract detection. We present deep learning-based eye segmentation and multitask network classification networks for cataract detection using NIR images as input. The proposed segmentation algorithm efficiently and effectively detects non-ideal eye boundaries and is cost-effective, and the classification network yields very high classification performance on the cataract dataset.
Abstract:The rapid progress in the ease of creating and spreading ultra-realistic media over social platforms calls for an urgent need to develop a generalizable deepfake detection technique. It has been observed that current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos. Inspired by this observation, in this paper, we present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation for classifying \textit{deepfakes}. MD-CSDNetwork is a novel cross-stitched network with two parallel branches carrying the spatial and frequency information, respectively. We hypothesize that these multi-domain input data streams can be considered as related supervisory signals. The supervision from both branches ensures better performance and generalization. Further, the concept of cross-stitch connections is utilized where they are inserted between the two branches to learn an optimal combination of domain-specific and shared representations from other domains automatically. Extensive experiments are conducted on the popular benchmark dataset namely FaceForeniscs++ for forgery classification. We report improvements over all the manipulation types in FaceForensics++ dataset and comparable results with state-of-the-art methods for cross-database evaluation on the Celeb-DF dataset and the Deepfake Detection Dataset.
Abstract:Identifying and mitigating bias in deep learning algorithms has gained significant popularity in the past few years due to its impact on the society. Researchers argue that models trained on balanced datasets with good representation provide equal and unbiased performance across subgroups. However, \textit{can seemingly unbiased pre-trained model become biased when input data undergoes certain distortions?} For the first time, we attempt to answer this question in the context of face recognition. We provide a systematic analysis to evaluate the performance of four state-of-the-art deep face recognition models in the presence of image distortions across different \textit{gender} and \textit{race} subgroups. We have observed that image distortions have a relationship with the performance gap of the model across different subgroups.