The segmentation of diseases is a popular topic explored by researchers in the field of machine learning. Brain tumors are extremely dangerous and require the utmost precision to segment for a successful surgery. Patients with tumors usually take 4 MRI scans, T1, T1gd, T2, and FLAIR, which are then sent to radiologists to segment and analyze for possible future surgery. To create a second segmentation, it would be beneficial to both radiologists and patients in being more confident in their conclusions. We propose using a method performed by radiologists called image segmentation and applying it to machine learning models to prove a better segmentation. Using Mask R-CNN, its ResNet backbone being pre-trained on the RSNA pneumonia detection challenge dataset, we can train a model on the Brats2020 Brain Tumor dataset. Center for Biomedical Image Computing & Analytics provides MRI data on patients with and without brain tumors and the corresponding segmentations. We can see how well the method of image subtraction works by comparing it to models without image subtraction through DICE coefficient (F1 score), recall, and precision on the untouched test set. Our model performed with a DICE coefficient of 0.75 in comparison to 0.69 without image subtraction. To further emphasize the usefulness of image subtraction, we compare our final model to current state-of-the-art models to segment tumors from MRI scans.
Fifth generation (5G) networks and beyond envisions massive Internet of Things (IoT) rollout to support disruptive applications such as extended reality (XR), augmented/virtual reality (AR/VR), industrial automation, autonomous driving, and smart everything which brings together massive and diverse IoT devices occupying the radio frequency (RF) spectrum. Along with spectrum crunch and throughput challenges, such a massive scale of wireless devices exposes unprecedented threat surfaces. RF fingerprinting is heralded as a candidate technology that can be combined with cryptographic and zero-trust security measures to ensure data privacy, confidentiality, and integrity in wireless networks. Motivated by the relevance of this subject in the future communication networks, in this work, we present a comprehensive survey of RF fingerprinting approaches ranging from a traditional view to the most recent deep learning (DL) based algorithms. Existing surveys have mostly focused on a constrained presentation of the wireless fingerprinting approaches, however, many aspects remain untold. In this work, however, we mitigate this by addressing every aspect - background on signal intelligence (SIGINT), applications, relevant DL algorithms, systematic literature review of RF fingerprinting techniques spanning the past two decades, discussion on datasets, and potential research avenues - necessary to elucidate this topic to the reader in an encyclopedic manner.
Recent advances in machine learning have created increasing interest in solving visual computing problems using a class of coordinate-based neural networks that parametrize physical properties of scenes or objects across space and time. These methods, which we call neural fields, have seen successful application in the synthesis of 3D shapes and image, animation of human bodies, 3D reconstruction, and pose estimation. However, due to rapid progress in a short time, many papers exist but a comprehensive review and formulation of the problem has not yet emerged. In this report, we address this limitation by providing context, mathematical grounding, and an extensive review of literature on neural fields. This report covers research along two dimensions. In Part I, we focus on techniques in neural fields by identifying common components of neural field methods, including different representations, architectures, forward mapping, and generalization methods. In Part II, we focus on applications of neural fields to different problems in visual computing, and beyond (e.g., robotics, audio). Our review shows the breadth of topics already covered in visual computing, both historically and in current incarnations, demonstrating the improved quality, flexibility, and capability brought by neural fields methods. Finally, we present a companion website that contributes a living version of this review that can be continually updated by the community.
Deep neural network-based image classification can be misled by adversarial examples with small and quasi-imperceptible perturbations. Furthermore, the adversarial examples created on one classification model can also fool another different model. The transferability of the adversarial examples has recently attracted a growing interest since it makes black-box attacks on classification models feasible. As an extension of classification, semantic segmentation has also received much attention towards its adversarial robustness. However, the transferability of adversarial examples on segmentation models has not been systematically studied. In this work, we intensively study this topic. First, we explore the overfitting phenomenon of adversarial examples on classification and segmentation models. In contrast to the observation made on classification models that the transferability is limited by overfitting to the source model, we find that the adversarial examples on segmentations do not always overfit the source models. Even when no overfitting is presented, the transferability of adversarial examples is limited. We attribute the limitation to the architectural traits of segmentation models, i.e., multi-scale object recognition. Then, we propose a simple and effective method, dubbed dynamic scaling, to overcome the limitation. The high transferability achieved by our method shows that, in contrast to the observations in previous work, adversarial examples on a segmentation model can be easy to transfer to other segmentation models. Our analysis and proposals are supported by extensive experiments.
Decoding brain cognitive states from neuroimaging signals is an important topic in neuroscience. In recent years, deep neural networks (DNNs) have been recruited for multiple brain state decoding and achieved good performance. However, the open question of how to interpret the DNN black box remains unanswered. Capitalizing on advances in machine learning, we integrated attention modules into brain decoders to facilitate an in-depth interpretation of DNN channels. A 4D convolution operation was also included to extract temporo-spatial interaction within the fMRI signal. The experiments showed that the proposed model obtains a very high accuracy (97.4%) and outperforms previous researches on the 7 different task benchmarks from the Human Connectome Project (HCP) dataset. The visualization analysis further illustrated the hierarchical emergence of task-specific masks with depth. Finally, the model was retrained to regress individual traits within the HCP and to classify viewing images from the BOLD5000 dataset, respectively. Transfer learning also achieves good performance. A further visualization analysis shows that, after transfer learning, low-level attention masks remained similar to the source domain, whereas high-level attention masks changed adaptively. In conclusion, the proposed 4D model with attention module performed well and facilitated interpretation of DNNs, which is helpful for subsequent research.
Remote sensing provides valuable information about objects or areas from a distance in either active (e.g., RADAR and LiDAR) or passive (e.g., multispectral and hyperspectral) modes. The quality of data acquired by remotely sensed imaging sensors (both active and passive) is often degraded by a variety of noise types and artifacts. Image restoration, which is a vibrant field of research in the remote sensing community, is the task of recovering the true unknown image from the degraded observed image. Each imaging sensor induces unique noise types and artifacts into the observed image. This fact has led to the expansion of restoration techniques in different paths according to each sensor type. This review paper brings together the advances of image restoration techniques with particular focuses on synthetic aperture radar and hyperspectral images as the most active sub-fields of image restoration in the remote sensing community. We, therefore, provide a comprehensive, discipline-specific starting point for researchers at different levels (i.e., students, researchers, and senior researchers) willing to investigate the vibrant topic of data restoration by supplying sufficient detail and references. Additionally, this review paper accompanies a toolbox to provide a platform to encourage interested students and researchers in the field to further explore the restoration techniques and fast-forward the community. The toolboxes are provided in https://github.com/ImageRestorationToolbox.
In label-noise learning, estimating the transition matrix is a hot topic as the matrix plays an important role in building statistically consistent classifiers. Traditionally, the transition from clean distribution to noisy distribution (i.e., clean label transition matrix) has been widely exploited to learn a clean label classifier by employing the noisy data. Motivated by that classifiers mostly output Bayes optimal labels for prediction, in this paper, we study to directly model the transition from Bayes optimal distribution to noisy distribution (i.e., Bayes label transition matrix) and learn a Bayes optimal label classifier. Note that given only noisy data, it is ill-posed to estimate either the clean label transition matrix or the Bayes label transition matrix. But favorably, Bayes optimal labels are less uncertain compared with the clean labels, i.e., the class posteriors of Bayes optimal labels are one-hot vectors while those of clean labels are not. This enables two advantages to estimate the Bayes label transition matrix, i.e., (a) we could theoretically recover a set of Bayes optimal labels under mild conditions; (b) the feasible solution space is much smaller. By exploiting the advantages, we estimate the Bayes label transition matrix by employing a deep neural network in a parameterized way, leading to better generalization and superior classification performance.
With the explosive growth of video data, video summarization, which attempts to seek the minimum subset of frames while still conveying the main story, has become one of the hottest topics. Nowadays, substantial achievements have been made by supervised learning techniques, especially after the emergence of deep learning. However, it is extremely expensive and difficult to collect human annotation for large-scale video datasets. To address this problem, we propose a convolutional attentive adversarial network (CAAN), whose key idea is to build a deep summarizer in an unsupervised way. Upon the generative adversarial network, our overall framework consists of a generator and a discriminator. The former predicts importance scores for all frames of a video while the latter tries to distinguish the score-weighted frame features from original frame features. Specifically, the generator employs a fully convolutional sequence network to extract global representation of a video, and an attention-based network to output normalized importance scores. To learn the parameters, our objective function is composed of three loss functions, which can guide the frame-level importance score prediction collaboratively. To validate this proposed method, we have conducted extensive experiments on two public benchmarks SumMe and TVSum. The results show the superiority of our proposed method against other state-of-the-art unsupervised approaches. Our method even outperforms some published supervised approaches.