Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Time": models, code, and papers

Tinto: Multisensor Benchmark for 3D Hyperspectral Point Cloud Segmentation in the Geosciences

May 17, 2023
Ahmed J. Afifi, Samuel T. Thiele, Sandra Lorenz, Pedram Ghamisi, Raimon Tolosana-Delgado, Moritz Kirsch, Richard Gloaguen, Michael Heizmann

Figure 1 for Tinto: Multisensor Benchmark for 3D Hyperspectral Point Cloud Segmentation in the Geosciences

Figure 2 for Tinto: Multisensor Benchmark for 3D Hyperspectral Point Cloud Segmentation in the Geosciences

Figure 3 for Tinto: Multisensor Benchmark for 3D Hyperspectral Point Cloud Segmentation in the Geosciences

Figure 4 for Tinto: Multisensor Benchmark for 3D Hyperspectral Point Cloud Segmentation in the Geosciences

The increasing use of deep learning techniques has reduced interpretation time and, ideally, reduced interpreter bias by automatically deriving geological maps from digital outcrop models. However, accurate validation of these automated mapping approaches is a significant challenge due to the subjective nature of geological mapping and the difficulty in collecting quantitative validation data. Additionally, many state-of-the-art deep learning methods are limited to 2D image data, which is insufficient for 3D digital outcrops, such as hyperclouds. To address these challenges, we present Tinto, a multi-sensor benchmark digital outcrop dataset designed to facilitate the development and validation of deep learning approaches for geological mapping, especially for non-structured 3D data like point clouds. Tinto comprises two complementary sets: 1) a real digital outcrop model from Corta Atalaya (Spain), with spectral attributes and ground-truth data, and 2) a synthetic twin that uses latent features in the original datasets to reconstruct realistic spectral data (including sensor noise and processing artifacts) from the ground-truth. The point cloud is dense and contains 3,242,964 labeled points. We used these datasets to explore the abilities of different deep learning approaches for automated geological mapping. By making Tinto publicly available, we hope to foster the development and adaptation of new deep learning tools for 3D applications in Earth sciences. The dataset can be accessed through this link: https://doi.org/10.14278/rodare.2256.

Via

Access Paper or Ask Questions

A survey of modularized backstepping control design approaches to nonlinear ODE systems

May 03, 2023
Zhengru Ren

Figure 1 for A survey of modularized backstepping control design approaches to nonlinear ODE systems

Figure 2 for A survey of modularized backstepping control design approaches to nonlinear ODE systems

Figure 3 for A survey of modularized backstepping control design approaches to nonlinear ODE systems

Figure 4 for A survey of modularized backstepping control design approaches to nonlinear ODE systems

Backstepping is a mature and powerful Lyapunov-based design approach for a specific set of systems. Throughout the development over three decades, innovative theories and practices have extended backstepping to stabilization and tracking problems for nonlinear systems with growing complexity. The attractions of the backstepping-like approach are the recursive design processes and modularized design. A nonlinear system can be transferred into a group of simple problems and solved it by a sequential superposition of the corresponding approaches for each problem. To handle the complexities, backstepping designs always come up with adaptive control and robust control. The survey aims to review the milestone theoretical achievements among thousands of publications making the state-feedback backstepping designs of complex ODE systems to be systematic and modularized. Several selected elegant methods are reviewed, starting from the general designs, and then the finite-time control enhancing the convergence rate, the fuzzy logic system and neural network estimating the system unknowns, the Nussbaum function handling unknown control coefficients, barrier Lyapunov function solving state constraints, and the hyperbolic tangent function applying in robust designs. The associated assumptions and Lyapunov function candidates, inequalities, and the deduction key points are reviewed. The nonlinearity and complexities lay in state constraints, disturbance, input nonlinearities, time-delay effects, pure feedback systems, event-triggered systems, and stochastic systems. Instead of networked systems, the survey focuses on stand-alone systems.

* 31 pages and 7 figures. The majority of the present survey was written in the final phase of my PhD study in 2019 and was slightly revised in 2020.Since I am too busy to update it by including the most recent research after that, I hope to share this work, and may it helps every beginner

Via

Access Paper or Ask Questions

Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient

Apr 25, 2023
Yuhang Li, Youngeun Kim, Hyoungseob Park, Priyadarshini Panda

Figure 1 for Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient

Figure 2 for Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient

Figure 3 for Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient

Figure 4 for Uncovering the Representation of Spiking Neural Networks Trained with Surrogate Gradient

Spiking Neural Networks (SNNs) are recognized as the candidate for the next-generation neural networks due to their bio-plausibility and energy efficiency. Recently, researchers have demonstrated that SNNs are able to achieve nearly state-of-the-art performance in image recognition tasks using surrogate gradient training. However, some essential questions exist pertaining to SNNs that are little studied: Do SNNs trained with surrogate gradient learn different representations from traditional Artificial Neural Networks (ANNs)? Does the time dimension in SNNs provide unique representation power? In this paper, we aim to answer these questions by conducting a representation similarity analysis between SNNs and ANNs using Centered Kernel Alignment (CKA). We start by analyzing the spatial dimension of the networks, including both the width and the depth. Furthermore, our analysis of residual connections shows that SNNs learn a periodic pattern, which rectifies the representations in SNNs to be ANN-like. We additionally investigate the effect of the time dimension on SNN representation, finding that deeper layers encourage more dynamics along the time dimension. We also investigate the impact of input data such as event-stream data and adversarial attacks. Our work uncovers a host of new findings of representations in SNNs. We hope this work will inspire future research to fully comprehend the representation power of SNNs. Code is released at https://github.com/Intelligent-Computing-Lab-Yale/SNNCKA.

* Published in Transactions on Machine Learning Research (TMLR)

Via

Access Paper or Ask Questions

SneakyPrompt: Evaluating Robustness of Text-to-image Generative Models' Safety Filters

May 20, 2023
Yuchen Yang, Bo Hui, Haolin Yuan, Neil Gong, Yinzhi Cao

Figure 1 for SneakyPrompt: Evaluating Robustness of Text-to-image Generative Models' Safety Filters

Figure 2 for SneakyPrompt: Evaluating Robustness of Text-to-image Generative Models' Safety Filters

Figure 3 for SneakyPrompt: Evaluating Robustness of Text-to-image Generative Models' Safety Filters

Figure 4 for SneakyPrompt: Evaluating Robustness of Text-to-image Generative Models' Safety Filters

Text-to-image generative models such as Stable Diffusion and DALL$\cdot$E 2 have attracted much attention since their publication due to their wide application in the real world. One challenging problem of text-to-image generative models is the generation of Not-Safe-for-Work (NSFW) content, e.g., those related to violence and adult. Therefore, a common practice is to deploy a so-called safety filter, which blocks NSFW content based on either text or image features. Prior works have studied the possible bypass of such safety filters. However, existing works are largely manual and specific to Stable Diffusion's official safety filter. Moreover, the bypass ratio of Stable Diffusion's safety filter is as low as 23.51% based on our evaluation. In this paper, we propose the first automated attack framework, called SneakyPrompt, to evaluate the robustness of real-world safety filters in state-of-the-art text-to-image generative models. Our key insight is to search for alternative tokens in a prompt that generates NSFW images so that the generated prompt (called an adversarial prompt) bypasses existing safety filters. Specifically, SneakyPrompt utilizes reinforcement learning (RL) to guide an agent with positive rewards on semantic similarity and bypass success. Our evaluation shows that SneakyPrompt successfully generated NSFW content using an online model DALL$\cdot$E 2 with its default, closed-box safety filter enabled. At the same time, we also deploy several open-source state-of-the-art safety filters on a Stable Diffusion model and show that SneakyPrompt not only successfully generates NSFW content, but also outperforms existing adversarial attacks in terms of the number of queries and image qualities.

Via

Access Paper or Ask Questions

Human labeling errors and their impact on ConvNets for satellite image scene classification

May 20, 2023
Longkang Peng, Tao Wei, Xuehong Chen, Xiaobei Chen, Rui Sun, Luoma Wan, Xiaolin Zhu

Figure 1 for Human labeling errors and their impact on ConvNets for satellite image scene classification

Figure 2 for Human labeling errors and their impact on ConvNets for satellite image scene classification

Figure 3 for Human labeling errors and their impact on ConvNets for satellite image scene classification

Figure 4 for Human labeling errors and their impact on ConvNets for satellite image scene classification

Convolutional neural networks (ConvNets) have been successfully applied to satellite image scene classification. Human-labeled training datasets are essential for ConvNets to perform accurate classification. Errors in human-labeled training datasets are unavoidable due to the complexity of satellite images. However, the distribution of human labeling errors on satellite images and their impact on ConvNets have not been investigated. To fill this research gap, this study, for the first time, collected real-world labels from 32 participants and explored how their errors affect three ConvNets (VGG16, GoogleNet and ResNet-50) for high-resolution satellite image scene classification. We found that: (1) human labeling errors have significant class and instance dependence, which is fundamentally different from the simulation noise in previous studies; (2) regarding the overall accuracy of all classes, when human labeling errors in training data increase by one unit, the overall accuracy of ConvNets classification decreases by approximately half a unit; (3) regarding the accuracy of each class, the impact of human labeling errors on ConvNets shows large heterogeneity across classes. To uncover the mechanism underlying the impact of human labeling errors on ConvNets, we further compared it with two types of simulated labeling noise: uniform noise (errors independent of both classes and instances) and class-dependent noise (errors independent of instances but not classes). Our results show that the impact of human labeling errors on ConvNets is similar to that of the simulated class-dependent noise but not to that of the simulated uniform noise, suggesting that the impact of human labeling errors on ConvNets is mainly due to class-dependent errors rather than instance-dependent errors.

Via

Access Paper or Ask Questions

In Defense of Pure 16-bit Floating-Point Neural Networks

May 18, 2023
Juyoung Yun, Byungkon Kang, Francois Rameau, Zhoulai Fu

Figure 1 for In Defense of Pure 16-bit Floating-Point Neural Networks

Figure 2 for In Defense of Pure 16-bit Floating-Point Neural Networks

Figure 3 for In Defense of Pure 16-bit Floating-Point Neural Networks

Figure 4 for In Defense of Pure 16-bit Floating-Point Neural Networks

Reducing the number of bits needed to encode the weights and activations of neural networks is highly desirable as it speeds up their training and inference time while reducing memory consumption. For these reasons, research in this area has attracted significant attention toward developing neural networks that leverage lower-precision computing, such as mixed-precision training. Interestingly, none of the existing approaches has investigated pure 16-bit floating-point settings. In this paper, we shed light on the overlooked efficiency of pure 16-bit floating-point neural networks. As such, we provide a comprehensive theoretical analysis to investigate the factors contributing to the differences observed between 16-bit and 32-bit models. We formalize the concepts of floating-point error and tolerance, enabling us to quantitatively explain the conditions under which a 16-bit model can closely approximate the results of its 32-bit counterpart. This theoretical exploration offers perspective that is distinct from the literature which attributes the success of low-precision neural networks to its regularization effect. This in-depth analysis is supported by an extensive series of experiments. Our findings demonstrate that pure 16-bit floating-point neural networks can achieve similar or even better performance than their mixed-precision and 32-bit counterparts. We believe the results presented in this paper will have significant implications for machine learning practitioners, offering an opportunity to reconsider using pure 16-bit networks in various applications.

Via

Access Paper or Ask Questions

Accelerated MR Fingerprinting with Low-Rank and Generative Subspace Modeling

May 18, 2023
Hengfa Lu, Huihui Ye, Lawrence L. Wald, Bo Zhao

Figure 1 for Accelerated MR Fingerprinting with Low-Rank and Generative Subspace Modeling

Figure 2 for Accelerated MR Fingerprinting with Low-Rank and Generative Subspace Modeling

Figure 3 for Accelerated MR Fingerprinting with Low-Rank and Generative Subspace Modeling

Figure 4 for Accelerated MR Fingerprinting with Low-Rank and Generative Subspace Modeling

Magnetic Resonance (MR) Fingerprinting is an emerging multi-parametric quantitative MR imaging technique, for which image reconstruction methods utilizing low-rank and subspace constraints have achieved state-of-the-art performance. However, this class of methods often suffers from an ill-conditioned model-fitting issue, which often degrades the performance as the data acquisition lengths become short and/or the signal-to-noise ratio becomes low. To address this problem, we present a new image reconstruction method for MR Fingerprinting, integrating low-rank and subspace modeling with deep generative priors. Specifically, the proposed method captures the strong spatiotemporal correlation of contrast-weighted time-series images in MR Fingerprinting via a low-rank factorization. Further, it utilizes an untrained convolutional generative neural network to represent the spatial subspace of the low-rank model, while estimating the temporal subspace of the model from simulated magnetization evolutions generated based on spin physics. Here the architecture of the neural network serves as an effective regularizer for the ill-conditioned inverse problem without additional spatial training data that are often expensive to acquire. The proposed formulation results in a non-convex optimization problem, for which we develop an algorithm based on variable splitting and alternating direction method of multipliers. We evaluate the performance of the proposed method with simulations and in vivo experiments and demonstrate that the proposed method outperforms the state-of-the-art low-rank and subspace reconstruction method.

Via

Access Paper or Ask Questions

GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework

May 18, 2023
Ang Lv, Xu Tan, Peiling Lu, Wei Ye, Shikun Zhang, Jiang Bian, Rui Yan

Figure 1 for GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework

Figure 2 for GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework

Figure 3 for GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework

Figure 4 for GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework

Symbolic music generation aims to create musical notes, which can help users compose music, such as generating target instrumental tracks from scratch, or based on user-provided source tracks. Considering the diverse and flexible combination between source and target tracks, a unified model capable of generating any arbitrary tracks is of crucial necessity. Previous works fail to address this need due to inherent constraints in music representations and model architectures. To address this need, we propose a unified representation and diffusion framework named GETMusic (`GET' stands for GEnerate music Tracks), which includes a novel music representation named GETScore, and a diffusion model named GETDiff. GETScore represents notes as tokens and organizes them in a 2D structure, with tracks stacked vertically and progressing horizontally over time. During training, tracks are randomly selected as either the target or source. In the forward process, target tracks are corrupted by masking their tokens, while source tracks remain as ground truth. In the denoising process, GETDiff learns to predict the masked target tokens, conditioning on the source tracks. With separate tracks in GETScore and the non-autoregressive behavior of the model, GETMusic can explicitly control the generation of any target tracks from scratch or conditioning on source tracks. We conduct experiments on music generation involving six instrumental tracks, resulting in a total of 665 combinations. GETMusic provides high-quality results across diverse combinations and surpasses prior works proposed for some specific combinations.

* 15 pages, 4 figures

Via

Access Paper or Ask Questions

Skin Lesion Diagnosis Using Convolutional Neural Networks

May 18, 2023
Daniel Alonso Villanueva Nunez, Yongmin Li

Figure 1 for Skin Lesion Diagnosis Using Convolutional Neural Networks

Figure 2 for Skin Lesion Diagnosis Using Convolutional Neural Networks

Figure 3 for Skin Lesion Diagnosis Using Convolutional Neural Networks

Cancerous skin lesions are one of the most common malignancies detected in humans, and if not detected at an early stage, they can lead to death. Therefore, it is crucial to have access to accurate results early on to optimize the chances of survival. Unfortunately, accurate results are typically obtained by highly trained dermatologists, who may not be accessible to many people, particularly in low-income and middle-income countries. Artificial Intelligence (AI) appears to be a potential solution to this problem, as it has proven to provide equal or even better diagnoses than healthcare professionals. This project aims to address the issue by collecting state-of-the-art techniques for image classification from various fields and implementing them. Some of these techniques include mixup, presizing, and test-time augmentation, among others. Three architectures were used for the implementation: DenseNet121, VGG16 with batch normalization, and ResNet50. The models were designed with two main purposes. First, to classify images into seven categories, including melanocytic nevus, melanoma, benign keratosis-like lesions, basal cell carcinoma, actinic keratoses and intraepithelial carcinoma, vascular lesions, and dermatofibroma. Second, to classify images into benign or malignant. The models were trained using a dataset of 8012 images, and their performance was evaluated using 2003 images. It's worth noting that this model is trained end-to-end, directly from the image to the labels, without the need for handcrafted feature extraction.

Via

Access Paper or Ask Questions

LATTE: Label-efficient Incident Phenotyping from Longitudinal Electronic Health Records

May 19, 2023
Jun Wen, Jue Hou, Clara-Lea Bonzel, Yihan Zhao, Victor M. Castro, Vivian S. Gainer, Dana Weisenfeld, Tianrun Cai, Yuk-Lam Ho, Vidul A. Panickan, Lauren Costa, Chuan Hong, J. Michael Gaziano, Katherine P. Liao, Junwei Lu, Kelly Cho, Tianxi Cai

Figure 1 for LATTE: Label-efficient Incident Phenotyping from Longitudinal Electronic Health Records

Figure 2 for LATTE: Label-efficient Incident Phenotyping from Longitudinal Electronic Health Records

Figure 3 for LATTE: Label-efficient Incident Phenotyping from Longitudinal Electronic Health Records

Figure 4 for LATTE: Label-efficient Incident Phenotyping from Longitudinal Electronic Health Records

Electronic health record (EHR) data are increasingly used to support real-world evidence (RWE) studies. Yet its ability to generate reliable RWE is limited by the lack of readily available precise information on the timing of clinical events such as the onset time of heart failure. We propose a LAbel-efficienT incidenT phEnotyping (LATTE) algorithm to accurately annotate the timing of clinical events from longitudinal EHR data. By leveraging the pre-trained semantic embedding vectors from large-scale EHR data as prior knowledge, LATTE selects predictive EHR features in a concept re-weighting module by mining their relationship to the target event and compresses their information into longitudinal visit embeddings through a visit attention learning network. LATTE employs a recurrent neural network to capture the sequential dependency between the target event and visit embeddings before/after it. To improve label efficiency, LATTE constructs highly informative longitudinal silver-standard labels from large-scale unlabeled patients to perform unsupervised pre-training and semi-supervised joint training. Finally, LATTE enhances cross-site portability via contrastive representation learning. LATTE is evaluated on three analyses: the onset of type-2 diabetes, heart failure, and the onset and relapses of multiple sclerosis. We use various evaluation metrics present in the literature including the $ABC_{gain}$, the proportion of reduction in the area between the observed event indicator and the predicted cumulative incidences in reference to the prediction per incident prevalence. LATTE consistently achieves substantial improvement over benchmark methods such as SAMGEP and RETAIN in all settings.

* ERHs data

Via

Access Paper or Ask Questions