Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Image": models, code, and papers

Paired Competing Neurons Improving STDP Supervised Local Learning In Spiking Neural Networks

Aug 04, 2023
Gaspard Goupy, Pierre Tirilly, Ioan Marius Bilasco

Figure 1 for Paired Competing Neurons Improving STDP Supervised Local Learning In Spiking Neural Networks

Figure 2 for Paired Competing Neurons Improving STDP Supervised Local Learning In Spiking Neural Networks

Figure 3 for Paired Competing Neurons Improving STDP Supervised Local Learning In Spiking Neural Networks

Figure 4 for Paired Competing Neurons Improving STDP Supervised Local Learning In Spiking Neural Networks

Direct training of Spiking Neural Networks (SNNs) on neuromorphic hardware has the potential to significantly reduce the high energy consumption of Artificial Neural Networks (ANNs) training on modern computers. The biological plausibility of SNNs allows them to benefit from bio-inspired plasticity rules, such as Spike Timing-Dependent Plasticity (STDP). STDP offers gradient-free and unsupervised local learning, which can be easily implemented on neuromorphic hardware. However, relying solely on unsupervised STDP to perform classification tasks is not enough. In this paper, we propose Stabilized Supervised STDP (S2-STDP), a supervised STDP learning rule to train the classification layer of an SNN equipped with unsupervised STDP. S2-STDP integrates error-modulated weight updates that align neuron spikes with desired timestamps derived from the average firing time within the layer. Then, we introduce a training architecture called Paired Competing Neurons (PCN) to further enhance the learning capabilities of our classification layer trained with S2-STDP. PCN associates each class with paired neurons and encourages neuron specialization through intra-class competition. We evaluated our proposed methods on image recognition datasets, including MNIST, Fashion-MNIST, and CIFAR-10. Results showed that our methods outperform current supervised STDP-based state of the art, for comparable architectures and numbers of neurons. Also, the use of PCN enhances the performance of S2-STDP, regardless of the configuration, and without introducing any hyperparameters.Further analysis demonstrated that our methods exhibited improved hyperparameter robustness, which reduces the need for tuning.

Via

Access Paper or Ask Questions

Damage Vision Mining Opportunity for Imbalanced Anomaly Detection

Aug 04, 2023
Takato Yasuno

Figure 1 for Damage Vision Mining Opportunity for Imbalanced Anomaly Detection

Figure 2 for Damage Vision Mining Opportunity for Imbalanced Anomaly Detection

Figure 3 for Damage Vision Mining Opportunity for Imbalanced Anomaly Detection

Figure 4 for Damage Vision Mining Opportunity for Imbalanced Anomaly Detection

In past decade, previous balanced datasets have been used to advance algorithms for classification, object detection, semantic segmentation, and anomaly detection in industrial applications. Specifically, for condition-based maintenance, automating visual inspection is crucial to ensure high quality. Deterioration prognostic attempts to optimize the fine decision process for predictive maintenance and proactive repair. In civil infrastructure and living environment, damage data mining cannot avoid the imbalanced data issue because of rare unseen events and high quality status by improved operations. For visual inspection, deteriorated class acquired from the surface of concrete and steel components are occasionally imbalanced. From numerous related surveys, we summarize that imbalanced data problems can be categorized into four types; 1) missing range of target and label valuables, 2) majority-minority class imbalance, 3) foreground-background of spatial imbalance, 4) long-tailed class of pixel-wise imbalance. Since 2015, there has been many imbalanced studies using deep learning approaches that includes regression, image classification, object detection, semantic segmentation. However, anomaly detection for imbalanced data is not yet well known. In the study, we highlight one-class anomaly detection application whether anomalous class or not, and demonstrate clear examples on imbalanced vision datasets: blood smear, lung infection, wooden, concrete deterioration, and disaster damage. We provide key results on damage vision mining advantage, hypothesizing that the more effective range of positive ratio, the higher accuracy gain of anomaly detection application. Finally, the applicability of the damage learning methods, limitations, and future works are mentioned.

* 15 pages, 20 figures, 12 tables

Via

Access Paper or Ask Questions

SDC-UDA: Volumetric Unsupervised Domain Adaptation Framework for Slice-Direction Continuous Cross-Modality Medical Image Segmentation

May 18, 2023
Hyungseob Shin, Hyeongyu Kim, Sewon Kim, Yohan Jun, Taejoon Eo, Dosik Hwang

Figure 1 for SDC-UDA: Volumetric Unsupervised Domain Adaptation Framework for Slice-Direction Continuous Cross-Modality Medical Image Segmentation

Figure 2 for SDC-UDA: Volumetric Unsupervised Domain Adaptation Framework for Slice-Direction Continuous Cross-Modality Medical Image Segmentation

Figure 3 for SDC-UDA: Volumetric Unsupervised Domain Adaptation Framework for Slice-Direction Continuous Cross-Modality Medical Image Segmentation

Figure 4 for SDC-UDA: Volumetric Unsupervised Domain Adaptation Framework for Slice-Direction Continuous Cross-Modality Medical Image Segmentation

Recent advances in deep learning-based medical image segmentation studies achieve nearly human-level performance in fully supervised manner. However, acquiring pixel-level expert annotations is extremely expensive and laborious in medical imaging fields. Unsupervised domain adaptation (UDA) can alleviate this problem, which makes it possible to use annotated data in one imaging modality to train a network that can successfully perform segmentation on target imaging modality with no labels. In this work, we propose SDC-UDA, a simple yet effective volumetric UDA framework for slice-direction continuous cross-modality medical image segmentation which combines intra- and inter-slice self-attentive image translation, uncertainty-constrained pseudo-label refinement, and volumetric self-training. Our method is distinguished from previous methods on UDA for medical image segmentation in that it can obtain continuous segmentation in the slice direction, thereby ensuring higher accuracy and potential in clinical practice. We validate SDC-UDA with multiple publicly available cross-modality medical image segmentation datasets and achieve state-of-the-art segmentation performance, not to mention the superior slice-direction continuity of prediction compared to previous studies.

* 10 pages, 7 figures, CVPR 2023

Via

Access Paper or Ask Questions

C-DARL: Contrastive diffusion adversarial representation learning for label-free blood vessel segmentation

Jul 31, 2023
Boah Kim, Yujin Oh, Bradford J. Wood, Ronald M. Summers, Jong Chul Ye

Figure 1 for C-DARL: Contrastive diffusion adversarial representation learning for label-free blood vessel segmentation

Figure 2 for C-DARL: Contrastive diffusion adversarial representation learning for label-free blood vessel segmentation

Figure 3 for C-DARL: Contrastive diffusion adversarial representation learning for label-free blood vessel segmentation

Figure 4 for C-DARL: Contrastive diffusion adversarial representation learning for label-free blood vessel segmentation

Blood vessel segmentation in medical imaging is one of the essential steps for vascular disease diagnosis and interventional planning in a broad spectrum of clinical scenarios in image-based medicine and interventional medicine. Unfortunately, manual annotation of the vessel masks is challenging and resource-intensive due to subtle branches and complex structures. To overcome this issue, this paper presents a self-supervised vessel segmentation method, dubbed the contrastive diffusion adversarial representation learning (C-DARL) model. Our model is composed of a diffusion module and a generation module that learns the distribution of multi-domain blood vessel data by generating synthetic vessel images from diffusion latent. Moreover, we employ contrastive learning through a mask-based contrastive loss so that the model can learn more realistic vessel representations. To validate the efficacy, C-DARL is trained using various vessel datasets, including coronary angiograms, abdominal digital subtraction angiograms, and retinal imaging. Experimental results confirm that our model achieves performance improvement over baseline methods with noise robustness, suggesting the effectiveness of C-DARL for vessel segmentation.

Via

Access Paper or Ask Questions

Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning

May 24, 2023
Haoyi Qiu, Zi-Yi Dou, Tianlu Wang, Asli Celikyilmaz, Nanyun Peng

Figure 1 for Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning

Figure 2 for Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning

Figure 3 for Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning

Figure 4 for Gender Biases in Automatic Evaluation Metrics: A Case Study on Image Captioning

Pretrained model-based evaluation metrics have demonstrated strong performance with high correlations with human judgments in various natural language generation tasks such as image captioning. Despite the impressive results, their impact on fairness is under-explored -- it is widely acknowledged that pretrained models can encode societal biases, and utilizing them for evaluation purposes may inadvertently manifest and potentially amplify biases. In this paper, we conduct a systematic study in gender biases of model-based evaluation metrics with a focus on image captioning tasks. Specifically, we first identify and quantify gender biases in different evaluation metrics regarding profession, activity, and object concepts. Then, we demonstrate the negative consequences of using these biased metrics, such as favoring biased generation models in deployment and propagating the biases to generation models through reinforcement learning. We also present a simple but effective alternative to reduce gender biases by combining n-gram matching-based and pretrained model-based evaluation metrics.

Via

Access Paper or Ask Questions

Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation

Jul 27, 2023
Yiming Cui, Linjie Yang, Haichao Yu

Figure 1 for Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation

Figure 2 for Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation

Figure 3 for Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation

Figure 4 for Learning Dynamic Query Combinations for Transformer-based Object Detection and Segmentation

Transformer-based detection and segmentation methods use a list of learned detection queries to retrieve information from the transformer network and learn to predict the location and category of one specific object from each query. We empirically find that random convex combinations of the learned queries are still good for the corresponding models. We then propose to learn a convex combination with dynamic coefficients based on the high-level semantics of the image. The generated dynamic queries, named modulated queries, better capture the prior of object locations and categories in the different images. Equipped with our modulated queries, a wide range of DETR-based models achieve consistent and superior performance across multiple tasks including object detection, instance segmentation, panoptic segmentation, and video instance segmentation.

* 12 pages, 4 figures, ICML 2023, code is available at https://github.com/bytedance/DQ-Det

Via

Access Paper or Ask Questions

DETR Doesn't Need Multi-Scale or Locality Design

Aug 03, 2023
Yutong Lin, Yuhui Yuan, Zheng Zhang, Chen Li, Nanning Zheng, Han Hu

Figure 1 for DETR Doesn't Need Multi-Scale or Locality Design

Figure 2 for DETR Doesn't Need Multi-Scale or Locality Design

Figure 3 for DETR Doesn't Need Multi-Scale or Locality Design

Figure 4 for DETR Doesn't Need Multi-Scale or Locality Design

This paper presents an improved DETR detector that maintains a "plain" nature: using a single-scale feature map and global cross-attention calculations without specific locality constraints, in contrast to previous leading DETR-based detectors that reintroduce architectural inductive biases of multi-scale and locality into the decoder. We show that two simple technologies are surprisingly effective within a plain design to compensate for the lack of multi-scale feature maps and locality constraints. The first is a box-to-pixel relative position bias (BoxRPB) term added to the cross-attention formulation, which well guides each query to attend to the corresponding object region while also providing encoding flexibility. The second is masked image modeling (MIM)-based backbone pre-training which helps learn representation with fine-grained localization ability and proves crucial for remedying dependencies on the multi-scale feature maps. By incorporating these technologies and recent advancements in training and problem formation, the improved "plain" DETR showed exceptional improvements over the original DETR detector. By leveraging the Object365 dataset for pre-training, it achieved 63.9 mAP accuracy using a Swin-L backbone, which is highly competitive with state-of-the-art detectors which all heavily rely on multi-scale feature maps and region-based feature extraction. Code is available at https://github.com/impiga/Plain-DETR .

* To be published in ICCV2023

Via

Access Paper or Ask Questions

Motion Degeneracy in Self-supervised Learning of Elevation Angle Estimation for 2D Forward-Looking Sonar

Aug 01, 2023
Yusheng Wang, Yonghoon Ji, Chujie Wu, Hiroshi Tsuchiya, Hajime Asama, Atsushi Yamashita

Figure 1 for Motion Degeneracy in Self-supervised Learning of Elevation Angle Estimation for 2D Forward-Looking Sonar

Figure 2 for Motion Degeneracy in Self-supervised Learning of Elevation Angle Estimation for 2D Forward-Looking Sonar

Figure 3 for Motion Degeneracy in Self-supervised Learning of Elevation Angle Estimation for 2D Forward-Looking Sonar

Figure 4 for Motion Degeneracy in Self-supervised Learning of Elevation Angle Estimation for 2D Forward-Looking Sonar

2D forward-looking sonar is a crucial sensor for underwater robotic perception. A well-known problem in this field is estimating missing information in the elevation direction during sonar imaging. There are demands to estimate 3D information per image for 3D mapping and robot navigation during fly-through missions. Recent learning-based methods have demonstrated their strengths, but there are still drawbacks. Supervised learning methods have achieved high-quality results but may require further efforts to acquire 3D ground-truth labels. The existing self-supervised method requires pretraining using synthetic images with 3D supervision. This study aims to realize stable self-supervised learning of elevation angle estimation without pretraining using synthetic images. Failures during self-supervised learning may be caused by motion degeneracy problems. We first analyze the motion field of 2D forward-looking sonar, which is related to the main supervision signal. We utilize a modern learning framework and prove that if the training dataset is built with effective motions, the network can be trained in a self-supervised manner without the knowledge of synthetic data. Both simulation and real experiments validate the proposed method.

* IROS2023

Via

Access Paper or Ask Questions

Improving Pixel-based MIM by Reducing Wasted Modeling Capability

Aug 01, 2023
Yuan Liu, Songyang Zhang, Jiacheng Chen, Zhaohui Yu, Kai Chen, Dahua Lin

Figure 1 for Improving Pixel-based MIM by Reducing Wasted Modeling Capability

Figure 2 for Improving Pixel-based MIM by Reducing Wasted Modeling Capability

Figure 3 for Improving Pixel-based MIM by Reducing Wasted Modeling Capability

Figure 4 for Improving Pixel-based MIM by Reducing Wasted Modeling Capability

There has been significant progress in Masked Image Modeling (MIM). Existing MIM methods can be broadly categorized into two groups based on the reconstruction target: pixel-based and tokenizer-based approaches. The former offers a simpler pipeline and lower computational cost, but it is known to be biased toward high-frequency details. In this paper, we provide a set of empirical studies to confirm this limitation of pixel-based MIM and propose a new method that explicitly utilizes low-level features from shallow layers to aid pixel reconstruction. By incorporating this design into our base method, MAE, we reduce the wasted modeling capability of pixel-based MIM, improving its convergence and achieving non-trivial improvements across various downstream tasks. To the best of our knowledge, we are the first to systematically investigate multi-level feature fusion for isotropic architectures like the standard Vision Transformer (ViT). Notably, when applied to a smaller model (e.g., ViT-S), our method yields significant performance gains, such as 1.2\% on fine-tuning, 2.8\% on linear probing, and 2.6\% on semantic segmentation. Code and models are available at https://github.com/open-mmlab/mmpretrain.

* Accepted by ICCV2023

Via

Access Paper or Ask Questions

Dynamic ensemble selection based on Deep Neural Network Uncertainty Estimation for Adversarial Robustness

Aug 01, 2023
Ruoxi Qin, Linyuan Wang, Xuehui Du, Xingyuan Chen, Bin Yan

Figure 1 for Dynamic ensemble selection based on Deep Neural Network Uncertainty Estimation for Adversarial Robustness

Figure 2 for Dynamic ensemble selection based on Deep Neural Network Uncertainty Estimation for Adversarial Robustness

Figure 3 for Dynamic ensemble selection based on Deep Neural Network Uncertainty Estimation for Adversarial Robustness

Figure 4 for Dynamic ensemble selection based on Deep Neural Network Uncertainty Estimation for Adversarial Robustness

The deep neural network has attained significant efficiency in image recognition. However, it has vulnerable recognition robustness under extensive data uncertainty in practical applications. The uncertainty is attributed to the inevitable ambient noise and, more importantly, the possible adversarial attack. Dynamic methods can effectively improve the defense initiative in the arms race of attack and defense of adversarial examples. Different from the previous dynamic method depend on input or decision, this work explore the dynamic attributes in model level through dynamic ensemble selection technology to further protect the model from white-box attacks and improve the robustness. Specifically, in training phase the Dirichlet distribution is apply as prior of sub-models' predictive distribution, and the diversity constraint in parameter space is introduced under the lightweight sub-models to construct alternative ensembel model spaces. In test phase, the certain sub-models are dynamically selected based on their rank of uncertainty value for the final prediction to ensure the majority accurate principle in ensemble robustness and accuracy. Compared with the previous dynamic method and staic adversarial traning model, the presented approach can achieve significant robustness results without damaging accuracy by combining dynamics and diversity property.

Via

Access Paper or Ask Questions