Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Information": models, code, and papers

Learning Better Contrastive View from Radiologist's Gaze

May 15, 2023
Sheng Wang, Zixu Zhuang, Xi Ouyang, Lichi Zhang, Zheren Li, Chong Ma, Tianming Liu, Dinggang Shen, Qian Wang

Figure 1 for Learning Better Contrastive View from Radiologist's Gaze

Figure 2 for Learning Better Contrastive View from Radiologist's Gaze

Figure 3 for Learning Better Contrastive View from Radiologist's Gaze

Figure 4 for Learning Better Contrastive View from Radiologist's Gaze

Recent self-supervised contrastive learning methods greatly benefit from the Siamese structure that aims to minimizing distances between positive pairs. These methods usually apply random data augmentation to input images, expecting the augmented views of the same images to be similar and positively paired. However, random augmentation may overlook image semantic information and degrade the quality of augmented views in contrastive learning. This issue becomes more challenging in medical images since the abnormalities related to diseases can be tiny, and are easy to be corrupted (e.g., being cropped out) in the current scheme of random augmentation. In this work, we first demonstrate that, for widely-used X-ray images, the conventional augmentation prevalent in contrastive pre-training can affect the performance of the downstream diagnosis or classification tasks. Then, we propose a novel augmentation method, i.e., FocusContrast, to learn from radiologists' gaze in diagnosis and generate contrastive views for medical images with guidance from radiologists' visual attention. Specifically, we track the gaze movement of radiologists and model their visual attention when reading to diagnose X-ray images. The learned model can predict visual attention of the radiologists given a new input image, and further guide the attention-aware augmentation that hardly neglects the disease-related abnormalities. As a plug-and-play and framework-agnostic module, FocusContrast consistently improves state-of-the-art contrastive learning methods of SimCLR, MoCo, and BYOL by 4.0~7.0% in classification accuracy on a knee X-ray dataset.

Via

Access Paper or Ask Questions

HIORE: Leveraging High-order Interactions for Unified Entity Relation Extraction

May 07, 2023
Yijun Wang, Changzhi Sun, Yuanbin Wu, Lei Li, Junchi Yan, Hao Zhou

Figure 1 for HIORE: Leveraging High-order Interactions for Unified Entity Relation Extraction

Figure 2 for HIORE: Leveraging High-order Interactions for Unified Entity Relation Extraction

Figure 3 for HIORE: Leveraging High-order Interactions for Unified Entity Relation Extraction

Figure 4 for HIORE: Leveraging High-order Interactions for Unified Entity Relation Extraction

Entity relation extraction consists of two sub-tasks: entity recognition and relation extraction. Existing methods either tackle these two tasks separately or unify them with word-by-word interactions. In this paper, we propose HIORE, a new method for unified entity relation extraction. The key insight is to leverage the high-order interactions, i.e., the complex association among word pairs, which contains richer information than the first-order word-by-word interactions. For this purpose, we first devise a W-shape DNN (WNet) to capture coarse-level high-order connections. Then, we build a heuristic high-order graph and further calibrate the representations with a graph neural network (GNN). Experiments on three benchmarks (ACE04, ACE05, SciERC) show that HIORE achieves the state-of-the-art performance on relation extraction and an improvement of 1.1~1.8 F1 points over the prior best unified model.

* 10 pages

Via

Access Paper or Ask Questions

PanFlowNet: A Flow-Based Deep Network for Pan-sharpening

May 16, 2023
Gang Yang, Xiangyong Cao, Wenzhe Xiao, Man Zhou, Aiping Liu, Xun chen, Deyu Meng

Figure 1 for PanFlowNet: A Flow-Based Deep Network for Pan-sharpening

Figure 2 for PanFlowNet: A Flow-Based Deep Network for Pan-sharpening

Figure 3 for PanFlowNet: A Flow-Based Deep Network for Pan-sharpening

Figure 4 for PanFlowNet: A Flow-Based Deep Network for Pan-sharpening

Pan-sharpening aims to generate a high-resolution multispectral (HRMS) image by integrating the spectral information of a low-resolution multispectral (LRMS) image with the texture details of a high-resolution panchromatic (PAN) image. It essentially inherits the ill-posed nature of the super-resolution (SR) task that diverse HRMS images can degrade into an LRMS image. However, existing deep learning-based methods recover only one HRMS image from the LRMS image and PAN image using a deterministic mapping, thus ignoring the diversity of the HRMS image. In this paper, to alleviate this ill-posed issue, we propose a flow-based pan-sharpening network (PanFlowNet) to directly learn the conditional distribution of HRMS image given LRMS image and PAN image instead of learning a deterministic mapping. Specifically, we first transform this unknown conditional distribution into a given Gaussian distribution by an invertible network, and the conditional distribution can thus be explicitly defined. Then, we design an invertible Conditional Affine Coupling Block (CACB) and further build the architecture of PanFlowNet by stacking a series of CACBs. Finally, the PanFlowNet is trained by maximizing the log-likelihood of the conditional distribution given a training set and can then be used to predict diverse HRMS images. The experimental results verify that the proposed PanFlowNet can generate various HRMS images given an LRMS image and a PAN image. Additionally, the experimental results on different kinds of satellite datasets also demonstrate the superiority of our PanFlowNet compared with other state-of-the-art methods both visually and quantitatively.

Via

Access Paper or Ask Questions

Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token

May 16, 2023
Jinsong Shi, Pan Gao, Aljosa Smolic

Figure 1 for Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token

Figure 2 for Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token

Figure 3 for Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token

Figure 4 for Blind Image Quality Assessment via Transformer Predicted Error Map and Perceptual Quality Token

Image quality assessment is a fundamental problem in the field of image processing, and due to the lack of reference images in most practical scenarios, no-reference image quality assessment (NR-IQA), has gained increasing attention recently. With the development of deep learning technology, many deep neural network-based NR-IQA methods have been developed, which try to learn the image quality based on the understanding of database information. Currently, Transformer has achieved remarkable progress in various vision tasks. Since the characteristics of the attention mechanism in Transformer fit the global perceptual impact of artifacts perceived by a human, Transformer is thus well suited for image quality assessment tasks. In this paper, we propose a Transformer based NR-IQA model using a predicted objective error map and perceptual quality token. Specifically, we firstly generate the predicted error map by pre-training one model consisting of a Transformer encoder and decoder, in which the objective difference between the distorted and the reference images is used as supervision. Then, we freeze the parameters of the pre-trained model and design another branch using the vision Transformer to extract the perceptual quality token for feature fusion with the predicted error map. Finally, the fused features are regressed to the final image quality score. Extensive experiments have shown that our proposed method outperforms the current state-of-the-art in both authentic and synthetic image databases. Moreover, the attentional map extracted by the perceptual quality token also does conform to the characteristics of the human visual system.

* Submitted to TMM

Via

Access Paper or Ask Questions

Applying Machine Learning Analysis for Software Quality Test

May 16, 2023
Al Khan, Remudin Reshid Mekuria, Ruslan Isaev

Figure 1 for Applying Machine Learning Analysis for Software Quality Test

Figure 2 for Applying Machine Learning Analysis for Software Quality Test

Figure 3 for Applying Machine Learning Analysis for Software Quality Test

Figure 4 for Applying Machine Learning Analysis for Software Quality Test

One of the biggest expense in software development is the maintenance. Therefore, it is critical to comprehend what triggers maintenance and if it may be predicted. Numerous research have demonstrated that specific methods of assessing the complexity of created programs may produce useful prediction models to ascertain the possibility of maintenance due to software failures. As a routine it is performed prior to the release, and setting up the models frequently calls for certain, object-oriented software measurements. It is not always the case that software developers have access to these measurements. In this paper, the machine learning is applied on the available data to calculate the cumulative software failure levels. A technique to forecast a software`s residual defectiveness using machine learning can be looked into as a solution to the challenge of predicting residual flaws. Software metrics and defect data were separated out of the static source code repository. Static code is used to create software metrics, and reported bugs in the repository are used to gather defect information. By using a correlation method, metrics that had no connection to the defect data were removed. This makes it possible to analyze all the data without pausing the programming process. Large, sophisticated software`s primary issue is that it is impossible to control everything manually, and the cost of an error can be quite expensive. Developers may miss errors during testing as a consequence, which will raise maintenance costs. Finding a method to accurately forecast software defects is the overall objective.

* 2023 International Conference on Code Quality (ICCQ), IEEE Xplore
* 16 pages, 5 figures and 14 tables

Via

Access Paper or Ask Questions

Constructing a Knowledge Graph from Textual Descriptions of Software Vulnerabilities in the National Vulnerability Database

Apr 30, 2023
Anders Mølmen Høst, Pierre Lison, Leon Moonen

Figure 1 for Constructing a Knowledge Graph from Textual Descriptions of Software Vulnerabilities in the National Vulnerability Database

Figure 2 for Constructing a Knowledge Graph from Textual Descriptions of Software Vulnerabilities in the National Vulnerability Database

Figure 3 for Constructing a Knowledge Graph from Textual Descriptions of Software Vulnerabilities in the National Vulnerability Database

Figure 4 for Constructing a Knowledge Graph from Textual Descriptions of Software Vulnerabilities in the National Vulnerability Database

Knowledge graphs have shown promise for several cybersecurity tasks, such as vulnerability assessment and threat analysis. In this work, we present a new method for constructing a vulnerability knowledge graph from information in the National Vulnerability Database (NVD). Our approach combines named entity recognition (NER), relation extraction (RE), and entity prediction using a combination of neural models, heuristic rules, and knowledge graph embeddings. We demonstrate how our method helps to fix missing entities in knowledge graphs used for cybersecurity and evaluate the performance.

* Accepted for publication in the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), T\'{o}rshavn, Faroe Islands, May 22nd-24th, 2023

Via

Access Paper or Ask Questions

Energy-Efficient Lane Changes Planning and Control for Connected Autonomous Vehicles on Urban Roads

Apr 17, 2023
Eunhyek Joa, Hotae Lee, Eric Yongkeun Choi, Francesco Borrelli

Figure 1 for Energy-Efficient Lane Changes Planning and Control for Connected Autonomous Vehicles on Urban Roads

Figure 2 for Energy-Efficient Lane Changes Planning and Control for Connected Autonomous Vehicles on Urban Roads

Figure 3 for Energy-Efficient Lane Changes Planning and Control for Connected Autonomous Vehicles on Urban Roads

Figure 4 for Energy-Efficient Lane Changes Planning and Control for Connected Autonomous Vehicles on Urban Roads

This paper presents a novel energy-efficient motion planning algorithm for Connected Autonomous Vehicles (CAVs) on urban roads. The approach consists of two components: a decision-making algorithm and an optimization-based trajectory planner. The decision-making algorithm leverages Signal Phase and Timing (SPaT) information from connected traffic lights to select a lane with the aim of reducing energy consumption. The algorithm is based on a heuristic rule which is learned from human driving data. The optimization-based trajectory planner generates a safe, smooth, and energy-efficient trajectory toward the selected lane. The proposed strategy is experimentally evaluated in a Vehicle-in-the-Loop (VIL) setting, where a real test vehicle receives SPaT information from both actual and virtual traffic lights and autonomously drives on a testing site, while the surrounding vehicles are simulated. The results demonstrate that the use of SPaT information in autonomous driving leads to improved energy efficiency, with the proposed strategy saving 37.1% energy consumption compared to a lane-keeping algorithm.

* IEEE Intelligent Vehicle Symposium, Anchorage, Alaska, June 4-7, 2023

Via

Access Paper or Ask Questions

Self-Supervised Video Representation Learning via Latent Time Navigation

May 10, 2023
Di Yang, Yaohui Wang, Quan Kong, Antitza Dantcheva, Lorenzo Garattoni, Gianpiero Francesca, Francois Bremond

Figure 1 for Self-Supervised Video Representation Learning via Latent Time Navigation

Figure 2 for Self-Supervised Video Representation Learning via Latent Time Navigation

Figure 3 for Self-Supervised Video Representation Learning via Latent Time Navigation

Figure 4 for Self-Supervised Video Representation Learning via Latent Time Navigation

Self-supervised video representation learning aimed at maximizing similarity between different temporal segments of one video, in order to enforce feature persistence over time. This leads to loss of pertinent information related to temporal relationships, rendering actions such as `enter' and `leave' to be indistinguishable. To mitigate this limitation, we propose Latent Time Navigation (LTN), a time-parameterized contrastive learning strategy that is streamlined to capture fine-grained motions. Specifically, we maximize the representation similarity between different video segments from one video, while maintaining their representations time-aware along a subspace of the latent representation code including an orthogonal basis to represent temporal changes. Our extensive experimental analysis suggests that learning video representations by LTN consistently improves performance of action classification in fine-grained and human-oriented tasks (e.g., on Toyota Smarthome dataset). In addition, we demonstrate that our proposed model, when pre-trained on Kinetics-400, generalizes well onto the unseen real world video benchmark datasets UCF101 and HMDB51, achieving state-of-the-art performance in action recognition.

* AAAI 2023

Via

Access Paper or Ask Questions

Mobile Image Restoration via Prior Quantization

May 10, 2023
Shiqi Chen, Jinwen Zhou, Menghao Li, Yueting Chen, Tingting Jiang

Figure 1 for Mobile Image Restoration via Prior Quantization

Figure 2 for Mobile Image Restoration via Prior Quantization

Figure 3 for Mobile Image Restoration via Prior Quantization

Figure 4 for Mobile Image Restoration via Prior Quantization

In digital images, the performance of optical aberration is a multivariate degradation, where the spectral of the scene, the lens imperfections, and the field of view together contribute to the results. Besides eliminating it at the hardware level, the post-processing system, which utilizes various prior information, is significant for correction. However, due to the content differences among priors, the pipeline that aligns these factors shows limited efficiency and unoptimized restoration. Here, we propose a prior quantization model to correct the optical aberrations in image processing systems. To integrate these messages, we encode various priors into a latent space and quantify them by the learnable codebooks. After quantization, the prior codes are fused with the image restoration branch to realize targeted optical aberration correction. Comprehensive experiments demonstrate the flexibility of the proposed method and validate its potential to accomplish targeted restoration for a specific camera. Furthermore, our model promises to analyze the correlation between the various priors and the optical aberration of devices, which is helpful for joint soft-hardware design.

* Submitted to Elsevier PRL. 5 pages, 5figures

Via

Access Paper or Ask Questions

Multi-stage Progressive Reasoning for Dunhuang Murals Inpainting

May 10, 2023
Wenjie Liu, Baokai Liu, Shiqiang Du, Yuqing Shi, Jiacheng Li, Jianhua Wang

Figure 1 for Multi-stage Progressive Reasoning for Dunhuang Murals Inpainting

Figure 2 for Multi-stage Progressive Reasoning for Dunhuang Murals Inpainting

Figure 3 for Multi-stage Progressive Reasoning for Dunhuang Murals Inpainting

Figure 4 for Multi-stage Progressive Reasoning for Dunhuang Murals Inpainting

Dunhuang murals suffer from fading, breakage, surface brittleness and extensive peeling affected by prolonged environmental erosion. Image inpainting techniques are widely used in the field of digital mural inpainting. Generally speaking, for mural inpainting tasks with large area damage, it is challenging for any image inpainting method. In this paper, we design a multi-stage progressive reasoning network (MPR-Net) containing global to local receptive fields for murals inpainting. This network is capable of recursively inferring the damage boundary and progressively tightening the regional texture constraints. Moreover, to adaptively fuse plentiful information at various scales of murals, a multi-scale feature aggregation module (MFA) is designed to empower the capability to select the significant features. The execution of the model is similar to the process of a mural restorer (i.e., inpainting the structure of the damaged mural globally first and then adding the local texture details further). Our method has been evaluated through both qualitative and quantitative experiments, and the results demonstrate that it outperforms state-of-the-art image inpainting methods.

Via

Access Paper or Ask Questions