Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wei Dai

Dictionary Learning Using Rank-One Atomic Decomposition (ROAD)

Oct 26, 2021
Cheng Cheng, Wei Dai

Figure 1 for Dictionary Learning Using Rank-One Atomic Decomposition (ROAD)

Figure 2 for Dictionary Learning Using Rank-One Atomic Decomposition (ROAD)

Figure 3 for Dictionary Learning Using Rank-One Atomic Decomposition (ROAD)

Figure 4 for Dictionary Learning Using Rank-One Atomic Decomposition (ROAD)

Dictionary learning aims at seeking a dictionary under which the training data can be sparsely represented. Methods in the literature typically formulate the dictionary learning problem as an optimization w.r.t. two variables, i.e., dictionary and sparse coefficients, and solve it by alternating between two stages: sparse coding and dictionary update. The key contribution of this work is a Rank-One Atomic Decomposition (ROAD) formulation where dictionary learning is cast as an optimization w.r.t. a single variable which is a set of rank one matrices. The resulting algorithm is hence single-stage. Compared with two-stage algorithms, ROAD minimizes the sparsity of the coefficients whilst keeping the data consistency constraint throughout the whole learning process. An alternating direction method of multipliers (ADMM) is derived to solve the optimization problem and the lower bound of the penalty parameter is computed to guarantees a global convergence despite non-convexity of the optimization formulation. From practical point of view, ROAD reduces the number of tuning parameters required in other benchmark algorithms. Numerical tests demonstrate that ROAD outperforms other benchmark algorithms for both synthetic data and real data, especially when the number of training samples is small.

* arXiv admin note: text overlap with arXiv:1911.08975

Via

Access Paper or Ask Questions

Dictionary Learning with Convex Update (ROMD)

Oct 25, 2021
Cheng Cheng, Wei Dai

Figure 1 for Dictionary Learning with Convex Update (ROMD)

Figure 2 for Dictionary Learning with Convex Update (ROMD)

Figure 3 for Dictionary Learning with Convex Update (ROMD)

Figure 4 for Dictionary Learning with Convex Update (ROMD)

Dictionary learning aims to find a dictionary under which the training data can be sparsely represented, and it is usually achieved by iteratively applying two stages: sparse coding and dictionary update. Typical methods for dictionary update focuses on refining both dictionary atoms and their corresponding sparse coefficients by using the sparsity patterns obtained from sparse coding stage, and hence it is a non-convex bilinear inverse problem. In this paper, we propose a Rank-One Matrix Decomposition (ROMD) algorithm to recast this challenge into a convex problem by resolving these two variables into a set of rank-one matrices. Different from methods in the literature, ROMD updates the whole dictionary at a time using convex programming. The advantages hence include both convergence guarantees for dictionary update and faster convergence of the whole dictionary learning. The performance of ROMD is compared with other benchmark dictionary learning algorithms. The results show the improvement of ROMD in recovery accuracy, especially in the cases of high sparsity level and fewer observation data.

Via

Access Paper or Ask Questions

Short-and-Sparse Deconvolution Via Rank-One Constrained Optimization (ROCO)

Oct 05, 2021
Cheng Cheng, Wei Dai

Figure 1 for Short-and-Sparse Deconvolution Via Rank-One Constrained Optimization (ROCO)

Figure 2 for Short-and-Sparse Deconvolution Via Rank-One Constrained Optimization (ROCO)

Figure 3 for Short-and-Sparse Deconvolution Via Rank-One Constrained Optimization (ROCO)

Short-and-sparse deconvolution (SaSD) aims to recover a short kernel and a long and sparse signal from their convolution. In the literature, formulations of blind deconvolution is either a convex programming via a matrix lifting of convolution, or a bilinear Lasso. Optimization solvers are typically based on bilinear factorizations. In this paper, we formulate SaSD as a non-convex optimization with a rank-one matrix constraint, hence referred to as Rank-One Constrained Optimization (ROCO). The solver is based on alternating direction method of multipliers (ADMM). It operates on the full rank-one matrix rather than bilinear factorizations. Closed form updates are derived for the efficiency of ADMM. Simulations include both synthetic data and real images. Results show substantial improvements in recovery accuracy (at least 19dB in PSNR for real images) and comparable runtime compared with benchmark algorithms based on bilinear factorization.

Via

Access Paper or Ask Questions

CAN3D: Fast 3D Medical Image Segmentation via Compact Context Aggregation

Sep 22, 2021
Wei Dai, Boyeong Woo, Siyu Liu, Matthew Marques, Craig B. Engstrom, Peter B. Greer, Stuart Crozier, Jason A. Dowling, Shekhar S. Chandra

Figure 1 for CAN3D: Fast 3D Medical Image Segmentation via Compact Context Aggregation

Figure 2 for CAN3D: Fast 3D Medical Image Segmentation via Compact Context Aggregation

Figure 3 for CAN3D: Fast 3D Medical Image Segmentation via Compact Context Aggregation

Figure 4 for CAN3D: Fast 3D Medical Image Segmentation via Compact Context Aggregation

Direct automatic segmentation of objects from 3D medical imaging, such as magnetic resonance (MR) imaging, is challenging as it often involves accurately identifying a number of individual objects with complex geometries within a large volume under investigation. To address these challenges, most deep learning approaches typically enhance their learning capability by substantially increasing the complexity or the number of trainable parameters within their models. Consequently, these models generally require long inference time on standard workstations operating clinical MR systems and are restricted to high-performance computing hardware due to their large memory requirement. Further, to fit 3D dataset through these large models using limited computer memory, trade-off techniques such as patch-wise training are often used which sacrifice the fine-scale geometric information from input images which could be clinically significant for diagnostic purposes. To address these challenges, we present a compact convolutional neural network with a shallow memory footprint to efficiently reduce the number of model parameters required for state-of-art performance. This is critical for practical employment as most clinical environments only have low-end hardware with limited computing power and memory. The proposed network can maintain data integrity by directly processing large full-size 3D input volumes with no patches required and significantly reduces the computational time required for both training and inference. We also propose a novel loss function with extra shape constraint to improve the accuracy for imbalanced classes in 3D MR images.

* 21 pages, 7 figures

Via

Access Paper or Ask Questions

BrainNNExplainer: An Interpretable Graph Neural Network Framework for Brain Network based Disease Analysis

Jul 11, 2021
Hejie Cui, Wei Dai, Yanqiao Zhu, Xiaoxiao Li, Lifang He, Carl Yang

Figure 1 for BrainNNExplainer: An Interpretable Graph Neural Network Framework for Brain Network based Disease Analysis

Figure 2 for BrainNNExplainer: An Interpretable Graph Neural Network Framework for Brain Network based Disease Analysis

Figure 3 for BrainNNExplainer: An Interpretable Graph Neural Network Framework for Brain Network based Disease Analysis

Figure 4 for BrainNNExplainer: An Interpretable Graph Neural Network Framework for Brain Network based Disease Analysis

Interpretable brain network models for disease prediction are of great value for the advancement of neuroscience. GNNs are promising to model complicated network data, but they are prone to overfitting and suffer from poor interpretability, which prevents their usage in decision-critical scenarios like healthcare. To bridge this gap, we propose BrainNNExplainer, an interpretable GNN framework for brain network analysis. It is mainly composed of two jointly learned modules: a backbone prediction model that is specifically designed for brain networks and an explanation generator that highlights disease-specific prominent brain network connections. Extensive experimental results with visualizations on two challenging disease prediction datasets demonstrate the unique interpretability and outstanding performance of BrainNNExplainer.

* This paper has been accepted to ICML 2021 Workshop on Interpretable Machine Learning in Healthcare

Via

Access Paper or Ask Questions

Third Party Risk Modelling and Assessment for Safe UAV Path Planning in Metropolitan Environments

Jul 05, 2021
Bizhao Pang, Xinting Hu, Wei Dai, Kin Huat Low

Figure 1 for Third Party Risk Modelling and Assessment for Safe UAV Path Planning in Metropolitan Environments

Figure 2 for Third Party Risk Modelling and Assessment for Safe UAV Path Planning in Metropolitan Environments

Figure 3 for Third Party Risk Modelling and Assessment for Safe UAV Path Planning in Metropolitan Environments

Figure 4 for Third Party Risk Modelling and Assessment for Safe UAV Path Planning in Metropolitan Environments

Various applications of advanced air mobility (AAM) in urban environments facilitate our daily life and public services. As one of the key issues of realizing these applications autonomously, path planning problem has been studied with main objectives on minimizing travel distance, flight time and energy cost. However, AAM operations in metropolitan areas bring safety and society issues. Because most of AAM aircraft are unmanned aerial vehicles (UAVs) and they may fail to operate resulting in fatality risk, property damage risk and societal impacts (noise and privacy) to the public. To quantitatively assess these risks and mitigate them in planning phase, this paper proposes an integrated risk assessment model and develops a hybrid algorithm to solve the risk-based 3D path planning problem. The integrated risk assessment method considers probability and severity models of UAV impact ground people and vehicle. By introducing gravity model, the population density and traffic density are estimated in a finer scale, which enables more accurate risk assessment. The 3D risk-based path planning problem is first formulated as a special minimum cost flow problem. Then, a hybrid estimation of distribution algorithm (EDA) and risk-based A* (named as EDA-RA*) algorithm is proposed to solve the problem. To improve computational efficiency, k-means clustering method is incorporated into EDA-RA* to provide both global and local search heuristic information, which formed the EDA and fast risk-based A* algorithm we call EDA-FRA*. Case study results show that the risk assessment model can capture high risk areas and the generated risk map enables safe UAV path planning in urban complex environments.

Via

Access Paper or Ask Questions

VMAF And Variants: Towards A Unified VQA

Mar 13, 2021
Pankaj Topiwala, Wei Dai, Jiangfeng Pian

Figure 1 for VMAF And Variants: Towards A Unified VQA

Figure 2 for VMAF And Variants: Towards A Unified VQA

Figure 3 for VMAF And Variants: Towards A Unified VQA

Figure 4 for VMAF And Variants: Towards A Unified VQA

Video quality assessment (VQA) is now a fastgrowing subject, beginning to mature in the full reference (FR) case, while the burgeoning no reference (NR) case remains challenging. We investigate variants of the popular VMAF video quality assessment algorithm for the FR case, using support vector regression and feedforward neural networks, and extend it to the NR case, using the same learning architectures, to develop a partially unified framework for VQA. When heavily trained, algorithms such as VMAF perform well on test datasets, with 90%+ match; but predicting performance in the wild is better done by training/testing from scratch, as we do. Even from scratch, we achieve 90%+ performance in FR, with gains over VMAF. And we greatly reduce complexity vs. leading recent NR algorithms, VIDEVAL, RAPIQUE, yet exceed 80% in SRCC. In our preliminary testing, we find the improvements in trainability, while also constraining computational complexity, as quite encouraging, suggesting further study and analysis.

Via

Access Paper or Ask Questions

Benchmarking Deep Learning Classifiers: Beyond Accuracy

Mar 02, 2021
Wei Dai, Daniel Berleant

Figure 1 for Benchmarking Deep Learning Classifiers: Beyond Accuracy

Figure 2 for Benchmarking Deep Learning Classifiers: Beyond Accuracy

Figure 3 for Benchmarking Deep Learning Classifiers: Beyond Accuracy

Figure 4 for Benchmarking Deep Learning Classifiers: Beyond Accuracy

Previous research evaluating deep learning (DL) classifiers has often used top-1/top-5 accuracy. However, the accuracy of DL classifiers is unstable in that it often changes significantly when retested on imperfect or adversarial images. This paper adds to the small but fundamental body of work on benchmarking the robustness of DL classifiers on imperfect images by proposing a two-dimensional metric, consisting of mean accuracy and coefficient of variation, to measure the robustness of DL classifiers. Spearman's rank correlation coefficient and Pearson's correlation coefficient are used and their independence evaluated. A statistical plot we call mCV is presented which aims to help visualize the robustness of the performance of DL classifiers across varying amounts of imperfection in tested images. Finally, we demonstrate that defective images corrupted by two-factor corruption could be used to improve the robustness of DL classifiers. All source codes and related image sets are shared on a website (http://www.animpala.com) to support future research projects.

* 7 pages, 6 figures

Via

Access Paper or Ask Questions

Where is the Model Looking At?--Concentrate and Explain the Network Attention

Sep 29, 2020
Wenjia Xu, Jiuniu Wang, Yang Wang, Guangluan Xu, Wei Dai, Yirong Wu

Figure 1 for Where is the Model Looking At?--Concentrate and Explain the Network Attention

Figure 2 for Where is the Model Looking At?--Concentrate and Explain the Network Attention

Figure 3 for Where is the Model Looking At?--Concentrate and Explain the Network Attention

Figure 4 for Where is the Model Looking At?--Concentrate and Explain the Network Attention

Image classification models have achieved satisfactory performance on many datasets, sometimes even better than human. However, The model attention is unclear since the lack of interpretability. This paper investigates the fidelity and interpretability of model attention. We propose an Explainable Attribute-based Multi-task (EAT) framework to concentrate the model attention on the discriminative image area and make the attention interpretable. We introduce attributes prediction to the multi-task learning network, helping the network to concentrate attention on the foreground objects. We generate attribute-based textual explanations for the network and ground the attributes on the image to show visual explanations. The multi-model explanation can not only improve user trust but also help to find the weakness of network and dataset. Our framework can be generalized to any basic model. We perform experiments on three datasets and five basic models. Results indicate that the EAT framework can give multi-modal explanations that interpret the network decision. The performance of several recognition approaches is improved by guiding network attention.

* IEEE Journal of Selected Topics in Signal Processing, vol. 14, no. 3, pp. 506-516, March 2020

Via

Access Paper or Ask Questions