Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lei Ma

Structure-aware registration network for liver DCE-CT images

Mar 08, 2023
Peng Xue, Jingyang Zhang, Lei Ma, Mianxin Liu, Yuning Gu, Jiawei Huang, Feihong Liua, Yongsheng Pan, Xiaohuan Cao, Dinggang Shen

Figure 1 for Structure-aware registration network for liver DCE-CT images

Figure 2 for Structure-aware registration network for liver DCE-CT images

Figure 3 for Structure-aware registration network for liver DCE-CT images

Figure 4 for Structure-aware registration network for liver DCE-CT images

Image registration of liver dynamic contrast-enhanced computed tomography (DCE-CT) is crucial for diagnosis and image-guided surgical planning of liver cancer. However, intensity variations due to the flow of contrast agents combined with complex spatial motion induced by respiration brings great challenge to existing intensity-based registration methods. To address these problems, we propose a novel structure-aware registration method by incorporating structural information of related organs with segmentation-guided deep registration network. Existing segmentation-guided registration methods only focus on volumetric registration inside the paired organ segmentations, ignoring the inherent attributes of their anatomical structures. In addition, such paired organ segmentations are not always available in DCE-CT images due to the flow of contrast agents. Different from existing segmentation-guided registration methods, our proposed method extracts structural information in hierarchical geometric perspectives of line and surface. Then, according to the extracted structural information, structure-aware constraints are constructed and imposed on the forward and backward deformation field simultaneously. In this way, all available organ segmentations, including unpaired ones, can be fully utilized to avoid the side effect of contrast agent and preserve the topology of organs during registration. Extensive experiments on an in-house liver DCE-CT dataset and a public LiTS dataset show that our proposed method can achieve higher registration accuracy and preserve anatomical structure more effectively than state-of-the-art methods.

Via

Access Paper or Ask Questions

DeepLens: Interactive Out-of-distribution Data Detection in NLP Models

Mar 02, 2023
Da Song, Zhijie Wang, Yuheng Huang, Lei Ma, Tianyi Zhang

Figure 1 for DeepLens: Interactive Out-of-distribution Data Detection in NLP Models

Figure 2 for DeepLens: Interactive Out-of-distribution Data Detection in NLP Models

Figure 3 for DeepLens: Interactive Out-of-distribution Data Detection in NLP Models

Figure 4 for DeepLens: Interactive Out-of-distribution Data Detection in NLP Models

Machine Learning (ML) has been widely used in Natural Language Processing (NLP) applications. A fundamental assumption in ML is that training data and real-world data should follow a similar distribution. However, a deployed ML model may suffer from out-of-distribution (OOD) issues due to distribution shifts in the real-world data. Though many algorithms have been proposed to detect OOD data from text corpora, there is still a lack of interactive tool support for ML developers. In this work, we propose DeepLens, an interactive system that helps users detect and explore OOD issues in massive text corpora. Users can efficiently explore different OOD types in DeepLens with the help of a text clustering method. Users can also dig into a specific text by inspecting salient words highlighted through neuron activation analysis. In a within-subjects user study with 24 participants, participants using DeepLens were able to find nearly twice more types of OOD issues accurately with 22% more confidence compared with a variant of DeepLens that has no interaction or visualization support.

* The first two authors contributed equally. To appear in the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23), April 23--28, 2023, Hamburg, Germany

Via

Access Paper or Ask Questions

DeepSeer: Interactive RNN Explanation and Debugging via State Abstraction

Mar 02, 2023
Zhijie Wang, Yuheng Huang, Da Song, Lei Ma, Tianyi Zhang

Figure 1 for DeepSeer: Interactive RNN Explanation and Debugging via State Abstraction

Figure 2 for DeepSeer: Interactive RNN Explanation and Debugging via State Abstraction

Figure 3 for DeepSeer: Interactive RNN Explanation and Debugging via State Abstraction

Figure 4 for DeepSeer: Interactive RNN Explanation and Debugging via State Abstraction

Recurrent Neural Networks (RNNs) have been widely used in Natural Language Processing (NLP) tasks given its superior performance on processing sequential data. However, it is challenging to interpret and debug RNNs due to the inherent complexity and the lack of transparency of RNNs. While many explainable AI (XAI) techniques have been proposed for RNNs, most of them only support local explanations rather than global explanations. In this paper, we present DeepSeer, an interactive system that provides both global and local explanations of RNN behavior in multiple tightly-coordinated views for model understanding and debugging. The core of DeepSeer is a state abstraction method that bundles semantically similar hidden states in an RNN model and abstracts the model as a finite state machine. Users can explore the global model behavior by inspecting text patterns associated with each state and the transitions between states. Users can also dive into individual predictions by inspecting the state trace and intermediate prediction results of a given input. A between-subjects user study with 28 participants shows that, compared with a popular XAI technique, LIME, participants using DeepSeer made a deeper and more comprehensive assessment of RNN model behavior, identified the root causes of incorrect predictions more accurately, and came up with more actionable plans to improve the model performance.

* To appear in the 2023 CHI Conference on Human Factors in Computing Systems (CHI '23), April 23--28, 2023, Hamburg, Germany

Via

Access Paper or Ask Questions

Neural Episodic Control with State Abstraction

Jan 27, 2023
Zhuo Li, Derui Zhu, Yujing Hu, Xiaofei Xie, Lei Ma, Yan Zheng, Yan Song, Yingfeng Chen, Jianjun Zhao

Figure 1 for Neural Episodic Control with State Abstraction

Figure 2 for Neural Episodic Control with State Abstraction

Figure 3 for Neural Episodic Control with State Abstraction

Figure 4 for Neural Episodic Control with State Abstraction

Existing Deep Reinforcement Learning (DRL) algorithms suffer from sample inefficiency. Generally, episodic control-based approaches are solutions that leverage highly-rewarded past experiences to improve sample efficiency of DRL algorithms. However, previous episodic control-based approaches fail to utilize the latent information from the historical behaviors (e.g., state transitions, topological similarities, etc.) and lack scalability during DRL training. This work introduces Neural Episodic Control with State Abstraction (NECSA), a simple but effective state abstraction-based episodic control containing a more comprehensive episodic memory, a novel state evaluation, and a multi-step state analysis. We evaluate our approach to the MuJoCo and Atari tasks in OpenAI gym domains. The experimental results indicate that NECSA achieves higher sample efficiency than the state-of-the-art episodic control-based approaches. Our data and code are available at the project website\footnote{\url{https://sites.google.com/view/drl-necsa}}.

Via

Access Paper or Ask Questions

An Exploratory Study of AI System Risk Assessment from the Lens of Data Distribution and Uncertainty

Dec 13, 2022
Zhijie Wang, Yuheng Huang, Lei Ma, Haruki Yokoyama, Susumu Tokumoto, Kazuki Munakata

Figure 1 for An Exploratory Study of AI System Risk Assessment from the Lens of Data Distribution and Uncertainty

Figure 2 for An Exploratory Study of AI System Risk Assessment from the Lens of Data Distribution and Uncertainty

Figure 3 for An Exploratory Study of AI System Risk Assessment from the Lens of Data Distribution and Uncertainty

Figure 4 for An Exploratory Study of AI System Risk Assessment from the Lens of Data Distribution and Uncertainty

Deep learning (DL) has become a driving force and has been widely adopted in many domains and applications with competitive performance. In practice, to solve the nontrivial and complicated tasks in real-world applications, DL is often not used standalone, but instead contributes as a piece of gadget of a larger complex AI system. Although there comes a fast increasing trend to study the quality issues of deep neural networks (DNNs) at the model level, few studies have been performed to investigate the quality of DNNs at both the unit level and the potential impacts on the system level. More importantly, it also lacks systematic investigation on how to perform the risk assessment for AI systems from unit level to system level. To bridge this gap, this paper initiates an early exploratory study of AI system risk assessment from both the data distribution and uncertainty angles to address these issues. We propose a general framework with an exploratory study for analyzing AI systems. After large-scale (700+ experimental configurations and 5000+ GPU hours) experiments and in-depth investigations, we reached a few key interesting findings that highlight the practical need and opportunities for more in-depth investigations into AI systems.

* 51 pages, 5 figures, 27 tables

Via

Access Paper or Ask Questions

AI-driven Mobile Apps: an Explorative Study

Dec 03, 2022
Yinghua Li, Xueqi Dang, Haoye Tian, Tiezhu Sun, Zhijie Wang, Lei Ma, Jacques Klein, Tegawende F. Bissyande

Recent years have witnessed an astonishing explosion in the evolution of mobile applications powered by AI technologies. The rapid growth of AI frameworks enables the transition of AI technologies to mobile devices, significantly prompting the adoption of AI apps (i.e., apps that integrate AI into their functions) among smartphone devices. In this paper, we conduct the most extensive empirical study on 56,682 published AI apps from three perspectives: dataset characteristics, development issues, and user feedback and privacy. To this end, we build an automated AI app identification tool, AI Discriminator, that detects eligible AI apps from 7,259,232 mobile apps. First, we carry out a dataset analysis, where we explore the AndroZoo large repository to identify AI apps and their core characteristics. Subsequently, we pinpoint key issues in AI app development (e.g., model protection). Finally, we focus on user reviews and user privacy protection. Our paper provides several notable findings. Some essential ones involve revealing the issue of insufficient model protection by presenting the lack of model encryption, and demonstrating the risk of user privacy data being leaked. We published our large-scale AI app datasets to inspire more future research.

Via

Access Paper or Ask Questions

FAF: A novel multimodal emotion recognition approach integrating face, body and text

Nov 20, 2022
Zhongyu Fang, Aoyun He, Qihui Yu, Baopeng Gao, Weiping Ding, Tong Zhang, Lei Ma

Figure 1 for FAF: A novel multimodal emotion recognition approach integrating face, body and text

Figure 2 for FAF: A novel multimodal emotion recognition approach integrating face, body and text

Figure 3 for FAF: A novel multimodal emotion recognition approach integrating face, body and text

Figure 4 for FAF: A novel multimodal emotion recognition approach integrating face, body and text

Multimodal emotion analysis performed better in emotion recognition depending on more comprehensive emotional clues and multimodal emotion dataset. In this paper, we developed a large multimodal emotion dataset, named "HED" dataset, to facilitate the emotion recognition task, and accordingly propose a multimodal emotion recognition method. To promote recognition accuracy, "Feature After Feature" framework was used to explore crucial emotional information from the aligned face, body and text samples. We employ various benchmarks to evaluate the "HED" dataset and compare the performance with our method. The results show that the five classification accuracy of the proposed multimodal fusion method is about 83.75%, and the performance is improved by 1.83%, 9.38%, and 21.62% respectively compared with that of individual modalities. The complementarity between each channel is effectively used to improve the performance of emotion recognition. We had also established a multimodal online emotion prediction platform, aiming to provide free emotion prediction to more users.

Via

Access Paper or Ask Questions

Common Corruption Robustness of Point Cloud Detectors: Benchmark and Enhancement

Oct 12, 2022
Shuangzhi Li, Zhijie Wang, Felix Juefei-Xu, Qing Guo, Xingyu Li, Lei Ma

Figure 1 for Common Corruption Robustness of Point Cloud Detectors: Benchmark and Enhancement

Figure 2 for Common Corruption Robustness of Point Cloud Detectors: Benchmark and Enhancement

Figure 3 for Common Corruption Robustness of Point Cloud Detectors: Benchmark and Enhancement

Figure 4 for Common Corruption Robustness of Point Cloud Detectors: Benchmark and Enhancement

Object detection through LiDAR-based point cloud has recently been important in autonomous driving. Although achieving high accuracy on public benchmarks, the state-of-the-art detectors may still go wrong and cause a heavy loss due to the widespread corruptions in the real world like rain, snow, sensor noise, etc. Nevertheless, there is a lack of a large-scale dataset covering diverse scenes and realistic corruption types with different severities to develop practical and robust point cloud detectors, which is challenging due to the heavy collection costs. To alleviate the challenge and start the first step for robust point cloud detection, we propose the physical-aware simulation methods to generate degraded point clouds under different real-world common corruptions. Then, for the first attempt, we construct a benchmark based on the physical-aware common corruptions for point cloud detectors, which contains a total of 1,122,150 examples covering 7,481 scenes, 25 common corruption types, and 6 severities. With such a novel benchmark, we conduct extensive empirical studies on 8 state-of-the-art detectors that contain 6 different detection frameworks. Thus we get several insight observations revealing the vulnerabilities of the detectors and indicating the enhancement directions. Moreover, we further study the effectiveness of existing robustness enhancement methods based on data augmentation and data denoising. The benchmark can potentially be a new platform for evaluating point cloud detectors, opening a door for developing novel robustness enhancement methods.

* 16 pages, 6 figures

Via

Access Paper or Ask Questions

Decompiling x86 Deep Neural Network Executables

Oct 04, 2022
Zhibo Liu, Yuanyuan Yuan, Shuai Wang, Xiaofei Xie, Lei Ma

Figure 1 for Decompiling x86 Deep Neural Network Executables

Figure 2 for Decompiling x86 Deep Neural Network Executables

Figure 3 for Decompiling x86 Deep Neural Network Executables

Figure 4 for Decompiling x86 Deep Neural Network Executables

Due to their widespread use on heterogeneous hardware devices, deep learning (DL) models are compiled into executables by DL compilers to fully leverage low-level hardware primitives. This approach allows DL computations to be undertaken at low cost across a variety of computing platforms, including CPUs, GPUs, and various hardware accelerators. We present BTD (Bin to DNN), a decompiler for deep neural network (DNN) executables. BTD takes DNN executables and outputs full model specifications, including types of DNN operators, network topology, dimensions, and parameters that are (nearly) identical to those of the input models. BTD delivers a practical framework to process DNN executables compiled by different DL compilers and with full optimizations enabled on x86 platforms. It employs learning-based techniques to infer DNN operators, dynamic analysis to reveal network architectures, and symbolic execution to facilitate inferring dimensions and parameters of DNN operators. Our evaluation reveals that BTD enables accurate recovery of full specifications of complex DNNs with millions of parameters (e.g., ResNet). The recovered DNN specifications can be re-compiled into a new DNN executable exhibiting identical behavior to the input executable. We show that BTD can boost two representative attacks, adversarial example generation and knowledge stealing, against DNN executables. We also demonstrate cross-architecture legacy code reuse using BTD, and envision BTD being used for other critical downstream tasks like DNN security hardening and patching.

* The extended version of a paper to appear in the Proceedings of the 32nd USENIX Security Symposium, 2023, (USENIX Security '23), 25 pages

Via

Access Paper or Ask Questions

DARTSRepair: Core-failure-set Guided DARTS for Network Robustness to Common Corruptions

Sep 21, 2022
Xuhong Ren, Jianlang Chen, Felix Juefei-Xu, Wanli Xue, Qing Guo, Lei Ma, Jianjun Zhao, Shengyong Chen

Figure 1 for DARTSRepair: Core-failure-set Guided DARTS for Network Robustness to Common Corruptions

Figure 2 for DARTSRepair: Core-failure-set Guided DARTS for Network Robustness to Common Corruptions

Figure 3 for DARTSRepair: Core-failure-set Guided DARTS for Network Robustness to Common Corruptions

Figure 4 for DARTSRepair: Core-failure-set Guided DARTS for Network Robustness to Common Corruptions

Network architecture search (NAS), in particular the differentiable architecture search (DARTS) method, has shown a great power to learn excellent model architectures on the specific dataset of interest. In contrast to using a fixed dataset, in this work, we focus on a different but important scenario for NAS: how to refine a deployed network's model architecture to enhance its robustness with the guidance of a few collected and misclassified examples that are degraded by some real-world unknown corruptions having a specific pattern (e.g., noise, blur, etc.). To this end, we first conduct an empirical study to validate that the model architectures can be definitely related to the corruption patterns. Surprisingly, by just adding a few corrupted and misclassified examples (e.g., $10^3$ examples) to the clean training dataset (e.g., $5.0 \times 10^4$ examples), we can refine the model architecture and enhance the robustness significantly. To make it more practical, the key problem, i.e., how to select the proper failure examples for the effective NAS guidance, should be carefully investigated. Then, we propose a novel core-failure-set guided DARTS that embeds a K-center-greedy algorithm for DARTS to select suitable corrupted failure examples to refine the model architecture. We use our method for DARTS-refined DNNs on the clean as well as 15 corruptions with the guidance of four specific real-world corruptions. Compared with the state-of-the-art NAS as well as data-augmentation-based enhancement methods, our final method can achieve higher accuracy on both corrupted datasets and the original clean dataset. On some of the corruption patterns, we can achieve as high as over 45% absolute accuracy improvements.

* To appear in Pattern Recognition (PR)

Via

Access Paper or Ask Questions