Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiawen Li

RetMIL: Retentive Multiple Instance Learning for Histopathological Whole Slide Image Classification

Mar 16, 2024
Hongbo Chu, Qiehe Sun, Jiawen Li, Yuxuan Chen, Lizhong Zhang, Tian Guan, Anjia Han, Yonghong He

Figure 1 for RetMIL: Retentive Multiple Instance Learning for Histopathological Whole Slide Image Classification

Figure 2 for RetMIL: Retentive Multiple Instance Learning for Histopathological Whole Slide Image Classification

Figure 3 for RetMIL: Retentive Multiple Instance Learning for Histopathological Whole Slide Image Classification

Figure 4 for RetMIL: Retentive Multiple Instance Learning for Histopathological Whole Slide Image Classification

Histopathological whole slide image (WSI) analysis with deep learning has become a research focus in computational pathology. The current paradigm is mainly based on multiple instance learning (MIL), in which approaches with Transformer as the backbone are well discussed. These methods convert WSI tasks into sequence tasks by representing patches as tokens in the WSI sequence. However, the feature complexity brought by high heterogeneity and the ultra-long sequences brought by gigapixel size makes Transformer-based MIL suffer from the challenges of high memory consumption, slow inference speed, and lack of performance. To this end, we propose a retentive MIL method called RetMIL, which processes WSI sequences through hierarchical feature propagation structure. At the local level, the WSI sequence is divided into multiple subsequences. Tokens of each subsequence are updated through a parallel linear retention mechanism and aggregated utilizing an attention layer. At the global level, subsequences are fused into a global sequence, then updated through a serial retention mechanism, and finally the slide-level representation is obtained through a global attention pooling. We conduct experiments on two public CAMELYON and BRACS datasets and an public-internal LUNG dataset, confirming that RetMIL not only achieves state-of-the-art performance but also significantly reduces computational overhead. Our code will be accessed shortly.

* under review

Via

Access Paper or Ask Questions

Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis

Mar 12, 2024
Jiawen Li, Yuxuan Chen, Hongbo Chu, Qiehe Sun, Tian Guan, Anjia Han, Yonghong He

Figure 1 for Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis

Figure 2 for Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis

Figure 3 for Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis

Figure 4 for Dynamic Graph Representation with Knowledge-aware Attention for Histopathology Whole Slide Image Analysis

Histopathological whole slide images (WSIs) classification has become a foundation task in medical microscopic imaging processing. Prevailing approaches involve learning WSIs as instance-bag representations, emphasizing significant instances but struggling to capture the interactions between instances. Additionally, conventional graph representation methods utilize explicit spatial positions to construct topological structures but restrict the flexible interaction capabilities between instances at arbitrary locations, particularly when spatially distant. In response, we propose a novel dynamic graph representation algorithm that conceptualizes WSIs as a form of the knowledge graph structure. Specifically, we dynamically construct neighbors and directed edge embeddings based on the head and tail relationships between instances. Then, we devise a knowledge-aware attention mechanism that can update the head node features by learning the joint attention score of each neighbor and edge. Finally, we obtain a graph-level embedding through the global pooling process of the updated head, serving as an implicit representation for the WSI classification. Our end-to-end graph representation learning approach has outperformed the state-of-the-art WSI analysis methods on three TCGA benchmark datasets and in-house test sets. Our code is available at https://github.com/WonderLandxD/WiKG.

* Accepted by CVPR 2024

Via

Access Paper or Ask Questions

Neural Implicit Surface Reconstruction for Freehand 3D Ultrasound Volumetric Point Clouds with Geometric Constraints

Jan 13, 2024
Hongbo Chen, Logiraj Kumaralingam, Edmond H. M. Lou, Kumaradevan Punithakumar, Jiawen Li, Thanh-Tu Pham, Lawrence H. Le, Rui Zheng

Three-dimensional (3D) freehand ultrasound (US) is a widely used imaging modality that allows non-invasive imaging of medical anatomy without radiation exposure. The freehand 3D US surface reconstruction is vital to acquire the accurate anatomical structures needed for modeling, registration, and visualization. However, the currently used traditional methods cannot produce a high-quality surface due to imaging noise and connectivity issues in US. Although the deep learning-based approaches exhibiting the improvements in smoothness, continuity and resolution, the investigation into freehand 3D US remains limited. In this study, we introduce a self-supervised neural implicit surface reconstruction method to learn the signed distance functions (SDFs) from freehand 3D US volumetric point clouds. In particular, our method iteratively learns the SDFs by moving the 3D queries sampled around the point clouds to approximate the surface with the assistance of two novel geometric constraints. We assess our method on the three imaging systems, using twenty-three shapes that include six distinct anthropomorphic phantoms datasets and seventeen in vivo carotid artery datasets. Experimental results on phantoms outperform the existing approach, with a 67% reduction in Chamfer distance, 60% in Hausdorff distance, and 61% in Average absolute distance. Furthermore, our method achieves a 0.92 Dice score on the in vivo datasets and demonstrates great clinical potential.

* Hongbo Chen and Logiraj Kumaralingam contributed equally

Via

Access Paper or Ask Questions

The Whole Pathological Slide Classification via Weakly Supervised Learning

Jul 12, 2023
Qiehe Sun, Jiawen Li, Jin Xu, Junru Cheng, Tian Guan, Yonghong He

Figure 1 for The Whole Pathological Slide Classification via Weakly Supervised Learning

Figure 2 for The Whole Pathological Slide Classification via Weakly Supervised Learning

Figure 3 for The Whole Pathological Slide Classification via Weakly Supervised Learning

Figure 4 for The Whole Pathological Slide Classification via Weakly Supervised Learning

Due to its superior efficiency in utilizing annotations and addressing gigapixel-sized images, multiple instance learning (MIL) has shown great promise as a framework for whole slide image (WSI) classification in digital pathology diagnosis. However, existing methods tend to focus on advanced aggregators with different structures, often overlooking the intrinsic features of H\&E pathological slides. To address this limitation, we introduced two pathological priors: nuclear heterogeneity of diseased cells and spatial correlation of pathological tiles. Leveraging the former, we proposed a data augmentation method that utilizes stain separation during extractor training via a contrastive learning strategy to obtain instance-level representations. We then described the spatial relationships between the tiles using an adjacency matrix. By integrating these two views, we designed a multi-instance framework for analyzing H\&E-stained tissue images based on pathological inductive bias, encompassing feature extraction, filtering, and aggregation. Extensive experiments on the Camelyon16 breast dataset and TCGA-NSCLC Lung dataset demonstrate that our proposed framework can effectively handle tasks related to cancer detection and differentiation of subtypes, outperforming state-of-the-art medical image classification methods based on MIL. The code will be released later.

Via

Access Paper or Ask Questions

Automatic Diagnosis of Carotid Atherosclerosis Using a Portable Freehand 3D Ultrasound Imaging System

Jan 08, 2023
Jiawen Li, Yunqian Huang, Sheng Song, Hongbo Chen, Junni Shi, Duo Xu, Haibin Zhang, Man Chen, Rui Zheng

Figure 1 for Automatic Diagnosis of Carotid Atherosclerosis Using a Portable Freehand 3D Ultrasound Imaging System

Figure 2 for Automatic Diagnosis of Carotid Atherosclerosis Using a Portable Freehand 3D Ultrasound Imaging System

Figure 3 for Automatic Diagnosis of Carotid Atherosclerosis Using a Portable Freehand 3D Ultrasound Imaging System

Figure 4 for Automatic Diagnosis of Carotid Atherosclerosis Using a Portable Freehand 3D Ultrasound Imaging System

Objective: The objective of this study is to develop a deep-learning based detection and diagnosis technique for carotid atherosclerosis using a portable freehand 3D ultrasound (US) imaging system. Methods: A total of 127 3D carotid artery datasets were acquired using a portable 3D US imaging system. A U-Net segmentation network was firstly applied to extract the carotid artery on 2D transverse frame, then a novel 3D reconstruction algorithm using fast dot projection (FDP) method with position regularization was proposed to reconstruct the carotid artery volume. Furthermore, a convolutional neural network was used to classify the healthy case and diseased case qualitatively. 3D volume analysis including longitudinal reprojection algorithm and stenosis grade measurement algorithm was developed to obtain the clinical metrics quantitatively. Results: The proposed system achieved sensitivity of 0.714, specificity of 0.851 and accuracy of 0.803 respectively in diagnosis of carotid atherosclerosis. The automatically measured stenosis grade illustrated good correlation (r=0.762) with the experienced expert measurement. Conclusion: the developed technique based on 3D US imaging can be applied to the automatic diagnosis of carotid atherosclerosis. Significance: The proposed deep-learning based technique was specially designed for a portable 3D freehand US system, which can provide carotid atherosclerosis examination more conveniently and decrease the dependence on clinician's experience.

Via

Access Paper or Ask Questions

Product Segmentation Newsvendor Problems: A Robust Learning Approach

Jul 08, 2022
Xiaoli Yan, Hui Yu, Jiawen Li, Frank Youhua Chen

Figure 1 for Product Segmentation Newsvendor Problems: A Robust Learning Approach

Figure 2 for Product Segmentation Newsvendor Problems: A Robust Learning Approach

Figure 3 for Product Segmentation Newsvendor Problems: A Robust Learning Approach

Figure 4 for Product Segmentation Newsvendor Problems: A Robust Learning Approach

We propose and analyze a product segmentation newsvendor problem, which generalizes the phenomenon of segmentation sales of a class of perishable items. The product segmentation newsvendor problem is a new variant of the newsvendor problem, reflecting that sellers maximize profits by determining the inventory of the whole item in the context of uncertain demand for sub-items. We derive the closed-form robust ordering decision by assuming that the means and covariance matrix of stochastic demand are available but not the distributions. However, robust approaches that always trade-off in the worst-case demand scenario face a concern in solution conservatism; thus, the traditional robust schemes offer unsatisfactory. In this paper, we integrate robust and deep reinforcement learning (DRL) techniques and propose a new paradigm termed robust learning to increase the attractiveness of robust policies. Notably, we take the robust decision as human domain knowledge and implement it into the training process of DRL by designing a full-process human-machine collaborative mechanism of teaching experience, normative decision, and regularization return. Simulation results confirm that our approach effectively improves robust performance and can generalize to various problems that require robust but less conservative solutions. Simultaneously, fewer training episodes, increased training stability, and interpretability of behavior may have the opportunity to facilitate the deployment of DRL algorithms in operational practice. Furthermore, the successful attempt of RLDQN to solve the 1000-dimensional demand scenarios reveals that the algorithm provides a path to solve complex operational problems through human-machine collaboration and may have potential significance for solving other complex operational management problems.

Via

Access Paper or Ask Questions

HIRL: A General Framework for Hierarchical Image Representation Learning

May 26, 2022
Minghao Xu, Yuanfan Guo, Xuanyu Zhu, Jiawen Li, Zhenbang Sun, Jian Tang, Yi Xu, Bingbing Ni

Figure 1 for HIRL: A General Framework for Hierarchical Image Representation Learning

Figure 2 for HIRL: A General Framework for Hierarchical Image Representation Learning

Figure 3 for HIRL: A General Framework for Hierarchical Image Representation Learning

Figure 4 for HIRL: A General Framework for Hierarchical Image Representation Learning

Learning self-supervised image representations has been broadly studied to boost various visual understanding tasks. Existing methods typically learn a single level of image semantics like pairwise semantic similarity or image clustering patterns. However, these methods can hardly capture multiple levels of semantic information that naturally exists in an image dataset, e.g., the semantic hierarchy of "Persian cat to cat to mammal" encoded in an image database for species. It is thus unknown whether an arbitrary image self-supervised learning (SSL) approach can benefit from learning such hierarchical semantics. To answer this question, we propose a general framework for Hierarchical Image Representation Learning (HIRL). This framework aims to learn multiple semantic representations for each image, and these representations are structured to encode image semantics from fine-grained to coarse-grained. Based on a probabilistic factorization, HIRL learns the most fine-grained semantics by an off-the-shelf image SSL approach and learns multiple coarse-grained semantics by a novel semantic path discrimination scheme. We adopt six representative image SSL methods as baselines and study how they perform under HIRL. By rigorous fair comparison, performance gain is observed on all the six methods for diverse downstream tasks, which, for the first time, verifies the general effectiveness of learning hierarchical image semantics. All source code and model weights are available at https://github.com/hirl-team/HIRL

* Research project paper. arXiv v1: all source code and model weights released

Via

Access Paper or Ask Questions

HCSC: Hierarchical Contrastive Selective Coding

Feb 01, 2022
Yuanfan Guo, Minghao Xu, Jiawen Li, Bingbing Ni, Xuanyu Zhu, Zhenbang Sun, Yi Xu

Figure 1 for HCSC: Hierarchical Contrastive Selective Coding

Figure 2 for HCSC: Hierarchical Contrastive Selective Coding

Figure 3 for HCSC: Hierarchical Contrastive Selective Coding

Figure 4 for HCSC: Hierarchical Contrastive Selective Coding

Hierarchical semantic structures naturally exist in an image dataset, in which several semantically relevant image clusters can be further integrated into a larger cluster with coarser-grained semantics. Capturing such structures with image representations can greatly benefit the semantic understanding on various downstream tasks. Existing contrastive representation learning methods lack such an important model capability. In addition, the negative pairs used in these methods are not guaranteed to be semantically distinct, which could further hamper the structural correctness of learned image representations. To tackle these limitations, we propose a novel contrastive learning framework called Hierarchical Contrastive Selective Coding (HCSC). In this framework, a set of hierarchical prototypes are constructed and also dynamically updated to represent the hierarchical semantic structures underlying the data in the latent space. To make image representations better fit such semantic structures, we employ and further improve conventional instance-wise and prototypical contrastive learning via an elaborate pair selection scheme. This scheme seeks to select more diverse positive pairs with similar semantics and more precise negative pairs with truly distinct semantics. On extensive downstream tasks, we verify the superior performance of HCSC over state-of-the-art contrastive methods, and the effectiveness of major model components is proved by plentiful analytical studies. Our source code and model weights are available at https://github.com/gyfastas/HCSC

* Research project paper. arXiv v1: code & model weights released

Via

Access Paper or Ask Questions

True or False: Does the Deep Learning Model Learn to Detect Rumors?

Dec 01, 2021
Shiwen Ni, Jiawen Li, Hung-Yu Kao

Figure 1 for True or False: Does the Deep Learning Model Learn to Detect Rumors?

Figure 2 for True or False: Does the Deep Learning Model Learn to Detect Rumors?

Figure 3 for True or False: Does the Deep Learning Model Learn to Detect Rumors?

Figure 4 for True or False: Does the Deep Learning Model Learn to Detect Rumors?

It is difficult for humans to distinguish the true and false of rumors, but current deep learning models can surpass humans and achieve excellent accuracy on many rumor datasets. In this paper, we investigate whether deep learning models that seem to perform well actually learn to detect rumors. We evaluate models on their generalization ability to out-of-domain examples by fine-tuning BERT-based models on five real-world datasets and evaluating against all test sets. The experimental results indicate that the generalization ability of the models on other unseen datasets are unsatisfactory, even common-sense rumors cannot be detected. Moreover, we found through experiments that models take shortcuts and learn absurd knowledge when the rumor datasets have serious data pitfalls. This means that simple modifications to the rumor text based on specific rules will lead to inconsistent model predictions. To more realistically evaluate rumor detection models, we proposed a new evaluation method called paired test (PairT), which requires models to correctly predict a pair of test samples at the same time. Furthermore, we make recommendations on how to better create rumor dataset and evaluate rumor detection model at the end of this paper.

* 5 pages, 3 figures, 8 tables

Via

Access Paper or Ask Questions

DropAttack: A Masked Weight Adversarial Training Method to Improve Generalization of Neural Networks

Aug 29, 2021
Shiwen Ni, Jiawen Li, Hung-Yu Kao

Figure 1 for DropAttack: A Masked Weight Adversarial Training Method to Improve Generalization of Neural Networks

Figure 2 for DropAttack: A Masked Weight Adversarial Training Method to Improve Generalization of Neural Networks

Figure 3 for DropAttack: A Masked Weight Adversarial Training Method to Improve Generalization of Neural Networks

Figure 4 for DropAttack: A Masked Weight Adversarial Training Method to Improve Generalization of Neural Networks

Adversarial training has been proven to be a powerful regularization method to improve the generalization of models. However, current adversarial training methods only attack the original input sample or the embedding vectors, and their attacks lack coverage and diversity. To further enhance the breadth and depth of attack, we propose a novel masked weight adversarial training method called DropAttack, which enhances generalization of model by adding intentionally worst-case adversarial perturbations to both the input and hidden layers in different dimensions and minimize the adversarial risks generated by each layer. DropAttack is a general technique and can be adopt to a wide variety of neural networks with different architectures. To validate the effectiveness of the proposed method, we used five public datasets in the fields of natural language processing (NLP) and computer vision (CV) for experimental evaluating. We compare the proposed method with other adversarial training methods and regularization methods, and our method achieves state-of-the-art on all datasets. In addition, Dropattack can achieve the same performance when it use only a half training data compared to other standard training method. Theoretical analysis reveals that DropAttack can perform gradient regularization at random on some of the input and wight parameters of the model. Further visualization experiments show that DropAttack can push the minimum risk of the model to a lower and flatter loss landscapes. Our source code is publicly available on https://github.com/nishiwen1214/DropAttack.

Via

Access Paper or Ask Questions