Differentiable architecture search (DARTS) has attracted much attention due to its simplicity and significant improvement in efficiency. However, the excessive accumulation of the skip connection makes it suffer from long-term weak stability and low robustness. Many works attempt to restrict the accumulation of skip connections by indicators or manual design, however, these methods are susceptible to thresholds and human priors. In this work, we suggest a more subtle and direct approach that removes skip connections from the operation space. Then, by introducing an adaptive channel allocation strategy, we redesign the DARTS framework to automatically refill the skip connections in the evaluation stage, resolving the performance degradation caused by the absence of skip connections. Our method, dubbed Adaptive-Channel-Allocation-DARTS (ACA-DRATS), could eliminate the inconsistency in operation strength and significantly expand the architecture diversity. We continue to explore smaller search space under our framework, and offer a direct search on the entire ImageNet dataset. Experiments show that ACA-DRATS improves the search stability and significantly speeds up DARTS by more than ten times while yielding higher accuracy.
The isocitrate dehydrogenase (IDH) gene mutation is an essential biomarker for the diagnosis and prognosis of glioma. It is promising to better predict glioma genotype by integrating focal tumor image and geometric features with brain network features derived from MRI. Convolutions neural networks show reasonable performance in predicting IDH mutation, which, however, cannot learn from non-Euclidean data, e.g., geometric and network data. In this study, we propose a multi-modal learning framework using three separate encoders to extract features of focal tumor image, tumor geometrics and global brain networks. To mitigate the limited availability of diffusion MRI, we develop a self-supervised approach to generate brain networks from anatomical multi-sequence MRI. Moreover, to extract tumor-related features from the brain network, we design a hierarchical attention module for the brain network encoder. Further, we design a bi-level multi-modal contrastive loss to align the multi-modal features and tackle the domain gap at the focal tumor and global brain. Finally, we propose a weighted population graph to integrate the multi-modal features for genotype prediction. Experimental results on the testing set show that the proposed model outperforms the baseline deep learning models. The ablation experiments validate the performance of different components of the framework. The visualized interpretation corresponds to clinical knowledge with further validation. In conclusion, the proposed learning framework provides a novel approach for predicting the genotype of glioma.
Grammatical Error Correction (GEC) aims to automatically detect and correct grammatical errors. In this aspect, dominant models are trained by one-iteration learning while performing multiple iterations of corrections during inference. Previous studies mainly focus on the data augmentation approach to combat the exposure bias, which suffers from two drawbacks. First, they simply mix additionally-constructed training instances and original ones to train models, which fails to help models be explicitly aware of the procedure of gradual corrections. Second, they ignore the interdependence between different types of corrections. In this paper, we propose a Type-Driven Multi-Turn Corrections approach for GEC. Using this approach, from each training instance, we additionally construct multiple training instances, each of which involves the correction of a specific type of errors. Then, we use these additionally-constructed training instances and the original one to train the model in turn. Experimental results and in-depth analysis show that our approach significantly benefits the model training. Particularly, our enhanced model achieves state-of-the-art single-model performance on English GEC benchmarks. We release our code at Github.
Content addressable memory (CAM) is widely used in associative search tasks for its highly parallel pattern matching capability. To accommodate the increasingly complex and data-intensive pattern matching tasks, it is critical to keep improving the CAM density to enhance the performance and area efficiency. In this work, we demonstrate: i) a novel ultra-compact 1FeFET CAM design that enables parallel associative search and in-memory hamming distance calculation; ii) a multi-bit CAM for exact search using the same CAM cell; iii) compact device designs that integrate the series resistor current limiter into the intrinsic FeFET structure to turn the 1FeFET1R into an effective 1FeFET cell; iv) a successful 2-step search operation and a sufficient sensing margin of the proposed binary and multi-bit 1FeFET1R CAM array with sizes of practical interests in both experiments and simulations, given the existing unoptimized FeFET device variation; v) 89.9x speedup and 66.5x energy efficiency improvement over the state-of-the art alignment tools on GPU in accelerating genome pattern matching applications through the hyperdimensional computing paradigm.
In this paper, a unified transformation method in learned image compression(LIC) is proposed from the perspective of modulation. Firstly, the quantization in LIC is considered as a generalized channel with additive uniform noise. Moreover, the LIC is interpreted as a particular communication system according to the consistency in structures and optimization objectives. Thus, the technology of communication systems can be applied to guide the design of modules in LIC. Furthermore, a unified transform method based on signal modulation (TSM) is defined. In the view of TSM, the existing transformation methods are mathematically reduced to a linear modulation. A series of transformation methods, e.g. TPM and TJM, are obtained by extending to nonlinear modulation. The experimental results on various datasets and backbone architectures verify that the effectiveness and robustness of the proposed method. More importantly, it further confirms the feasibility of guiding LIC design from a communication perspective. For example, when backbone architecture is hyperprior combining context model, our method achieves 3.52$\%$ BD-rate reduction over GDN on Kodak dataset without increasing complexity.
Alzheimer's disease (AD) is the most common age-related dementia. Mild cognitive impairment (MCI) is the early stage of cognitive decline before AD. It is crucial to predict the MCI-to-AD conversion for precise management, which remains challenging due to the diversity of patients. Previous evidence shows that the brain network generated from diffusion MRI promises to classify dementia using deep learning. However, the limited availability of diffusion MRI challenges the model training. In this study, we develop a self-supervised contrastive learning approach to generate structural brain networks from routine anatomical MRI under the guidance of diffusion MRI. The generated brain networks are applied to train a learning framework for predicting the MCI-to-AD conversion. Instead of directly modelling the AD brain networks, we train a graph encoder and a variational autoencoder to model the healthy ageing trajectories from brain networks of healthy controls. To predict the MCI-to-AD conversion, we further design a recurrent neural networks based approach to model the longitudinal deviation of patients' brain networks from the healthy ageing trajectory. Numerical results show that the proposed methods outperform the benchmarks in the prediction task. We also visualize the model interpretation to explain the prediction and identify abnormal changes of white matter tracts.
Whole slide images (WSI) provide valuable phenotypic information for histological assessment and malignancy grading of tumors. The WSI-based computational pathology promises to provide rapid diagnostic support and facilitate digital health. The most commonly used WSI are derived from formalin-fixed paraffin-embedded (FFPE) and frozen sections. Currently, the majority of automatic tumor grading models are developed based on FFPE sections, which could be affected by the artifacts introduced by tissue processing. Here we propose a mutual contrastive learning scheme to integrate FFPE and frozen sections and disentangle cross-modality representations for glioma grading. We first design a mutual learning scheme to jointly optimize the model training based on FFPE and frozen sections. Further, we develop a multi-modality domain alignment mechanism to ensure semantic consistency in the backbone model training. We finally design a sphere normalized temperature-scaled cross-entropy loss (NT-Xent), which could promote cross-modality representation disentangling of FFPE and frozen sections. Our experiments show that the proposed scheme achieves better performance than the model trained based on each single modality or mixed modalities. The sphere NT-Xent loss outperforms other typical metrics loss functions.
Neural architecture search (NAS) could help search for robust network architectures, where defining robustness evaluation metrics is the important procedure. However, current robustness evaluations in NAS are not sufficiently comprehensive and reliable. In particular, the common practice only considers adversarial noise and quantified metrics such as the Jacobian matrix, whereas, some studies indicated that the models are also vulnerable to other types of noises such as natural noise. In addition, existing methods taking adversarial noise as the evaluation just use the robust accuracy of the FGSM or PGD, but these adversarial attacks could not provide the adequately reliable evaluation, leading to the vulnerability of the models under stronger attacks. To alleviate the above problems, we propose a novel framework, called Auto Adversarial Attack and Defense (AAAD), where we employ neural architecture search methods, and four types of robustness evaluations are considered, including adversarial noise, natural noise, system noise and quantified metrics, thereby assisting in finding more robust architectures. Also, among the adversarial noise, we use the composite adversarial attack obtained by random search as the new metric to evaluate the robustness of the model architectures. The empirical results on the CIFAR10 dataset show that the searched efficient attack could help find more robust architectures.
Chinese Spell Checking (CSC) aims to detect and correct Chinese spelling errors, which are mainly caused by the phonological or visual similarity. Recently, pre-trained language models (PLMs) promote the progress of CSC task. However, there exists a gap between the learned knowledge of PLMs and the goal of CSC task. PLMs focus on the semantics in text and tend to correct the erroneous characters to semantically proper or commonly used ones, but these aren't the ground-truth corrections. To address this issue, we propose an Error-driven COntrastive Probability Optimization (ECOPO) framework for CSC task. ECOPO refines the knowledge representations of PLMs, and guides the model to avoid predicting these common characters through an error-driven way. Particularly, ECOPO is model-agnostic and it can be combined with existing CSC methods to achieve better performance. Extensive experiments and detailed analyses on SIGHAN datasets demonstrate that ECOPO is simple yet effective.
Neural image compression have reached or out-performed traditional methods (such as JPEG, BPG, WebP). However,their sophisticated network structures with cascaded convolution layers bring heavy computational burden for practical deployment. In this paper, we explore the structural sparsity in neural image compression network to obtain real-time acceleration without any specialized hardware design or algorithm. We propose a simple plug-in adaptive binary channel masking(ABCM) to judge the importance of each convolution channel and introduce sparsity during training. During inference, the unimportant channels are pruned to obtain slimmer network and less computation. We implement our method into three neural image compression networks with different entropy models to verify its effectiveness and generalization, the experiment results show that up to 7x computation reduction and 3x acceleration can be achieved with negligible performance drop.