Model-based reinforcement learning (MBRL) attempts to use an available or a learned model to improve the data efficiency of reinforcement learning. This work proposes a one-step lookback approach that jointly learns the latent-space model and the policy to realize the sample-efficient continuous robotic control, wherein the control-theoretical knowledge is utilized to decrease the model learning difficulty. Specifically, the so-called one-step backward data is utilized to facilitate the incremental evolution model, an alternative structured representation of the robotics evolution model in the MBRL field. The incremental evolution model accurately predicts the robotics movement but with low sample complexity. This is because the formulated incremental evolution model degrades the model learning difficulty into a parametric matrix learning problem, which is especially favourable to high-dimensional robotics applications. The imagined data from the learned incremental evolution model is used to supplement training data to enhance the sample efficiency. Comparative numerical simulations on benchmark continuous robotics control problems are conducted to validate the efficiency of our proposed one-step lookback approach.
Despite the inherent limitations of existing Large Language Models in directly reading and interpreting single-cell omics data, they demonstrate significant potential and flexibility as the Foundation Model. This research focuses on how to train and adapt the Large Language Model with the capability to interpret and distinguish cell types in single-cell RNA sequencing data. Our preliminary research results indicate that these foundational models excel in accurately categorizing known cell types, demonstrating the potential of the Large Language Models as effective tools for uncovering new biological insights.
Representation learning frameworks in unlabeled time series have been proposed for medical signal processing. Despite the numerous excellent progresses have been made in previous works, we observe the representation extracted for the time series still does not generalize well. In this paper, we present a Time series (medical signal) Representation Learning framework via Spectrogram (TRLS) to get more informative representations. We transform the input time-domain medical signals into spectrograms and design a time-frequency encoder named Time Frequency RNN (TFRNN) to capture more robust multi-scale representations from the augmented spectrograms. Our TRLS takes spectrogram as input with two types of different data augmentations and maximizes the similarity between positive ones, which effectively circumvents the problem of designing negative samples. Our evaluation of four real-world medical signal datasets focusing on medical signal classification shows that TRLS is superior to the existing frameworks.
Expecting intelligent machines to efficiently work in real world requires a new method to understand unstructured information in unknown environments with good accuracy, scalability and generalization, like human. Here, a memristive neural computing based perceptual signal differential processing and learning method for intelligent machines is presented, via extracting main features of environmental information and applying associated encoded stimuli to memristors, we successfully obtain human-like ability in processing unstructured environmental information, such as amplification (>720%) and adaptation (<50%) of mechanical stimuli. The method also exhibits good scalability and generalization, validated in two typical applications of intelligent machines: object grasping and autonomous driving. In the former, a robot hand experimentally realizes safe and stable grasping, through learning unknown object features (e.g., sharp corner and smooth surface) with a single memristor in 1 ms. In the latter, the decision-making information of 10 unstructured environments in autonomous driving (e.g., overtaking cars, pedestrians) are accurately (94%) extracted with a 40x25 memristor array. By mimicking the intrinsic nature of human low-level perception mechanisms in electronic memristive neural circuits, the proposed method is adaptable to diverse sensing technologies, helping intelligent machines to generate smart high-level decisions in real world.
The rapid identification and accurate diagnosis of breast cancer, known as the killer of women, have become greatly significant for those patients. Numerous breast cancer histopathological image classification methods have been proposed. But they still suffer from two problems. (1) These methods can only hand high-resolution (HR) images. However, the low-resolution (LR) images are often collected by the digital slide scanner with limited hardware conditions. Compared with HR images, LR images often lose some key features like texture, which deeply affects the accuracy of diagnosis. (2) The existing methods have fixed receptive fields, so they can not extract and fuse multi-scale features well for images with different magnification factors. To fill these gaps, we present a \textbf{S}ingle \textbf{H}istopathological \textbf{I}mage \textbf{S}uper-\textbf{R}esolution \textbf{C}lassification network (SHISRCNet), which consists of two modules: Super-Resolution (SR) and Classification (CF) modules. SR module reconstructs LR images into SR ones. CF module extracts and fuses the multi-scale features of SR images for classification. In the training stage, we introduce HR images into the CF module to enhance SHISRCNet's performance. Finally, through the joint training of these two modules, super-resolution and classified of LR images are integrated into our model. The experimental results demonstrate that the effects of our method are close to the SOTA methods with taking HR images as inputs.
This paper presents a novel method for depth completion, which leverages multi-view improved monitored distillation to generate more precise depth maps. Our approach builds upon the state-of-the-art ensemble distillation method, in which we introduce a stereo-based model as a teacher model to improve the accuracy of the student model for depth completion. By minimizing the reconstruction error for a given image during ensemble distillation, we can avoid learning inherent error modes of completion-based teachers. To provide self-supervised information, we also employ multi-view depth consistency and multi-scale minimum reprojection. These techniques utilize existing structural constraints to yield supervised signals for student model training, without requiring costly ground truth depth information. Our extensive experimental evaluation demonstrates that our proposed method significantly improves the accuracy of the baseline monitored distillation method.
At present, backdoor attacks attract attention as they do great harm to deep learning models. The adversary poisons the training data making the model being injected with a backdoor after being trained unconsciously by victims using the poisoned dataset. In the field of text, however, existing works do not provide sufficient defense against backdoor attacks. In this paper, we propose a Noise-augmented Contrastive Learning (NCL) framework to defend against textual backdoor attacks when training models with untrustworthy data. With the aim of mitigating the mapping between triggers and the target label, we add appropriate noise perturbing possible backdoor triggers, augment the training dataset, and then pull homology samples in the feature space utilizing contrastive learning objective. Experiments demonstrate the effectiveness of our method in defending three types of textual backdoor attacks, outperforming the prior works.
Recently, Graph Convolutional Networks (GCNs) have become state-of-the-art algorithms for analyzing non-euclidean graph data. However, it is challenging to realize efficient GCN training, especially on large graphs. The reasons are many-folded: 1) GCN training incurs a substantial memory footprint. Full-batch training on large graphs even requires hundreds to thousands of gigabytes of memory to buffer the intermediate data for back-propagation. 2) GCN training involves both memory-intensive data reduction and computation-intensive features/gradients update operations. Such a heterogeneous nature challenges current CPU/GPU platforms. 3) The irregularity of graphs and the complex training dataflow jointly increase the difficulty of improving a GCN training system's efficiency. This paper presents GCNear, a hybrid architecture to tackle these challenges. Specifically, GCNear adopts a DIMM-based memory system to provide easy-to-scale memory capacity. To match the heterogeneous nature, we categorize GCN training operations as memory-intensive Reduce and computation-intensive Update operations. We then offload Reduce operations to on-DIMM NMEs, making full use of the high aggregated local bandwidth. We adopt a CAE with sufficient computation capacity to process Update operations. We further propose several optimization strategies to deal with the irregularity of GCN tasks and improve GCNear's performance. We also propose a Multi-GCNear system to evaluate the scalability of GCNear.
Attributed network representation learning aims at learning node embeddings by integrating network structure and attribute information. It is a challenge to fully capture the microscopic structure and the attribute semantics simultaneously, where the microscopic structure includes the one-step, two-step and multi-step relations, indicating the first-order, second-order and high-order proximity of nodes, respectively. In this paper, we propose a deep attributed network representation learning via attribute enhanced neighborhood (DANRL-ANE) model to improve the robustness and effectiveness of node representations. The DANRL-ANE model adopts the idea of the autoencoder, and expands the decoder component to three branches to capture different order proximity. We linearly combine the adjacency matrix with the attribute similarity matrix as the input of our model, where the attribute similarity matrix is calculated by the cosine similarity between the attributes based on the social homophily. In this way, we preserve the second-order proximity to enhance the robustness of DANRL-ANE model on sparse networks, and deal with the topological and attribute information simultaneously. Moreover, the sigmoid cross-entropy loss function is extended to capture the neighborhood character, so that the first-order proximity is better preserved. We compare our model with the state-of-the-art models on five real-world datasets and two network analysis tasks, i.e., link prediction and node classification. The DANRL-ANE model performs well on various networks, even on sparse networks or networks with isolated nodes given the attribute information is sufficient.