Representation learning plays a critical role in the analysis of time series data and has high practical value across a wide range of applications. including trend analysis, time series data retrieval and forecasting. In practice, data confusion is a significant issue as it can considerably impact the effectiveness and accuracy of data analysis, machine learning models and decision-making processes. In general, previous studies did not consider the variability at various levels of granularity, thus resulting in inadequate information utilization, which further exacerbated the issue of data confusion. This paper proposes an unsupervised framework to realize multi-granularity representation learning for time series. Specifically, we employed a cross-granularity transformer to develop an association between fine- and coarse-grained representations. In addition, we introduced a retrieval task as an unsupervised training task to learn the multi-granularity representation of time series. Moreover, a novel loss function was designed to obtain the comprehensive multi-granularity representation of the time series via unsupervised learning. The experimental results revealed that the proposed framework demonstrates significant advantages over alternative representation learning models.
Cortical surface reconstruction plays a fundamental role in modeling the rapid brain development during the perinatal period. In this work, we propose Conditional Temporal Attention Network (CoTAN), a fast end-to-end framework for diffeomorphic neonatal cortical surface reconstruction. CoTAN predicts multi-resolution stationary velocity fields (SVF) from neonatal brain magnetic resonance images (MRI). Instead of integrating multiple SVFs, CoTAN introduces attention mechanisms to learn a conditional time-varying velocity field (CTVF) by computing the weighted sum of all SVFs at each integration step. The importance of each SVF, which is estimated by learned attention maps, is conditioned on the age of the neonates and varies with the time step of integration. The proposed CTVF defines a diffeomorphic surface deformation, which reduces mesh self-intersection errors effectively. It only requires 0.21 seconds to deform an initial template mesh to cortical white matter and pial surfaces for each brain hemisphere. CoTAN is validated on the Developing Human Connectome Project (dHCP) dataset with 877 3D brain MR images acquired from preterm and term born neonates. Compared to state-of-the-art baselines, CoTAN achieves superior performance with only 0.12mm geometric error and 0.07% self-intersecting faces. The visualization of our attention maps illustrates that CoTAN indeed learns coarse-to-fine surface deformations automatically without intermediate supervision.
Fluoroscopy is an imaging technique that uses X-ray to obtain a real-time 2D video of the interior of a 3D object, helping surgeons to observe pathological structures and tissue functions especially during intervention. However, it suffers from heavy noise that mainly arises from the clinical use of a low dose X-ray, thereby necessitating the technology of fluoroscopy denoising. Such denoising is challenged by the relative motion between the object being imaged and the X-ray imaging system. We tackle this challenge by proposing a self-supervised, three-stage framework that exploits the domain knowledge of fluoroscopy imaging. (i) Stabilize: we first construct a dynamic panorama based on optical flow calculation to stabilize the non-stationary background induced by the motion of the X-ray detector. (ii) Decompose: we then propose a novel mask-based Robust Principle Component Analysis (RPCA) decomposition method to separate a video with detector motion into a low-rank background and a sparse foreground. Such a decomposition accommodates the reading habit of experts. (iii) Denoise: we finally denoise the background and foreground separately by a self-supervised learning strategy and fuse the denoised parts into the final output via a bilateral, spatiotemporal filter. To assess the effectiveness of our work, we curate a dedicated fluoroscopy dataset of 27 videos (1,568 frames) and corresponding ground truth. Our experiments demonstrate that it achieves significant improvements in terms of denoising and enhancement effects when compared with standard approaches. Finally, expert rating confirms this efficacy.
Fetal Magnetic Resonance Imaging (MRI) is used in prenatal diagnosis and to assess early brain development. Accurate segmentation of the different brain tissues is a vital step in several brain analysis tasks, such as cortical surface reconstruction and tissue thickness measurements. Fetal MRI scans, however, are prone to motion artifacts that can affect the correctness of both manual and automatic segmentation techniques. In this paper, we propose a novel network structure that can simultaneously generate conditional atlases and predict brain tissue segmentation, called CAS-Net. The conditional atlases provide anatomical priors that can constrain the segmentation connectivity, despite the heterogeneity of intensity values caused by motion or partial volume effects. The proposed method is trained and evaluated on 253 subjects from the developing Human Connectome Project (dHCP). The results demonstrate that the proposed method can generate conditional age-specific atlas with sharp boundary and shape variance. It also segment multi-category brain tissues for fetal MRI with a high overall Dice similarity coefficient (DSC) of $85.2\%$ for the selected 9 tissue labels.
We present CortexODE, a deep learning framework for cortical surface reconstruction. CortexODE leverages neural ordinary different equations (ODEs) to deform an input surface into a target shape by learning a diffeomorphic flow. The trajectories of the points on the surface are modeled as ODEs, where the derivatives of their coordinates are parameterized via a learnable Lipschitz-continuous deformation network. This provides theoretical guarantees for the prevention of self-intersections. CortexODE can be integrated to an automatic learning-based pipeline, which reconstructs cortical surfaces efficiently in less than 6 seconds. The pipeline utilizes a 3D U-Net to predict a white matter segmentation from brain Magnetic Resonance Imaging (MRI) scans, and further generates a signed distance function that represents an initial surface. Fast topology correction is introduced to guarantee homeomorphism to a sphere. Following the isosurface extraction step, two CortexODE models are trained to deform the initial surface to white matter and pial surfaces respectively. The proposed pipeline is evaluated on large-scale neuroimage datasets in various age groups including neonates (25-45 weeks), young adults (22-36 years) and elderly subjects (55-90 years). Our experiments demonstrate that the CortexODE-based pipeline can achieve less than 0.2mm average geometric error while being orders of magnitude faster compared to conventional processing pipelines.
Considering the rapidly increasing number of academic papers, searching for and citing appropriate references has become a non-trial task during the wiring of papers. Recommending a handful of candidate papers to a manuscript before publication could ease the burden of the authors, and help the reviewers to check the completeness of the cited resources. Conventional approaches on citation recommendation generally consider recommending one ground-truth citation for a query context from an input manuscript, but lack of consideration on co-citation recommendations. However, a piece of context often needs to be supported by two or more co-citation pairs. Here, we propose a novel scientific paper modeling for citation recommendations, namely Multi-Positive BERT Model for Citation Recommendation (MP-BERT4CR), complied with a series of Multi-Positive Triplet objectives to recommend multiple positive citations for a query context. The proposed approach has the following advantages: First, the proposed multi-positive objectives are effective to recommend multiple positive candidates. Second, we adopt noise distributions which are built based on the historical co-citation frequencies, so that MP-BERT4CR is not only effective on recommending high-frequent co-citation pairs; but also the performances on retrieving the low-frequent ones are significantly improved. Third, we propose a dynamic context sampling strategy which captures the ``macro-scoped'' citing intents from a manuscript and empowers the citation embeddings to be content-dependent, which allow the algorithm to further improve the performances. Single and multiple positive recommendation experiments testified that MP-BERT4CR delivered significant improvements. In addition, MP-BERT4CR are also effective in retrieving the full list of co-citations, and historically low-frequent co-citation pairs compared with the prior works.
POI recommendation is a key task in tourism information systems. However, in contrast to conventional point of interest (POI) recommender systems, the available data is extremely sparse; most tourist visit a few sightseeing spots once and most of these spots have no check-in data from new tourists. Most conventional systems rank sightseeing spots based on their popularity, reputations, and category-based similarities with users' preferences. They do not clarify what users can experience in these spots, which makes it difficult to meet diverse tourism needs. To this end, in this work, we propose a mechanism to recommend POIs to tourists. Our mechanism include two components: one is a probabilistic model that reveals the user behaviors in tourism; the other is a pseudo rating mechanism to handle the cold-start issue in POIs recommendations. We carried out extensive experiments with two datasets collected from Flickr. The experimental results demonstrate that our methods are superior to the state-of-the-art methods in both the recommendation performances (precision, recall and F-measure) and fairness. The experimental results also validate the robustness of the proposed methods, i.e., our methods can handle well the issue of data sparsity.
Traditional cortical surface reconstruction is time consuming and limited by the resolution of brain Magnetic Resonance Imaging (MRI). In this work, we introduce Pial Neural Network (PialNN), a 3D deep learning framework for pial surface reconstruction. PialNN is trained end-to-end to deform an initial white matter surface to a target pial surface by a sequence of learned deformation blocks. A local convolutional operation is incorporated in each block to capture the multi-scale MRI information of each vertex and its neighborhood. This is fast and memory-efficient, which allows reconstructing a pial surface mesh with 150k vertices in one second. The performance is evaluated on the Human Connectome Project (HCP) dataset including T1-weighted MRI scans of 300 subjects. The experimental results demonstrate that PialNN reduces the geometric error of the predicted pial surface by 30% compared to state-of-the-art deep learning approaches.
Based on an exponentially increasing number of academic articles, discovering and citing comprehensive and appropriate resources has become a non-trivial task. Conventional citation recommender methods suffer from severe information loss. For example, they do not consider the section of the paper that the user is writing and for which they need to find a citation, the relatedness between the words in the local context (the text span that describes a citation), or the importance on each word from the local context. These shortcomings make such methods insufficient for recommending adequate citations to academic manuscripts. In this study, we propose a novel embedding-based neural network called "dual attention model for citation recommendation (DACR)" to recommend citations during manuscript preparation. Our method adapts embedding of three dimensions of semantic information: words in the local context, structural contexts, and the section on which a user is working. A neural network is designed to maximize the similarity between the embedding of the three input (local context words, section and structural contexts) and the target citation appearing in the context. The core of the neural network is composed of self-attention and additive attention, where the former aims to capture the relatedness between the contextual words and structural context, and the latter aims to learn the importance of them. The experiments on real-world datasets demonstrate the effectiveness of the proposed approach.