In recent years, point cloud representation has become one of the research hotspots in the field of computer vision, and has been widely used in many fields, such as autonomous driving, virtual reality, robotics, etc. Although deep learning techniques have achieved great success in processing regular structured 2D grid image data, there are still great challenges in processing irregular, unstructured point cloud data. Point cloud classification is the basis of point cloud analysis, and many deep learning-based methods have been widely used in this task. Therefore, the purpose of this paper is to provide researchers in this field with the latest research progress and future trends. First, we introduce point cloud acquisition, characteristics, and challenges. Second, we review 3D data representations, storage formats, and commonly used datasets for point cloud classification. We then summarize deep learning-based methods for point cloud classification and complement recent research work. Next, we compare and analyze the performance of the main methods. Finally, we discuss some challenges and future directions for point cloud classification.
Cardiac magnetic resonance imaging (CMR) has been widely used in clinical practice for the medical diagnosis of cardiac diseases. However, the long acquisition time hinders its development in real-time applications. Here, we propose a novel self-consistency guided multi-prior learning framework named $k$-$t$ CLAIR to exploit spatiotemporal correlations from highly undersampled data for accelerated dynamic parallel MRI reconstruction. The $k$-$t$ CLAIR progressively reconstructs faithful images by leveraging multiple complementary priors learned in the $x$-$t$, $x$-$f$, and $k$-$t$ domains in an iterative fashion, as dynamic MRI exhibits high spatiotemporal redundancy. Additionally, $k$-$t$ CLAIR incorporates calibration information for prior learning, resulting in a more consistent reconstruction. Experimental results on cardiac cine and T1W/T2W images demonstrate that $k$-$t$ CLAIR achieves high-quality dynamic MR reconstruction in terms of both quantitative and qualitative performance.
We introduce CartiMorph, a framework for automated knee articular cartilage morphometrics. It takes an image as input and generates quantitative metrics for cartilage subregions, including the percentage of full-thickness cartilage loss (FCL), mean thickness, surface area, and volume. CartiMorph leverages the power of deep learning models for hierarchical image feature representation. Deep learning models were trained and validated for tissue segmentation, template construction, and template-to-image registration. We established methods for surface-normal-based cartilage thickness mapping, FCL estimation, and rule-based cartilage parcellation. Our cartilage thickness map showed less error in thin and peripheral regions. We evaluated the effectiveness of the adopted segmentation model by comparing the quantitative metrics obtained from model segmentation and those from manual segmentation. The root-mean-squared deviation of the FCL measurements was less than 8%, and strong correlations were observed for the mean thickness (Pearson's correlation coefficient $\rho \in [0.82,0.97]$), surface area ($\rho \in [0.82,0.98]$) and volume ($\rho \in [0.89,0.98]$) measurements. We compared our FCL measurements with those from a previous study and found that our measurements deviated less from the ground truths. We observed superior performance of the proposed rule-based cartilage parcellation method compared with the atlas-based approach. CartiMorph has the potential to promote imaging biomarkers discovery for knee osteoarthritis.
Despite promising advances in deep learning-based MRI reconstruction methods, restoring high-frequency image details and textures remains a challenging problem for accelerated MRI. To tackle this challenge, we propose a novel context-aware multi-prior network (CAMP-Net) for MRI reconstruction. CAMP-Net leverages the complementary nature of multiple prior knowledge and explores data redundancy between adjacent slices in the hybrid domain to improve image quality. It incorporates three interleaved modules respectively for image enhancement, k-space restoration, and calibration consistency to jointly learn context-aware multiple priors in an end-to-end fashion. The image enhancement module learns a coil-combined image prior to suppress noise-like artifacts, while the k-space restoration module explores multi-coil k-space correlations to recover high-frequency details. The calibration consistency module embeds the known physical properties of MRI acquisition to ensure consistency of k-space correlations extracted from measurements and the artifact-free image intermediate. The resulting low- and high-frequency reconstructions are hierarchically aggregated in a frequency fusion module and iteratively refined to progressively reconstruct the final image. We evaluated the generalizability and robustness of our method on three large public datasets with various accelerations and sampling patterns. Comprehensive experiments demonstrate that CAMP-Net outperforms state-of-the-art methods in terms of reconstruction quality and quantitative $T_2$ mapping.
Mid-term electricity load forecasting (LF) plays a critical role in power system planning and operation. To address the issue of error accumulation and transfer during the operation of existing LF models, a novel model called error correction based LF (ECLF) is proposed in this paper, which is designed to provide more accurate and stable LF. Firstly, time series analysis and feature engineering act on the original data to decompose load data into three components and extract relevant features. Then, based on the idea of stacking ensemble, long short-term memory is employed as an error correction module to forecast the components separately, and the forecast results are treated as new features to be fed into extreme gradient boosting for the second-step forecasting. Finally, the component sub-series forecast results are reconstructed to obtain the final LF results. The proposed model is evaluated on real-world electricity load data from two cities in China, and the experimental results demonstrate its superior performance compared to the other benchmark models.
In this letter, we propose a novel tensor-based modulation scheme for massive unsourced random access. The proposed modulation can be deemed as a summation of third-order tensors, of which the factors are representatives of subspaces. A constellation design based on high-dimensional Grassmann manifold is presented for information encoding. The uniqueness of tensor decomposition provides theoretical guarantee for active user separation. Simulation results show that our proposed method outperforms the state-of-the-art tensor-based modulation.
Harmonic retrieval (HR) has a wide range of applications in the scenes where signals are modelled as a summation of sinusoids. Past works have developed a number of approaches to recover the original signals. Most of them rely on classical singular value decomposition, which are vulnerable to unexpected outliers. In this paper, we present new decomposition algorithms of third-order complex-valued tensors with $L_1$-principle component analysis ($L_1$-PCA) of complex data and apply them to a novel random access HR model in presence of outliers. We also develop a novel subcarrier recovery method for the proposed model. Simulations are designed to compare our proposed method with some existing tensor-based algorithms for HR. The results demonstrate the outlier-insensitivity of the proposed method.
How to represent a face pattern? While it is presented in a continuous way in our visual system, computers often store and process the face image in a discrete manner with 2D arrays of pixels. In this study, we attempt to learn a continuous representation for face images with explicit functions. First, we propose an explicit model (EmFace) for human face representation in the form of a finite sum of mathematical terms, where each term is an analytic function element. Further, to estimate the unknown parameters of EmFace, a novel neural network, EmNet, is designed with an encoder-decoder structure and trained using the backpropagation algorithm, where the encoder is defined by a deep convolutional neural network and the decoder is an explicit mathematical expression of EmFace. Experimental results show that EmFace has a higher representation performance on faces with various expressions, postures, and other factors, compared to that of other methods. Furthermore, EmFace achieves reasonable performance on several face image processing tasks, including face image restoration, denoising, and transformation.
Monolithic integration of multiband (1.4~ 6.0 GHz) RF acoustic devices were successfully demonstrated within the same process flow by using the lithium niobate (LN) thin film on silicon carbide (LNOSiC) substrate. A novel surface mode with sinking energy distribution was proposed, exhibiting reduced propagation loss. Surface wave and Lamb wave resonators with suppressed transverse modes and leaky modes were demonstrated, showing scalable resonances from 1.4 to 5.7 GHz, electromechanical coupling coefficients (k2) between 7.9% and 29.3%, and maximum Bode-Q (Qmax) larger than 3200. Arrayed filters with a small footprint (4.0 x 2.5 mm2) but diverse center frequencies (fc) and 3-dB fractional bandwidths (FBW) were achieved, showing fc from 1.4 to 6.0 GHz, FBW between 3.3% and 13.3%, and insertion loss (IL) between 0.59 and 2.10 dB. These results may promote the progress of hundred-filter sub-6 GHz RF front-end modules (RF-FEMs).
Accurate liver and lesion segmentation from computed tomography (CT) images are highly demanded in clinical practice for assisting the diagnosis and assessment of hepatic tumor disease. However, automatic liver and lesion segmentation from contrast-enhanced CT volumes is extremely challenging due to the diversity in contrast, resolution, and quality of images. Previous methods based on UNet for 2D slice-by-slice or 3D volume-by-volume segmentation either lack sufficient spatial contexts or suffer from high GPU computational cost, which limits the performance. To tackle these issues, we propose a novel context-aware PolyUNet for accurate liver and lesion segmentation. It jointly explores structural diversity and consecutive t-adjacent slices to enrich feature expressive power and spatial contextual information while avoiding the overload of GPU memory consumption. In addition, we utilize zoom out/in and two-stage refinement strategy to exclude the irrelevant contexts and focus on the specific region for the fine-grained segmentation. Our method achieved very competitive performance at the MICCAI 2017 Liver Tumor Segmentation (LiTS) Challenge among all tasks with a single model and ranked the $3^{rd}$, $12^{th}$, $2^{nd}$, and $5^{th}$ places in the liver segmentation, lesion segmentation, lesion detection, and tumor burden estimation, respectively.