Northeast Normal University
Abstract:Autism Spectrum Disorder (ASD) has been emerging as a growing public health threat. Early diagnosis of ASD is crucial for timely, effective intervention and treatment. However, conventional diagnosis methods based on communications and behavioral patterns are unreliable for children younger than 2 years of age. Given evidences of neurodevelopmental abnormalities in ASD infants, we resort to a novel deep learning-based method to extract key features from the inherently scarce, class-imbalanced, and heterogeneous structural MR images for early autism diagnosis. Specifically, we propose a Siamese verification framework to extend the scarce data, and an unsupervised compressor to alleviate data imbalance by extracting key features. We also proposed weight constraints to cope with sample heterogeneity by giving different samples different voting weights during validation, and we used Path Signature to unravel meaningful developmental features from the two-time point data longitudinally. Extensive experiments have shown that our method performed well under practical scenarios, transcending existing machine learning methods.
Abstract:Brain structural MRI has been widely used to assess the future progression of cognitive impairment (CI). Previous learning-based studies usually suffer from the issue of small-sized labeled training data, while there exist a huge amount of structural MRIs in large-scale public databases. Intuitively, brain anatomical structures derived from these public MRIs (even without task-specific label information) can be used to boost CI progression trajectory prediction. However, previous studies seldom take advantage of such brain anatomy prior. To this end, this paper proposes a brain anatomy prior modeling (BAPM) framework to forecast the clinical progression of cognitive impairment with small-sized target MRIs by exploring anatomical brain structures. Specifically, the BAPM consists of a pretext model and a downstream model, with a shared brain anatomy-guided encoder to model brain anatomy prior explicitly. Besides the encoder, the pretext model also contains two decoders for two auxiliary tasks (i.e., MRI reconstruction and brain tissue segmentation), while the downstream model relies on a predictor for classification. The brain anatomy-guided encoder is pre-trained with the pretext model on 9,344 auxiliary MRIs without diagnostic labels for anatomy prior modeling. With this encoder frozen, the downstream model is then fine-tuned on limited target MRIs for prediction. We validate the BAPM on two CI-related studies with T1-weighted MRIs from 448 subjects. Experimental results suggest the effectiveness of BAPM in (1) four CI progression prediction tasks, (2) MR image reconstruction, and (3) brain tissue segmentation, compared with several state-of-the-art methods.
Abstract:Multi-view shape reconstruction has achieved impressive progresses thanks to the latest advances in neural implicit surface rendering. However, existing methods based on signed distance function (SDF) are limited to closed surfaces, failing to reconstruct a wide range of real-world objects that contain open-surface structures. In this work, we introduce a new neural rendering framework, coded NeUDF, that can reconstruct surfaces with arbitrary topologies solely from multi-view supervision. To gain the flexibility of representing arbitrary surfaces, NeUDF leverages the unsigned distance function (UDF) as surface representation. While a naive extension of an SDF-based neural renderer cannot scale to UDF, we propose two new formulations of weight function specially tailored for UDF-based volume rendering. Furthermore, to cope with open surface rendering, where the in/out test is no longer valid, we present a dedicated normal regularization strategy to resolve the surface orientation ambiguity. We extensively evaluate our method over a number of challenging datasets, including DTU}, MGN, and Deep Fashion 3D. Experimental results demonstrate that nEudf can significantly outperform the state-of-the-art method in the task of multi-view surface reconstruction, especially for complex shapes with open boundaries.
Abstract:While most recent autonomous driving system focuses on developing perception methods on ego-vehicle sensors, people tend to overlook an alternative approach to leverage intelligent roadside cameras to extend the perception ability beyond the visual range. We discover that the state-of-the-art vision-centric bird's eye view detection methods have inferior performances on roadside cameras. This is because these methods mainly focus on recovering the depth regarding the camera center, where the depth difference between the car and the ground quickly shrinks while the distance increases. In this paper, we propose a simple yet effective approach, dubbed BEVHeight, to address this issue. In essence, instead of predicting the pixel-wise depth, we regress the height to the ground to achieve a distance-agnostic formulation to ease the optimization process of camera-only perception methods. On popular 3D detection benchmarks of roadside cameras, our method surpasses all previous vision-centric methods by a significant margin. The code is available at {\url{https://github.com/ADLab-AutoDrive/BEVHeight}}.
Abstract:Extreme multi-label text classification utilizes the label hierarchy to partition extreme labels into multiple label groups, turning the task into simple multi-group multi-label classification tasks. Current research encodes labels as a vector with fixed length which needs establish multiple classifiers for different label groups. The problem is how to build only one classifier without sacrificing the label relationship in the hierarchy. This paper adopts the multi-answer questioning task for extreme multi-label classification. This paper also proposes an auxiliary classification evaluation metric. This study adopts the proposed method and the evaluation metric to the legal domain. The utilization of legal Berts and the study on task distribution are discussed. The experiment results show that the proposed hierarchy and multi-answer questioning task can do extreme multi-label classification for EURLEX dataset. And in minor/fine-tuning the multi-label classification task, the domain adapted BERT models could not show apparent advantages in this experiment. The method is also theoretically applicable to zero-shot learning.
Abstract:Previous works on emotion recognition in conversation (ERC) follow a two-step paradigm, which can be summarized as first producing context-independent features via fine-tuning pretrained language models (PLMs) and then analyzing contextual information and dialogue structure information among the extracted features. However, we discover that this paradigm has several limitations. Accordingly, we propose a novel paradigm, i.e., exploring contextual information and dialogue structure information in the fine-tuning step, and adapting the PLM to the ERC task in terms of input text, classification structure, and training strategy. Furthermore, we develop our model BERT-ERC according to the proposed paradigm, which improves ERC performance in three aspects, namely suggestive text, fine-grained classification module, and two-stage training. Compared to existing methods, BERT-ERC achieves substantial improvement on four datasets, indicating its effectiveness and generalization capability. Besides, we also set up the limited resources scenario and the online prediction scenario to approximate real-world scenarios. Extensive experiments demonstrate that the proposed paradigm significantly outperforms the previous one and can be adapted to various scenes.
Abstract:Deep operator network (DeepONet) has demonstrated great success in various learning tasks, including learning solution operators of partial differential equations. In particular, it provides an efficient approach to predict the evolution equations in a finite time horizon. Nevertheless, the vanilla DeepONet suffers from the issue of stability degradation in the long-time prediction. This paper proposes a {\em transfer-learning} aided DeepONet to enhance the stability. Our idea is to use transfer learning to sequentially update the DeepONets as the surrogates for propagators learned in different time frames. The evolving DeepONets can better track the varying complexities of the evolution equations, while only need to be updated by efficient training of a tiny fraction of the operator networks. Through systematic experiments, we show that the proposed method not only improves the long-time accuracy of DeepONet while maintaining similar computational cost but also substantially reduces the sample size of the training set.
Abstract:Conventional channel estimation (CE) for Internet of Things (IoT) systems encounters challenges such as low spectral efficiency, high energy consumption, and blocked propagation paths. Although superimposed pilot-based CE schemes and the reconfigurable intelligent surface (RIS) could partially tackle these challenges, limited researches have been done for a systematic solution. In this paper, a superimposed pilot-based CE with the reconfigurable intelligent surface (RIS)-assisted mode is proposed and further enhanced the performance by networks. Specifically, at the user equipment (UE), the pilot for CE is superimposed on the uplink user data to improve the spectral efficiency and energy consumption for IoT systems, and two lightweight networks at the base station (BS) alleviate the computational complexity and processing delay for the CE and symbol detection (SD). These dedicated networks are developed in a cooperation manner. That is, the conventional methods are employed to perform initial feature extraction, and the developed neural networks (NNs) are oriented to learn along with the extracted features. With the assistance of the extracted initial feature, the number of training data for network training is reduced. Simulation results show that, the computational complexity and processing delay are decreased without sacrificing the accuracy of CE and SD, and the normalized mean square error (NMSE) and bit error rate (BER) performance at the BS are improved against the parameter variance.
Abstract:Reconfigurable antennas (RAs) are a promising technology to enhance the capacity and coverage of wireless communication systems. However, RA systems have two major challenges: (i) High computational complexity of mode selection, and (ii) High overhead of channel estimation for all modes. In this paper, we develop a low-complexity iterative mode selection algorithm for data transmission in an RA-MIMO system. Furthermore, we study channel estimation of an RA multi-user MIMO system. However, given the coherence time, it is challenging to estimate channels of all modes. We propose a mode selection scheme to select a subset of modes, train channels for the selected subset, and predict channels for the remaining modes. In addition, we propose a prediction scheme based on pattern correlation between modes. Representative simulation results demonstrate the system's channel estimation error and achievable sum-rate for various selected modes and different signal-to-noise ratios (SNRs).
Abstract:MIMO technology has enabled spatial multiple access and has provided a higher system spectral efficiency (SE). However, this technology has some drawbacks, such as the high number of RF chains that increases complexity in the system. One of the solutions to this problem can be to employ reconfigurable antennas (RAs) that can support different radiation patterns during transmission to provide similar performance with fewer RF chains. In this regard, the system aims to maximize the SE with respect to optimum beamforming design and RA mode selection. Due to the non-convexity of this problem, we propose machine learning-based methods for RA antenna mode selection in both dynamic and static scenarios. In the static scenario, we present how to solve the RA mode selection problem, an integer optimization problem in nature, via deep convolutional neural networks (DCNN). A Multi-Armed-bandit (MAB) consisting of offline and online training is employed for the dynamic RA state selection. For the proposed MAB, the computational complexity of the optimization problem is reduced. Finally, the proposed methods in both dynamic and static scenarios are compared with exhaustive search and random selection methods.