Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yi Li

Victor

Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches

Apr 23, 2024

Yi Li, Yunan Wu, Aggelos K. Katsaggelos

Figure 1 for Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches

Figure 2 for Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches

Figure 3 for Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches

Figure 4 for Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches

Abstract:The advancement of The Laser Interferometer Gravitational-Wave Observatory (LIGO) has significantly enhanced the feasibility and reliability of gravitational wave detection. However, LIGO's high sensitivity makes it susceptible to transient noises known as glitches, which necessitate effective differentiation from real gravitational wave signals. Traditional approaches predominantly employ fully supervised or semi-supervised algorithms for the task of glitch classification and clustering. In the future task of identifying and classifying glitches across main and auxiliary channels, it is impractical to build a dataset with manually labeled ground-truth. In addition, the patterns of glitches can vary with time, generating new glitches without manual labels. In response to this challenge, we introduce the Cross-Temporal Spectrogram Autoencoder (CTSAE), a pioneering unsupervised method for the dimensionality reduction and clustering of gravitational wave glitches. CTSAE integrates a novel four-branch autoencoder with a hybrid of Convolutional Neural Networks (CNN) and Vision Transformers (ViT). To further extract features across multi-branches, we introduce a novel multi-branch fusion method using the CLS (Class) token. Our model, trained and evaluated on the GravitySpy O3 dataset on the main channel, demonstrates superior performance in clustering tasks when compared to state-of-the-art semi-supervised learning methods. To the best of our knowledge, CTSAE represents the first unsupervised approach tailored specifically for clustering LIGO data, marking a significant step forward in the field of gravitational wave research. The code of this paper is available at https://github.com/Zod-L/CTSAE

Via

Access Paper or Ask Questions

Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model

Apr 08, 2024

Jichang Yang, Hegan Chen, Jia Chen, Songqi Wang, Shaocong Wang, Yifei Yu, Xi Chen, Bo Wang, Xinyuan Zhang, Binbin Cui(+12 more)

Figure 1 for Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model

Figure 2 for Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model

Figure 3 for Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model

Figure 4 for Resistive Memory-based Neural Differential Equation Solver for Score-based Diffusion Model

Abstract:Human brains image complicated scenes when reading a novel. Replicating this imagination is one of the ultimate goals of AI-Generated Content (AIGC). However, current AIGC methods, such as score-based diffusion, are still deficient in terms of rapidity and efficiency. This deficiency is rooted in the difference between the brain and digital computers. Digital computers have physically separated storage and processing units, resulting in frequent data transfers during iterative calculations, incurring large time and energy overheads. This issue is further intensified by the conversion of inherently continuous and analog generation dynamics, which can be formulated by neural differential equations, into discrete and digital operations. Inspired by the brain, we propose a time-continuous and analog in-memory neural differential equation solver for score-based diffusion, employing emerging resistive memory. The integration of storage and computation within resistive memory synapses surmount the von Neumann bottleneck, benefiting the generative speed and energy efficiency. The closed-loop feedback integrator is time-continuous, analog, and compact, physically implementing an infinite-depth neural network. Moreover, the software-hardware co-design is intrinsically robust to analog noise. We experimentally validate our solution with 180 nm resistive memory in-memory computing macros. Demonstrating equivalent generative quality to the software baseline, our system achieved remarkable enhancements in generative speed for both unconditional and conditional generation tasks, by factors of 64.8 and 156.5, respectively. Moreover, it accomplished reductions in energy consumption by factors of 5.2 and 4.1. Our approach heralds a new horizon for hardware solutions in edge computing for generative AI applications.

Via

Access Paper or Ask Questions

RoNet: Rotation-oriented Continuous Image Translation

Apr 06, 2024

Yi Li, Xin Xie, Lina Lei, Haiyan Fu, Yanqing Guo

Abstract:The generation of smooth and continuous images between domains has recently drawn much attention in image-to-image (I2I) translation. Linear relationship acts as the basic assumption in most existing approaches, while applied to different aspects including features, models or labels. However, the linear assumption is hard to conform with the element dimension increases and suffers from the limit that having to obtain both ends of the line. In this paper, we propose a novel rotation-oriented solution and model the continuous generation with an in-plane rotation over the style representation of an image, achieving a network named RoNet. A rotation module is implanted in the generation network to automatically learn the proper plane while disentangling the content and the style of an image. To encourage realistic texture, we also design a patch-based semantic style loss that learns the different styles of the similar object in different domains. We conduct experiments on forest scenes (where the complex texture makes the generation very challenging), faces, streetscapes and the iphone2dslr task. The results validate the superiority of our method in terms of visual quality and continuity.

* 14 pages

Via

Access Paper or Ask Questions

Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer

Apr 05, 2024

Toru Shirakawa, Yi Li, Yulun Wu, Sky Qiu, Yuxuan Li, Mingduo Zhao, Hiroyasu Iso, Mark van der Laan

Figure 1 for Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer

Figure 2 for Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer

Figure 3 for Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer

Figure 4 for Longitudinal Targeted Minimum Loss-based Estimation with Temporal-Difference Heterogeneous Transformer

Abstract:We propose Deep Longitudinal Targeted Minimum Loss-based Estimation (Deep LTMLE), a novel approach to estimate the counterfactual mean of outcome under dynamic treatment policies in longitudinal problem settings. Our approach utilizes a transformer architecture with heterogeneous type embedding trained using temporal-difference learning. After obtaining an initial estimate using the transformer, following the targeted minimum loss-based likelihood estimation (TMLE) framework, we statistically corrected for the bias commonly associated with machine learning algorithms. Furthermore, our method also facilitates statistical inference by enabling the provision of 95% confidence intervals grounded in asymptotic statistical theory. Simulation results demonstrate our method's superior performance over existing approaches, particularly in complex, long time-horizon scenarios. It remains effective in small-sample, short-duration contexts, matching the performance of asymptotically efficient estimators. To demonstrate our method in practice, we applied our method to estimate counterfactual mean outcomes for standard versus intensive blood pressure management strategies in a real-world cardiovascular epidemiology cohort study.

Via

Access Paper or Ask Questions

GT-Rain Single Image Deraining Challenge Report

Mar 18, 2024

Howard Zhang, Yunhao Ba, Ethan Yang, Rishi Upadhyay, Alex Wong, Achuta Kadambi, Yun Guo, Xueyao Xiao, Xiaoxiong Wang, Yi Li(+13 more)

Figure 1 for GT-Rain Single Image Deraining Challenge Report

Figure 2 for GT-Rain Single Image Deraining Challenge Report

Figure 3 for GT-Rain Single Image Deraining Challenge Report

Abstract:This report reviews the results of the GT-Rain challenge on single image deraining at the UG2+ workshop at CVPR 2023. The aim of this competition is to study the rainy weather phenomenon in real world scenarios, provide a novel real world rainy image dataset, and to spark innovative ideas that will further the development of single image deraining methods on real images. Submissions were trained on the GT-Rain dataset and evaluated on an extension of the dataset consisting of 15 additional scenes. Scenes in GT-Rain are comprised of real rainy image and ground truth image captured moments after the rain had stopped. 275 participants were registered in the challenge and 55 competed in the final testing phase.

Via

Access Paper or Ask Questions

AllSpark: Reborn Labeled Features from Unlabeled in Transformer for Semi-Supervised Semantic Segmentation

Mar 14, 2024

Haonan Wang, Qixiang Zhang, Yi Li, Xiaomeng Li

Abstract:Semi-supervised semantic segmentation (SSSS) has been proposed to alleviate the burden of time-consuming pixel-level manual labeling, which leverages limited labeled data along with larger amounts of unlabeled data. Current state-of-the-art methods train the labeled data with ground truths and unlabeled data with pseudo labels. However, the two training flows are separate, which allows labeled data to dominate the training process, resulting in low-quality pseudo labels and, consequently, sub-optimal results. To alleviate this issue, we present AllSpark, which reborns the labeled features from unlabeled ones with the channel-wise cross-attention mechanism. We further introduce a Semantic Memory along with a Channel Semantic Grouping strategy to ensure that unlabeled features adequately represent labeled features. The AllSpark shed new light on the architecture level designs of SSSS rather than framework level, which avoids increasingly complicated training pipeline designs. It can also be regarded as a flexible bottleneck module that can be seamlessly integrated into a general transformer-based segmentation model. The proposed AllSpark outperforms existing methods across all evaluation protocols on Pascal, Cityscapes and COCO benchmarks without bells-and-whistles. Code and model weights are available at: https://github.com/xmed-lab/AllSpark.

* Accepted by CVPR 2024; correct typos; this is not the camera-ready version

Via

Access Paper or Ask Questions

Deep Separable Spatiotemporal Learning for Fast Dynamic Cardiac MRI

Feb 24, 2024

Zi Wang, Min Xiao, Yirong Zhou, Chengyan Wang, Naiming Wu, Yi Li, Yiwen Gong, Shufu Chang, Yinyin Chen, Liuhong Zhu(+6 more)

Figure 1 for Deep Separable Spatiotemporal Learning for Fast Dynamic Cardiac MRI

Figure 2 for Deep Separable Spatiotemporal Learning for Fast Dynamic Cardiac MRI

Figure 3 for Deep Separable Spatiotemporal Learning for Fast Dynamic Cardiac MRI

Figure 4 for Deep Separable Spatiotemporal Learning for Fast Dynamic Cardiac MRI

Abstract:Dynamic magnetic resonance imaging (MRI) plays an indispensable role in cardiac diagnosis. To enable fast imaging, the k-space data can be undersampled but the image reconstruction poses a great challenge of high-dimensional processing. This challenge leads to necessitate extensive training data in many deep learning reconstruction methods. This work proposes a novel and efficient approach, leveraging a dimension-reduced separable learning scheme that excels even with highly limited training data. We further integrate it with spatiotemporal priors to develop a Deep Separable Spatiotemporal Learning network (DeepSSL), which unrolls an iteration process of a reconstruction model with both temporal low-rankness and spatial sparsity. Intermediate outputs are visualized to provide insights into the network's behavior and enhance its interpretability. Extensive results on cardiac cine datasets show that the proposed DeepSSL is superior to the state-of-the-art methods visually and quantitatively, while reducing the demand for training cases by up to 75%. And its preliminary adaptability to cardiac patients has been verified through experienced radiologists' and cardiologists' blind reader study. Additionally, DeepSSL also benefits for achieving the downstream task of cardiac segmentation with higher accuracy and shows robustness in prospective real-time cardiac MRI.

* 10 pages, 11 figures, 3 tables

Via

Access Paper or Ask Questions

Spin: An Efficient Secure Computation Framework with GPU Acceleration

Feb 04, 2024

Wuxuan Jiang, Xiangjun Song, Shenbai Hong, Haijun Zhang, Wenxin Liu, Bo Zhao, Wei Xu, Yi Li

Figure 1 for Spin: An Efficient Secure Computation Framework with GPU Acceleration

Figure 2 for Spin: An Efficient Secure Computation Framework with GPU Acceleration

Figure 3 for Spin: An Efficient Secure Computation Framework with GPU Acceleration

Figure 4 for Spin: An Efficient Secure Computation Framework with GPU Acceleration

Abstract:Accuracy and efficiency remain challenges for multi-party computation (MPC) frameworks. Spin is a GPU-accelerated MPC framework that supports multiple computation parties and a dishonest majority adversarial setup. We propose optimized protocols for non-linear functions that are critical for machine learning, as well as several novel optimizations specific to attention that is the fundamental unit of Transformer models, allowing Spin to perform non-trivial CNNs training and Transformer inference without sacrificing security. At the backend level, Spin leverages GPU, CPU, and RDMA-enabled smart network cards for acceleration. Comprehensive evaluations demonstrate that Spin can be up to $2\times$ faster than the state-of-the-art for deep neural network training. For inference on a Transformer model with 18.9 million parameters, our attention-specific optimizations enable Spin to achieve better efficiency, less communication, and better accuracy.

Via

Access Paper or Ask Questions

Exploiting Hierarchical Interactions for Protein Surface Learning

Jan 17, 2024

Yiqun Lin, Liang Pan, Yi Li, Ziwei Liu, Xiaomeng Li

Abstract:Predicting interactions between proteins is one of the most important yet challenging problems in structural bioinformatics. Intrinsically, potential function sites in protein surfaces are determined by both geometric and chemical features. However, existing works only consider handcrafted or individually learned chemical features from the atom type and extract geometric features independently. Here, we identify two key properties of effective protein surface learning: 1) relationship among atoms: atoms are linked with each other by covalent bonds to form biomolecules instead of appearing alone, leading to the significance of modeling the relationship among atoms in chemical feature learning. 2) hierarchical feature interaction: the neighboring residue effect validates the significance of hierarchical feature interaction among atoms and between surface points and atoms (or residues). In this paper, we present a principled framework based on deep learning techniques, namely Hierarchical Chemical and Geometric Feature Interaction Network (HCGNet), for protein surface analysis by bridging chemical and geometric features with hierarchical interactions. Extensive experiments demonstrate that our method outperforms the prior state-of-the-art method by 2.3% in site prediction task and 3.2% in interaction matching task, respectively. Our code is available at https://github.com/xmed-lab/HCGNet.

* Accepted to J-BHI

Via

Access Paper or Ask Questions

Digital twin-assisted three-dimensional electrical capacitance tomography for multiphase flow imaging

Dec 22, 2023

Shengnan Wang, Yi Li, Zhou Chen, Yunjie Yang

Abstract:Three-dimensional electrical capacitance tomography (3D-ECT) has shown promise for visualizing industrial multiphase flows. However, existing 3D-ECT approaches suffer from limited imaging resolution and lack assessment metrics, hampering their effectiveness in quantitative multiphase flow imaging. This paper presents a digital twin (DT)-assisted 3D-ECT, aiming to overcome these limitations and enhance our understanding of multiphase flow dynamics. The DT framework incorporates a 3D fluid-electrostatic field coupling model (3D-FECM) that digitally represents the physical 3D-ECT system, which enables us to simulate real multiphase flows and generate a comprehensive virtual multiphase flow 3D imaging dataset. Additionally, the framework includes a deep neural network named 3D deep back projection (3D-DBP), which learns multiphase flow features from the virtual dataset and enables more accurate 3D flow imaging in the real world. The DT-assisted 3D-ECT was validated through virtual and physical experiments, demonstrating superior image quality, noise robustness and computational efficiency compared to conventional 3D-ECT approaches. This research contributes to developing accurate and reliable 3D-ECT techniques and their implementation in multiphase flow systems across various industries.

Via

Access Paper or Ask Questions