Alert button
Picture for Zhenyu Zhou

Zhenyu Zhou

Alert button

Accutar Biotechnology

A Geometric Perspective on Diffusion Models

May 31, 2023
Defang Chen, Zhenyu Zhou, Jian-Ping Mei, Chunhua Shen, Chun Chen, Can Wang

Figure 1 for A Geometric Perspective on Diffusion Models
Figure 2 for A Geometric Perspective on Diffusion Models
Figure 3 for A Geometric Perspective on Diffusion Models
Figure 4 for A Geometric Perspective on Diffusion Models

Recent years have witnessed significant progress in developing efficient training and fast sampling approaches for diffusion models. A recent remarkable advancement is the use of stochastic differential equations (SDEs) to describe data perturbation and generative modeling in a unified mathematical framework. In this paper, we reveal several intriguing geometric structures of diffusion models and contribute a simple yet powerful interpretation to their sampling dynamics. Through carefully inspecting a popular variance-exploding SDE and its marginal-preserving ordinary differential equation (ODE) for sampling, we discover that the data distribution and the noise distribution are smoothly connected with an explicit, quasi-linear sampling trajectory, and another implicit denoising trajectory, which even converges faster in terms of visual quality. We also establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm, with which we can characterize the asymptotic behavior of diffusion models and identify the score deviation. These new geometric observations enable us to improve previous sampling algorithms, re-examine latent interpolation, as well as re-explain the working principles of distillation-based fast sampling techniques.

Viaarxiv icon

HDR Video Reconstruction with a Large Dynamic Dataset in Raw and sRGB Domains

Apr 12, 2023
Huanjing Yue, Yubo Peng, Biting Yu, Xuanwu Yin, Zhenyu Zhou, Jingyu Yang

Figure 1 for HDR Video Reconstruction with a Large Dynamic Dataset in Raw and sRGB Domains
Figure 2 for HDR Video Reconstruction with a Large Dynamic Dataset in Raw and sRGB Domains
Figure 3 for HDR Video Reconstruction with a Large Dynamic Dataset in Raw and sRGB Domains
Figure 4 for HDR Video Reconstruction with a Large Dynamic Dataset in Raw and sRGB Domains

High dynamic range (HDR) video reconstruction is attracting more and more attention due to the superior visual quality compared with those of low dynamic range (LDR) videos. The availability of LDR-HDR training pairs is essential for the HDR reconstruction quality. However, there are still no real LDR-HDR pairs for dynamic scenes due to the difficulty in capturing LDR-HDR frames simultaneously. In this work, we propose to utilize a staggered sensor to capture two alternate exposure images simultaneously, which are then fused into an HDR frame in both raw and sRGB domains. In this way, we build a large scale LDR-HDR video dataset with 85 scenes and each scene contains 60 frames. Based on this dataset, we further propose a Raw-HDRNet, which utilizes the raw LDR frames as inputs. We propose a pyramid flow-guided deformation convolution to align neighboring frames. Experimental results demonstrate that 1) the proposed dataset can improve the HDR reconstruction performance on real scenes for three benchmark networks; 2) Compared with sRGB inputs, utilizing raw inputs can further improve the reconstruction quality and our proposed Raw-HDRNet is a strong baseline for raw HDR reconstruction. Our dataset and code will be released after the acceptance of this paper.

Viaarxiv icon

BEVFusion4D: Learning LiDAR-Camera Fusion Under Bird's-Eye-View via Cross-Modality Guidance and Temporal Aggregation

Mar 30, 2023
Hongxiang Cai, Zeyuan Zhang, Zhenyu Zhou, Ziyin Li, Wenbo Ding, Jiuhua Zhao

Figure 1 for BEVFusion4D: Learning LiDAR-Camera Fusion Under Bird's-Eye-View via Cross-Modality Guidance and Temporal Aggregation
Figure 2 for BEVFusion4D: Learning LiDAR-Camera Fusion Under Bird's-Eye-View via Cross-Modality Guidance and Temporal Aggregation
Figure 3 for BEVFusion4D: Learning LiDAR-Camera Fusion Under Bird's-Eye-View via Cross-Modality Guidance and Temporal Aggregation
Figure 4 for BEVFusion4D: Learning LiDAR-Camera Fusion Under Bird's-Eye-View via Cross-Modality Guidance and Temporal Aggregation

Integrating LiDAR and Camera information into Bird's-Eye-View (BEV) has become an essential topic for 3D object detection in autonomous driving. Existing methods mostly adopt an independent dual-branch framework to generate LiDAR and camera BEV, then perform an adaptive modality fusion. Since point clouds provide more accurate localization and geometry information, they could serve as a reliable spatial prior to acquiring relevant semantic information from the images. Therefore, we design a LiDAR-Guided View Transformer (LGVT) to effectively obtain the camera representation in BEV space and thus benefit the whole dual-branch fusion system. LGVT takes camera BEV as the primitive semantic query, repeatedly leveraging the spatial cue of LiDAR BEV for extracting image features across multiple camera views. Moreover, we extend our framework into the temporal domain with our proposed Temporal Deformable Alignment (TDA) module, which aims to aggregate BEV features from multiple historical frames. Including these two modules, our framework dubbed BEVFusion4D achieves state-of-the-art results in 3D object detection, with 72.0% mAP and 73.5% NDS on the nuScenes validation set, and 73.3% mAP and 74.7% NDS on nuScenes test set, respectively.

* 13 pages, 7 figures 
Viaarxiv icon

Two-timescale Resource Allocation for Automated Networks in IIoT

Mar 24, 2022
Yanhua He, Yun Ren, Zhenyu Zhou, Shahid Mumtaz, Saba Al-Rubaye, Antonios Tsourdos, Octavia A. Dobre

Figure 1 for Two-timescale Resource Allocation for Automated Networks in IIoT
Figure 2 for Two-timescale Resource Allocation for Automated Networks in IIoT
Figure 3 for Two-timescale Resource Allocation for Automated Networks in IIoT
Figure 4 for Two-timescale Resource Allocation for Automated Networks in IIoT

The rapid technological advances of cellular technologies will revolutionize network automation in industrial internet of things (IIoT). In this paper, we investigate the two-timescale resource allocation problem in IIoT networks with hybrid energy supply, where temporal variations of energy harvesting (EH), electricity price, channel state, and data arrival exhibit different granularity. The formulated problem consists of energy management at a large timescale, as well as rate control, channel selection, and power allocation at a small timescale. To address this challenge, we develop an online solution to guarantee bounded performance deviation with only causal information. Specifically, Lyapunov optimization is leveraged to transform the long-term stochastic optimization problem into a series of short-term deterministic optimization problems. Then, a low-complexity rate control algorithm is developed based on alternating direction method of multipliers (ADMM), which accelerates the convergence speed via the decomposition-coordination approach. Next, the joint channel selection and power allocation problem is transformed into a one-to-many matching problem, and solved by the proposed price-based matching with quota restriction. Finally, the proposed algorithm is verified through simulations under various system configurations.

Viaarxiv icon

Molecular modeling with machine-learned universal potential functions

Mar 06, 2021
Ke Liu, Zekun Ni, Zhenyu Zhou, Suocheng Tan, Xun Zou, Haoming Xing, Xiangyan Sun, Qi Han, Junqiu Wu, Jie Fan

Figure 1 for Molecular modeling with machine-learned universal potential functions
Figure 2 for Molecular modeling with machine-learned universal potential functions
Figure 3 for Molecular modeling with machine-learned universal potential functions
Figure 4 for Molecular modeling with machine-learned universal potential functions

Molecular modeling is an important topic in drug discovery. Decades of research have led to the development of high quality scalable molecular force fields. In this paper, we show that neural networks can be used to train an universal approximator for energy potential functions. By incorporating a fully automated training process we have been able to train smooth, differentiable, and predictive potential functions on large scale crystal structures. A variety of tests have also performed to show the superiority and versatility of the machine-learned model.

Viaarxiv icon

Boosting Gradient for White-Box Adversarial Attacks

Oct 21, 2020
Hongying Liu, Zhenyu Zhou, Fanhua Shang, Xiaoyu Qi, Yuanyuan Liu, Licheng Jiao

Figure 1 for Boosting Gradient for White-Box Adversarial Attacks
Figure 2 for Boosting Gradient for White-Box Adversarial Attacks
Figure 3 for Boosting Gradient for White-Box Adversarial Attacks
Figure 4 for Boosting Gradient for White-Box Adversarial Attacks

Deep neural networks (DNNs) are playing key roles in various artificial intelligence applications such as image classification and object recognition. However, a growing number of studies have shown that there exist adversarial examples in DNNs, which are almost imperceptibly different from original samples, but can greatly change the network output. Existing white-box attack algorithms can generate powerful adversarial examples. Nevertheless, most of the algorithms concentrate on how to iteratively make the best use of gradients to improve adversarial performance. In contrast, in this paper, we focus on the properties of the widely-used ReLU activation function, and discover that there exist two phenomena (i.e., wrong blocking and over transmission) misleading the calculation of gradients in ReLU during the backpropagation. Both issues enlarge the difference between the predicted changes of the loss function from gradient and corresponding actual changes, and mislead the gradients which results in larger perturbations. Therefore, we propose a universal adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient based white-box attack algorithms. During the backpropagation of the network, our approach calculates the gradient of the loss function versus network input, maps the values to scores, and selects a part of them to update the misleading gradients. Comprehensive experimental results on \emph{ImageNet} demonstrate that our ADV-ReLU can be easily integrated into many state-of-the-art gradient-based white-box attack algorithms, as well as transferred to black-box attack attackers, to further decrease perturbations in the ${\ell _2}$-norm.

* 9 pages,6 figures 
Viaarxiv icon

Prediction of amino acid side chain conformation using a deep neural network

Jul 26, 2017
Ke Liu, Xiangyan Sun, Jun Ma, Zhenyu Zhou, Qilin Dong, Shengwen Peng, Junqiu Wu, Suocheng Tan, Günter Blobel, Jie Fan

Figure 1 for Prediction of amino acid side chain conformation using a deep neural network
Figure 2 for Prediction of amino acid side chain conformation using a deep neural network
Figure 3 for Prediction of amino acid side chain conformation using a deep neural network
Figure 4 for Prediction of amino acid side chain conformation using a deep neural network

A deep neural network based architecture was constructed to predict amino acid side chain conformation with unprecedented accuracy. Amino acid side chain conformation prediction is essential for protein homology modeling and protein design. Current widely-adopted methods use physics-based energy functions to evaluate side chain conformation. Here, using a deep neural network architecture without physics-based assumptions, we have demonstrated that side chain conformation prediction accuracy can be improved by more than 25%, especially for aromatic residues compared with current standard methods. More strikingly, the prediction method presented here is robust enough to identify individual conformational outliers from high resolution structures in a protein data bank without providing its structural factors. We envisage that our amino acid side chain predictor could be used as a quality check step for future protein structure model validation and many other potential applications such as side chain assignment in Cryo-electron microscopy, crystallography model auto-building, protein folding and small molecule ligand docking.

Viaarxiv icon