Alert button
Picture for Shifeng Zhang

Shifeng Zhang

Alert button

SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models

Sep 10, 2023
Shuchen Xue, Mingyang Yi, Weijian Luo, Shifeng Zhang, Jiacheng Sun, Zhenguo Li, Zhi-Ming Ma

Figure 1 for SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models
Figure 2 for SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models
Figure 3 for SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models
Figure 4 for SA-Solver: Stochastic Adams Solver for Fast Sampling of Diffusion Models

Diffusion Probabilistic Models (DPMs) have achieved considerable success in generation tasks. As sampling from DPMs is equivalent to solving diffusion SDE or ODE which is time-consuming, numerous fast sampling methods built upon improved differential equation solvers are proposed. The majority of such techniques consider solving the diffusion ODE due to its superior efficiency. However, stochastic sampling could offer additional advantages in generating diverse and high-quality data. In this work, we engage in a comprehensive analysis of stochastic sampling from two aspects: variance-controlled diffusion SDE and linear multi-step SDE solver. Based on our analysis, we propose SA-Solver, which is an improved efficient stochastic Adams method for solving diffusion SDE to generate data with high quality. Our experiments show that SA-Solver achieves: 1) improved or comparable performance compared with the existing state-of-the-art sampling methods for few-step sampling; 2) SOTA FID scores on substantial benchmark datasets under a suitable number of function evaluations (NFEs).

Viaarxiv icon

Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models

May 29, 2023
Weijian Luo, Tianyang Hu, Shifeng Zhang, Jiacheng Sun, Zhenguo Li, Zhihua Zhang

Figure 1 for Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models
Figure 2 for Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models
Figure 3 for Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models
Figure 4 for Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models

Due to the ease of training, ability to scale, and high sample quality, diffusion models (DMs) have become the preferred option for generative modeling, with numerous pre-trained models available for a wide variety of datasets. Containing intricate information about data distributions, pre-trained DMs are valuable assets for downstream applications. In this work, we consider learning from pre-trained DMs and transferring their knowledge to other generative models in a data-free fashion. Specifically, we propose a general framework called Diff-Instruct to instruct the training of arbitrary generative models as long as the generated samples are differentiable with respect to the model parameters. Our proposed Diff-Instruct is built on a rigorous mathematical foundation where the instruction process directly corresponds to minimizing a novel divergence we call Integral Kullback-Leibler (IKL) divergence. IKL is tailored for DMs by calculating the integral of the KL divergence along a diffusion process, which we show to be more robust in comparing distributions with misaligned supports. We also reveal non-trivial connections of our method to existing works such as DreamFusion, and generative adversarial training. To demonstrate the effectiveness and universality of Diff-Instruct, we consider two scenarios: distilling pre-trained diffusion models and refining existing GAN models. The experiments on distilling pre-trained diffusion models show that Diff-Instruct results in state-of-the-art single-step diffusion-based models. The experiments on refining GAN models show that the Diff-Instruct can consistently improve the pre-trained generators of GAN models across various settings.

Viaarxiv icon

PILC: Practical Image Lossless Compression with an End-to-end GPU Oriented Neural Framework

Jun 10, 2022
Ning Kang, Shanzhao Qiu, Shifeng Zhang, Zhenguo Li, Shutao Xia

Figure 1 for PILC: Practical Image Lossless Compression with an End-to-end GPU Oriented Neural Framework
Figure 2 for PILC: Practical Image Lossless Compression with an End-to-end GPU Oriented Neural Framework
Figure 3 for PILC: Practical Image Lossless Compression with an End-to-end GPU Oriented Neural Framework
Figure 4 for PILC: Practical Image Lossless Compression with an End-to-end GPU Oriented Neural Framework

Generative model based image lossless compression algorithms have seen a great success in improving compression ratio. However, the throughput for most of them is less than 1 MB/s even with the most advanced AI accelerated chips, preventing them from most real-world applications, which often require 100 MB/s. In this paper, we propose PILC, an end-to-end image lossless compression framework that achieves 200 MB/s for both compression and decompression with a single NVIDIA Tesla V100 GPU, 10 times faster than the most efficient one before. To obtain this result, we first develop an AI codec that combines auto-regressive model and VQ-VAE which performs well in lightweight setting, then we design a low complexity entropy coder that works well with our codec. Experiments show that our framework compresses better than PNG by a margin of 30% in multiple datasets. We believe this is an important step to bring AI compression forward to commercial use.

Viaarxiv icon

Split Hierarchical Variational Compression

Apr 05, 2022
Tom Ryder, Chen Zhang, Ning Kang, Shifeng Zhang

Figure 1 for Split Hierarchical Variational Compression
Figure 2 for Split Hierarchical Variational Compression
Figure 3 for Split Hierarchical Variational Compression
Figure 4 for Split Hierarchical Variational Compression

Variational autoencoders (VAEs) have witnessed great success in performing the compression of image datasets. This success, made possible by the bits-back coding framework, has produced competitive compression performance across many benchmarks. However, despite this, VAE architectures are currently limited by a combination of coding practicalities and compression ratios. That is, not only do state-of-the-art methods, such as normalizing flows, often demonstrate out-performance, but the initial bits required in coding makes single and parallel image compression challenging. To remedy this, we introduce Split Hierarchical Variational Compression (SHVC). SHVC introduces two novelties. Firstly, we propose an efficient autoregressive prior, the autoregressive sub-pixel convolution, that allows a generalisation between per-pixel autoregressions and fully factorised probability models. Secondly, we define our coding framework, the autoregressive initial bits, that flexibly supports parallel coding and avoids -- for the first time -- many of the practicalities commonly associated with bits-back coding. In our experiments, we demonstrate SHVC is able to achieve state-of-the-art compression performance across full-resolution lossless image compression tasks, with up to 100x fewer model parameters than competing VAE approaches.

Viaarxiv icon

Memory Replay with Data Compression for Continual Learning

Mar 09, 2022
Liyuan Wang, Xingxing Zhang, Kuo Yang, Longhui Yu, Chongxuan Li, Lanqing Hong, Shifeng Zhang, Zhenguo Li, Yi Zhong, Jun Zhu

Figure 1 for Memory Replay with Data Compression for Continual Learning
Figure 2 for Memory Replay with Data Compression for Continual Learning
Figure 3 for Memory Replay with Data Compression for Continual Learning
Figure 4 for Memory Replay with Data Compression for Continual Learning

Continual learning needs to overcome catastrophic forgetting of the past. Memory replay of representative old training samples has been shown as an effective solution, and achieves the state-of-the-art (SOTA) performance. However, existing work is mainly built on a small memory buffer containing a few original data, which cannot fully characterize the old data distribution. In this work, we propose memory replay with data compression (MRDC) to reduce the storage cost of old training samples and thus increase their amount that can be stored in the memory buffer. Observing that the trade-off between the quality and quantity of compressed data is highly nontrivial for the efficacy of memory replay, we propose a novel method based on determinantal point processes (DPPs) to efficiently determine an appropriate compression quality for currently-arrived training samples. In this way, using a naive data compression algorithm with a properly selected quality can largely boost recent strong baselines by saving more compressed data in a limited storage space. We extensively validate this across several benchmarks of class-incremental learning and in a realistic scenario of object detection for autonomous driving.

* ICLR 2022  
* arXiv admin note: text overlap with arXiv:1207.6083 by other authors 
Viaarxiv icon

OSOA: One-Shot Online Adaptation of Deep Generative Models for Lossless Compression

Nov 02, 2021
Chen Zhang, Shifeng Zhang, Fabio Maria Carlucci, Zhenguo Li

Figure 1 for OSOA: One-Shot Online Adaptation of Deep Generative Models for Lossless Compression
Figure 2 for OSOA: One-Shot Online Adaptation of Deep Generative Models for Lossless Compression
Figure 3 for OSOA: One-Shot Online Adaptation of Deep Generative Models for Lossless Compression
Figure 4 for OSOA: One-Shot Online Adaptation of Deep Generative Models for Lossless Compression

Explicit deep generative models (DGMs), e.g., VAEs and Normalizing Flows, have shown to offer an effective data modelling alternative for lossless compression. However, DGMs themselves normally require large storage space and thus contaminate the advantage brought by accurate data density estimation. To eliminate the requirement of saving separate models for different target datasets, we propose a novel setting that starts from a pretrained deep generative model and compresses the data batches while adapting the model with a dynamical system for only one epoch. We formalise this setting as that of One-Shot Online Adaptation (OSOA) of DGMs for lossless compression and propose a vanilla algorithm under this setting. Experimental results show that vanilla OSOA can save significant time versus training bespoke models and space versus using one model for all targets. With the same adaptation step number or adaptation time, it is shown vanilla OSOA can exhibit better space efficiency, e.g., $47\%$ less space, than fine-tuning the pretrained model and saving the fine-tuned model. Moreover, we showcase the potential of OSOA and motivate more sophisticated OSOA algorithms by showing further space or time efficiency with multiple updates per batch and early stopping.

Viaarxiv icon

iFlow: Numerically Invertible Flows for Efficient Lossless Compression via a Uniform Coder

Nov 01, 2021
Shifeng Zhang, Ning Kang, Tom Ryder, Zhenguo Li

Figure 1 for iFlow: Numerically Invertible Flows for Efficient Lossless Compression via a Uniform Coder
Figure 2 for iFlow: Numerically Invertible Flows for Efficient Lossless Compression via a Uniform Coder
Figure 3 for iFlow: Numerically Invertible Flows for Efficient Lossless Compression via a Uniform Coder
Figure 4 for iFlow: Numerically Invertible Flows for Efficient Lossless Compression via a Uniform Coder

It was estimated that the world produced $59 ZB$ ($5.9 \times 10^{13} GB$) of data in 2020, resulting in the enormous costs of both data storage and transmission. Fortunately, recent advances in deep generative models have spearheaded a new class of so-called "neural compression" algorithms, which significantly outperform traditional codecs in terms of compression ratio. Unfortunately, the application of neural compression garners little commercial interest due to its limited bandwidth; therefore, developing highly efficient frameworks is of critical practical importance. In this paper, we discuss lossless compression using normalizing flows which have demonstrated a great capacity for achieving high compression ratios. As such, we introduce iFlow, a new method for achieving efficient lossless compression. We first propose Modular Scale Transform (MST) and a novel family of numerically invertible flow transformations based on MST. Then we introduce the Uniform Base Conversion System (UBCS), a fast uniform-distribution codec incorporated into iFlow, enabling efficient compression. iFlow achieves state-of-the-art compression ratios and is $5\times$ quicker than other high-performance schemes. Furthermore, the techniques presented in this paper can be used to accelerate coding time for a broad class of flow-based algorithms.

* Accepted for NeurIPS 2021 Spotlight 
Viaarxiv icon

iVPF: Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression

Mar 30, 2021
Shifeng Zhang, Chen Zhang, Ning Kang, Li Zhenguo

Figure 1 for iVPF: Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression
Figure 2 for iVPF: Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression
Figure 3 for iVPF: Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression
Figure 4 for iVPF: Numerical Invertible Volume Preserving Flow for Efficient Lossless Compression

It is nontrivial to store rapidly growing big data nowadays, which demands high-performance lossless compression techniques. Likelihood-based generative models have witnessed their success on lossless compression, where flow based models are desirable in allowing exact data likelihood optimisation with bijective mappings. However, common continuous flows are in contradiction with the discreteness of coding schemes, which requires either 1) imposing strict constraints on flow models that degrades the performance or 2) coding numerous bijective mapping errors which reduces the efficiency. In this paper, we investigate volume preserving flows for lossless compression and show that a bijective mapping without error is possible. We propose Numerical Invertible Volume Preserving Flow (iVPF) which is derived from the general volume preserving flows. By introducing novel computation algorithms on flow models, an exact bijective mapping is achieved without any numerical error. We also propose a lossless compression algorithm based on iVPF. Experiments on various datasets show that the algorithm based on iVPF achieves state-of-the-art compression ratio over lightweight compression algorithms.

Viaarxiv icon

Loss Function Search for Face Recognition

Jul 10, 2020
Xiaobo Wang, Shuo Wang, Cheng Chi, Shifeng Zhang, Tao Mei

Figure 1 for Loss Function Search for Face Recognition
Figure 2 for Loss Function Search for Face Recognition
Figure 3 for Loss Function Search for Face Recognition
Figure 4 for Loss Function Search for Face Recognition

In face recognition, designing margin-based (e.g., angular, additive, additive angular margins) softmax loss functions plays an important role in learning discriminative features. However, these hand-crafted heuristic methods are sub-optimal because they require much effort to explore the large design space. Recently, an AutoML for loss function search method AM-LFS has been derived, which leverages reinforcement learning to search loss functions during the training process. But its search space is complex and unstable that hindering its superiority. In this paper, we first analyze that the key to enhance the feature discrimination is actually \textbf{how to reduce the softmax probability}. We then design a unified formulation for the current margin-based softmax losses. Accordingly, we define a novel search space and develop a reward-guided search method to automatically obtain the best candidate. Experimental results on a variety of face recognition benchmarks have demonstrated the effectiveness of our method over the state-of-the-art alternatives.

* Accepted by ICML2020. arXiv admin note: substantial text overlap with arXiv:1912.00833; text overlap with arXiv:1905.07375 by other authors 
Viaarxiv icon