Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yibo Yang

Dual-Flow Transformation Network for Deformable Image Registration with Region Consistency Constraint

Dec 04, 2021

Xinke Ma, Yibo Yang, Yong Xia, Dacheng Tao

Figure 1 for Dual-Flow Transformation Network for Deformable Image Registration with Region Consistency Constraint

Figure 2 for Dual-Flow Transformation Network for Deformable Image Registration with Region Consistency Constraint

Figure 3 for Dual-Flow Transformation Network for Deformable Image Registration with Region Consistency Constraint

Figure 4 for Dual-Flow Transformation Network for Deformable Image Registration with Region Consistency Constraint

Abstract:Deformable image registration is able to achieve fast and accurate alignment between a pair of images and thus plays an important role in many medical image studies. The current deep learning (DL)-based image registration approaches directly learn the spatial transformation from one image to another by leveraging a convolutional neural network, requiring ground truth or similarity metric. Nevertheless, these methods only use a global similarity energy function to evaluate the similarity of a pair of images, which ignores the similarity of regions of interest (ROIs) within images. Moreover, DL-based methods often estimate global spatial transformations of image directly, which never pays attention to region spatial transformations of ROIs within images. In this paper, we present a novel dual-flow transformation network with region consistency constraint which maximizes the similarity of ROIs within a pair of images and estimates both global and region spatial transformations simultaneously. Experiments on four public 3D MRI datasets show that the proposed method achieves the best registration performance in accuracy and generalization compared with other state-of-the-art methods.

Via

Access Paper or Ask Questions

Towards Empirical Sandwich Bounds on the Rate-Distortion Function

Nov 23, 2021

Yibo Yang, Stephan Mandt

Figure 1 for Towards Empirical Sandwich Bounds on the Rate-Distortion Function

Figure 2 for Towards Empirical Sandwich Bounds on the Rate-Distortion Function

Figure 3 for Towards Empirical Sandwich Bounds on the Rate-Distortion Function

Figure 4 for Towards Empirical Sandwich Bounds on the Rate-Distortion Function

Abstract:Rate-distortion (R-D) function, a key quantity in information theory, characterizes the fundamental limit of how much a data source can be compressed subject to a fidelity criterion, by any compression algorithm. As researchers push for ever-improving compression performance, establishing the R-D function of a given data source is not only of scientific interest, but also sheds light on the possible room for improving compression algorithms. Previous work on this problem relied on distributional assumptions on the data source (Gibson, 2017) or only applied to discrete data. By contrast, this paper makes the first attempt at an algorithm for sandwiching the R-D function of a general (not necessarily discrete) source requiring only i.i.d. data samples. We estimate R-D sandwich bounds on Gaussian and high-dimension banana-shaped sources, as well as GAN-generated images. Our R-D upper bound on natural images indicates room for improving the performance of state-of-the-art image compression methods by 1 dB in PSNR at various bitrates.

Via

Access Paper or Ask Questions

Biologically Plausible Training Mechanisms for Self-Supervised Learning in Deep Networks

Oct 13, 2021

Mufeng Tang, Yibo Yang, Yali Amit

Figure 1 for Biologically Plausible Training Mechanisms for Self-Supervised Learning in Deep Networks

Figure 2 for Biologically Plausible Training Mechanisms for Self-Supervised Learning in Deep Networks

Figure 3 for Biologically Plausible Training Mechanisms for Self-Supervised Learning in Deep Networks

Figure 4 for Biologically Plausible Training Mechanisms for Self-Supervised Learning in Deep Networks

Abstract:We develop biologically plausible training mechanisms for self-supervised learning (SSL) in deep networks. SSL, with a contrastive loss, is more natural as it does not require labelled data and its robustness to perturbations yields more adaptable embeddings. Moreover the perturbation of data required to create positive pairs for SSL is easily produced in a natural environment by observing objects in motion and with variable lighting over time. We propose a contrastive hinge based loss whose error involves simple local computations as opposed to the standard contrastive losses employed in the literature, which do not lend themselves easily to implementation in a network architecture due to complex computations involving ratios and inner products. Furthermore we show that learning can be performed with one of two more plausible alternatives to backpropagation. The first is difference target propagation (DTP), which trains network parameters using target-based local losses and employs a Hebbian learning rule, thus overcoming the biologically implausible symmetric weight problem in backpropagation. The second is simply layer-wise learning, where each layer is directly connected to a layer computing the loss error. The layers are either updated sequentially in a greedy fashion (GLL) or in random order (RLL), and each training stage involves a single hidden layer network. The one step backpropagation needed for each such network can either be altered with fixed random feedback weights as proposed in Lillicrap et al. (2016), or using updated random feedback as in Amit (2019). Both methods represent alternatives to the symmetric weight issue of backpropagation. By training convolutional neural networks (CNNs) with SSL and DTP, GLL or RLL, we find that our proposed framework achieves comparable performance to its implausible counterparts in both linear evaluation and transfer learning tasks.

Via

Access Paper or Ask Questions

Insights from Generative Modeling for Neural Video Compression

Jul 28, 2021

Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt

Figure 1 for Insights from Generative Modeling for Neural Video Compression

Figure 2 for Insights from Generative Modeling for Neural Video Compression

Figure 3 for Insights from Generative Modeling for Neural Video Compression

Figure 4 for Insights from Generative Modeling for Neural Video Compression

Abstract:While recent machine learning research has revealed connections between deep generative models such as VAEs and rate-distortion losses used in learned compression, most of this work has focused on images. In a similar spirit, we view recently proposed neural video coding algorithms through the lens of deep autoregressive and latent variable modeling. We present recent neural video codecs as instances of a generalized stochastic temporal autoregressive transform, and propose new avenues for further improvements inspired by normalizing flows and structured priors. We propose several architectures that yield state-of-the-art video compression performance on full-resolution video and discuss their tradeoffs and ablations. In particular, we propose (i) improved temporal autoregressive transforms, (ii) improved entropy models with structured and temporal dependencies, and (iii) variable bitrate versions of our algorithms. Since our improvements are compatible with a large class of existing models, we provide further evidence that the generative modeling viewpoint can advance the neural video coding field.

* This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible. arXiv admin note: text overlap with arXiv:2010.10258

Via

Access Paper or Ask Questions

Output-Weighted Sampling for Multi-Armed Bandits with Extreme Payoffs

Feb 19, 2021

Yibo Yang, Antoine Blanchard, Themistoklis Sapsis, Paris Perdikaris

Figure 1 for Output-Weighted Sampling for Multi-Armed Bandits with Extreme Payoffs

Figure 2 for Output-Weighted Sampling for Multi-Armed Bandits with Extreme Payoffs

Figure 3 for Output-Weighted Sampling for Multi-Armed Bandits with Extreme Payoffs

Figure 4 for Output-Weighted Sampling for Multi-Armed Bandits with Extreme Payoffs

Abstract:We present a new type of acquisition functions for online decision making in multi-armed and contextual bandit problems with extreme payoffs. Specifically, we model the payoff function as a Gaussian process and formulate a novel type of upper confidence bound (UCB) acquisition function that guides exploration towards the bandits that are deemed most relevant according to the variability of the observed rewards. This is achieved by computing a tractable likelihood ratio that quantifies the importance of the output relative to the inputs and essentially acts as an \textit{attention mechanism} that promotes exploration of extreme rewards. We demonstrate the benefits of the proposed methodology across several synthetic benchmarks, as well as a realistic example involving noisy sensor network data. Finally, we provide a JAX library for efficient bandit optimization using Gaussian processes.

* 10 pages, 4 figures, 1 table

Via

Access Paper or Ask Questions

Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search

Jan 27, 2021

Yibo Yang, Shan You, Hongyang Li, Fei Wang, Chen Qian, Zhouchen Lin

Figure 1 for Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search

Figure 2 for Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search

Figure 3 for Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search

Figure 4 for Towards Improving the Consistency, Efficiency, and Flexibility of Differentiable Neural Architecture Search

Abstract:Most differentiable neural architecture search methods construct a super-net for search and derive a target-net as its sub-graph for evaluation. There exists a significant gap between the architectures in search and evaluation. As a result, current methods suffer from an inconsistent, inefficient, and inflexible search process. In this paper, we introduce EnTranNAS that is composed of Engine-cells and Transit-cells. The Engine-cell is differentiable for architecture search, while the Transit-cell only transits a sub-graph by architecture derivation. Consequently, the gap between the architectures in search and evaluation is significantly reduced. Our method also spares much memory and computation cost, which speeds up the search process. A feature sharing strategy is introduced for more balanced optimization and more efficient search. Furthermore, we develop an architecture derivation method to replace the traditional one that is based on a hand-crafted rule. Our method enables differentiable sparsification, and keeps the derived architecture equivalent to that of Engine-cell, which further improves the consistency between search and evaluation. Besides, it supports the search for topology where a node can be connected to prior nodes with any number of connections, so that the searched architectures could be more flexible. For experiments on CIFAR-10, our search on the standard space requires only 0.06 GPU-day. We further have an error rate of 2.22% with 0.07 GPU-day for the search on an extended space. We can also directly perform the search on ImageNet with topology learnable and achieve a top-1 error rate of 23.8% in 2.1 GPU-day.

Via

Access Paper or Ask Questions

Explicitly Learning Topology for Differentiable Neural Architecture Search

Nov 18, 2020

Tao Huang, Shan You, Yibo Yang, Zhuozhuo Tu, Fei Wang, Chen Qian, Changshui Zhang

Figure 1 for Explicitly Learning Topology for Differentiable Neural Architecture Search

Figure 2 for Explicitly Learning Topology for Differentiable Neural Architecture Search

Figure 3 for Explicitly Learning Topology for Differentiable Neural Architecture Search

Figure 4 for Explicitly Learning Topology for Differentiable Neural Architecture Search

Abstract:Differentiable neural architecture search (DARTS) has gained much success in discovering more flexible and diverse cell types. Current methods couple the operations and topology during search, and simply derive optimal topology by a hand-craft rule. However, topology also matters for neural architectures since it controls the interactions between features of operations. In this paper, we highlight the topology learning in differentiable NAS, and propose an explicit topology modeling method, named TopoNAS, to directly decouple the operation selection and topology during search. Concretely, we introduce a set of topological variables and a combinatorial probabilistic distribution to explicitly indicate the target topology. Besides, we also leverage a passive-aggressive regularization to suppress invalid topology within supernet. Our introduced topological variables can be jointly learned with operation variables and supernet weights, and apply to various DARTS variants. Extensive experiments on CIFAR-10 and ImageNet validate the effectiveness of our proposed TopoNAS. The results show that TopoNAS does enable to search cells with more diverse and complex topology, and boost the performance significantly. For example, TopoNAS can improve DARTS by 0.16\% accuracy on CIFAR-10 dataset with 40\% parameters reduced or 0.35\% with similar parameters.

Via

Access Paper or Ask Questions

Hierarchical Autoregressive Modeling for Neural Video Compression

Oct 19, 2020

Ruihan Yang, Yibo Yang, Joseph Marino, Stephan Mandt

Figure 1 for Hierarchical Autoregressive Modeling for Neural Video Compression

Figure 2 for Hierarchical Autoregressive Modeling for Neural Video Compression

Figure 3 for Hierarchical Autoregressive Modeling for Neural Video Compression

Figure 4 for Hierarchical Autoregressive Modeling for Neural Video Compression

Abstract:Recent work by Marino et al. (2020) showed improved performance in sequential density estimation by combining masked autoregressive flows with hierarchical latent variable models. We draw a connection between such autoregressive generative models and the task of lossy video compression. Specifically, we view recent neural video compression methods (Lu et al., 2019; Yang et al., 2020b; Agustssonet al., 2020) as instances of a generalized stochastic temporal autoregressive trans-form, and propose avenues for enhancement based on this insight. Comprehensive evaluations on large-scale video data show improved rate-distortion performance over both state-of-the-art neural and conventional video compression methods.

Via

Access Paper or Ask Questions

ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

Oct 13, 2020

Yibo Yang, Hongyang Li, Shan You, Fei Wang, Chen Qian, Zhouchen Lin

Figure 1 for ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

Figure 2 for ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

Figure 3 for ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

Figure 4 for ISTA-NAS: Efficient and Consistent Neural Architecture Search by Sparse Coding

Abstract:Neural architecture search (NAS) aims to produce the optimal sparse solution from a high-dimensional space spanned by all candidate connections. Current gradient-based NAS methods commonly ignore the constraint of sparsity in the search phase, but project the optimized solution onto a sparse one by post-processing. As a result, the dense super-net for search is inefficient to train and has a gap with the projected architecture for evaluation. In this paper, we formulate neural architecture search as a sparse coding problem. We perform the differentiable search on a compressed lower-dimensional space that has the same validation loss as the original sparse solution space, and recover an architecture by solving the sparse coding problem. The differentiable search and architecture recovery are optimized in an alternate manner. By doing so, our network for search at each update satisfies the sparsity constraint and is efficient to train. In order to also eliminate the depth and width gap between the network in search and the target-net in evaluation, we further propose a method to search and evaluate in one stage under the target-net settings. When training finishes, architecture variables are absorbed into network weights. Thus we get the searched architecture and optimized parameters in a single run. In experiments, our two-stage method on CIFAR-10 requires only 0.05 GPU-day for search. Our one-stage method produces state-of-the-art performances on both CIFAR-10 and ImageNet at the cost of only evaluation time.

* NeurIPS 2020

Via

Access Paper or Ask Questions

Improving Inference for Neural Image Compression

Jun 09, 2020

Yibo Yang, Robert Bamler, Stephan Mandt

Figure 1 for Improving Inference for Neural Image Compression

Figure 2 for Improving Inference for Neural Image Compression

Figure 3 for Improving Inference for Neural Image Compression

Figure 4 for Improving Inference for Neural Image Compression

Abstract:We consider the problem of lossy image compression with deep latent variable models. State-of-the-art methods build on hierarchical variational autoencoders (VAEs) and learn inference networks to predict a compressible latent representation of each data point. Drawing on the variational inference perspective on compression, we identify three approximation gaps which limit performance in the conventional approach: (i) an amortization gap, (ii) a discretization gap, and (iii) a marginalization gap. We propose improvements to each of these three shortcomings based on ideas related to iterative inference, stochastic annealing for discrete optimization, and bits-back coding, resulting in the first application of bits-back coding to lossy compression. In our experiments, which include extensive baseline comparisons and ablation studies, we achieve new state-of-the-art performance on lossy image compression using an established VAE architecture, by changing only the inference method.

* 8 pages + detailed supplement with additional qualitative and quantitative results

Via

Access Paper or Ask Questions