Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rui Gao

Sinkhorn Distributionally Robust Optimization

Sep 24, 2021

Jie Wang, Rui Gao, Yao Xie

Figure 1 for Sinkhorn Distributionally Robust Optimization

Figure 2 for Sinkhorn Distributionally Robust Optimization

Figure 3 for Sinkhorn Distributionally Robust Optimization

Figure 4 for Sinkhorn Distributionally Robust Optimization

Abstract:We study distributionally robust optimization with Sinkorn distance -- a variant of Wasserstein distance based on entropic regularization. We derive convex programming dual reformulations when the nominal distribution is an empirical distribution and a general distribution, respectively. Compared with Wasserstein DRO, it is computationally tractable for a larger class of loss functions, and its worst-case distribution is more reasonable. To solve the dual reformulation, we propose an efficient batch gradient descent with a bisection search algorithm. Finally, we provide various numerical examples using both synthetic and real data to demonstrate its competitive performance.

* 34 pages, 2 figures

Via

Access Paper or Ask Questions

Hierarchical Non-Stationary Temporal Gaussian Processes With $L^1$-Regularization

May 20, 2021

Zheng Zhao, Rui Gao, Simo Särkkä

Figure 1 for Hierarchical Non-Stationary Temporal Gaussian Processes With $L^1$-Regularization

Figure 2 for Hierarchical Non-Stationary Temporal Gaussian Processes With $L^1$-Regularization

Figure 3 for Hierarchical Non-Stationary Temporal Gaussian Processes With $L^1$-Regularization

Figure 4 for Hierarchical Non-Stationary Temporal Gaussian Processes With $L^1$-Regularization

Abstract:This paper is concerned with regularized extensions of hierarchical non-stationary temporal Gaussian processes (NSGPs) in which the parameters (e.g., length-scale) are modeled as GPs. In particular, we consider two commonly used NSGP constructions which are based on explicitly constructed non-stationary covariance functions and stochastic differential equations, respectively. We extend these NSGPs by including $L^1$-regularization on the processes in order to induce sparseness. To solve the resulting regularized NSGP (R-NSGP) regression problem we develop a method based on the alternating direction method of multipliers (ADMM) and we also analyze its convergence properties theoretically. We also evaluate the performance of the proposed methods in simulated and real-world datasets.

* 20 pages. Submitted to Statistics and Computing

Via

Access Paper or Ask Questions

Learning While Dissipating Information: Understanding the Generalization Capability of SGLD

Feb 05, 2021

Hao Wang, Yizhe Huang, Rui Gao, Flavio P. Calmon

Figure 1 for Learning While Dissipating Information: Understanding the Generalization Capability of SGLD

Figure 2 for Learning While Dissipating Information: Understanding the Generalization Capability of SGLD

Figure 3 for Learning While Dissipating Information: Understanding the Generalization Capability of SGLD

Figure 4 for Learning While Dissipating Information: Understanding the Generalization Capability of SGLD

Abstract:Understanding the generalization capability of learning algorithms is at the heart of statistical learning theory. In this paper, we investigate the generalization gap of stochastic gradient Langevin dynamics (SGLD), a widely used optimizer for training deep neural networks (DNNs). We derive an algorithm-dependent generalization bound by analyzing SGLD through an information-theoretic lens. Our analysis reveals an intricate trade-off between learning and information dissipation: SGLD learns from data by updating parameters at each iteration while dissipating information from early training stages. Our bound also involves the variance of gradients which captures a particular kind of "sharpness" of the loss landscape. The main proof techniques in this paper rely on strong data processing inequalities -- a fundamental concept in information theory -- and Otto-Villani's HWI inequality. Finally, we demonstrate our bound through numerical experiments, showing that it can predict the behavior of the true generalization gap.

Via

Access Paper or Ask Questions

Generalize Ultrasound Image Segmentation via Instant and Plug & Play Style Transfer

Jan 11, 2021

Zhendong Liu, Xiaoqiong Huang, Xin Yang, Rui Gao, Rui Li, Yuanji Zhang, Yankai Huang, Guangquan Zhou, Yi Xiong, Alejandro F Frangi(+1 more)

Figure 1 for Generalize Ultrasound Image Segmentation via Instant and Plug & Play Style Transfer

Figure 2 for Generalize Ultrasound Image Segmentation via Instant and Plug & Play Style Transfer

Figure 3 for Generalize Ultrasound Image Segmentation via Instant and Plug & Play Style Transfer

Figure 4 for Generalize Ultrasound Image Segmentation via Instant and Plug & Play Style Transfer

Abstract:Deep segmentation models that generalize to images with unknown appearance are important for real-world medical image analysis. Retraining models leads to high latency and complex pipelines, which are impractical in clinical settings. The situation becomes more severe for ultrasound image analysis because of their large appearance shifts. In this paper, we propose a novel method for robust segmentation under unknown appearance shifts. Our contribution is three-fold. First, we advance a one-stage plug-and-play solution by embedding hierarchical style transfer units into a segmentation architecture. Our solution can remove appearance shifts and perform segmentation simultaneously. Second, we adopt Dynamic Instance Normalization to conduct precise and dynamic style transfer in a learnable manner, rather than previously fixed style normalization. Third, our solution is fast and lightweight for routine clinical adoption. Given 400*400 image input, our solution only needs an additional 0.2ms and 1.92M FLOPs to handle appearance shifts compared to the baseline pipeline. Extensive experiments are conducted on a large dataset from three vendors demonstrate our proposed method enhances the robustness of deep segmentation models.

* Accepted by IEEE ISBI 2021

Via

Access Paper or Ask Questions

Reliable Off-policy Evaluation for Reinforcement Learning

Nov 08, 2020

Jie Wang, Rui Gao, Hongyuan Zha

Figure 1 for Reliable Off-policy Evaluation for Reinforcement Learning

Figure 2 for Reliable Off-policy Evaluation for Reinforcement Learning

Figure 3 for Reliable Off-policy Evaluation for Reinforcement Learning

Figure 4 for Reliable Off-policy Evaluation for Reinforcement Learning

Abstract:In a sequential decision-making problem, off-policy evaluation (OPE) estimates the expected cumulative reward of a target policy using logged transition data generated from a different behavior policy, without execution of the target policy. Reinforcement learning in high-stake environments, such as healthcare and education, is often limited to off-policy settings due to safety or ethical concerns, or inability of exploration. Hence it is imperative to quantify the uncertainty of the off-policy estimate before deployment of the target policy. In this paper, we propose a novel framework that provides robust and optimistic cumulative reward estimates with statistical guarantees and develop non-asymptotic as well as asymptotic confidence intervals for OPE, leveraging methodologies from distributionally robust optimization. Our theoretical results are also supported by empirical analysis.

* 36 pages, 4 figures

Via

Access Paper or Ask Questions

Two-sample Test using Projected Wasserstein Distance: Breaking the Curse of Dimensionality

Oct 22, 2020

Jie Wang, Rui Gao, Yao Xie

Figure 1 for Two-sample Test using Projected Wasserstein Distance: Breaking the Curse of Dimensionality

Figure 2 for Two-sample Test using Projected Wasserstein Distance: Breaking the Curse of Dimensionality

Figure 3 for Two-sample Test using Projected Wasserstein Distance: Breaking the Curse of Dimensionality

Figure 4 for Two-sample Test using Projected Wasserstein Distance: Breaking the Curse of Dimensionality

Abstract:We develop a projected Wasserstein distance for the two-sample test, a fundamental problem in statistics and machine learning: given two sets of samples, to determine whether they are from the same distribution. In particular, we aim to circumvent the curse of dimensionality in Wasserstein distance: when the dimension is high, it has diminishing testing power, which is inherently due to the slow concentration property of Wasserstein metrics in the high dimension space. A key contribution is to couple optimal projection to find the low dimensional linear mapping to maximize the Wasserstein distance between projected probability distributions. We characterize the theoretical property of the finite-sample convergence rate on IPMs and present practical algorithms for computing this metric. Numerical examples validate our theoretical results.

* 16 pages, 5 figures. Submitted to AISTATS 2021

Via

Access Paper or Ask Questions

Contrastive Rendering for Ultrasound Image Segmentation

Oct 10, 2020

Haoming Li, Xin Yang, Jiamin Liang, Wenlong Shi, Chaoyu Chen, Haoran Dou, Rui Li, Rui Gao, Guangquan Zhou, Jinghui Fang(+5 more)

Figure 1 for Contrastive Rendering for Ultrasound Image Segmentation

Figure 2 for Contrastive Rendering for Ultrasound Image Segmentation

Figure 3 for Contrastive Rendering for Ultrasound Image Segmentation

Figure 4 for Contrastive Rendering for Ultrasound Image Segmentation

Abstract:Ultrasound (US) image segmentation embraced its significant improvement in deep learning era. However, the lack of sharp boundaries in US images still remains an inherent challenge for segmentation. Previous methods often resort to global context, multi-scale cues or auxiliary guidance to estimate the boundaries. It is hard for these methods to approach pixel-level learning for fine-grained boundary generating. In this paper, we propose a novel and effective framework to improve boundary estimation in US images. Our work has three highlights. First, we propose to formulate the boundary estimation as a rendering task, which can recognize ambiguous points (pixels/voxels) and calibrate the boundary prediction via enriched feature representation learning. Second, we introduce point-wise contrastive learning to enhance the similarity of points from the same class and contrastively decrease the similarity of points from different classes. Boundary ambiguities are therefore further addressed. Third, both rendering and contrastive learning tasks contribute to consistent improvement while reducing network parameters. As a proof-of-concept, we performed validation experiments on a challenging dataset of 86 ovarian US volumes. Results show that our proposed method outperforms state-of-the-art methods and has the potential to be used in clinical practice.

* 10 pages, 5 figures, 2 tables, 13 references

Via

Access Paper or Ask Questions

Finite-Sample Guarantees for Wasserstein Distributionally Robust Optimization: Breaking the Curse of Dimensionality

Sep 09, 2020

Rui Gao

Abstract:Wasserstein distributionally robust optimization (DRO) aims to find robust and generalizable solutions by hedging against data perturbations in Wasserstein distance. Despite its recent empirical success in operations research and machine learning, existing performance guarantees for generic loss functions are either overly conservative due to the curse of dimensionality, or plausible only in large sample asymptotics. In this paper, we develop a non-asymptotic framework for analyzing the out-of-sample performance for Wasserstein robust learning and the generalization bound for its related Lipschitz and gradient regularization problems. To the best of our knowledge, this gives the first finite-sample guarantee for generic Wasserstein DRO problems without suffering from the curse of dimensionality. Our results highlight the bias-variation trade-off intrinsic in the Wasserstein DRO, which automatically balances between the empirical mean of the loss and the variation of the loss, measured by the Lipschitz norm or the gradient norm of the loss. Our analysis is based on two novel methodological developments which are of independent interest: 1) a new concentration inequality characterizing the decay rate of large deviation probabilities by the variation of the loss and, 2) a localized Rademacher complexity theory based on the variation of the loss.

Via

Access Paper or Ask Questions

Distributionally Robust $k$-Nearest Neighbors for Few-Shot Learning

Jun 07, 2020

Shixiang Zhu, Liyan Xie, Minghe Zhang, Rui Gao, Yao Xie

Figure 1 for Distributionally Robust $k$-Nearest Neighbors for Few-Shot Learning

Figure 2 for Distributionally Robust $k$-Nearest Neighbors for Few-Shot Learning

Figure 3 for Distributionally Robust $k$-Nearest Neighbors for Few-Shot Learning

Figure 4 for Distributionally Robust $k$-Nearest Neighbors for Few-Shot Learning

Abstract:Learning a robust classifier from a few samples remains a key challenge in machine learning. A major thrust of research in few-shot classification has been based on metric learning to capture similarities between samples and then perform the $k$-nearest neighbor algorithm. To make such an algorithm more robust, in this paper, we propose a distributionally robust $k$-nearest neighbor algorithm Dr.k-NN, which features assigning minimax optimal weights to training samples when performing classification. We also couple it with neural-network-based feature embedding. We demonstrate the competitive performance of our algorithm comparing to the state-of-the-art in the few-shot learning setting with various real-data experiments.

Via

Access Paper or Ask Questions

Remove Appearance Shift for Ultrasound Image Segmentation via Fast and Universal Style Transfer

Feb 14, 2020

Zhendong Liu, Xin Yang, Rui Gao, Shengfeng Liu, Haoran Dou, Shuangchi He, Yuhao Huang, Yankai Huang, Huanjia Luo, Yuanji Zhang(+2 more)

Figure 1 for Remove Appearance Shift for Ultrasound Image Segmentation via Fast and Universal Style Transfer

Figure 2 for Remove Appearance Shift for Ultrasound Image Segmentation via Fast and Universal Style Transfer

Figure 3 for Remove Appearance Shift for Ultrasound Image Segmentation via Fast and Universal Style Transfer

Figure 4 for Remove Appearance Shift for Ultrasound Image Segmentation via Fast and Universal Style Transfer

Abstract:Deep Neural Networks (DNNs) suffer from the performance degradation when image appearance shift occurs, especially in ultrasound (US) image segmentation. In this paper, we propose a novel and intuitive framework to remove the appearance shift, and hence improve the generalization ability of DNNs. Our work has three highlights. First, we follow the spirit of universal style transfer to remove appearance shifts, which was not explored before for US images. Without sacrificing image structure details, it enables the arbitrary style-content transfer. Second, accelerated with Adaptive Instance Normalization block, our framework achieved real-time speed required in the clinical US scanning. Third, an efficient and effective style image selection strategy is proposed to ensure the target-style US image and testing content US image properly match each other. Experiments on two large US datasets demonstrate that our methods are superior to state-of-the-art methods on making DNNs robust against various appearance shifts.

* IEEE International Symposium on Biomedical Imaging (IEEE ISBI 2020)

Via

Access Paper or Ask Questions