Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gen Li

DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models

Jul 05, 2023
Liangbin Xie, Xintao Wang, Xiangyu Chen, Gen Li, Ying Shan, Jiantao Zhou, Chao Dong

Figure 1 for DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models

Figure 2 for DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models

Figure 3 for DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models

Figure 4 for DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models

Image super-resolution (SR) with generative adversarial networks (GAN) has achieved great success in restoring realistic details. However, it is notorious that GAN-based SR models will inevitably produce unpleasant and undesirable artifacts, especially in practical scenarios. Previous works typically suppress artifacts with an extra loss penalty in the training phase. They only work for in-distribution artifact types generated during training. When applied in real-world scenarios, we observe that those improved methods still generate obviously annoying artifacts during inference. In this paper, we analyze the cause and characteristics of the GAN artifacts produced in unseen test data without ground-truths. We then develop a novel method, namely, DeSRA, to Detect and then Delete those SR Artifacts in practice. Specifically, we propose to measure a relative local variance distance from MSE-SR results and GAN-SR results, and locate the problematic areas based on the above distance and semantic-aware thresholds. After detecting the artifact regions, we develop a finetune procedure to improve GAN-based SR models with a few samples, so that they can deal with similar types of artifacts in more unseen real data. Equipped with our DeSRA, we can successfully eliminate artifacts from inference and improve the ability of SR models to be applied in real-world scenarios. The code will be available at https://github.com/TencentARC/DeSRA.

* The code and models will be made publicly at https://github.com/TencentARC/DeSRA

Via

Access Paper or Ask Questions

Referenceless User Controllable Semantic Image Synthesis

Jun 18, 2023
Jonghyun Kim, Gen Li, Joongkyu Kim

Figure 1 for Referenceless User Controllable Semantic Image Synthesis

Figure 2 for Referenceless User Controllable Semantic Image Synthesis

Figure 3 for Referenceless User Controllable Semantic Image Synthesis

Figure 4 for Referenceless User Controllable Semantic Image Synthesis

Despite recent progress in semantic image synthesis, complete control over image style remains a challenging problem. Existing methods require reference images to feed style information into semantic layouts, which indicates that the style is constrained by the given image. In this paper, we propose a model named RUCGAN for user controllable semantic image synthesis, which utilizes a singular color to represent the style of a specific semantic region. The proposed network achieves reference-free semantic image synthesis by injecting color as user-desired styles into each semantic layout, and is able to synthesize semantic images with unusual colors. Extensive experimental results on various challenging datasets show that the proposed method outperforms existing methods, and we further provide an interactive UI to demonstrate the advantage of our approach for style controllability.

* Accepted to IJCNN 2023

Via

Access Paper or Ask Questions

Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models

Jun 15, 2023
Gen Li, Yuting Wei, Yuxin Chen, Yuejie Chi

Diffusion models, which convert noise into new data instances by learning to reverse a Markov diffusion process, have become a cornerstone in contemporary generative modeling. While their practical power has now been widely recognized, the theoretical underpinnings remain far from mature. In this work, we develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models in discrete time, assuming access to reliable estimates of the (Stein) score functions. For a popular deterministic sampler (based on the probability flow ODE), we establish a convergence rate proportional to $1/T$ (with $T$ the total number of steps), improving upon past results; for another mainstream stochastic sampler (i.e., a type of the denoising diffusion probabilistic model (DDPM)), we derive a convergence rate proportional to $1/\sqrt{T}$, matching the state-of-the-art theory. Our theory imposes only minimal assumptions on the target data distribution (e.g., no smoothness assumption is imposed), and is developed based on an elementary yet versatile non-asymptotic approach without resorting to toolboxes for SDEs and ODEs. Further, we design two accelerated variants, improving the convergence to $1/T^2$ for the ODE-based sampler and $1/T$ for the DDPM-type sampler, which might be of independent theoretical and empirical interest.

Via

Access Paper or Ask Questions

Dynamic Sparsity Is Channel-Level Sparsity Learner

May 30, 2023
Lu Yin, Gen Li, Meng Fang, Li Shen, Tianjin Huang, Zhangyang Wang, Vlado Menkovski, Xiaolong Ma, Mykola Pechenizkiy, Shiwei Liu

Figure 1 for Dynamic Sparsity Is Channel-Level Sparsity Learner

Figure 2 for Dynamic Sparsity Is Channel-Level Sparsity Learner

Figure 3 for Dynamic Sparsity Is Channel-Level Sparsity Learner

Figure 4 for Dynamic Sparsity Is Channel-Level Sparsity Learner

Sparse training has received an upsurging interest in machine learning due to its tantalizing saving potential for the entire training process as well as inference. Dynamic sparse training (DST), as a leading sparse training approach, can train deep neural networks at high sparsity from scratch to match the performance of their dense counterparts. However, most if not all DST prior arts demonstrate their effectiveness on unstructured sparsity with highly irregular sparse patterns, which receives limited support in common hardware. This limitation hinders the usage of DST in practice. In this paper, we propose Channel-aware dynamic sparse (Chase), which for the first time seamlessly translates the promise of unstructured dynamic sparsity to GPU-friendly channel-level sparsity (not fine-grained N:M or group sparsity) during one end-to-end training process, without any ad-hoc operations. The resulting small sparse networks can be directly accelerated by commodity hardware, without using any particularly sparsity-aware hardware accelerators. This appealing outcome is partially motivated by a hidden phenomenon of dynamic sparsity: off-the-shelf unstructured DST implicitly involves biased parameter reallocation across channels, with a large fraction of channels (up to 60\%) being sparser than others. By progressively identifying and removing these channels during training, our approach translates unstructured sparsity to channel-wise sparsity. Our experimental results demonstrate that Chase achieves 1.7 X inference throughput speedup on common GPU devices without compromising accuracy with ResNet-50 on ImageNet. We release our codes in https://github.com/luuyin/chase.

Via

Access Paper or Ask Questions

Sharp high-probability sample complexities for policy evaluation with linear function approximation

May 30, 2023
Gen Li, Weichen Wu, Yuejie Chi, Cong Ma, Alessandro Rinaldo, Yuting Wei

Figure 1 for Sharp high-probability sample complexities for policy evaluation with linear function approximation

Figure 2 for Sharp high-probability sample complexities for policy evaluation with linear function approximation

Figure 3 for Sharp high-probability sample complexities for policy evaluation with linear function approximation

Figure 4 for Sharp high-probability sample complexities for policy evaluation with linear function approximation

This paper is concerned with the problem of policy evaluation with linear function approximation in discounted infinite horizon Markov decision processes. We investigate the sample complexities required to guarantee a predefined estimation error of the best linear coefficients for two widely-used policy evaluation algorithms: the temporal difference (TD) learning algorithm and the two-timescale linear TD with gradient correction (TDC) algorithm. In both the on-policy setting, where observations are generated from the target policy, and the off-policy setting, where samples are drawn from a behavior policy potentially different from the target policy, we establish the first sample complexity bound with high-probability convergence guarantee that attains the optimal dependence on the tolerance level. We also exhihit an explicit dependence on problem-related quantities, and show in the on-policy setting that our upper bound matches the minimax lower bound on crucial problem parameters, including the choice of the feature maps and the problem dimension.

* The first two authors contributed equally

Via

Access Paper or Ask Questions

The Curious Price of Distributional Robustness in Reinforcement Learning with a Generative Model

May 26, 2023
Laixi Shi, Gen Li, Yuting Wei, Yuxin Chen, Matthieu Geist, Yuejie Chi

Figure 1 for The Curious Price of Distributional Robustness in Reinforcement Learning with a Generative Model

Figure 2 for The Curious Price of Distributional Robustness in Reinforcement Learning with a Generative Model

Figure 3 for The Curious Price of Distributional Robustness in Reinforcement Learning with a Generative Model

This paper investigates model robustness in reinforcement learning (RL) to reduce the sim-to-real gap in practice. We adopt the framework of distributionally robust Markov decision processes (RMDPs), aimed at learning a policy that optimizes the worst-case performance when the deployed environment falls within a prescribed uncertainty set around the nominal MDP. Despite recent efforts, the sample complexity of RMDPs remained mostly unsettled regardless of the uncertainty set in use. It was unclear if distributional robustness bears any statistical consequences when benchmarked against standard RL. Assuming access to a generative model that draws samples based on the nominal MDP, we characterize the sample complexity of RMDPs when the uncertainty set is specified via either the total variation (TV) distance or $\chi^2$ divergence. The algorithm studied here is a model-based method called {\em distributionally robust value iteration}, which is shown to be near-optimal for the full range of uncertainty levels. Somewhat surprisingly, our results uncover that RMDPs are not necessarily easier or harder to learn than standard MDPs. The statistical consequence incurred by the robustness requirement depends heavily on the size and shape of the uncertainty set: in the case w.r.t.~the TV distance, the minimax sample complexity of RMDPs is always smaller than that of standard MDPs; in the case w.r.t.~the $\chi^2$ divergence, the sample complexity of RMDPs can often far exceed the standard MDP counterpart.

Via

Access Paper or Ask Questions

Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time

May 24, 2023
Xiang Ji, Gen Li

Figure 1 for Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs with Short Burn-In Time

A crucial problem in reinforcement learning is learning the optimal policy. We study this in tabular infinite-horizon discounted Markov decision processes under the online setting. The existing algorithms either fail to achieve regret optimality or have to incur a high memory and computational cost. In addition, existing optimal algorithms all require a long burn-in time in order to achieve optimal sample efficiency, i.e., their optimality is not guaranteed unless sample size surpasses a high threshold. We address both open problems by introducing a model-free algorithm that employs variance reduction and a novel technique that switches the execution policy in a slow-yet-adaptive manner. This is the first regret-optimal model-free algorithm in the discounted setting, with the additional benefit of a low burn-in time.

Via

Access Paper or Ask Questions

Reward-agnostic Fine-tuning: Provable Statistical Benefits of Hybrid Reinforcement Learning

May 17, 2023
Gen Li, Wenhao Zhan, Jason D. Lee, Yuejie Chi, Yuxin Chen

This paper studies tabular reinforcement learning (RL) in the hybrid setting, which assumes access to both an offline dataset and online interactions with the unknown environment. A central question boils down to how to efficiently utilize online data collection to strengthen and complement the offline dataset and enable effective policy fine-tuning. Leveraging recent advances in reward-agnostic exploration and model-based offline RL, we design a three-stage hybrid RL algorithm that beats the best of both worlds -- pure offline RL and pure online RL -- in terms of sample complexities. The proposed algorithm does not require any reward information during data collection. Our theory is developed based on a new notion called single-policy partial concentrability, which captures the trade-off between distribution mismatch and miscoverage and guides the interplay between offline and online data.

Via

Access Paper or Ask Questions

Provable Identifiability of Two-Layer ReLU Neural Networks via LASSO Regularization

May 07, 2023
Gen Li, Ganghua Wang, Jie Ding

Figure 1 for Provable Identifiability of Two-Layer ReLU Neural Networks via LASSO Regularization

Figure 2 for Provable Identifiability of Two-Layer ReLU Neural Networks via LASSO Regularization

Figure 3 for Provable Identifiability of Two-Layer ReLU Neural Networks via LASSO Regularization

Figure 4 for Provable Identifiability of Two-Layer ReLU Neural Networks via LASSO Regularization

LASSO regularization is a popular regression tool to enhance the prediction accuracy of statistical models by performing variable selection through the $\ell_1$ penalty, initially formulated for the linear model and its variants. In this paper, the territory of LASSO is extended to two-layer ReLU neural networks, a fashionable and powerful nonlinear regression model. Specifically, given a neural network whose output $y$ depends only on a small subset of input $\boldsymbol{x}$, denoted by $\mathcal{S}^{\star}$, we prove that the LASSO estimator can stably reconstruct the neural network and identify $\mathcal{S}^{\star}$ when the number of samples scales logarithmically with the input dimension. This challenging regime has been well understood for linear models while barely studied for neural networks. Our theory lies in an extended Restricted Isometry Property (RIP)-based analysis framework for two-layer ReLU neural networks, which may be of independent interest to other LASSO or neural network settings. Based on the result, we advocate a neural network-based variable selection method. Experiments on simulated and real-world datasets show promising performance of the variable selection approach compared with existing techniques.

* IEEE Transactions on Information Theory, 2023

Via

Access Paper or Ask Questions

Minimax-Optimal Reward-Agnostic Exploration in Reinforcement Learning

Apr 14, 2023
Gen Li, Yuling Yan, Yuxin Chen, Jianqing Fan

Figure 1 for Minimax-Optimal Reward-Agnostic Exploration in Reinforcement Learning

This paper studies reward-agnostic exploration in reinforcement learning (RL) -- a scenario where the learner is unware of the reward functions during the exploration stage -- and designs an algorithm that improves over the state of the art. More precisely, consider a finite-horizon non-stationary Markov decision process with $S$ states, $A$ actions, and horizon length $H$, and suppose that there are no more than a polynomial number of given reward functions of interest. By collecting an order of \begin{align*} \frac{SAH^3}{\varepsilon^2} \text{ sample episodes (up to log factor)} \end{align*} without guidance of the reward information, our algorithm is able to find $\varepsilon$-optimal policies for all these reward functions, provided that $\varepsilon$ is sufficiently small. This forms the first reward-agnostic exploration scheme in this context that achieves provable minimax optimality. Furthermore, once the sample size exceeds $\frac{S^2AH^3}{\varepsilon^2}$ episodes (up to log factor), our algorithm is able to yield $\varepsilon$ accuracy for arbitrarily many reward functions (even when they are adversarially designed), a task commonly dubbed as ``reward-free exploration.'' The novelty of our algorithm design draws on insights from offline RL: the exploration scheme attempts to maximize a critical reward-agnostic quantity that dictates the performance of offline RL, while the policy learning paradigm leverages ideas from sample-optimal offline RL paradigms.

Via

Access Paper or Ask Questions