Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qian Lin

The phase diagram of kernel interpolation in large dimensions

Apr 19, 2024

Haobo Zhang, Weihao Lu, Qian Lin

Figure 1 for The phase diagram of kernel interpolation in large dimensions

Abstract:The generalization ability of kernel interpolation in large dimensions (i.e., $n \asymp d^{\gamma}$ for some $\gamma>0$) might be one of the most interesting problems in the recent renaissance of kernel regression, since it may help us understand the 'benign overfitting phenomenon' reported in the neural networks literature. Focusing on the inner product kernel on the sphere, we fully characterized the exact order of both the variance and bias of large-dimensional kernel interpolation under various source conditions $s\geq 0$. Consequently, we obtained the $(s,\gamma)$-phase diagram of large-dimensional kernel interpolation, i.e., we determined the regions in $(s,\gamma)$-plane where the kernel interpolation is minimax optimal, sub-optimal and inconsistent.

* 18 pages, 1 figure

Via

Access Paper or Ask Questions

The Optimality of Kernel Classifiers in Sobolev Space

Feb 02, 2024

Jianfa Lai, Zhifan Li, Dongming Huang, Qian Lin

Figure 1 for The Optimality of Kernel Classifiers in Sobolev Space

Figure 2 for The Optimality of Kernel Classifiers in Sobolev Space

Figure 3 for The Optimality of Kernel Classifiers in Sobolev Space

Abstract:Kernel methods are widely used in machine learning, especially for classification problems. However, the theoretical analysis of kernel classification is still limited. This paper investigates the statistical performances of kernel classifiers. With some mild assumptions on the conditional probability $\eta(x)=\mathbb{P}(Y=1\mid X=x)$, we derive an upper bound on the classification excess risk of a kernel classifier using recent advances in the theory of kernel regression. We also obtain a minimax lower bound for Sobolev spaces, which shows the optimality of the proposed classifier. Our theoretical results can be extended to the generalization error of overparameterized neural network classifiers. To make our theoretical results more applicable in realistic settings, we also propose a simple method to estimate the interpolation smoothness of $2\eta(x)-1$ and apply the method to real datasets.

* 21 pages, 2 figures

Via

Access Paper or Ask Questions

Off-Policy Primal-Dual Safe Reinforcement Learning

Jan 26, 2024

Zifan Wu, Bo Tang, Qian Lin, Chao Yu, Shangqin Mao, Qianlong Xie, Xingxing Wang, Dong Wang

Figure 1 for Off-Policy Primal-Dual Safe Reinforcement Learning

Figure 2 for Off-Policy Primal-Dual Safe Reinforcement Learning

Figure 3 for Off-Policy Primal-Dual Safe Reinforcement Learning

Figure 4 for Off-Policy Primal-Dual Safe Reinforcement Learning

Abstract:Primal-dual safe RL methods commonly perform iterations between the primal update of the policy and the dual update of the Lagrange Multiplier. Such a training paradigm is highly susceptible to the error in cumulative cost estimation since this estimation serves as the key bond connecting the primal and dual update processes. We show that this problem causes significant underestimation of cost when using off-policy methods, leading to the failure to satisfy the safety constraint. To address this issue, we propose \textit{conservative policy optimization}, which learns a policy in a constraint-satisfying area by considering the uncertainty in cost estimation. This improves constraint satisfaction but also potentially hinders reward maximization. We then introduce \textit{local policy convexification} to help eliminate such suboptimality by gradually reducing the estimation uncertainty. We provide theoretical interpretations of the joint coupling effect of these two ingredients and further verify them by extensive experiments. Results on benchmark tasks show that our method not only achieves an asymptotic performance comparable to state-of-the-art on-policy methods while using much fewer samples, but also significantly reduces constraint violation during training. Our code is available at https://github.com/ZifanWu/CAL.

* ICLR 2024 Poster

Via

Access Paper or Ask Questions

Policy-regularized Offline Multi-objective Reinforcement Learning

Jan 04, 2024

Qian Lin, Chao Yu, Zongkai Liu, Zifan Wu

Figure 1 for Policy-regularized Offline Multi-objective Reinforcement Learning

Figure 2 for Policy-regularized Offline Multi-objective Reinforcement Learning

Figure 3 for Policy-regularized Offline Multi-objective Reinforcement Learning

Figure 4 for Policy-regularized Offline Multi-objective Reinforcement Learning

Abstract:In this paper, we aim to utilize only offline trajectory data to train a policy for multi-objective RL. We extend the offline policy-regularized method, a widely-adopted approach for single-objective offline RL problems, into the multi-objective setting in order to achieve the above goal. However, such methods face a new challenge in offline MORL settings, namely the preference-inconsistent demonstration problem. We propose two solutions to this problem: 1) filtering out preference-inconsistent demonstrations via approximating behavior preferences, and 2) adopting regularization techniques with high policy expressiveness. Moreover, we integrate the preference-conditioned scalarized update method into policy-regularized offline RL, in order to simultaneously learn a set of policies using a single policy network, thus reducing the computational cost induced by the training of a large number of individual policies for various preferences. Finally, we introduce Regularization Weight Adaptation to dynamically determine appropriate regularization weights for arbitrary target preferences during deployment. Empirical results on various multi-objective datasets demonstrate the capability of our approach in solving offline MORL problems.

Via

Access Paper or Ask Questions

Generalization Error Curves for Analytic Spectral Algorithms under Power-law Decay

Jan 03, 2024

Yicheng Li, Weiye Gan, Zuoqiang Shi, Qian Lin

Abstract:The generalization error curve of certain kernel regression method aims at determining the exact order of generalization error with various source condition, noise level and choice of the regularization parameter rather than the minimax rate. In this work, under mild assumptions, we rigorously provide a full characterization of the generalization error curves of the kernel gradient descent method (and a large class of analytic spectral algorithms) in kernel regression. Consequently, we could sharpen the near inconsistency of kernel interpolation and clarify the saturation effects of kernel regression algorithms with higher qualification, etc. Thanks to the neural tangent kernel theory, these results greatly improve our understanding of the generalization behavior of training the wide neural networks. A novel technical contribution, the analytic functional argument, might be of independent interest.

Via

Access Paper or Ask Questions

Optimal Rates of Kernel Ridge Regression under Source Condition in Large Dimensions

Jan 02, 2024

Haobo Zhang, Yicheng Li, Weihao Lu, Qian Lin

Abstract:Motivated by the studies of neural networks (e.g.,the neural tangent kernel theory), we perform a study on the large-dimensional behavior of kernel ridge regression (KRR) where the sample size $n \asymp d^{\gamma}$ for some $\gamma > 0$. Given an RKHS $\mathcal{H}$ associated with an inner product kernel defined on the sphere $\mathbb{S}^{d}$, we suppose that the true function $f_{\rho}^{*} \in [\mathcal{H}]^{s}$, the interpolation space of $\mathcal{H}$ with source condition $s>0$. We first determined the exact order (both upper and lower bound) of the generalization error of kernel ridge regression for the optimally chosen regularization parameter $\lambda$. We then further showed that when $0<s\le1$, KRR is minimax optimal; and when $s>1$, KRR is not minimax optimal (a.k.a. he saturation effect). Our results illustrate that the curves of rate varying along $\gamma$ exhibit the periodic plateau behavior and the multiple descent behavior and show how the curves evolve with $s>0$. Interestingly, our work provides a unified viewpoint of several recent works on kernel regression in the large-dimensional setting, which correspond to $s=0$ and $s=1$ respectively.

* 61 pages, 11 figures

Via

Access Paper or Ask Questions

On the Asymptotic Learning Curves of Kernel Ridge Regression under Power-law Decay

Sep 23, 2023

Yicheng Li, Haobo Zhang, Qian Lin

Figure 1 for On the Asymptotic Learning Curves of Kernel Ridge Regression under Power-law Decay

Figure 2 for On the Asymptotic Learning Curves of Kernel Ridge Regression under Power-law Decay

Figure 3 for On the Asymptotic Learning Curves of Kernel Ridge Regression under Power-law Decay

Figure 4 for On the Asymptotic Learning Curves of Kernel Ridge Regression under Power-law Decay

Abstract:The widely observed 'benign overfitting phenomenon' in the neural network literature raises the challenge to the 'bias-variance trade-off' doctrine in the statistical learning theory. Since the generalization ability of the 'lazy trained' over-parametrized neural network can be well approximated by that of the neural tangent kernel regression, the curve of the excess risk (namely, the learning curve) of kernel ridge regression attracts increasing attention recently. However, most recent arguments on the learning curve are heuristic and are based on the 'Gaussian design' assumption. In this paper, under mild and more realistic assumptions, we rigorously provide a full characterization of the learning curve: elaborating the effect and the interplay of the choice of the regularization parameter, the source condition and the noise. In particular, our results suggest that the 'benign overfitting phenomenon' exists in very wide neural networks only when the noise level is small.

Via

Access Paper or Ask Questions

Optimal Rate of Kernel Regression in Large Dimensions

Sep 08, 2023

Weihao Lu, Haobo Zhang, Yicheng Li, Manyun Xu, Qian Lin

Figure 1 for Optimal Rate of Kernel Regression in Large Dimensions

Figure 2 for Optimal Rate of Kernel Regression in Large Dimensions

Figure 3 for Optimal Rate of Kernel Regression in Large Dimensions

Abstract:We perform a study on kernel regression for large-dimensional data (where the sample size $n$ is polynomially depending on the dimension $d$ of the samples, i.e., $n\asymp d^{\gamma}$ for some $\gamma >0$ ). We first build a general tool to characterize the upper bound and the minimax lower bound of kernel regression for large dimensional data through the Mendelson complexity $\varepsilon_{n}^{2}$ and the metric entropy $\bar{\varepsilon}_{n}^{2}$ respectively. When the target function falls into the RKHS associated with a (general) inner product model defined on $\mathbb{S}^{d}$, we utilize the new tool to show that the minimax rate of the excess risk of kernel regression is $n^{-1/2}$ when $n\asymp d^{\gamma}$ for $\gamma =2, 4, 6, 8, \cdots$. We then further determine the optimal rate of the excess risk of kernel regression for all the $\gamma>0$ and find that the curve of optimal rate varying along $\gamma$ exhibits several new phenomena including the {\it multiple descent behavior} and the {\it periodic plateau behavior}. As an application, For the neural tangent kernel (NTK), we also provide a similar explicit description of the curve of optimal rate. As a direct corollary, we know these claims hold for wide neural networks as well.

Via

Access Paper or Ask Questions

Safe Offline Reinforcement Learning with Real-Time Budget Constraints

Jun 01, 2023

Qian Lin, Bo Tang, Zifan Wu, Chao Yu, Shangqin Mao, Qianlong Xie, Xingxing Wang, Dong Wang

Figure 1 for Safe Offline Reinforcement Learning with Real-Time Budget Constraints

Figure 2 for Safe Offline Reinforcement Learning with Real-Time Budget Constraints

Figure 3 for Safe Offline Reinforcement Learning with Real-Time Budget Constraints

Figure 4 for Safe Offline Reinforcement Learning with Real-Time Budget Constraints

Abstract:Aiming at promoting the safe real-world deployment of Reinforcement Learning (RL), research on safe RL has made significant progress in recent years. However, most existing works in the literature still focus on the online setting where risky violations of the safety budget are likely to be incurred during training. Besides, in many real-world applications, the learned policy is required to respond to dynamically determined safety budgets (i.e., constraint threshold) in real time. In this paper, we target at the above real-time budget constraint problem under the offline setting, and propose Trajectory-based REal-time Budget Inference (TREBI) as a novel solution that approaches this problem from the perspective of trajectory distribution. Theoretically, we prove an error bound of the estimation on the episodic reward and cost under the offline setting and thus provide a performance guarantee for TREBI. Empirical results on a wide range of simulation tasks and a real-world large-scale advertising application demonstrate the capability of TREBI in solving real-time budget constraint problems under offline settings.

* We propose a method to handle the constraint problem with dynamically determined safety budgets under the offline setting

Via

Access Paper or Ask Questions

Generalization Ability of Wide Residual Networks

May 29, 2023

Jianfa Lai, Zixiong Yu, Songtao Tian, Qian Lin

Abstract:In this paper, we study the generalization ability of the wide residual network on $\mathbb{S}^{d-1}$ with the ReLU activation function. We first show that as the width $m\rightarrow\infty$, the residual network kernel (RNK) uniformly converges to the residual neural tangent kernel (RNTK). This uniform convergence further guarantees that the generalization error of the residual network converges to that of the kernel regression with respect to the RNTK. As direct corollaries, we then show $i)$ the wide residual network with the early stopping strategy can achieve the minimax rate provided that the target regression function falls in the reproducing kernel Hilbert space (RKHS) associated with the RNTK; $ii)$ the wide residual network can not generalize well if it is trained till overfitting the data. We finally illustrate some experiments to reconcile the contradiction between our theoretical result and the widely observed ``benign overfitting phenomenon''

* 28 pages, 3 figures

Via

Access Paper or Ask Questions