Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuxin Chen

Online Learning of Energy Consumption for Navigation of Electric Vehicles

Nov 03, 2021
Niklas Åkerblom, Yuxin Chen, Morteza Haghir Chehreghani

Figure 1 for Online Learning of Energy Consumption for Navigation of Electric Vehicles

Figure 2 for Online Learning of Energy Consumption for Navigation of Electric Vehicles

Figure 3 for Online Learning of Energy Consumption for Navigation of Electric Vehicles

Figure 4 for Online Learning of Energy Consumption for Navigation of Electric Vehicles

Energy-efficient navigation constitutes an important challenge in electric vehicles, due to their limited battery capacity. We employ a Bayesian approach to model the energy consumption at road segments for efficient navigation. In order to learn the model parameters, we develop an online learning framework and investigate several exploration strategies such as Thompson Sampling and Upper Confidence Bound. We then extend our online learning framework to multi-agent setting, where multiple vehicles adaptively navigate and learn the parameters of the energy model. We analyze Thompson Sampling and establish rigorous regret bounds on its performance in the single-agent and multi-agent settings, through an analysis of the algorithm under batched feedback. Finally, we demonstrate the performance of our methods via experiments on several real-world city road networks.

* Extension of IJCAI 2020 Main Track conference paper, sole copyright holder IJCAI (International Joint Conferences on Artificial Intelligence), all rights reserved. Original conference paper available at https://www.ijcai.org/Proceedings/2020/284 and arXiv:2003.01416

Via

Access Paper or Ask Questions

Teaching an Active Learner with Contrastive Examples

Oct 29, 2021
Chaoqi Wang, Adish Singla, Yuxin Chen

Figure 1 for Teaching an Active Learner with Contrastive Examples

Figure 2 for Teaching an Active Learner with Contrastive Examples

Figure 3 for Teaching an Active Learner with Contrastive Examples

Figure 4 for Teaching an Active Learner with Contrastive Examples

We study the problem of active learning with the added twist that the learner is assisted by a helpful teacher. We consider the following natural interaction protocol: At each round, the learner proposes a query asking for the label of an instance $x^q$, the teacher provides the requested label $\{x^q, y^q\}$ along with explanatory information to guide the learning process. In this paper, we view this information in the form of an additional contrastive example ($\{x^c, y^c\}$) where $x^c$ is picked from a set constrained by $x^q$ (e.g., dissimilar instances with the same label). Our focus is to design a teaching algorithm that can provide an informative sequence of contrastive examples to the learner to speed up the learning process. We show that this leads to a challenging sequence optimization problem where the algorithm's choices at a given round depend on the history of interactions. We investigate an efficient teaching algorithm that adaptively picks these contrastive examples. We derive strong performance guarantees for our algorithm based on two problem-dependent parameters and further show that for specific types of active learners (e.g., a generalized binary search learner), the proposed teaching algorithm exhibits strong approximation guarantees. Finally, we illustrate our bounds and demonstrate the effectiveness of our teaching framework via two numerical case studies.

* Accepted at NeurIPS 2021

Via

Access Paper or Ask Questions

Class-wise Thresholding for Detecting Out-of-Distribution Data

Oct 28, 2021
Matteo Guarrera, Baihong Jin, Tung-Wei Lin, Maria Zuluaga, Yuxin Chen, Alberto Sangiovanni-Vincentelli

Figure 1 for Class-wise Thresholding for Detecting Out-of-Distribution Data

Figure 2 for Class-wise Thresholding for Detecting Out-of-Distribution Data

Figure 3 for Class-wise Thresholding for Detecting Out-of-Distribution Data

Figure 4 for Class-wise Thresholding for Detecting Out-of-Distribution Data

We consider the problem of detecting OoD(Out-of-Distribution) input data when using deep neural networks, and we propose a simple yet effective way to improve the robustness of several popular OoD detection methods against label shift. Our work is motivated by the observation that most existing OoD detection algorithms consider all training/test data as a whole, regardless of which class entry each input activates (inter-class differences). Through extensive experimentation, we have found that such practice leads to a detector whose performance is sensitive and vulnerable to label shift. To address this issue, we propose a class-wise thresholding scheme that can apply to most existing OoD detection algorithms and can maintain similar OoD detection performance even in the presence of label shift in the test distribution.

* 12 pages, 7 figures, 7 tables

Via

Access Paper or Ask Questions

Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Oct 09, 2021
Gen Li, Laixi Shi, Yuxin Chen, Yuantao Gu, Yuejie Chi

Figure 1 for Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Achieving sample efficiency in online episodic reinforcement learning (RL) requires optimally balancing exploration and exploitation. When it comes to a finite-horizon episodic Markov decision process with $S$ states, $A$ actions and horizon length $H$, substantial progress has been achieved towards characterizing the minimax-optimal regret, which scales on the order of $\sqrt{H^2SAT}$ (modulo log factors) with $T$ the total number of samples. While several competing solution paradigms have been proposed to minimize regret, they are either memory-inefficient, or fall short of optimality unless the sample size exceeds an enormous threshold (e.g., $S^6A^4 \,\mathrm{poly}(H)$ for existing model-free methods). To overcome such a large sample size barrier to efficient RL, we design a novel model-free algorithm, with space complexity $O(SAH)$, that achieves near-optimal regret as soon as the sample size exceeds the order of $SA\,\mathrm{poly}(H)$. In terms of this sample size requirement (also referred to the initial burn-in cost), our method improves -- by at least a factor of $S^5A^3$ -- upon any prior memory-efficient algorithm that is asymptotically regret-optimal. Leveraging the recently introduced variance reduction strategy (also called {\em reference-advantage decomposition}), the proposed algorithm employs an {\em early-settled} reference update rule, with the aid of two Q-learning sequences with upper and lower confidence bounds. The design principle of our early-settled variance reduction method might be of independent interest to other RL settings that involve intricate exploration-exploitation trade-offs.

* Short version in Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021)

Via

Access Paper or Ask Questions

Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition

Aug 23, 2021
Yuxin Chen, Ziqi Zhang, Chunfeng Yuan, Bing Li, Ying Deng, Weiming Hu

Figure 1 for Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition

Figure 2 for Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition

Figure 3 for Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition

Figure 4 for Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition

Graph convolutional networks (GCNs) have been widely used and achieved remarkable results in skeleton-based action recognition. In GCNs, graph topology dominates feature aggregation and therefore is the key to extracting representative features. In this work, we propose a novel Channel-wise Topology Refinement Graph Convolution (CTR-GC) to dynamically learn different topologies and effectively aggregate joint features in different channels for skeleton-based action recognition. The proposed CTR-GC models channel-wise topologies through learning a shared topology as a generic prior for all channels and refining it with channel-specific correlations for each channel. Our refinement method introduces few extra parameters and significantly reduces the difficulty of modeling channel-wise topologies. Furthermore, via reformulating graph convolutions into a unified form, we find that CTR-GC relaxes strict constraints of graph convolutions, leading to stronger representation capability. Combining CTR-GC with temporal modeling modules, we develop a powerful graph convolutional network named CTR-GCN which notably outperforms state-of-the-art methods on the NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets.

* Accepted to ICCV2021. Camera-ready version with supplementary materials. Code is available at https://github.com/Uason-Chen/CTR-GCN

Via

Access Paper or Ask Questions

A Contract Theory based Incentive Mechanism for Federated Learning

Aug 12, 2021
Mengmeng Tian, Yuxin Chen, Yuan Liu, Zehui Xiong, Cyril Leung, Chunyan Miao

Figure 1 for A Contract Theory based Incentive Mechanism for Federated Learning

Figure 2 for A Contract Theory based Incentive Mechanism for Federated Learning

Figure 3 for A Contract Theory based Incentive Mechanism for Federated Learning

Federated learning (FL) serves as a data privacy-preserved machine learning paradigm, and realizes the collaborative model trained by distributed clients. To accomplish an FL task, the task publisher needs to pay financial incentives to the FL server and FL server offloads the task to the contributing FL clients. It is challenging to design proper incentives for the FL clients due to the fact that the task is privately trained by the clients. This paper aims to propose a contract theory based FL task training model towards minimizing incentive budget subject to clients being individually rational (IR) and incentive compatible (IC) in each FL training round. We design a two-dimensional contract model by formally defining two private types of clients, namely data quality and computation effort. To effectively aggregate the trained models, a contract-based aggregator is proposed. We analyze the feasible and optimal contract solutions to the proposed contract model. %Experimental results demonstrate that the proposed framework and contract model can effective improve the generation accuracy of FL tasks. Experimental results show that the generalization accuracy of the FL tasks can be improved by the proposed incentive mechanism where contract-based aggregation is applied.

* 7 pages, 2 figures, International Workshop on Federated and Transfer Learning for Data Sparsity and Confidentiality in Conjunction with IJCAI 2021 (FTL-IJCAI'21), Best Student Paper Award

Via

Access Paper or Ask Questions

Inference for Heteroskedastic PCA with Missing Data

Jul 26, 2021
Yuling Yan, Yuxin Chen, Jianqing Fan

Figure 1 for Inference for Heteroskedastic PCA with Missing Data

Figure 2 for Inference for Heteroskedastic PCA with Missing Data

Figure 3 for Inference for Heteroskedastic PCA with Missing Data

Figure 4 for Inference for Heteroskedastic PCA with Missing Data

This paper studies how to construct confidence regions for principal component analysis (PCA) in high dimension, a problem that has been vastly under-explored. While computing measures of uncertainty for nonlinear/nonconvex estimators is in general difficult in high dimension, the challenge is further compounded by the prevalent presence of missing data and heteroskedastic noise. We propose a suite of solutions to perform valid inference on the principal subspace based on two estimators: a vanilla SVD-based approach, and a more refined iterative scheme called $\textsf{HeteroPCA}$ (Zhang et al., 2018). We develop non-asymptotic distributional guarantees for both estimators, and demonstrate how these can be invoked to compute both confidence regions for the principal subspace and entrywise confidence intervals for the spiked covariance matrix. Particularly worth highlighting is the inference procedure built on top of $\textsf{HeteroPCA}$, which is not only valid but also statistically efficient for broader scenarios (e.g., it covers a wider range of missing rates and signal-to-noise ratios). Our solutions are fully data-driven and adaptive to heteroskedastic random noise, without requiring prior knowledge about the noise levels and noise distributions.

Via

Access Paper or Ask Questions

Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence

May 24, 2021
Wenhao Zhan, Shicong Cen, Baihe Huang, Yuxin Chen, Jason D. Lee, Yuejie Chi

Figure 1 for Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence

Policy optimization, which learns the policy of interest by maximizing the value function via large-scale optimization techniques, lies at the heart of modern reinforcement learning (RL). In addition to value maximization, other practical considerations arise commonly as well, including the need of encouraging exploration, and that of ensuring certain structural properties of the learned policy due to safety, resource and operational constraints. These considerations can often be accounted for by resorting to regularized RL, which augments the target value function with a structure-promoting regularization term. Focusing on an infinite-horizon discounted Markov decision process, this paper proposes a generalized policy mirror descent (GPMD) algorithm for solving regularized RL. As a generalization of policy mirror descent Lan (2021), the proposed algorithm accommodates a general class of convex regularizers as well as a broad family of Bregman divergence in cognizant of the regularizer in use. We demonstrate that our algorithm converges linearly over an entire range of learning rates, in a dimension-free fashion, to the global solution, even when the regularizer lacks strong convexity and smoothness. In addition, this linear convergence feature is provably stable in the face of inexact policy evaluation and imperfect policy updates. Numerical experiments are provided to corroborate the applicability and appealing performance of GPMD.

Via

Access Paper or Ask Questions

Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting

May 17, 2021
Gen Li, Yuxin Chen, Yuejie Chi, Yuantao Gu, Yuting Wei

Figure 1 for Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting

Low-complexity models such as linear function representation play a pivotal role in enabling sample-efficient reinforcement learning (RL). The current paper pertains to a scenario with value-based linear representation, which postulates the linear realizability of the optimal Q-function (also called the "linear $Q^{\star}$ problem"). While linear realizability alone does not allow for sample-efficient solutions in general, the presence of a large sub-optimality gap is a potential game changer, depending on the sampling mechanism in use. Informally, sample efficiency is achievable with a large sub-optimality gap when a generative model is available but is unfortunately infeasible when we turn to standard online RL settings. In this paper, we make progress towards understanding this linear $Q^{\star}$ problem by investigating a new sampling protocol, which draws samples in an online/exploratory fashion but allows one to backtrack and revisit previous states in a controlled and infrequent manner. This protocol is more flexible than the standard online RL setting, while being practically relevant and far more restrictive than the generative model. We develop an algorithm tailored to this setting, achieving a sample complexity that scales polynomially with the feature dimension, the horizon, and the inverse sub-optimality gap, but not the size of the state/action space. Our findings underscore the fundamental interplay between sampling protocols and low-complexity structural representation in RL.

Via

Access Paper or Ask Questions