Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Renzhe Xu

Model Agnostic Sample Reweighting for Out-of-Distribution Learning

Jan 24, 2023

Xiao Zhou, Yong Lin, Renjie Pi, Weizhong Zhang, Renzhe Xu, Peng Cui, Tong Zhang

Figure 1 for Model Agnostic Sample Reweighting for Out-of-Distribution Learning

Figure 2 for Model Agnostic Sample Reweighting for Out-of-Distribution Learning

Figure 3 for Model Agnostic Sample Reweighting for Out-of-Distribution Learning

Figure 4 for Model Agnostic Sample Reweighting for Out-of-Distribution Learning

Abstract:Distributionally robust optimization (DRO) and invariant risk minimization (IRM) are two popular methods proposed to improve out-of-distribution (OOD) generalization performance of machine learning models. While effective for small models, it has been observed that these methods can be vulnerable to overfitting with large overparameterized models. This work proposes a principled method, \textbf{M}odel \textbf{A}gnostic sam\textbf{PL}e r\textbf{E}weighting (\textbf{MAPLE}), to effectively address OOD problem, especially in overparameterized scenarios. Our key idea is to find an effective reweighting of the training samples so that the standard empirical risk minimization training of a large model on the weighted training data leads to superior OOD generalization performance. The overfitting issue is addressed by considering a bilevel formulation to search for the sample reweighting, in which the generalization complexity depends on the search space of sample weights instead of the model size. We present theoretical analysis in linear case to prove the insensitivity of MAPLE to model size, and empirically verify its superiority in surpassing state-of-the-art methods by a large margin. Code is available at \url{https://github.com/x-zho14/MAPLE}.

Via

Access Paper or Ask Questions

Stable Learning via Sparse Variable Independence

Dec 02, 2022

Han Yu, Peng Cui, Yue He, Zheyan Shen, Yong Lin, Renzhe Xu, Xingxuan Zhang

Abstract:The problem of covariate-shift generalization has attracted intensive research attention. Previous stable learning algorithms employ sample reweighting schemes to decorrelate the covariates when there is no explicit domain information about training data. However, with finite samples, it is difficult to achieve the desirable weights that ensure perfect independence to get rid of the unstable variables. Besides, decorrelating within stable variables may bring about high variance of learned models because of the over-reduced effective sample size. A tremendous sample size is required for these algorithms to work. In this paper, with theoretical justification, we propose SVI (Sparse Variable Independence) for the covariate-shift generalization problem. We introduce sparsity constraint to compensate for the imperfectness of sample reweighting under the finite-sample setting in previous methods. Furthermore, we organically combine independence-based sample reweighting and sparsity-based variable selection in an iterative way to avoid decorrelating within stable variables, increasing the effective sample size to alleviate variance inflation. Experiments on both synthetic and real-world datasets demonstrate the improvement of covariate-shift generalization performance brought by SVI.

* Accepted by AAAI 2023

Via

Access Paper or Ask Questions

Product Ranking for Revenue Maximization with Multiple Purchases

Oct 15, 2022

Renzhe Xu, Xingxuan Zhang, Bo Li, Yafeng Zhang, Xiaolong Chen, Peng Cui

Figure 1 for Product Ranking for Revenue Maximization with Multiple Purchases

Figure 2 for Product Ranking for Revenue Maximization with Multiple Purchases

Figure 3 for Product Ranking for Revenue Maximization with Multiple Purchases

Figure 4 for Product Ranking for Revenue Maximization with Multiple Purchases

Abstract:Product ranking is the core problem for revenue-maximizing online retailers. To design proper product ranking algorithms, various consumer choice models are proposed to characterize the consumers' behaviors when they are provided with a list of products. However, existing works assume that each consumer purchases at most one product or will keep viewing the product list after purchasing a product, which does not agree with the common practice in real scenarios. In this paper, we assume that each consumer can purchase multiple products at will. To model consumers' willingness to view and purchase, we set a random attention span and purchase budget, which determines the maximal amount of products that he/she views and purchases, respectively. Under this setting, we first design an optimal ranking policy when the online retailer can precisely model consumers' behaviors. Based on the policy, we further develop the Multiple-Purchase-with-Budget UCB (MPB-UCB) algorithms with $\~O(\sqrt{T})$ regret that estimate consumers' behaviors and maximize revenue simultaneously in online settings. Experiments on both synthetic and semi-synthetic datasets prove the effectiveness of the proposed algorithms.

* NeurIPS 2022

Via

Access Paper or Ask Questions

NICO++: Towards Better Benchmarking for Domain Generalization

Apr 21, 2022

Xingxuan Zhang, Yue He, Renzhe Xu, Han Yu, Zheyan Shen, Peng Cui

Figure 1 for NICO++: Towards Better Benchmarking for Domain Generalization

Figure 2 for NICO++: Towards Better Benchmarking for Domain Generalization

Figure 3 for NICO++: Towards Better Benchmarking for Domain Generalization

Figure 4 for NICO++: Towards Better Benchmarking for Domain Generalization

Abstract:Despite the remarkable performance that modern deep neural networks have achieved on independent and identically distributed (I.I.D.) data, they can crash under distribution shifts. Most current evaluation methods for domain generalization (DG) adopt the leave-one-out strategy as a compromise on the limited number of domains. We propose a large-scale benchmark with extensive labeled domains named NICO++ along with more rational evaluation methods for comprehensively evaluating DG algorithms. To evaluate DG datasets, we propose two metrics to quantify covariate shift and concept shift, respectively. Two novel generalization bounds from the perspective of data construction are proposed to prove that limited concept shift and significant covariate shift favor the evaluation capability for generalization. Through extensive experiments, NICO++ shows its superior evaluation capability compared with current DG datasets and its contribution in alleviating unfairness caused by the leak of oracle knowledge in model selection.

* The NICO challenge based on NICO++ can be found at https://nicochallenge.com/

Via

Access Paper or Ask Questions

Towards Domain Generalization in Object Detection

Mar 27, 2022

Xingxuan Zhang, Zekai Xu, Renzhe Xu, Jiashuo Liu, Peng Cui, Weitao Wan, Chong Sun, Chen Li

Figure 1 for Towards Domain Generalization in Object Detection

Figure 2 for Towards Domain Generalization in Object Detection

Figure 3 for Towards Domain Generalization in Object Detection

Figure 4 for Towards Domain Generalization in Object Detection

Abstract:Despite the striking performance achieved by modern detectors when training and test data are sampled from the same or similar distribution, the generalization ability of detectors under unknown distribution shifts remains hardly studied. Recently several works discussed the detectors' adaptation ability to a specific target domain which are not readily applicable in real-world applications since detectors may encounter various environments or situations while pre-collecting all of them before training is inconceivable. In this paper, we study the critical problem, domain generalization in object detection (DGOD), where detectors are trained with source domains and evaluated on unknown target domains. To thoroughly evaluate detectors under unknown distribution shifts, we formulate the DGOD problem and propose a comprehensive evaluation benchmark to fill the vacancy. Moreover, we propose a novel method named Region Aware Proposal reweighTing (RAPT) to eliminate dependence within RoI features. Extensive experiments demonstrate that current DG methods fail to address the DGOD problem and our method outperforms other state-of-the-art counterparts.

Via

Access Paper or Ask Questions

Regulatory Instruments for Fair Personalized Pricing

Feb 09, 2022

Renzhe Xu, Xingxuan Zhang, Peng Cui, Bo Li, Zheyan Shen, Jiazheng Xu

Figure 1 for Regulatory Instruments for Fair Personalized Pricing

Figure 2 for Regulatory Instruments for Fair Personalized Pricing

Figure 3 for Regulatory Instruments for Fair Personalized Pricing

Figure 4 for Regulatory Instruments for Fair Personalized Pricing

Abstract:Personalized pricing is a business strategy to charge different prices to individual consumers based on their characteristics and behaviors. It has become common practice in many industries nowadays due to the availability of a growing amount of high granular consumer data. The discriminatory nature of personalized pricing has triggered heated debates among policymakers and academics on how to design regulation policies to balance market efficiency and equity. In this paper, we propose two sound policy instruments, i.e., capping the range of the personalized prices or their ratios. We investigate the optimal pricing strategy of a profit-maximizing monopoly under both regulatory constraints and the impact of imposing them on consumer surplus, producer surplus, and social welfare. We theoretically prove that both proposed constraints can help balance consumer surplus and producer surplus at the expense of total surplus for common demand distributions, such as uniform, logistic, and exponential distributions. Experiments on both simulation and real-world datasets demonstrate the correctness of these theoretical results. Our findings and insights shed light on regulatory policy design for the increasingly monopolized business in the digital era.

* WWW 2022

Via

Access Paper or Ask Questions

Why Stable Learning Works? A Theory of Covariate Shift Generalization

Nov 03, 2021

Renzhe Xu, Peng Cui, Zheyan Shen, Xingxuan Zhang, Tong Zhang

Figure 1 for Why Stable Learning Works? A Theory of Covariate Shift Generalization

Figure 2 for Why Stable Learning Works? A Theory of Covariate Shift Generalization

Figure 3 for Why Stable Learning Works? A Theory of Covariate Shift Generalization

Figure 4 for Why Stable Learning Works? A Theory of Covariate Shift Generalization

Abstract:Covariate shift generalization, a typical case in out-of-distribution (OOD) generalization, requires a good performance on the unknown testing distribution, which varies from the accessible training distribution in the form of covariate shift. Recently, stable learning algorithms have shown empirical effectiveness to deal with covariate shift generalization on several learning models involving regression algorithms and deep neural networks. However, the theoretical explanations for such effectiveness are still missing. In this paper, we take a step further towards the theoretical analysis of stable learning algorithms by explaining them as feature selection processes. We first specify a set of variables, named minimal stable variable set, that is minimal and optimal to deal with covariate shift generalization for common loss functions, including the mean squared loss and binary cross entropy loss. Then we prove that under ideal conditions, stable learning algorithms could identify the variables in this set. Further analysis on asymptotic properties and error propagation are also provided. These theories shed light on why stable learning works for covariate shift generalization.

* 25 pages

Via

Access Paper or Ask Questions

Towards Out-Of-Distribution Generalization: A Survey

Aug 31, 2021

Zheyan Shen, Jiashuo Liu, Yue He, Xingxuan Zhang, Renzhe Xu, Han Yu, Peng Cui

Figure 1 for Towards Out-Of-Distribution Generalization: A Survey

Figure 2 for Towards Out-Of-Distribution Generalization: A Survey

Figure 3 for Towards Out-Of-Distribution Generalization: A Survey

Figure 4 for Towards Out-Of-Distribution Generalization: A Survey

Abstract:Classic machine learning methods are built on the $i.i.d.$ assumption that training and testing data are independent and identically distributed. However, in real scenarios, the $i.i.d.$ assumption can hardly be satisfied, rendering the sharp drop of classic machine learning algorithms' performances under distributional shifts, which indicates the significance of investigating the Out-of-Distribution generalization problem. Out-of-Distribution (OOD) generalization problem addresses the challenging setting where the testing distribution is unknown and different from the training. This paper serves as the first effort to systematically and comprehensively discuss the OOD generalization problem, from the definition, methodology, evaluation to the implications and future directions. Firstly, we provide the formal definition of the OOD generalization problem. Secondly, existing methods are categorized into three parts based on their positions in the whole learning pipeline, namely unsupervised representation learning, supervised model learning and optimization, and typical methods for each category are discussed in detail. We then demonstrate the theoretical connections of different categories, and introduce the commonly used datasets and evaluation metrics. Finally, we summarize the whole literature and raise some future directions for OOD generalization problem. The summary of OOD generalization methods reviewed in this survey can be found at http://out-of-distribution-generalization.com.

Via

Access Paper or Ask Questions

Domain-Irrelevant Representation Learning for Unsupervised Domain Generalization

Jul 13, 2021

Xingxuan Zhang, Linjun Zhou, Renzhe Xu, Peng Cui, Zheyan Shen, Haoxin Liu

Figure 1 for Domain-Irrelevant Representation Learning for Unsupervised Domain Generalization

Figure 2 for Domain-Irrelevant Representation Learning for Unsupervised Domain Generalization

Figure 3 for Domain-Irrelevant Representation Learning for Unsupervised Domain Generalization

Figure 4 for Domain-Irrelevant Representation Learning for Unsupervised Domain Generalization

Abstract:Domain generalization (DG) aims to help models trained on a set of source domains generalize better on unseen target domains. The performances of current DG methods largely rely on sufficient labeled data, which however are usually costly or unavailable. While unlabeled data are far more accessible, we seek to explore how unsupervised learning can help deep models generalizes across domains. Specifically, we study a novel generalization problem called unsupervised domain generalization, which aims to learn generalizable models with unlabeled data. Furthermore, we propose a Domain-Irrelevant Unsupervised Learning (DIUL) method to cope with the significant and misleading heterogeneity within unlabeled data and severe distribution shifts between source and target data. Surprisingly we observe that DIUL can not only counterbalance the scarcity of labeled data but also further strengthen the generalization ability of models when the labeled data are sufficient. As a pretraining approach, DIUL shows superior to ImageNet pretraining protocol even when the available data are unlabeled and of a greatly smaller amount compared to ImageNet. Extensive experiments clearly demonstrate the effectiveness of our method compared with state-of-the-art unsupervised learning counterparts.

Via

Access Paper or Ask Questions

Deep Stable Learning for Out-Of-Distribution Generalization

Apr 16, 2021

Xingxuan Zhang, Peng Cui, Renzhe Xu, Linjun Zhou, Yue He, Zheyan Shen

Figure 1 for Deep Stable Learning for Out-Of-Distribution Generalization

Figure 2 for Deep Stable Learning for Out-Of-Distribution Generalization

Figure 3 for Deep Stable Learning for Out-Of-Distribution Generalization

Figure 4 for Deep Stable Learning for Out-Of-Distribution Generalization

Abstract:Approaches based on deep neural networks have achieved striking performance when testing data and training data share similar distribution, but can significantly fail otherwise. Therefore, eliminating the impact of distribution shifts between training and testing data is crucial for building performance-promising deep models. Conventional methods assume either the known heterogeneity of training data (e.g. domain labels) or the approximately equal capacities of different domains. In this paper, we consider a more challenging case where neither of the above assumptions holds. We propose to address this problem by removing the dependencies between features via learning weights for training samples, which helps deep models get rid of spurious correlations and, in turn, concentrate more on the true connection between discriminative features and labels. Extensive experiments clearly demonstrate the effectiveness of our method on multiple distribution generalization benchmarks compared with state-of-the-art counterparts. Through extensive experiments on distribution generalization benchmarks including PACS, VLCS, MNIST-M, and NICO, we show the effectiveness of our method compared with state-of-the-art counterparts.

Via

Access Paper or Ask Questions