Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chuan Yu

Scalable Virtual Valuations Combinatorial Auction Design by Combining Zeroth-Order and First-Order Optimization Method

Feb 19, 2024

Zhijian Duan, Haoran Sun, Yichong Xia, Siqiang Wang, Zhilin Zhang, Chuan Yu, Jian Xu, Bo Zheng, Xiaotie Deng

Abstract:Automated auction design seeks to discover empirically high-revenue and incentive-compatible mechanisms using machine learning. Ensuring dominant strategy incentive compatibility (DSIC) is crucial, and the most effective approach is to confine the mechanism to Affine Maximizer Auctions (AMAs). Nevertheless, existing AMA-based approaches encounter challenges such as scalability issues (arising from combinatorial candidate allocations) and the non-differentiability of revenue. In this paper, to achieve a scalable AMA-based method, we further restrict the auction mechanism to Virtual Valuations Combinatorial Auctions (VVCAs), a subset of AMAs with significantly fewer parameters. Initially, we employ a parallelizable dynamic programming algorithm to compute the winning allocation of a VVCA. Subsequently, we propose a novel optimization method that combines both zeroth-order and first-order techniques to optimize the VVCA parameters. Extensive experiments demonstrate the efficacy and scalability of our proposed approach, termed Zeroth-order and First-order Optimization of VVCAs (ZFO-VVCA), particularly when applied to large-scale auctions.

Via

Access Paper or Ask Questions

Sustainable Online Reinforcement Learning for Auto-bidding

Oct 13, 2022

Zhiyu Mou, Yusen Huo, Rongquan Bai, Mingzhou Xie, Chuan Yu, Jian Xu, Bo Zheng

Figure 1 for Sustainable Online Reinforcement Learning for Auto-bidding

Figure 2 for Sustainable Online Reinforcement Learning for Auto-bidding

Figure 3 for Sustainable Online Reinforcement Learning for Auto-bidding

Figure 4 for Sustainable Online Reinforcement Learning for Auto-bidding

Abstract:Recently, auto-bidding technique has become an essential tool to increase the revenue of advertisers. Facing the complex and ever-changing bidding environments in the real-world advertising system (RAS), state-of-the-art auto-bidding policies usually leverage reinforcement learning (RL) algorithms to generate real-time bids on behalf of the advertisers. Due to safety concerns, it was believed that the RL training process can only be carried out in an offline virtual advertising system (VAS) that is built based on the historical data generated in the RAS. In this paper, we argue that there exists significant gaps between the VAS and RAS, making the RL training process suffer from the problem of inconsistency between online and offline (IBOO). Firstly, we formally define the IBOO and systematically analyze its causes and influences. Then, to avoid the IBOO, we propose a sustainable online RL (SORL) framework that trains the auto-bidding policy by directly interacting with the RAS, instead of learning in the VAS. Specifically, based on our proof of the Lipschitz smooth property of the Q function, we design a safe and efficient online exploration (SER) policy for continuously collecting data from the RAS. Meanwhile, we derive the theoretical lower bound on the safety of the SER policy. We also develop a variance-suppressed conservative Q-learning (V-CQL) method to effectively and stably learn the auto-bidding policy with the collected data. Finally, extensive simulated and real-world experiments validate the superiority of our approach over the state-of-the-art auto-bidding algorithm.

* NeurIPS 2022

Via

Access Paper or Ask Questions

Hierarchically Constrained Adaptive Ad Exposure in Feeds

May 31, 2022

Dagui Chen, Qi Yan, Chunjie Chen, Zhenzhe Zheng, Yangsu Liu, Zhenjia Ma, Chuan Yu, Jian Xu, Bo Zheng

Figure 1 for Hierarchically Constrained Adaptive Ad Exposure in Feeds

Figure 2 for Hierarchically Constrained Adaptive Ad Exposure in Feeds

Figure 3 for Hierarchically Constrained Adaptive Ad Exposure in Feeds

Figure 4 for Hierarchically Constrained Adaptive Ad Exposure in Feeds

Abstract:A contemporary feed application usually provides blended results of organic items and sponsored items~(ads) to users. Conventionally, ads are exposed at fixed positions. Such a static exposure strategy is inefficient due to ignoring users' personalized preferences towards ads. To this end, adaptive ad exposure has become an appealing strategy to boost the overall performance of the feed. However, existing approaches to implementing the adaptive ad exposure still suffer from several limitations: 1) they usually fall into sub-optimal solutions because of only focusing on request-level optimization without consideration of the long-term application-level performance and constraints, 2) they neglect the necessity of keeping the game-theoretical properties of ad auctions, which may lead to anarchy in bidding, and 3) they can hardly be deployed in large-scale applications due to high computational complexity. In this paper, we focus on long-term performance optimization under hierarchical constraints in feeds and formulate the adaptive ad exposure as a Dynamic Knapsack Problem. We propose an effective approach: Hierarchically Constrained Adaptive Ad Exposure~(HCA2E). We present that HCA2E possesses desired game-theoretical properties, computational efficiency, and performance robustness. Comprehensive offline and online experiments on a leading e-commerce application demonstrate the significant performance superiority of HCA2E over representative baselines. HCA2E has also been deployed on this application to serve millions of daily users.

Via

Access Paper or Ask Questions

MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty Calibration

Feb 09, 2022

Siguang Huang, Yunli Wang, Lili Mou, Huayue Zhang, Han Zhu, Chuan Yu, Bo Zheng

Figure 1 for MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty Calibration

Figure 2 for MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty Calibration

Figure 3 for MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty Calibration

Figure 4 for MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty Calibration

Abstract:Most machine learning classifiers only concern classification accuracy, while certain applications (such as medical diagnosis, meteorological forecasting, and computation advertising) require the model to predict the true probability, known as a calibrated estimate. In previous work, researchers have developed several calibration methods to post-process the outputs of a predictor to obtain calibrated values, such as binning and scaling methods. Compared with scaling, binning methods are shown to have distribution-free theoretical guarantees, which motivates us to prefer binning methods for calibration. However, we notice that existing binning methods have several drawbacks: (a) the binning scheme only considers the original prediction values, thus limiting the calibration performance; and (b) the binning approach is non-individual, mapping multiple samples in a bin to the same value, and thus is not suitable for order-sensitive applications. In this paper, we propose a feature-aware binning framework, called Multiple Boosting Calibration Trees (MBCT), along with a multi-view calibration loss to tackle the above issues. Our MBCT optimizes the binning scheme by the tree structures of features, and adopts a linear function in a tree node to achieve individual calibration. Our MBCT is non-monotonic, and has the potential to improve order accuracy, due to its learnable binning scheme and the individual calibration. We conduct comprehensive experiments on three datasets in different fields. Results show that our method outperforms all competing models in terms of both calibration error and order accuracy. We also conduct simulation experiments, justifying that the proposed multi-view calibration loss is a better metric in modeling calibration error.

* WWW 2022

Via

Access Paper or Ask Questions

A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

Jun 11, 2021

Chao Wen, Miao Xu, Zhilin Zhang, Zhenzhe Zheng, Yuhui Wang, Xiangyu Liu, Yu Rong, Dong Xie, Xiaoyang Tan, Chuan Yu(+4 more)

Figure 1 for A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

Figure 2 for A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

Figure 3 for A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

Figure 4 for A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising

Abstract:In online advertising, auto-bidding has become an essential tool for advertisers to optimize their preferred ad performance metrics by simply expressing the high-level campaign objectives and constraints. Previous works consider the design of auto-bidding agents from the single-agent view without modeling the mutual influence between agents. In this paper, we instead consider this problem from the perspective of a distributed multi-agent system, and propose a general Multi-Agent reinforcement learning framework for Auto-Bidding, namely MAAB, to learn the auto-bidding strategies. First, we investigate the competition and cooperation relation among auto-bidding agents, and propose temperature-regularized credit assignment for establishing a mixed cooperative-competitive paradigm. By carefully making a competition and cooperation trade-off among the agents, we can reach an equilibrium state that guarantees not only individual advertiser's utility but also the system performance (social welfare). Second, due to the observed collusion behaviors of bidding low prices underlying the cooperation, we further propose bar agents to set a personalized bidding bar for each agent, and then to alleviate the degradation of revenue. Third, to deploy MAAB to the large-scale advertising system with millions of advertisers, we propose a mean-field approach. By grouping advertisers with the same objective as a mean auto-bidding agent, the interactions among advertisers are greatly simplified, making it practical to train MAAB efficiently. Extensive experiments on the offline industrial dataset and Alibaba advertising platform demonstrate that our approach outperforms several baseline methods in terms of social welfare and guarantees the ad platform's revenue.

* 10 pages

Via

Access Paper or Ask Questions

We Know What You Want: An Advertising Strategy Recommender System for Online Advertising

Jun 08, 2021

Liyi Guo, Junqi Jin, Haoqi Zhang, Zhenzhe Zheng, Zhiye Yang, Zhizhuang Xing, Fei Pan, Lvyin Niu, Fan Wu, Haiyang Xu(+3 more)

Figure 1 for We Know What You Want: An Advertising Strategy Recommender System for Online Advertising

Figure 2 for We Know What You Want: An Advertising Strategy Recommender System for Online Advertising

Figure 3 for We Know What You Want: An Advertising Strategy Recommender System for Online Advertising

Figure 4 for We Know What You Want: An Advertising Strategy Recommender System for Online Advertising

Abstract:Advertising expenditures have become the major source of revenue for e-commerce platforms. Providing good advertising experiences for advertisers through reducing their costs of trial and error for discovering the optimal advertising strategies is crucial for the long-term prosperity of online advertising. To achieve this goal, the advertising platform needs to identify the advertisers' marketing objectives, and then recommend the corresponding strategies to fulfill this objective. In this work, we first deploy a prototype of strategy recommender system on Taobao display advertising platform, recommending bid prices and targeted users to advertisers. We further augment this prototype system by directly revealing the advertising performance, and then infer the advertisers' marketing objectives through their adoptions of different recommending advertising performance. We use the techniques from context bandit to jointly learn the advertisers' marketing objectives and the recommending strategies. Online evaluations show that the designed advertising strategy recommender system can optimize the advertisers' advertising performance and increase the platform's revenue. Simulation experiments based on Taobao online bidding data show that the designed contextual bandit algorithm can effectively optimize the strategy adoption rate of advertisers.

* KDD 2021, Virtual Event, Singapore
* Accepted by KDD 2021

Via

Access Paper or Ask Questions

Neural Auction: End-to-End Learning of Auction Mechanisms for E-Commerce Advertising

Jun 07, 2021

Xiangyu Liu, Chuan Yu, Zhilin Zhang, Zhenzhe Zheng, Yu Rong, Hongtao Lv, Da Huo, Yiqing Wang, Dagui Chen, Jian Xu(+3 more)

Abstract:In e-commerce advertising, it is crucial to jointly consider various performance metrics, e.g., user experience, advertiser utility, and platform revenue. Traditional auction mechanisms, such as GSP and VCG auctions, can be suboptimal due to their fixed allocation rules to optimize a single performance metric (e.g., revenue or social welfare). Recently, data-driven auctions, learned directly from auction outcomes to optimize multiple performance metrics, have attracted increasing research interests. However, the procedure of auction mechanisms involves various discrete calculation operations, making it challenging to be compatible with continuous optimization pipelines in machine learning. In this paper, we design \underline{D}eep \underline{N}eural \underline{A}uctions (DNAs) to enable end-to-end auction learning by proposing a differentiable model to relax the discrete sorting operation, a key component in auctions. We optimize the performance metrics by developing deep models to efficiently extract contexts from auctions, providing rich features for auction design. We further integrate the game theoretical conditions within the model design, to guarantee the stability of the auctions. DNAs have been successfully deployed in the e-commerce advertising system at Taobao. Experimental evaluation results on both large-scale data set as well as online A/B test demonstrated that DNAs significantly outperformed other mechanisms widely adopted in industry.

* To appear in the Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2021

Via

Access Paper or Ask Questions

Computation Resource Allocation Solution in Recommender Systems

Mar 03, 2021

Xun Yang, Yunli Wang, Cheng Chen, Qing Tan, Chuan Yu, Jian Xu, Xiaoqiang Zhu

Figure 1 for Computation Resource Allocation Solution in Recommender Systems

Figure 2 for Computation Resource Allocation Solution in Recommender Systems

Figure 3 for Computation Resource Allocation Solution in Recommender Systems

Figure 4 for Computation Resource Allocation Solution in Recommender Systems

Abstract:Recommender systems rely heavily on increasing computation resources to improve their business goal. By deploying computation-intensive models and algorithms, these systems are able to inference user interests and exhibit certain ads or commodities from the candidate set to maximize their business goals. However, such systems are facing two challenges in achieving their goals. On the one hand, facing massive online requests, computation-intensive models and algorithms are pushing their computation resources to the limit. On the other hand, the response time of these systems is strictly limited to a short period, e.g. 300 milliseconds in our real system, which is also being exhausted by the increasingly complex models and algorithms. In this paper, we propose the computation resource allocation solution (CRAS) that maximizes the business goal with limited computation resources and response time. We comprehensively illustrate the problem and formulate such a problem as an optimization problem with multiple constraints, which could be broken down into independent sub-problems. To solve the sub-problems, we propose the revenue function to facilitate the theoretical analysis, and obtain the optimal computation resource allocation strategy. To address the applicability issues, we devise the feedback control system to help our strategy constantly adapt to the changing online environment. The effectiveness of our method is verified by extensive experiments based on the real dataset from Taobao.com. We also deploy our method in the display advertising system of Alibaba. The online results show that our computation resource allocation solution achieves significant business goal improvement without any increment of computation cost, which demonstrates the efficacy of our method in real industrial practice.

Via

Access Paper or Ask Questions

Optimizing Multiple Performance Metrics with Deep GSP Auctions for E-commerce Advertising

Jan 08, 2021

Zhilin Zhang, Xiangyu Liu, Zhenzhe Zheng, Chenrui Zhang, Miao Xu, Junwei Pan, Chuan Yu, Fan Wu, Jian Xu, Kun Gai

Figure 1 for Optimizing Multiple Performance Metrics with Deep GSP Auctions for E-commerce Advertising

Figure 2 for Optimizing Multiple Performance Metrics with Deep GSP Auctions for E-commerce Advertising

Figure 3 for Optimizing Multiple Performance Metrics with Deep GSP Auctions for E-commerce Advertising

Figure 4 for Optimizing Multiple Performance Metrics with Deep GSP Auctions for E-commerce Advertising

Abstract:In e-commerce advertising, the ad platform usually relies on auction mechanisms to optimize different performance metrics, such as user experience, advertiser utility, and platform revenue. However, most of the state-of-the-art auction mechanisms only focus on optimizing a single performance metric, e.g., either social welfare or revenue, and are not suitable for e-commerce advertising with various, dynamic, difficult to estimate, and even conflicting performance metrics. In this paper, we propose a new mechanism called Deep GSP auction, which leverages deep learning to design new rank score functions within the celebrated GSP auction framework. These new rank score functions are implemented via deep neural network models under the constraints of monotone allocation and smooth transition. The requirement of monotone allocation ensures Deep GSP auction nice game theoretical properties, while the requirement of smooth transition guarantees the advertiser utilities would not fluctuate too much when the auction mechanism switches among candidate mechanisms to achieve different optimization objectives. We deployed the proposed mechanisms in a leading e-commerce ad platform and conducted comprehensive experimental evaluations with both offline simulations and online A/B tests. The results demonstrated the effectiveness of the Deep GSP auction compared to the state-of-the-art auction mechanisms.

* To appear in the Proceedings of the 14th ACM International Conference on Web Search and Data Mining (WSDM), 2021

Via

Access Paper or Ask Questions

Learning to Infer User Hidden States for Online Sequential Advertising

Sep 03, 2020

Zhaoqing Peng, Junqi Jin, Lan Luo, Yaodong Yang, Rui Luo, Jun Wang, Weinan Zhang, Haiyang Xu, Miao Xu, Chuan Yu(+4 more)

Figure 1 for Learning to Infer User Hidden States for Online Sequential Advertising

Figure 2 for Learning to Infer User Hidden States for Online Sequential Advertising

Figure 3 for Learning to Infer User Hidden States for Online Sequential Advertising

Figure 4 for Learning to Infer User Hidden States for Online Sequential Advertising

Abstract:To drive purchase in online advertising, it is of the advertiser's great interest to optimize the sequential advertising strategy whose performance and interpretability are both important. The lack of interpretability in existing deep reinforcement learning methods makes it not easy to understand, diagnose and further optimize the strategy. In this paper, we propose our Deep Intents Sequential Advertising (DISA) method to address these issues. The key part of interpretability is to understand a consumer's purchase intent which is, however, unobservable (called hidden states). In this paper, we model this intention as a latent variable and formulate the problem as a Partially Observable Markov Decision Process (POMDP) where the underlying intents are inferred based on the observable behaviors. Large-scale industrial offline and online experiments demonstrate our method's superior performance over several baselines. The inferred hidden states are analyzed, and the results prove the rationality of our inference.

* to be published in CIKM 2020

Via

Access Paper or Ask Questions