Many important multiple-objective decision problems can be cast within the framework of ranking under constraints and solved via a weighted bipartite matching linear program. Some of these optimization problems, such as personalized content recommendations, may need to be solved in real time and thus must comply with strict time requirements to prevent the perception of latency by consumers. Classical linear programming is too computationally inefficient for such settings. We propose a novel approach to scale up ranking under constraints by replacing the weighted bipartite matching optimization with a prediction problem in the algorithm deployment stage. We show empirically that the proposed approximate solution to the ranking problem leads to a major reduction in required computing resources without much sacrifice in constraint compliance and achieved utility, allowing us to solve larger constrained ranking problems real-time, within the required 50 milliseconds, than previously reported.
The paper outlines a framework for autonomous control of a CRM (customer relationship management) system. First, it explores how a modified version of the widely accepted Recency-Frequency-Monetary Value system of metrics can be used to define the state space of clients or donors. Second, it describes a procedure to determine the optimal direct marketing action in discrete and continuous action space for the given individual, based on his position in the state space. The procedure involves the use of model-free Q-learning to train a deep neural network that relates a client's position in the state space to rewards associated with possible marketing actions. The estimated value function over the client state space can be interpreted as customer lifetime value, and thus allows for a quick plug-in estimation of CLV for a given client. Experimental results are presented, based on KDD Cup 1998 mailing dataset of donation solicitations.