

We present an algorithm, called the Offset Tree, for learning to make decisions in situations where the payoff of only one choice is observed, rather than all choices. The algorithm reduces this setting to binary classification, allowing one to reuse of any existing, fully supervised binary classification algorithm in this partial information setting. We show that the Offset Tree is an optimal reduction to binary classification. In particular, it has regret at most $(k-1)$ times the regret of the binary classifier it uses (where $k$ is the number of choices), and no reduction to binary classification can do better. This reduction is also computationally optimal, both at training and test time, requiring just $O(\log_2 k)$ work to train on an example or make a prediction. Experiments with the Offset Tree show that it generally performs better than several alternative approaches.