Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaoqiang Zhu

Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net

Mar 15, 2018

Guorui Zhou, Ying Fan, Runpeng Cui, Weijie Bian, Xiaoqiang Zhu, Kun Gai

Figure 1 for Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net

Figure 2 for Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net

Figure 3 for Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net

Figure 4 for Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net

Abstract:Models applied on real time response task, like click-through rate (CTR) prediction model, require high accuracy and rigorous response time. Therefore, top-performing deep models of high depth and complexity are not well suited for these applications with the limitations on the inference time. In order to further improve the neural networks' performance given the time and computational limitations, we propose an approach that exploits a cumbersome net to help train the lightweight net for prediction. We dub the whole process rocket launching, where the cumbersome booster net is used to guide the learning of the target light net throughout the whole training process. We analyze different loss functions aiming at pushing the light net to behave similarly to the booster net, and adopt the loss with best performance in our experiments. We use one technique called gradient block to improve the performance of the light net and booster net further. Experiments on benchmark datasets and real-life industrial advertisement data present that our light model can get performance only previously achievable with more complex models.

* 10 pages, AAAI2018

Via

Access Paper or Ask Questions

Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction

Apr 18, 2017

Kun Gai, Xiaoqiang Zhu, Han Li, Kai Liu, Zhe Wang

Figure 1 for Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction

Figure 2 for Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction

Figure 3 for Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction

Figure 4 for Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction

Abstract:CTR prediction in real-world business is a difficult machine learning problem with large scale nonlinear sparse data. In this paper, we introduce an industrial strength solution with model named Large Scale Piece-wise Linear Model (LS-PLM). We formulate the learning problem with $L_1$ and $L_{2,1}$ regularizers, leading to a non-convex and non-smooth optimization problem. Then, we propose a novel algorithm to solve it efficiently, based on directional derivatives and quasi-Newton method. In addition, we design a distributed system which can run on hundreds of machines parallel and provides us with the industrial scalability. LS-PLM model can capture nonlinear patterns from massive sparse data, saving us from heavy feature engineering jobs. Since 2012, LS-PLM has become the main CTR prediction model in Alibaba's online display advertising system, serving hundreds of millions users every day.

Via

Access Paper or Ask Questions