Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Markus Weimer

PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems

Oct 14, 2018

Yunseong Lee, Alberto Scolari, Byung-Gon Chun, Marco Domenico Santambrogio, Markus Weimer, Matteo Interlandi

Figure 1 for PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems

Figure 2 for PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems

Figure 3 for PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems

Figure 4 for PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems

Abstract:Machine Learning models are often composed of pipelines of transformations. While this design allows to efficiently execute single model components at training time, prediction serving has different requirements such as low latency, high throughput and graceful performance degradation under heavy load. Current prediction serving systems consider models as black boxes, whereby prediction-time-specific optimizations are ignored in favor of ease of deployment. In this paper, we present PRETZEL, a prediction serving system introducing a novel white box architecture enabling both end-to-end and multi-model optimizations. Using production-like model pipelines, our experiments show that PRETZEL is able to introduce performance improvements over different dimensions; compared to state-of-the-art approaches PRETZEL is on average able to reduce 99th percentile latency by 5.5x while reducing memory footprint by 25x, and increasing throughput by 4.7x.

* 16 pages, 14 figures, 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2018

Via

Access Paper or Ask Questions

Batch-Expansion Training: An Efficient Optimization Framework

Feb 23, 2018

Michał Dereziński, Dhruv Mahajan, S. Sathiya Keerthi, S. V. N. Vishwanathan, Markus Weimer

Figure 1 for Batch-Expansion Training: An Efficient Optimization Framework

Figure 2 for Batch-Expansion Training: An Efficient Optimization Framework

Figure 3 for Batch-Expansion Training: An Efficient Optimization Framework

Figure 4 for Batch-Expansion Training: An Efficient Optimization Framework

Abstract:We propose Batch-Expansion Training (BET), a framework for running a batch optimizer on a gradually expanding dataset. As opposed to stochastic approaches, batches do not need to be resampled i.i.d. at every iteration, thus making BET more resource efficient in a distributed setting, and when disk-access is constrained. Moreover, BET can be easily paired with most batch optimizers, does not require any parameter-tuning, and compares favorably to existing stochastic and batch methods. We show that when the batch size grows exponentially with the number of outer iterations, BET achieves optimal $O(1/\epsilon)$ data-access convergence rate for strongly convex objectives. Experiments in parallel and distributed settings show that BET performs better than standard batch and stochastic approaches.

Via

Access Paper or Ask Questions

Towards Geo-Distributed Machine Learning

Mar 30, 2016

Ignacio Cano, Markus Weimer, Dhruv Mahajan, Carlo Curino, Giovanni Matteo Fumarola

Figure 1 for Towards Geo-Distributed Machine Learning

Figure 2 for Towards Geo-Distributed Machine Learning

Abstract:Latency to end-users and regulatory requirements push large companies to build data centers all around the world. The resulting data is "born" geographically distributed. On the other hand, many machine learning applications require a global view of such data in order to achieve the best results. These types of applications form a new class of learning problems, which we call Geo-Distributed Machine Learning (GDML). Such applications need to cope with: 1) scarce and expensive cross-data center bandwidth, and 2) growing privacy concerns that are pushing for stricter data sovereignty regulations. Current solutions to learning from geo-distributed data sources revolve around the idea of first centralizing the data in one data center, and then training locally. As machine learning algorithms are communication-intensive, the cost of centralizing the data is thought to be offset by the lower cost of intra-data center communication during training. In this work, we show that the current centralized practice can be far from optimal, and propose a system for doing geo-distributed training. Furthermore, we argue that the geo-distributed approach is structurally more amenable to dealing with regulatory constraints, as raw data never leaves the source data center. Our empirical evaluation on three real datasets confirms the general validity of our approach, and shows that GDML is not only possible but also advisable in many scenarios.

Via

Access Paper or Ask Questions

Iterative MapReduce for Large Scale Machine Learning

Mar 13, 2013

Joshua Rosen, Neoklis Polyzotis, Vinayak Borkar, Yingyi Bu, Michael J. Carey, Markus Weimer, Tyson Condie, Raghu Ramakrishnan

Figure 1 for Iterative MapReduce for Large Scale Machine Learning

Figure 2 for Iterative MapReduce for Large Scale Machine Learning

Figure 3 for Iterative MapReduce for Large Scale Machine Learning

Figure 4 for Iterative MapReduce for Large Scale Machine Learning

Abstract:Large datasets ("Big Data") are becoming ubiquitous because the potential value in deriving insights from data, across a wide range of business and scientific applications, is increasingly recognized. In particular, machine learning - one of the foundational disciplines for data analysis, summarization and inference - on Big Data has become routine at most organizations that operate large clouds, usually based on systems such as Hadoop that support the MapReduce programming paradigm. It is now widely recognized that while MapReduce is highly scalable, it suffers from a critical weakness for machine learning: it does not support iteration. Consequently, one has to program around this limitation, leading to fragile, inefficient code. Further, reliance on the programmer is inherently flawed in a multi-tenanted cloud environment, since the programmer does not have visibility into the state of the system when his or her program executes. Prior work has sought to address this problem by either developing specialized systems aimed at stylized applications, or by augmenting MapReduce with ad hoc support for saving state across iterations (driven by an external loop). In this paper, we advocate support for looping as a first-class construct, and propose an extension of the MapReduce programming paradigm called {\em Iterative MapReduce}. We then develop an optimizer for a class of Iterative MapReduce programs that cover most machine learning techniques, provide theoretical justifications for the key optimization steps, and empirically demonstrate that system-optimized programs for significant machine learning tasks are competitive with state-of-the-art specialized solutions.

Via

Access Paper or Ask Questions

Scaling Datalog for Machine Learning on Big Data

Mar 02, 2012

Yingyi Bu, Vinayak Borkar, Michael J. Carey, Joshua Rosen, Neoklis Polyzotis, Tyson Condie, Markus Weimer, Raghu Ramakrishnan

Figure 1 for Scaling Datalog for Machine Learning on Big Data

Figure 2 for Scaling Datalog for Machine Learning on Big Data

Figure 3 for Scaling Datalog for Machine Learning on Big Data

Figure 4 for Scaling Datalog for Machine Learning on Big Data

Abstract:In this paper, we present the case for a declarative foundation for data-intensive machine learning systems. Instead of creating a new system for each specific flavor of machine learning task, or hardcoding new optimizations, we argue for the use of recursive queries to program a variety of machine learning systems. By taking this approach, database query optimization techniques can be utilized to identify effective execution plans, and the resulting runtime plans can be executed on a single unified data-parallel query processing engine. As a proof of concept, we consider two programming models--Pregel and Iterative Map-Reduce-Update---from the machine learning domain, and show how they can be captured in Datalog, tuned for a specific task, and then compiled into an optimized physical plan. Experiments performed on a large computing cluster with real data demonstrate that this declarative approach can provide very good performance while offering both increased generality and programming ease.

Via

Access Paper or Ask Questions