Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Francis Y. Yan

AutoSpec: Automated Generation of Neural Network Specifications

Sep 17, 2024

Shuowei Jin, Francis Y. Yan, Cheng Tan, Anuj Kalia, Xenofon Foukas, Z. Morley Mao

Figure 1 for AutoSpec: Automated Generation of Neural Network Specifications

Figure 2 for AutoSpec: Automated Generation of Neural Network Specifications

Figure 3 for AutoSpec: Automated Generation of Neural Network Specifications

Figure 4 for AutoSpec: Automated Generation of Neural Network Specifications

Abstract:The increasing adoption of neural networks in learning-augmented systems highlights the importance of model safety and robustness, particularly in safety-critical domains. Despite progress in the formal verification of neural networks, current practices require users to manually define model specifications -- properties that dictate expected model behavior in various scenarios. This manual process, however, is prone to human error, limited in scope, and time-consuming. In this paper, we introduce AutoSpec, the first framework to automatically generate comprehensive and accurate specifications for neural networks in learning-augmented systems. We also propose the first set of metrics for assessing the accuracy and coverage of model specifications, establishing a benchmark for future comparisons. Our evaluation across four distinct applications shows that AutoSpec outperforms human-defined specifications as well as two baseline approaches introduced in this study.

Via

Access Paper or Ask Questions

LLM-ABR: Designing Adaptive Bitrate Algorithms via Large Language Models

Apr 02, 2024

Zhiyuan He, Aashish Gottipati, Lili Qiu, Francis Y. Yan, Xufang Luo, Kenuo Xu, Yuqing Yang

Abstract:We present LLM-ABR, the first system that utilizes the generative capabilities of large language models (LLMs) to autonomously design adaptive bitrate (ABR) algorithms tailored for diverse network characteristics. Operating within a reinforcement learning framework, LLM-ABR empowers LLMs to design key components such as states and neural network architectures. We evaluate LLM-ABR across diverse network settings, including broadband, satellite, 4G, and 5G. LLM-ABR consistently outperforms default ABR algorithms.

Via

Access Paper or Ask Questions

Autothrust: A Practical Framework for Harvesting CPUs from SLO-Targeted Microservices

Jan 06, 2023

Zibo Wang, Pinghe Li, Chieh-Jan Mike Liang, Feng Wu, Francis Y. Yan

Figure 1 for Autothrust: A Practical Framework for Harvesting CPUs from SLO-Targeted Microservices

Figure 2 for Autothrust: A Practical Framework for Harvesting CPUs from SLO-Targeted Microservices

Figure 3 for Autothrust: A Practical Framework for Harvesting CPUs from SLO-Targeted Microservices

Figure 4 for Autothrust: A Practical Framework for Harvesting CPUs from SLO-Targeted Microservices

Abstract:As the number of distributed services (or microservices) of cloud-native applications grows, resource management becomes a challenging task. These applications tend to be user-facing and latency-sensitive, and our goal is to continuously minimize the amount of CPU resources allocated while still satisfying the application latency SLO. Although previous efforts have proposed simple heuristics and sophisticated ML-based techniques, we believe that a practical resource manager should accurately scale CPU resources for diverse applications, with minimum human efforts and operation overheads. To this end, we ask: can we systematically break resource management down to subproblems solvable by practical policies? Based on the notion of CPU-throttle-based performance target, we decouple the mechanisms of SLO feedback and resource control, and implement a two-level framework -- Autothrust. It combines a lightweight learned controller at the global level, and agile per-microservice controllers at the local level. We evaluate Autothrust on three microservice applications, with both short-term and 21-day production workload traces. Empirical results show Autothrust's superior CPU core savings up to 26.21% over the best-performing baselines across applications, while maintaining the latency SLO.

Via

Access Paper or Ask Questions

Teal: Learning-Accelerated Optimization of Traffic Engineering

Oct 25, 2022

Zhiying Xu, Francis Y. Yan, Rachee Singh, Justin T. Chiu, Alexander M. Rush, Minlan Yu

Figure 1 for Teal: Learning-Accelerated Optimization of Traffic Engineering

Figure 2 for Teal: Learning-Accelerated Optimization of Traffic Engineering

Figure 3 for Teal: Learning-Accelerated Optimization of Traffic Engineering

Figure 4 for Teal: Learning-Accelerated Optimization of Traffic Engineering

Abstract:In the last decade, global cloud wide-area networks (WANs) have grown 10$\times$ in size due to the deployment of new network sites and datacenters, making it challenging for commercial optimization engines to solve the network traffic engineering (TE) problem within the temporal budget of a few minutes. In this work, we show that carefully designed deep learning models are key to accelerating the running time of intra-WAN TE systems for large deployments since deep learning is both massively parallel and it benefits from the wealth of historical traffic allocation data from production WANs. However, off-the-shelf deep learning methods fail to perform well on the TE task since they ignore the effects of network connectivity on flow allocations. They are also faced with a tractability challenge posed by the large problem scale of TE optimization. Moreover, neural networks do not have mechanisms to readily enforce hard constraints on model outputs (e.g., link capacity constraints). We tackle these challenges by designing a deep learning-based TE system -- Teal. First, Teal leverages graph neural networks (GNN) to faithfully capture connectivity and model network flows. Second, Teal devises a multi-agent reinforcement learning (RL) algorithm to process individual demands independently in parallel to lower the problem scale. Finally, Teal reduces link capacity violations and improves solution quality using the alternating direction method of multipliers (ADMM). We evaluate Teal on traffic matrices of a global commercial cloud provider and find that Teal computes near-optimal traffic allocations with a 59$\times$ speedup over state-of-the-art TE systems on a WAN topology of over 1,500 nodes.

Via

Access Paper or Ask Questions