Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wan Du

Towards VM Rescheduling Optimization Through Deep Reinforcement Learning

May 23, 2025

Xianzhong Ding, Yunkai Zhang, Binbin Chen, Donghao Ying, Tieying Zhang, Jianjun Chen, Lei Zhang, Alberto Cerpa, Wan Du

Abstract:Modern industry-scale data centers need to manage a large number of virtual machines (VMs). Due to the continual creation and release of VMs, many small resource fragments are scattered across physical machines (PMs). To handle these fragments, data centers periodically reschedule some VMs to alternative PMs, a practice commonly referred to as VM rescheduling. Despite the increasing importance of VM rescheduling as data centers grow in size, the problem remains understudied. We first show that, unlike most combinatorial optimization tasks, the inference time of VM rescheduling algorithms significantly influences their performance, due to dynamic VM state changes during this period. This causes existing methods to scale poorly. Therefore, we develop a reinforcement learning system for VM rescheduling, VM2RL, which incorporates a set of customized techniques, such as a two-stage framework that accommodates diverse constraints and workload conditions, a feature extraction module that captures relational information specific to rescheduling, as well as a risk-seeking evaluation enabling users to optimize the trade-off between latency and accuracy. We conduct extensive experiments with data from an industry-scale data center. Our results show that VM2RL can achieve a performance comparable to the optimal solution but with a running time of seconds. Code and datasets are open-sourced: https://github.com/zhykoties/VMR2L_eurosys, https://drive.google.com/drive/folders/1PfRo1cVwuhH30XhsE2Np3xqJn2GpX5qy.

* Twentieth European Conference on Computer Systems (EuroSys '25), Rotterdam, Netherlands, March 30-April 3 2025

Via

Access Paper or Ask Questions

Disentangling Uncertainties by Learning Compressed Data Representation

Mar 20, 2025

Zhiyu An, Zhibo Hou, Wan Du

Abstract:We study aleatoric and epistemic uncertainty estimation in a learned regressive system dynamics model. Disentangling aleatoric uncertainty (the inherent randomness of the system) from epistemic uncertainty (the lack of data) is crucial for downstream tasks such as risk-aware control and reinforcement learning, efficient exploration, and robust policy transfer. While existing approaches like Gaussian Processes, Bayesian networks, and model ensembles are widely adopted, they suffer from either high computational complexity or inaccurate uncertainty estimation. To address these limitations, we propose the Compressed Data Representation Model (CDRM), a framework that learns a neural network encoding of the data distribution and enables direct sampling from the output distribution. Our approach incorporates a novel inference procedure based on Langevin dynamics sampling, allowing CDRM to predict arbitrary output distributions rather than being constrained to a Gaussian prior. Theoretical analysis provides the conditions where CDRM achieves better memory and computational complexity compared to bin-based compression methods. Empirical evaluations show that CDRM demonstrates a superior capability to identify aleatoric and epistemic uncertainties separately, achieving AUROCs of 0.8876 and 0.9981 on a single test set containing a mixture of both uncertainties. Qualitative results further show that CDRM's capability extends to datasets with multimodal output distributions, a challenging scenario where existing methods consistently fail. Code and supplementary materials are available at https://github.com/ryeii/CDRM.

* Accepted by the 7th Annual Learning for Dynamics & Control Conference (L4DC) 2025

Via

Access Paper or Ask Questions

GWRF: A Generalizable Wireless Radiance Field for Wireless Signal Propagation Modeling

Feb 08, 2025

Kang Yang, Yuning Chen, Wan Du

Figure 1 for GWRF: A Generalizable Wireless Radiance Field for Wireless Signal Propagation Modeling

Figure 2 for GWRF: A Generalizable Wireless Radiance Field for Wireless Signal Propagation Modeling

Figure 3 for GWRF: A Generalizable Wireless Radiance Field for Wireless Signal Propagation Modeling

Figure 4 for GWRF: A Generalizable Wireless Radiance Field for Wireless Signal Propagation Modeling

Abstract:We present Generalizable Wireless Radiance Fields (GWRF), a framework for modeling wireless signal propagation at arbitrary 3D transmitter and receiver positions. Unlike previous methods that adapt vanilla Neural Radiance Fields (NeRF) from the optical to the wireless signal domain, requiring extensive per-scene training, GWRF generalizes effectively across scenes. First, a geometry-aware Transformer encoder-based wireless scene representation module incorporates information from geographically proximate transmitters to learn a generalizable wireless radiance field. Second, a neural-driven ray tracing algorithm operates on this field to automatically compute signal reception at the receiver. Experimental results demonstrate that GWRF outperforms existing methods on single scenes and achieves state-of-the-art performance on unseen scenes.

Via

Access Paper or Ask Questions

See Behind Walls in Real-time Using Aerial Drones and Augmented Reality

Oct 17, 2024

Sikai Yang, Kang Yang, Yuning Chen, Fan Zhao, Wan Du

Figure 1 for See Behind Walls in Real-time Using Aerial Drones and Augmented Reality

Figure 2 for See Behind Walls in Real-time Using Aerial Drones and Augmented Reality

Figure 3 for See Behind Walls in Real-time Using Aerial Drones and Augmented Reality

Figure 4 for See Behind Walls in Real-time Using Aerial Drones and Augmented Reality

Abstract:This work presents ARD2, a framework that enables real-time through-wall surveillance using two aerial drones and an augmented reality (AR) device. ARD2 consists of two main steps: target direction estimation and contour reconstruction. In the first stage, ARD2 leverages geometric relationships between the drones, the user, and the target to project the target's direction onto the user's AR display. In the second stage, images from the drones are synthesized to reconstruct the target's contour, allowing the user to visualize the target behind walls. Experimental results demonstrate the system's accuracy in both direction estimation and contour reconstruction.

* 6 pages

Via

Access Paper or Ask Questions

Magnetic Distortion Resistant Orientation Estimation

Oct 16, 2024

Sikai Yang, Miaomiao Liu, Wan Du

Abstract:Inertial Measurement Unit (IMU) sensors, including accelerometers, gyroscopes, and magnetometers, are used to estimate the orientation of mobile devices. However, indoor magnetic fields are often distorted, causing the magnetometer's readings to deviate from true north and resulting in inaccurate orientation estimates. Existing solutions either ignore magnetic distortion or avoid using the magnetometer when distortion is detected. In this paper, we develop MDR, a Magnetic Distortion Resistant orientation estimation system that fundamentally models and corrects magnetic distortion. MDR builds a database to record magnetic directions at different locations and uses it to correct orientation estimates affected by magnetic distortion. To avoid the overhead of database preparation, MDR adopts practical designs to automatically update the database in parallel with orientation estimation. Experiments on 27+ hours of arm motion data show that MDR outperforms the state-of-the-art method by 35.34%.

* 14pages

Via

Access Paper or Ask Questions

Golden-Retriever: High-Fidelity Agentic Retrieval Augmented Generation for Industrial Knowledge Base

Jul 20, 2024

Zhiyu An, Xianzhong Ding, Yen-Chun Fu, Cheng-Chung Chu, Yan Li, Wan Du

Abstract:This paper introduces Golden-Retriever, designed to efficiently navigate vast industrial knowledge bases, overcoming challenges in traditional LLM fine-tuning and RAG frameworks with domain-specific jargon and context interpretation. Golden-Retriever incorporates a reflection-based question augmentation step before document retrieval, which involves identifying jargon, clarifying its meaning based on context, and augmenting the question accordingly. Specifically, our method extracts and lists all jargon and abbreviations in the input question, determines the context against a pre-defined list, and queries a jargon dictionary for extended definitions and descriptions. This comprehensive augmentation ensures the RAG framework retrieves the most relevant documents by providing clear context and resolving ambiguities, significantly improving retrieval accuracy. Evaluations using three open-source LLMs on a domain-specific question-answer dataset demonstrate Golden-Retriever's superior performance, providing a robust solution for efficiently integrating and querying industrial knowledge bases.

Via

Access Paper or Ask Questions

MARLP: Time-series Forecasting Control for Agricultural Managed Aquifer Recharge

Jul 01, 2024

Yuning Chen, Kang Yang, Zhiyu An, Brady Holder, Luke Paloutzian, Khaled Bali, Wan Du

Figure 1 for MARLP: Time-series Forecasting Control for Agricultural Managed Aquifer Recharge

Figure 2 for MARLP: Time-series Forecasting Control for Agricultural Managed Aquifer Recharge

Figure 3 for MARLP: Time-series Forecasting Control for Agricultural Managed Aquifer Recharge

Figure 4 for MARLP: Time-series Forecasting Control for Agricultural Managed Aquifer Recharge

Abstract:The rapid decline in groundwater around the world poses a significant challenge to sustainable agriculture. To address this issue, agricultural managed aquifer recharge (Ag-MAR) is proposed to recharge the aquifer by artificially flooding agricultural lands using surface water. Ag-MAR requires a carefully selected flooding schedule to avoid affecting the oxygen absorption of crop roots. However, current Ag-MAR scheduling does not take into account complex environmental factors such as weather and soil oxygen, resulting in crop damage and insufficient recharging amounts. This paper proposes MARLP, the first end-to-end data-driven control system for Ag-MAR. We first formulate Ag-MAR as an optimization problem. To that end, we analyze four-year in-field datasets, which reveal the multi-periodicity feature of the soil oxygen level trends and the opportunity to use external weather forecasts and flooding proposals as exogenous clues for soil oxygen prediction. Then, we design a two-stage forecasting framework. In the first stage, it extracts both the cross-variate dependency and the periodic patterns from historical data to conduct preliminary forecasting. In the second stage, it uses weather-soil and flooding-soil causality to facilitate an accurate prediction of soil oxygen levels. Finally, we conduct model predictive control (MPC) for Ag-MAR flooding. To address the challenge of large action spaces, we devise a heuristic planning module to reduce the number of flooding proposals to enable the search for optimal solutions. Real-world experiments show that MARLP reduces the oxygen deficit ratio by 86.8% while improving the recharging amount in unit time by 35.8%, compared with the previous four years.

* Accepted by KDD 2024

Via

Access Paper or Ask Questions

Go Beyond Black-box Policies: Rethinking the Design of Learning Agent for Interpretable and Verifiable HVAC Control

Feb 29, 2024

Zhiyu An, Xianzhong Ding, Wan Du

Abstract:Recent research has shown the potential of Model-based Reinforcement Learning (MBRL) to enhance energy efficiency of Heating, Ventilation, and Air Conditioning (HVAC) systems. However, existing methods rely on black-box thermal dynamics models and stochastic optimizers, lacking reliability guarantees and posing risks to occupant health. In this work, we overcome the reliability bottleneck by redesigning HVAC controllers using decision trees extracted from existing thermal dynamics models and historical data. Our decision tree-based policies are deterministic, verifiable, interpretable, and more energy-efficient than current MBRL methods. First, we introduce a novel verification criterion for RL agents in HVAC control based on domain knowledge. Second, we develop a policy extraction procedure that produces a verifiable decision tree policy. We found that the high dimensionality of the thermal dynamics model input hinders the efficiency of policy extraction. To tackle the dimensionality challenge, we leverage importance sampling conditioned on historical data distributions, significantly improving policy extraction efficiency. Lastly, we present an offline verification algorithm that guarantees the reliability of a control policy. Extensive experiments show that our method saves 68.4% more energy and increases human comfort gain by 14.8% compared to the state-of-the-art method, in addition to an 1127x reduction in computation overhead. Our code and data are available at https://github.com/ryeii/Veri_HVAC

* Accepted for the 61st Design Automation Conference (DAC)

Via

Access Paper or Ask Questions

Reward Bound for Behavioral Guarantee of Model-based Planning Agents

Feb 20, 2024

Zhiyu An, Xianzhong Ding, Wan Du

Abstract:Recent years have seen an emerging interest in the trustworthiness of machine learning-based agents in the wild, especially in robotics, to provide safety assurance for the industry. Obtaining behavioral guarantees for these agents remains an important problem. In this work, we focus on guaranteeing a model-based planning agent reaches a goal state within a specific future time step. We show that there exists a lower bound for the reward at the goal state, such that if the said reward is below that bound, it is impossible to obtain such a guarantee. By extension, we show how to enforce preferences over multiple goals.

* To be published in ICLR 24 tiny paper track

Via

Access Paper or Ask Questions

Optimizing Irrigation Efficiency using Deep Reinforcement Learning in the Field

Apr 04, 2023

Xianzhong Ding, Wan Du

Abstract:Agricultural irrigation is a significant contributor to freshwater consumption. However, the current irrigation systems used in the field are not efficient. They rely mainly on soil moisture sensors and the experience of growers, but do not account for future soil moisture loss. Predicting soil moisture loss is challenging because it is influenced by numerous factors, including soil texture, weather conditions, and plant characteristics. This paper proposes a solution to improve irrigation efficiency, which is called DRLIC. DRLIC is a sophisticated irrigation system that uses deep reinforcement learning (DRL) to optimize its performance. The system employs a neural network, known as the DRL control agent, which learns an optimal control policy that considers both the current soil moisture measurement and the future soil moisture loss. We introduce an irrigation reward function that enables our control agent to learn from previous experiences. However, there may be instances where the output of our DRL control agent is unsafe, such as irrigating too much or too little water. To avoid damaging the health of the plants, we implement a safety mechanism that employs a soil moisture predictor to estimate the performance of each action. If the predicted outcome is deemed unsafe, we perform a relatively-conservative action instead. To demonstrate the real-world application of our approach, we developed an irrigation system that comprises sprinklers, sensing and control nodes, and a wireless network. We evaluate the performance of DRLIC by deploying it in a testbed consisting of six almond trees. During a 15-day in-field experiment, we compared the water consumption of DRLIC with a widely-used irrigation scheme. Our results indicate that DRLIC outperformed the traditional irrigation method by achieving a water savings of up to 9.52%.

* 15 pages, 19 figures

Via

Access Paper or Ask Questions