Model predictive control (MPC) has been shown to significantly improve the energy efficiency of buildings while maintaining thermal comfort. Data-driven approaches based on neural networks have been proposed to facilitate system modelling. However, such approaches are generally nonconvex and result in computationally intractable optimization problems. In this work, we design a readily implementable energy management method for small commercial buildings. We then leverage our approach to formulate a real-time demand bidding strategy. We propose a data-driven and mixed-integer convex MPC which is solved via derivative-free optimization given a limited computational time of 5 minutes to respect operational constraints. We consider rooftop unit heating, ventilation, and air conditioning systems with discrete controls to accurately model the operation of most commercial buildings. Our approach uses an input convex recurrent neural network to model the thermal dynamics. We apply our approach in several demand response (DR) settings, including a demand bidding, a time-of-use, and a critical peak rebate program. Controller performance is evaluated on a state-of-the-art building simulation. The proposed approach improves thermal comfort while reducing energy consumption and cost through DR participation, when compared to other data-driven approaches or a set-point controller.
Thousands of satellites, asteroids, and rocket bodies break, collide, or degrade, resulting in large amounts of space debris in low Earth orbit. The presence of space debris poses a serious threat to satellite mega-constellations and to future space missions. Debris can be avoided if detected within the safety range of a satellite. In this paper, an integrated sensing and communication technique is proposed to detect space debris for satellite mega-constellations. The canonical polyadic (CP) tensor decomposition method is used to estimate the rank of the tensor that denotes the number of paths including line-of-sight and non-line-of-sight by exploiting the sparsity of THz channel with limited scattering. The analysis reveals that the reflected signals of the THz can be utilized for the detection of space debris. The CP decomposition is cast as an optimization problem and solved using the alternating least square (ALS) algorithm. Simulation results show that the probability of detection of the proposed tensor-based scheme is higher than the conventional energy-based detection scheme for the space debris detection.
High Throughput Satellites (HTSs) outpace traditional satellites due to their multi-beam transmission. The rise of low Earth orbit mega constellations amplifies HTS data rate demands to terabits/second with acceptable latency. This surge in data rate necessitates multiple modems, often exceeding single device capabilities. Consequently, satellites employ several processors, forming a complex packet-switch network. This can lead to potential internal congestion and challenges in adhering to strict quality of service (QoS) constraints. While significant research exists on constellation-level routing, a literature gap remains on the internal routing within a singular HTS. The intricacy of this internal network architecture presents a significant challenge to achieve high data rates. This paper introduces an online optimal flow allocation and scheduling method for HTSs. The problem is treated as a multi-commodity flow instance with different priority data streams. An initial full time horizon model is proposed as a benchmark. We apply a model predictive control (MPC) approach to enable adaptive routing based on current information and the forecast within the prediction time horizon while allowing for deviation of the latter. Importantly, MPC is inherently suited to handle uncertainty in incoming flows. Our approach minimizes packet loss by optimally and adaptively managing the priority queue schedulers and flow exchanges between satellite processing modules. Central to our method is a routing model focusing on optimal priority scheduling to enhance data rates and maintain QoS. The model's stages are critically evaluated, and results are compared to traditional methods via numerical simulations. Through simulations, our method demonstrates performance nearly on par with the hindsight optimum, showcasing its efficiency and adaptability in addressing satellite communication challenges.
High throughput satellites (HTS), with their digital payload technology, are expected to play a key role as enablers of the upcoming 6G networks. HTS are mainly designed to provide higher data rates and capacities. Fueled by technological advancements including beamforming, advanced modulation techniques, reconfigurable phased array technologies, and electronically steerable antennas, HTS have emerged as a fundamental component for future network generation. This paper offers a comprehensive state-of-the-art of HTS systems, with a focus on standardization, patents, channel multiple access techniques, routing, load balancing, and the role of software-defined networking (SDN). In addition, we provide a vision for next-satellite systems that we named as extremely-HTS (EHTS) toward autonomous satellites supported by the main requirements and key technologies expected for these systems. The EHTS system will be designed such that it maximizes spectrum reuse and data rates, and flexibly steers the capacity to satisfy user demand. We introduce a novel architecture for future regenerative payloads while summarizing the challenges imposed by this architecture.
We propose new algorithms with provable performance for online binary optimization subject to general constraints and in dynamic settings. We consider the subset of problems in which the objective function is submodular. We propose the online submodular greedy algorithm (OSGA) which solves to optimality an approximation of the previous round's loss function to avoid the NP-hardness of the original problem. We extend OSGA to a generic approximation function. We show that OSGA has a dynamic regret bound similar to the tightest bounds in online convex optimization. For instances where no approximation exists or a computationally simpler implementation is desired, we design the online submodular projected gradient descent (OSPGD) by leveraging the Lov\'asz extension. We obtain a regret bound that is akin to the conventional online gradient descent (OGD). Finally, we numerically test our algorithms in two power system applications: fast-timescale demand response and real-time distribution network reconfiguration.
To integrate high amounts of renewable energy resources, electrical power grids must be able to cope with high amplitude, fast timescale variations in power generation. Frequency regulation through demand response has the potential to coordinate temporally flexible loads, such as air conditioners, to counteract these variations. Existing approaches for discrete control with dynamic constraints struggle to provide satisfactory performance for fast timescale action selection with hundreds of agents. We propose a decentralized agent trained with multi-agent proximal policy optimization with localized communication. We explore two communication frameworks: hand-engineered, or learned through targeted multi-agent communication. The resulting policies perform well and robustly for frequency regulation, and scale seamlessly to arbitrary numbers of houses for constant processing times.
We formulate an efficient approximation for multi-agent batch reinforcement learning, the approximate multi-agent fitted Q iteration (AMAFQI). We present a detailed derivation of our approach. We propose an iterative policy search and show that it yields a greedy policy with respect to multiple approximations of the centralized, standard Q-function. In each iteration and policy evaluation, AMAFQI requires a number of computations that scales linearly with the number of agents whereas the analogous number of computations increase exponentially for the fitted Q iteration (FQI), one of the most commonly used approaches in batch reinforcement learning. This property of AMAFQI is fundamental for the design of a tractable multi-agent approach. We evaluate the performance of AMAFQI and compare it to FQI in numerical simulations. Numerical examples illustrate the significant computation time reduction when using AMAFQI instead of FQI in multi-agent problems and corroborate the similar decision-making performance of both approaches.
We consider online optimization with binary decision variables and convex loss functions. We design a new algorithm, binary online gradient descent (bOG}), and bound its expected dynamic regret. The bound is sublinear in time and linear in the cumulative variation of the relaxed, continuous round optima. We apply bOGD to demand response with thermostatically controlled loads, in which binary constraints model discrete on/off settings. We also model uncertainty and varying load availability, which depend on temperature deadbands, lock-out of cooling units and manual overrides. We test the performance of bOGD in several simulations based on demand response.
We extend the regret analysis of the online distributed weighted dual averaging (DWDA) algorithm[1] to the dynamic setting and provide the tightest dynamic regret bound known to date for a distributed online convex optimization (OCO) algorithm. Our bound is linear in the cumulative difference between consecutive optima and does not depend explicitly on the time horizon. We use dynamic-online DWDA (D-ODWDA) and formulate a performance-guaranteed distributed online demand response approach for heating, ventilation, and air-conditioning (HVAC) systems of commercial buildings. We show the performance of our approach for fast timescale demand response in numerical simulations and obtain demand response decisions that closely reproduce the centralized optimal ones.
We incorporate future information in the form of the estimated value of future gradients in online convex optimization. This is motivated by demand response in power systems, where forecasts about the current round, e.g., the weather or the loads' behavior, can be used to improve on predictions made with only past observations. Specifically, we introduce an additional predictive step that follows the standard online convex optimization step when certain conditions on the estimated gradient and descent direction are met. We show that under these conditions and without any assumptions on the predictability of the environment, the predictive update strictly improves on the performance of the standard update. We give two types of predictive update for various family of loss functions. We provide a regret bound for each of our predictive online convex optimization algorithms. Finally, we apply our framework to an example based on demand response which demonstrates its superior performance to a standard online convex optimization algorithm.