Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Time": models, code, and papers

Federated Multi-Agent Deep Reinforcement Learning Approach via Physics-Informed Reward for Multi-Microgrid Energy Management

Dec 29, 2022
Yuanzheng Li, Shangyang He, Yang Li, Yang Shi, Zhigang Zeng

Figure 1 for Federated Multi-Agent Deep Reinforcement Learning Approach via Physics-Informed Reward for Multi-Microgrid Energy Management

Figure 2 for Federated Multi-Agent Deep Reinforcement Learning Approach via Physics-Informed Reward for Multi-Microgrid Energy Management

Figure 3 for Federated Multi-Agent Deep Reinforcement Learning Approach via Physics-Informed Reward for Multi-Microgrid Energy Management

Figure 4 for Federated Multi-Agent Deep Reinforcement Learning Approach via Physics-Informed Reward for Multi-Microgrid Energy Management

The utilization of large-scale distributed renewable energy promotes the development of the multi-microgrid (MMG), which raises the need of developing an effective energy management method to minimize economic costs and keep self energy-sufficiency. The multi-agent deep reinforcement learning (MADRL) has been widely used for the energy management problem because of its real-time scheduling ability. However, its training requires massive energy operation data of microgrids (MGs), while gathering these data from different MGs would threaten their privacy and data security. Therefore, this paper tackles this practical yet challenging issue by proposing a federated multi-agent deep reinforcement learning (F-MADRL) algorithm via the physics-informed reward. In this algorithm, the federated learning (FL) mechanism is introduced to train the F-MADRL algorithm thus ensures the privacy and the security of data. In addition, a decentralized MMG model is built, and the energy of each participated MG is managed by an agent, which aims to minimize economic costs and keep self energy-sufficiency according to the physics-informed reward. At first, MGs individually execute the self-training based on local energy operation data to train their local agent models. Then, these local models are periodically uploaded to a server and their parameters are aggregated to build a global agent, which will be broadcasted to MGs and replace their local agents. In this way, the experience of each MG agent can be shared and the energy operation data is not explicitly transmitted, thus protecting the privacy and ensuring data security. Finally, experiments are conducted on Oak Ridge national laboratory distributed energy control communication lab microgrid (ORNL-MG) test system, and the comparisons are carried out to verify the effectiveness of introducing the FL mechanism and the outperformance of our proposed F-MADRL.

* Accepted by IEEE Transactions on Neural Networks and Learning Systems

Via

Access Paper or Ask Questions

Trajectory Adaptive Prediction for Moving Objects in Uncertain Environment

Dec 13, 2022
Hu Jin

The existing methods for trajectory prediction are difficult to describe trajectory of moving objects in complex and uncertain environment accurately. In order to solve this problem, this paper proposes an adaptive trajectory prediction method for moving objects based on variation Gaussian mixture model (VGMM) in dynamic environment (ESATP). Firstly, based on the traditional mixture Gaussian model, we use the approximate variational Bayesian inference method to process the mixture Gaussian distribution in model training procedure. Secondly, variational Bayesian expectation maximization iterative is used to learn the model parameters and prior information is used to get a more precise prediction model. Finally, for the input trajectories, parameter adaptive selection algorithm is used automatically to adjust the combination of parameters. Experiment results perform that the ESATP method in the experiment showed high predictive accuracy, and maintain a high time efficiency. This model can be used in products of mobile vehicle positioning.

Via

Access Paper or Ask Questions

ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency

Dec 02, 2022
Chuming Li, Jie Liu, Yinmin Zhang, Yuhong Wei, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang

Figure 1 for ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency

Figure 2 for ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency

Figure 3 for ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency

Figure 4 for ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency

Multi-agent reinforcement learning (MARL) suffers from the non-stationarity problem, which is the ever-changing targets at every iteration when multiple agents update their policies at the same time. Starting from first principle, in this paper, we manage to solve the non-stationarity problem by proposing bidirectional action-dependent Q-learning (ACE). Central to the development of ACE is the sequential decision-making process wherein only one agent is allowed to take action at one time. Within this process, each agent maximizes its value function given the actions taken by the preceding agents at the inference stage. In the learning phase, each agent minimizes the TD error that is dependent on how the subsequent agents have reacted to their chosen action. Given the design of bidirectional dependency, ACE effectively turns a multiagent MDP into a single-agent MDP. We implement the ACE framework by identifying the proper network representation to formulate the action dependency, so that the sequential decision process is computed implicitly in one forward pass. To validate ACE, we compare it with strong baselines on two MARL benchmarks. Empirical experiments demonstrate that ACE outperforms the state-of-the-art algorithms on Google Research Football and StarCraft Multi-Agent Challenge by a large margin. In particular, on SMAC tasks, ACE achieves 100% success rate on almost all the hard and super-hard maps. We further study extensive research problems regarding ACE, including extension, generalization, and practicability. Code is made available to facilitate further research.

* Accepted by the Thirty-Seventh AAAI Conference on Artificial Intelligence(AAAI2023)

Via

Access Paper or Ask Questions

Randomized Conditional Flow Matching for Video Prediction

Nov 26, 2022
Aram Davtyan, Sepehr Sameni, Paolo Favaro

Figure 1 for Randomized Conditional Flow Matching for Video Prediction

Figure 2 for Randomized Conditional Flow Matching for Video Prediction

Figure 3 for Randomized Conditional Flow Matching for Video Prediction

Figure 4 for Randomized Conditional Flow Matching for Video Prediction

We introduce a novel generative model for video prediction based on latent flow matching, an efficient alternative to diffusion-based models. In contrast to prior work that either incurs a high training cost by modeling the past through a memory state, as in recurrent neural networks, or limits the computational load by conditioning only on a predefined window of past frames, we efficiently and effectively take the past into account by conditioning at inference time only on a small random set of past frames at each integration step of the learned flow. Moreover, to enable the generation of high-resolution videos and speed up the training, we work in the latent space of a pretrained VQGAN. Furthermore, we propose to approximate the initial condition of the flow ODE with the previous noisy frame. This allows to reduce the number of integration steps and hence, speed up the sampling at inference time. We call our model Random frame conditional flow Integration for VidEo pRediction, or, in short, RIVER. We show that RIVER achieves superior or on par performance compared to prior work on common video prediction benchmarks.

Via

Access Paper or Ask Questions

A Geometric Method for Improved Uncertainty Estimation in Real-time

Jun 23, 2022
Gabriella Chouraqui, Liron Cohen, Gil Einziger, Liel Leman

Figure 1 for A Geometric Method for Improved Uncertainty Estimation in Real-time

Figure 2 for A Geometric Method for Improved Uncertainty Estimation in Real-time

Figure 3 for A Geometric Method for Improved Uncertainty Estimation in Real-time

Figure 4 for A Geometric Method for Improved Uncertainty Estimation in Real-time

Machine learning classifiers are probabilistic in nature, and thus inevitably involve uncertainty. Predicting the probability of a specific input to be correct is called uncertainty (or confidence) estimation and is crucial for risk management. Post-hoc model calibrations can improve models' uncertainty estimations without the need for retraining, and without changing the model. Our work puts forward a geometric-based approach for uncertainty estimation. Roughly speaking, we use the geometric distance of the current input from the existing training inputs as a signal for estimating uncertainty and then calibrate that signal (instead of the model's estimation) using standard post-hoc calibration techniques. We show that our method yields better uncertainty estimations than recently proposed approaches by extensively evaluating multiple datasets and models. In addition, we also demonstrate the possibility of performing our approach in near real-time applications. Our code is available at our Github https://github.com/NoSleepDeveloper/Geometric-Calibrator.

* Conference on Uncertainty in Artificial Intelligence (UAI)

Via

Access Paper or Ask Questions

Real Time Multi-Object Detection for Helmet Safety

May 19, 2022
Mrinal Mathur, Archana Benkkallpalli Chandrashekhar, Venkata Krishna Chaithanya Nuthalapati

Figure 1 for Real Time Multi-Object Detection for Helmet Safety

Figure 2 for Real Time Multi-Object Detection for Helmet Safety

Figure 3 for Real Time Multi-Object Detection for Helmet Safety

Figure 4 for Real Time Multi-Object Detection for Helmet Safety

The National Football League and Amazon Web Services teamed up to develop the best sports injury surveillance and mitigation program via the Kaggle competition. Through which the NFL wants to assign specific players to each helmet, which would help accurately identify each player's "exposures" throughout a football play. We are trying to implement a computer vision based ML algorithms capable of assigning detected helmet impacts to correct players via tracking information. Our paper will explain the approach to automatically track player helmets and their collisions. This will also allow them to review previous plays and explore the trends in exposure over time.

Via

Access Paper or Ask Questions

Robust, fast and accurate mapping of diffusional mean kurtosis

Nov 30, 2022
Megan E. Farquhar, Qianqian Yang, Viktor Vegh

Figure 1 for Robust, fast and accurate mapping of diffusional mean kurtosis

Figure 2 for Robust, fast and accurate mapping of diffusional mean kurtosis

Figure 3 for Robust, fast and accurate mapping of diffusional mean kurtosis

Figure 4 for Robust, fast and accurate mapping of diffusional mean kurtosis

Diffusion weighted magnetic resonance imaging produces data encoded with the random motion of water molecules in biological tissues. The collection and extraction of information from such data have become critical to modern imaging studies, and particularly those focusing on neuroimaging. A range of mathematical models are routinely applied to infer tissue microstructure properties. Diffusional kurtosis imaging entails a model for measuring the extent of non-Gaussian diffusion in biological tissues. The method has seen wide assimilation across a range of clinical applications, and promises to be an increasingly important tool for clinical diagnosis, treatment planning and monitoring. However, accurate and robust estimation of kurtosis from clinically feasible data acquisitions remains a challenge. We outline a fast and robust way of estimating mean kurtosis via the sub-diffusion mathematical framework. Our kurtosis mapping method is evaluated using simulations and the Connectome 1.0 human brain data. Results show that fitting the sub-diffusion model to multiple diffusion time data and then directly calculating the mean kurtosis greatly improves the quality of the estimation. Suggestions for diffusion encoding sampling, the number of diffusion times to be acquired and the separation between them are provided. Exquisite tissue contrast is achieved even when the diffusion encoded data is collected in only minutes. Our findings suggest robust estimation of mean kurtosis can be realised within a clinically feasible diffusion weighted magnetic resonance imaging data acquisition time.

Via

Access Paper or Ask Questions

Control Barrier Functionals: Safety-critical Control for Time Delay Systems

Jun 16, 2022
Adam K. Kiss, Tamas G. Molnar, Aaron D. Ames, Gabor Orosz

Figure 1 for Control Barrier Functionals: Safety-critical Control for Time Delay Systems

Figure 2 for Control Barrier Functionals: Safety-critical Control for Time Delay Systems

Figure 3 for Control Barrier Functionals: Safety-critical Control for Time Delay Systems

Figure 4 for Control Barrier Functionals: Safety-critical Control for Time Delay Systems

This work presents a theoretical framework for the safety-critical control of time delay systems. The theory of control barrier functions, that provides formal safety guarantees for delay-free systems, is extended to systems with state delay. The notion of control barrier functionals is introduced to attain formal safety guarantees, by enforcing the forward invariance of safe sets defined in the infinite dimensional state space. The proposed framework is able to handle multiple delays and distributed delays both in the dynamics and in the safety condition, and provides an affine constraint on the control input that yields provable safety. This constraint can be incorporated into optimization problems to synthesize pointwise optimal and provable safe controllers. The applicability of the proposed method is demonstrated by numerical simulation examples.

* Submitted to the International Journal of Robust and Nonlinear Control (JRNC). 25 pages, 3 figures

Via

Access Paper or Ask Questions

Interactive Concept Bottleneck Models

Dec 14, 2022
Kushal Chauhan, Rishabh Tiwari, Jan Freyberg, Pradeep Shenoy, Krishnamurthy Dvijotham

Figure 1 for Interactive Concept Bottleneck Models

Figure 2 for Interactive Concept Bottleneck Models

Figure 3 for Interactive Concept Bottleneck Models

Figure 4 for Interactive Concept Bottleneck Models

Concept bottleneck models (CBMs) (Koh et al. 2020) are interpretable neural networks that first predict labels for human-interpretable concepts relevant to the prediction task, and then predict the final label based on the concept label predictions.We extend CBMs to interactive prediction settings where the model can query a human collaborator for the label to some concepts. We develop an interaction policy that, at prediction time, chooses which concepts to request a label for so as to maximally improve the final prediction. We demonstrate thata simple policy combining concept prediction uncertainty and influence of the concept on the final prediction achieves strong performance and outperforms a static approach proposed in Koh et al. (2020) as well as active feature acquisition methods proposed in the literature. We show that the interactiveCBM can achieve accuracy gains of 5-10% with only 5 interactions over competitive baselines on the Caltech-UCSDBirds, CheXpert and OAI datasets.

* To appear at AAAI 2023

Via

Access Paper or Ask Questions

Reinforcement Learning in System Identification

Dec 14, 2022
Jose Antonio Martin H., Oscar Fernandez Vicente, Sergio Perez, Anas Belfadil, Cristina Ibanez-Llano, Freddy Jose Perozo Rondon, Jose Javier Valle, Javier Arechalde Pelaz

Figure 1 for Reinforcement Learning in System Identification

Figure 2 for Reinforcement Learning in System Identification

Figure 3 for Reinforcement Learning in System Identification

Figure 4 for Reinforcement Learning in System Identification

System identification, also known as learning forward models, transfer functions, system dynamics, etc., has a long tradition both in science and engineering in different fields. Particularly, it is a recurring theme in Reinforcement Learning research, where forward models approximate the state transition function of a Markov Decision Process by learning a mapping function from current state and action to the next state. This problem is commonly defined as a Supervised Learning problem in a direct way. This common approach faces several difficulties due to the inherent complexities of the dynamics to learn, for example, delayed effects, high non-linearity, non-stationarity, partial observability and, more important, error accumulation when using bootstrapped predictions (predictions based on past predictions), over large time horizons. Here we explore the use of Reinforcement Learning in this problem. We elaborate on why and how this problem fits naturally and sound as a Reinforcement Learning problem, and present some experimental results that demonstrate RL is a promising technique to solve these kind of problems.

* Accepted in Neurips Deep Reinforcement Learning Workshop 2022: https://openreview.net/forum?id=fGcbpWQIJZV

Via

Access Paper or Ask Questions