Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Eric Moulines

MBZUAI, LRE

Deep Reinforcement Learning Algorithms for Hybrid V2X Communication: A Benchmarking Study

Oct 04, 2023

Fouzi Boukhalfa, Reda Alami, Mastane Achab, Eric Moulines, Mehdi Bennis

Figure 1 for Deep Reinforcement Learning Algorithms for Hybrid V2X Communication: A Benchmarking Study

Figure 2 for Deep Reinforcement Learning Algorithms for Hybrid V2X Communication: A Benchmarking Study

Figure 3 for Deep Reinforcement Learning Algorithms for Hybrid V2X Communication: A Benchmarking Study

Figure 4 for Deep Reinforcement Learning Algorithms for Hybrid V2X Communication: A Benchmarking Study

Abstract:In today's era, autonomous vehicles demand a safety level on par with aircraft. Taking a cue from the aerospace industry, which relies on redundancy to achieve high reliability, the automotive sector can also leverage this concept by building redundancy in V2X (Vehicle-to-Everything) technologies. Given the current lack of reliable V2X technologies, this idea is particularly promising. By deploying multiple RATs (Radio Access Technologies) in parallel, the ongoing debate over the standard technology for future vehicles can be put to rest. However, coordinating multiple communication technologies is a complex task due to dynamic, time-varying channels and varying traffic conditions. This paper addresses the vertical handover problem in V2X using Deep Reinforcement Learning (DRL) algorithms. The goal is to assist vehicles in selecting the most appropriate V2X technology (DSRC/V-VLC) in a serpentine environment. The results show that the benchmarked algorithms outperform the current state-of-the-art approaches in terms of redundancy and usage rate of V-VLC headlights. This result is a significant reduction in communication costs while maintaining a high level of reliability. These results provide strong evidence for integrating advanced DRL decision mechanisms into the architecture as a promising approach to solving the vertical handover problem in V2X.

Via

Access Paper or Ask Questions

Monte Carlo guided Diffusion for Bayesian linear inverse problems

Aug 15, 2023

Gabriel Cardoso, Yazid Janati El Idrissi, Sylvain Le Corff, Eric Moulines

Figure 1 for Monte Carlo guided Diffusion for Bayesian linear inverse problems

Figure 2 for Monte Carlo guided Diffusion for Bayesian linear inverse problems

Figure 3 for Monte Carlo guided Diffusion for Bayesian linear inverse problems

Figure 4 for Monte Carlo guided Diffusion for Bayesian linear inverse problems

Abstract:Ill-posed linear inverse problems that combine knowledge of the forward measurement model with prior models arise frequently in various applications, from computational photography to medical imaging. Recent research has focused on solving these problems with score-based generative models (SGMs) that produce perceptually plausible images, especially in inpainting problems. In this study, we exploit the particular structure of the prior defined in the SGM to formulate recovery in a Bayesian framework as a Feynman--Kac model adapted from the forward diffusion model used to construct score-based diffusion. To solve this Feynman--Kac problem, we propose the use of Sequential Monte Carlo methods. The proposed algorithm, MCGdiff, is shown to be theoretically grounded and we provide numerical simulations showing that it outperforms competing baselines when dealing with ill-posed inverse problems.

* preprint

Via

Access Paper or Ask Questions

Conformal Prediction for Federated Uncertainty Quantification Under Label Shift

Jun 08, 2023

Vincent Plassier, Mehdi Makni, Aleksandr Rubashevskii, Eric Moulines, Maxim Panov

Figure 1 for Conformal Prediction for Federated Uncertainty Quantification Under Label Shift

Figure 2 for Conformal Prediction for Federated Uncertainty Quantification Under Label Shift

Figure 3 for Conformal Prediction for Federated Uncertainty Quantification Under Label Shift

Figure 4 for Conformal Prediction for Federated Uncertainty Quantification Under Label Shift

Abstract:Federated Learning (FL) is a machine learning framework where many clients collaboratively train models while keeping the training data decentralized. Despite recent advances in FL, the uncertainty quantification topic (UQ) remains partially addressed. Among UQ methods, conformal prediction (CP) approaches provides distribution-free guarantees under minimal assumptions. We develop a new federated conformal prediction method based on quantile regression and take into account privacy constraints. This method takes advantage of importance weighting to effectively address the label shift between agents and provides theoretical guarantees for both valid coverage of the prediction sets and differential privacy. Extensive experimental studies demonstrate that this method outperforms current competitors.

* ICML 2023

Via

Access Paper or Ask Questions

FAVAS: Federated AVeraging with ASynchronous clients

May 25, 2023

Louis Leconte, Van Minh Nguyen, Eric Moulines

Abstract:In this paper, we propose a novel centralized Asynchronous Federated Learning (FL) framework, FAVAS, for training Deep Neural Networks (DNNs) in resource-constrained environments. Despite its popularity, ``classical'' federated learning faces the increasingly difficult task of scaling synchronous communication over large wireless networks. Moreover, clients typically have different computing resources and therefore computing speed, which can lead to a significant bias (in favor of ``fast'' clients) when the updates are asynchronous. Therefore, practical deployment of FL requires to handle users with strongly varying computing speed in communication/resource constrained setting. We provide convergence guarantees for FAVAS in a smooth, non-convex environment and carefully compare the obtained convergence guarantees with existing bounds, when they are available. Experimental results show that the FAVAS algorithm outperforms current methods on standard benchmarks.

Via

Access Paper or Ask Questions

First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities

May 25, 2023

Aleksandr Beznosikov, Sergey Samsonov, Marina Sheshukova, Alexander Gasnikov, Alexey Naumov, Eric Moulines

Abstract:This paper delves into stochastic optimization problems that involve Markovian noise. We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities. Our approach covers scenarios for both non-convex and strongly convex minimization problems. To achieve an optimal (linear) dependence on the mixing time of the underlying noise sequence, we use the randomized batching scheme, which is based on the multilevel Monte Carlo method. Moreover, our technique allows us to eliminate the limiting assumptions of previous research on Markov noise, such as the need for a bounded domain and uniformly bounded stochastic gradients. Our extension to variational inequalities under Markovian noise is original. Additionally, we provide lower bounds that match the oracle complexity of our method in the case of strongly convex optimization problems.

* 47 pages, 3 algorithms, 2 tables

Via

Access Paper or Ask Questions

One-Step Distributional Reinforcement Learning

Apr 27, 2023

Mastane Achab, Reda Alami, Yasser Abdelaziz Dahou Djilali, Kirill Fedyanin, Eric Moulines

Figure 1 for One-Step Distributional Reinforcement Learning

Figure 2 for One-Step Distributional Reinforcement Learning

Figure 3 for One-Step Distributional Reinforcement Learning

Figure 4 for One-Step Distributional Reinforcement Learning

Abstract:Reinforcement learning (RL) allows an agent interacting sequentially with an environment to maximize its long-term expected return. In the distributional RL (DistrRL) paradigm, the agent goes beyond the limit of the expected value, to capture the underlying probability distribution of the return across all time steps. The set of DistrRL algorithms has led to improved empirical performance. Nevertheless, the theory of DistrRL is still not fully understood, especially in the control case. In this paper, we present the simpler one-step distributional reinforcement learning (OS-DistrRL) framework encompassing only the randomness induced by the one-step dynamics of the environment. Contrary to DistrRL, we show that our approach comes with a unified theory for both policy evaluation and control. Indeed, we propose two OS-DistrRL algorithms for which we provide an almost sure convergence analysis. The proposed approach compares favorably with categorical DistrRL on various environments.

Via

Access Paper or Ask Questions

Restarted Bayesian Online Change-point Detection for Non-Stationary Markov Decision Processes

Apr 01, 2023

Reda Alami, Mohammed Mahfoud, Eric Moulines

Figure 1 for Restarted Bayesian Online Change-point Detection for Non-Stationary Markov Decision Processes

Figure 2 for Restarted Bayesian Online Change-point Detection for Non-Stationary Markov Decision Processes

Abstract:We consider the problem of learning in a non-stationary reinforcement learning (RL) environment, where the setting can be fully described by a piecewise stationary discrete-time Markov decision process (MDP). We introduce a variant of the Restarted Bayesian Online Change-Point Detection algorithm (R-BOCPD) that operates on input streams originating from the more general multinomial distribution and provides near-optimal theoretical guarantees in terms of false-alarm rate and detection delay. Based on this, we propose an improved version of the UCRL2 algorithm for MDPs with state transition kernel sampled from a multinomial distribution, which we call R-BOCPD-UCRL2. We perform a finite-time performance analysis and show that R-BOCPD-UCRL2 enjoys a favorable regret bound of $O\left(D O \sqrt{A T K_T \log\left (\frac{T}{\delta} \right) + \frac{K_T \log \frac{K_T}{\delta}}{\min\limits_\ell \: \mathbf{KL}\left( {\mathbf{\theta}^{(\ell+1)}}\mid\mid{\mathbf{\theta}^{(\ell)}}\right)}}\right)$, where $D$ is the largest MDP diameter from the set of MDPs defining the piecewise stationary MDP setting, $O$ is the finite number of states (constant over all changes), $A$ is the finite number of actions (constant over all changes), $K_T$ is the number of change points up to horizon $T$, and $\mathbf{\theta}^{(\ell)}$ is the transition kernel during the interval $[c_\ell, c_{\ell+1})$, which we assume to be multinomially distributed over the set of states $\mathbb{O}$. Interestingly, the performance bound does not directly scale with the variation in MDP state transition distributions and rewards, ie. can also model abrupt changes. In practice, R-BOCPD-UCRL2 outperforms the state-of-the-art in a variety of scenarios in synthetic environments. We provide a detailed experimental setup along with a code repository (upon publication) that can be used to easily reproduce our experiments.

Via

Access Paper or Ask Questions

Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold

Mar 16, 2023

Sholom Schechtman, Daniil Tiapkin, Michael Muehlebach, Eric Moulines

Figure 1 for Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold

Figure 2 for Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold

Figure 3 for Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold

Figure 4 for Orthogonal Directions Constrained Gradient Method: from non-linear equality constraints to Stiefel manifold

Abstract:We consider the problem of minimizing a non-convex function over a smooth manifold $\mathcal{M}$. We propose a novel algorithm, the Orthogonal Directions Constrained Gradient Method (ODCGM) which only requires computing a projection onto a vector space. ODCGM is infeasible but the iterates are constantly pulled towards the manifold, ensuring the convergence of ODCGM towards $\mathcal{M}$. ODCGM is much simpler to implement than the classical methods which require the computation of a retraction. Moreover, we show that ODCGM exhibits the near-optimal oracle complexities $\mathcal{O}(1/\varepsilon^2)$ and $\mathcal{O}(1/\varepsilon^4)$ in the deterministic and stochastic cases, respectively. Furthermore, we establish that, under an appropriate choice of the projection metric, our method recovers the landing algorithm of Ablin and Peyr\'e (2022), a recently introduced algorithm for optimization over the Stiefel manifold. As a result, we significantly extend the analysis of Ablin and Peyr\'e (2022), establishing near-optimal rates both in deterministic and stochastic frameworks. Finally, we perform numerical experiments which shows the efficiency of ODCGM in a high-dimensional setting.

Via

Access Paper or Ask Questions

Fast Rates for Maximum Entropy Exploration

Mar 14, 2023

Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Remi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Menard

Figure 1 for Fast Rates for Maximum Entropy Exploration

Figure 2 for Fast Rates for Maximum Entropy Exploration

Figure 3 for Fast Rates for Maximum Entropy Exploration

Figure 4 for Fast Rates for Maximum Entropy Exploration

Abstract:We consider the reinforcement learning (RL) setting, in which the agent has to act in unknown environment driven by a Markov Decision Process (MDP) with sparse or even reward free signals. In this situation, exploration becomes the main challenge. In this work, we study the maximum entropy exploration problem of two different types. The first type is visitation entropy maximization that was previously considered by Hazan et al. (2019) in the discounted setting. For this type of exploration, we propose an algorithm based on a game theoretic representation that has $\widetilde{\mathcal{O}}(H^3 S^2 A / \varepsilon^2)$ sample complexity thus improving the $\varepsilon$-dependence of Hazan et al. (2019), where $S$ is a number of states, $A$ is a number of actions, $H$ is an episode length, and $\varepsilon$ is a desired accuracy. The second type of entropy we study is the trajectory entropy. This objective function is closely related to the entropy-regularized MDPs, and we propose a simple modification of the UCBVI algorithm that has a sample complexity of order $\widetilde{\mathcal{O}}(1/\varepsilon)$ ignoring dependence in $S, A, H$. Interestingly enough, it is the first theoretical result in RL literature establishing that the exploration problem for the regularized MDPs can be statistically strictly easier (in terms of sample complexity) than for the ordinary MDPs.

Via

Access Paper or Ask Questions

Rosenthal-type inequalities for linear statistics of Markov chains

Mar 10, 2023

Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov, Marina Sheshukova

Abstract:In this paper, we establish novel deviation bounds for additive functionals of geometrically ergodic Markov chains similar to Rosenthal and Bernstein-type inequalities for sums of independent random variables. We pay special attention to the dependence of our bounds on the mixing time of the corresponding chain. Our proof technique is, as far as we know, new and based on the recurrent application of the Poisson decomposition. We relate the constants appearing in our moment bounds to the constants from the martingale version of the Rosenthal inequality and show an explicit dependence on the parameters of the underlying Markov kernel.

Via

Access Paper or Ask Questions