Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Dynamic and Distributed Routing in IoT Networks based on Multi-Objective Q-Learning

May 01, 2025

Shubham Vaishnav, Praveen Kumar Donta, Sindri Magnússon

Figure 1 for Dynamic and Distributed Routing in IoT Networks based on Multi-Objective Q-Learning

Figure 2 for Dynamic and Distributed Routing in IoT Networks based on Multi-Objective Q-Learning

Figure 3 for Dynamic and Distributed Routing in IoT Networks based on Multi-Objective Q-Learning

Figure 4 for Dynamic and Distributed Routing in IoT Networks based on Multi-Objective Q-Learning

Share this with someone who'll enjoy it:

Abstract:The last few decades have witnessed a rapid increase in IoT devices owing to their wide range of applications, such as smart healthcare monitoring systems, smart cities, and environmental monitoring. A critical task in IoT networks is sensing and transmitting information over the network. The IoT nodes gather data by sensing the environment and then transmit this data to a destination node via multi-hop communication, following some routing protocols. These protocols are usually designed to optimize possibly contradictory objectives, such as maximizing packet delivery ratio and energy efficiency. While most literature has focused on optimizing a static objective that remains unchanged, many real-world IoT applications require adapting to rapidly shifting priorities. For example, in monitoring systems, some transmissions are time-critical and require a high priority on low latency, while other transmissions are less urgent and instead prioritize energy efficiency. To meet such dynamic demands, we propose novel dynamic and distributed routing based on multiobjective Q-learning that can adapt to changes in preferences in real-time. Our algorithm builds on ideas from both multi-objective optimization and Q-learning. We also propose a novel greedy interpolation policy scheme to take near-optimal decisions for unexpected preference changes. The proposed scheme can approximate and utilize the Pareto-efficient solutions for dynamic preferences, thus utilizing past knowledge to adapt to unpredictable preferences quickly during runtime. Simulation results show that the proposed scheme outperforms state-of-the-art algorithms for various exploration strategies, preference variation patterns, and important metrics like overall reward, energy efficiency, and packet delivery ratio.

View paper on

Share this with someone who'll enjoy it:

Title:Dynamic and Distributed Routing in IoT Networks based on Multi-Objective Q-Learning

Paper and Code