Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shuangyi Yan

Multi-Agentic AI for Fairness-Aware and Accelerated Multi-modal Large Model Inference in Real-world Mobile Edge Networks

Feb 06, 2026

Haiyuan Li, Hari Madhukumar, Shuangyi Yan, Yulei Wu, Dimitra Simeonidou

Abstract:Generative AI (GenAI) has transformed applications in natural language processing and content creation, yet centralized inference remains hindered by high latency, limited customizability, and privacy concerns. Deploying large models (LMs) in mobile edge networks emerges as a promising solution. However, it also poses new challenges, including heterogeneous multi-modal LMs with diverse resource demands and inference speeds, varied prompt/output modalities that complicate orchestration, and resource-limited infrastructure ill-suited for concurrent LM execution. In response, we propose a Multi-Agentic AI framework for latency- and fairness-aware multi-modal LM inference in mobile edge networks. Our solution includes a long-term planning agent, a short-term prompt scheduling agent, and multiple on-node LM deployment agents, all powered by foundation language models. These agents cooperatively optimize prompt routing and LM deployment through natural language reasoning over runtime telemetry and historical experience. To evaluate its performance, we further develop a city-wide testbed that supports network monitoring, containerized LM deployment, intra-server resource management, and inter-server communications. Experiments demonstrate that our solution reduces average latency by over 80% and improves fairness (Normalized Jain index) to 0.90 compared to other baselines. Moreover, our solution adapts quickly without fine-tuning, offering a generalizable solution for optimizing GenAI services in edge environments.

Via

Access Paper or Ask Questions

From Hype to Reality: The Road Ahead of Deploying DRL in 6G Networks

Oct 30, 2024

Haiyuan Li, Hari Madhukumar, Peizheng Li, Yiran Teng, Shuangyi Yan, Dimitra Simeonidou

Figure 1 for From Hype to Reality: The Road Ahead of Deploying DRL in 6G Networks

Figure 2 for From Hype to Reality: The Road Ahead of Deploying DRL in 6G Networks

Figure 3 for From Hype to Reality: The Road Ahead of Deploying DRL in 6G Networks

Figure 4 for From Hype to Reality: The Road Ahead of Deploying DRL in 6G Networks

Abstract:The industrial landscape is rapidly evolving with the advent of 6G applications, which demand massive connectivity, high computational capacity, and ultra-low latency. These requirements present new challenges, which can no longer be efficiently addressed by conventional strategies. In response, this article underscores the transformative potential of Deep Reinforcement Learning (DRL) for 6G, highlighting its advantages over classic machine learning solutions in meeting the demands of 6G. The necessity of DRL is further validated through three DRL applications in an end-to-end communication procedure, including wireless access control, baseband function placement, and network slicing coordination. However, DRL-based network management initiatives are far from mature. We extend the discussion to identify the challenges of applying DRL in practical networks and explore potential solutions along with their respective limitations. In the end, these insights are validated through a practical DRL deployment in managing network slices on the testbed.

Via

Access Paper or Ask Questions

DRL-Assisted Dynamic QoT-Aware Service Provisioning in Multi-Band Elastic Optical Networks

Aug 06, 2024

Yiran Teng, Carlos Natalino, Farhad Arpanaei, Alfonso Sánchez-Macián, Paolo Monti, Shuangyi Yan, Dimitra Simeonidou

Figure 1 for DRL-Assisted Dynamic QoT-Aware Service Provisioning in Multi-Band Elastic Optical Networks

Figure 2 for DRL-Assisted Dynamic QoT-Aware Service Provisioning in Multi-Band Elastic Optical Networks

Figure 3 for DRL-Assisted Dynamic QoT-Aware Service Provisioning in Multi-Band Elastic Optical Networks

Abstract:We propose a DRL-assisted approach for service provisioning in multi-band elastic optical networks. Our simulation environment uses an accurate QoT estimator based on the GN/EGN model. Results show that the proposed approach reduces request blocking by 50% compared with heuristics from the literature.

* This paper has been accepted by 50th European Conference on Optical Communications (ECOC 2O24)

Via

Access Paper or Ask Questions