Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Erik Elmroth

Trust-Aware Routing for Distributed Generative AI Inference at the Edge

Mar 30, 2026

Chanh Nguyen, Erik Elmroth

Abstract:Emerging deployments of Generative AI increasingly execute inference across decentralized and heterogeneous edge devices rather than on a single trusted server. In such environments, a single device failure or misbehavior can disrupt the entire inference process, making traditional best-effort peer-to-peer routing insufficient. Coordinating distributed generative inference therefore requires mechanisms that explicitly account for reliability, performance variability, and trust among participating peers. In this paper, we present G-TRAC, a trust-aware coordination framework that integrates algorithmic path selection with system-level protocol design to ensure robust distributed inference. First, we formulate the routing problem as a \textit{Risk-Bounded Shortest Path} computation and introduce a polynomial-time solution that combines trust-floor pruning with Dijkstra's search, achieving sub-millisecond median routing latency at practical edge scales, and remaining below 10 ms at larger scales. Second, to operationally support the routing logic in dynamic environments, the framework employs a \textit{Hybrid Trust Architecture} that maintains global reputation state at stable anchors while disseminating lightweight updates to edge peers via background synchronization. Experimental evaluation on a heterogeneous testbed of commodity devices demonstrates that G-TRAC significantly improves inference completion rates, effectively isolates unreliable peers, and sustains robust execution even under node failures and network partitions.

* 11 pages, 10 figures. Preprint accepted at the 22nd Annual International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT 2026)

Via

Access Paper or Ask Questions

Speculative Policy Orchestration: A Latency-Resilient Framework for Cloud-Robotic Manipulation

Mar 19, 2026

Chanh Nguyen, Shutong Jin, Florian T. Pokorny, Erik Elmroth

Abstract:Cloud robotics enables robots to offload high-dimensional motion planning and reasoning to remote servers. However, for continuous manipulation tasks requiring high-frequency control, network latency and jitter can severely destabilize the system, causing command starvation and unsafe physical execution. To address this, we propose Speculative Policy Orchestration (SPO), a latency-resilient cloud-edge framework. SPO utilizes a cloud-hosted world model to pre-compute and stream future kinematic waypoints to a local edge buffer, decoupling execution frequency from network round-trip time. To mitigate unsafe execution caused by predictive drift, the edge node employs an $ε$-tube verifier that strictly bounds kinematic execution errors. The framework is coupled with an Adaptive Horizon Scaling mechanism that dynamically expands or shrinks the speculative pre-fetch depth based on real-time tracking error. We evaluate SPO on continuous RLBench manipulation tasks under emulated network delays. Results show that even when deployed with learned models of modest accuracy, SPO reduces network-induced idle time by over 60% compared to blocking remote inference. Furthermore, SPO discards approximately 60% fewer cloud predictions than static caching baselines. Ultimately, SPO enables fluid, real-time cloud-robotic control while maintaining bounded physical safety.

* 9 pages, 7 figures, conference submission

Via

Access Paper or Ask Questions

A Cost-Aware Approach to Adversarial Robustness in Neural Networks

Sep 11, 2024

Charles Meyers, Mohammad Reza Saleh Sedghpour, Tommy Löfstedt, Erik Elmroth

Figure 1 for A Cost-Aware Approach to Adversarial Robustness in Neural Networks

Figure 2 for A Cost-Aware Approach to Adversarial Robustness in Neural Networks

Figure 3 for A Cost-Aware Approach to Adversarial Robustness in Neural Networks

Figure 4 for A Cost-Aware Approach to Adversarial Robustness in Neural Networks

Abstract:Considering the growing prominence of production-level AI and the threat of adversarial attacks that can evade a model at run-time, evaluating the robustness of models to these evasion attacks is of critical importance. Additionally, testing model changes likely means deploying the models to (e.g. a car or a medical imaging device), or a drone to see how it affects performance, making un-tested changes a public problem that reduces development speed, increases cost of development, and makes it difficult (if not impossible) to parse cause from effect. In this work, we used survival analysis as a cloud-native, time-efficient and precise method for predicting model performance in the presence of adversarial noise. For neural networks in particular, the relationships between the learning rate, batch size, training time, convergence time, and deployment cost are highly complex, so researchers generally rely on benchmark datasets to assess the ability of a model to generalize beyond the training data. To address this, we propose using accelerated failure time models to measure the effect of hardware choice, batch size, number of epochs, and test-set accuracy by using adversarial attacks to induce failures on a reference model architecture before deploying the model to the real world. We evaluate several GPU types and use the Tree Parzen Estimator to maximize model robustness and minimize model run-time simultaneously. This provides a way to evaluate the model and optimise it in a single step, while simultaneously allowing us to model the effect of model parameters on training time, prediction time, and accuracy. Using this technique, we demonstrate that newer, more-powerful hardware does decrease the training time, but with a monetary and power cost that far outpaces the marginal gains in accuracy.

Via

Access Paper or Ask Questions

A Systematic Approach to Robustness Modelling for Deep Convolutional Neural Networks

Jan 24, 2024

Charles Meyers, Mohammad Reza Saleh Sedghpour, Tommy Löfstedt, Erik Elmroth

Figure 1 for A Systematic Approach to Robustness Modelling for Deep Convolutional Neural Networks

Figure 2 for A Systematic Approach to Robustness Modelling for Deep Convolutional Neural Networks

Figure 3 for A Systematic Approach to Robustness Modelling for Deep Convolutional Neural Networks

Figure 4 for A Systematic Approach to Robustness Modelling for Deep Convolutional Neural Networks

Abstract:Convolutional neural networks have shown to be widely applicable to a large number of fields when large amounts of labelled data are available. The recent trend has been to use models with increasingly larger sets of tunable parameters to increase model accuracy, reduce model loss, or create more adversarially robust models -- goals that are often at odds with one another. In particular, recent theoretical work raises questions about the ability for even larger models to generalize to data outside of the controlled train and test sets. As such, we examine the role of the number of hidden layers in the ResNet model, demonstrated on the MNIST, CIFAR10, CIFAR100 datasets. We test a variety of parameters including the size of the model, the floating point precision, and the noise level of both the training data and the model output. To encapsulate the model's predictive power and computational cost, we provide a method that uses induced failures to model the probability of failure as a function of time and relate that to a novel metric that allows us to quickly determine whether or not the cost of training a model outweighs the cost of attacking it. Using this approach, we are able to approximate the expected failure rate using a small number of specially crafted samples rather than increasingly larger benchmark datasets. We demonstrate the efficacy of this technique on both the MNIST and CIFAR10 datasets using 8-, 16-, 32-, and 64-bit floating-point numbers, various data pre-processing techniques, and several attacks on five configurations of the ResNet model. Then, using empirical measurements, we examine the various trade-offs between cost, robustness, latency, and reliability to find that larger models do not significantly aid in adversarial robustness despite costing significantly more to train.

Via

Access Paper or Ask Questions

Resource Management for Blockchain-enabled Federated Learning: A Deep Reinforcement Learning Approach

May 01, 2020

Nguyen Quang Hieu, Tran The Anh, Nguyen Cong Luong, Dusit Niyato, Dong In Kim, Erik Elmroth

Figure 1 for Resource Management for Blockchain-enabled Federated Learning: A Deep Reinforcement Learning Approach

Figure 2 for Resource Management for Blockchain-enabled Federated Learning: A Deep Reinforcement Learning Approach

Figure 3 for Resource Management for Blockchain-enabled Federated Learning: A Deep Reinforcement Learning Approach

Figure 4 for Resource Management for Blockchain-enabled Federated Learning: A Deep Reinforcement Learning Approach

Abstract:Blockchain-enabled Federated Learning (BFL) enables mobile devices to collaboratively train neural network models required by a Machine Learning Model Owner (MLMO) while keeping data on the mobile devices. Then, the model updates are stored in the blockchain in a decentralized and reliable manner. However, the issue of BFL is that the mobile devices have energy and CPU constraints that may reduce the system lifetime and training efficiency. The other issue is that the training latency may increase due to the blockchain mining process. To address these issues, the MLMO needs to (i) decide how much data and energy that the mobile devices use for the training and (ii) determine the block generation rate to minimize the system latency, energy consumption, and incentive cost while achieving the target accuracy for the model. Under the uncertainty of the BFL environment, it is challenging for the MLMO to determine the optimal decisions. We propose to use the Deep Reinforcement Learning (DRL) to derive the optimal decisions for the MLMO.

Via

Access Paper or Ask Questions