Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhanwei Wang

Multi-SPIN: Multi-Access Speculative Inference for Cooperative Token Generation at the Edge

Jun 03, 2026

Haotian Zheng, Zhanwei Wang, Mingyao Cui, Chang Cai, Hongyang Du, Kaibin Huang

Abstract:Speculative inference (SPIN) was originally developed as an efficient architecture to accelerate Large Language Models (LLMs). In this work, we propose its distributed deployment to enable cooperative token generation in a multiuser edge system; its advantage is to effectively balance computational loads between resource-constrained devices and servers. The resulting architecture, termed Multi-access SPIN (Multi-SPIN), utilizes on-device small language models to generate and upload candidate token drafts, while an edge server operates the LLM to verify them in parallel batches. Given the severe heterogeneity in users' computation and communication capabilities, the draft length emerges as a critical control variable that influences node-level computation loads and multi-access latency, thereby governing the sum token goodput. Consequently, considering frequency-division multiple access, we investigate the problem of multi-access draft control, a joint optimization of draft-length control and bandwidth allocation to maximize sum token goodput. We examine two cases: (1) homogeneous draft lengths across users to facilitate server-side batching, and (2) heterogeneous draft lengths to introduce a new dimension for goodput enhancement. By developing decomposition methods, we reduce these complex optimizations into tractable sub-problems, which allow efficient draft control algorithms to be derived in closed form. Our analysis shows that the optimal bandwidth allocation compensates users with weaker computation-and-communication capabilities in the homogeneous case due to the batching synchronization requirements, whereas its heterogeneous-case counterpart rewards users with higher acceptance rates by relaxing such requirements. Experiments using Llama-2 and Qwen3.5 model pairs across diverse tasks demonstrate that Multi-SPIN improves goodput by up to 88% over heterogeneity-agnostic baselines.

Via

Access Paper or Ask Questions

Rydberg Atomic Receivers for Multi-Band Communications and Sensing

May 30, 2025

Mingyao Cui, Qunsong Zeng, Zhanwei Wang, Kaibin Huang

Abstract:Harnessing multi-level electron transitions, Rydberg Atomic Receivers (RAREs) can detect wireless signals across a wide range of frequency bands, from Megahertz to Terahertz, enabling multi-band communications and sensing (C&S). Current research on multi-band RAREs primarily focuses on experimental demonstrations, lacking an interpretable model to mathematically characterize their mechanisms. This issue leaves the multi-band RARE as a black box, posing challenges in its practical C&S applications. To fill in this gap, this paper investigates the underlying mechanism of multi-band RAREs and explores their optimal performance. For the first time, the closed-form expression of the transfer function of a multi-band RARE is derived by solving the quantum response of Rydberg atoms excited by multi-band signals. The function reveals that a multiband RARE simultaneously serves as both a multi-band atomic mixer for down-converting multi-band signals and a multi-band atomic amplifier that reflects its sensitivity to each band. Further analysis of the atomic amplifier unveils that the gain factor at each frequency band can be decoupled into a global gain term and a Rabi attention term. The former determines the overall sensitivity of a RARE to all frequency bands of wireless signals. The latter influences the allocation of the overall sensitivity to each frequency band, representing a unique attention mechanism of multi-band RAREs. The optimal design of the global gain is provided to maximize the overall sensitivity of multi-band RAREs. Subsequently, the optimal Rabi attentions are also derived to maximize the practical multi-band C&S performance. Numerical results confirm the effectiveness of the derived transfer function and the superiority of multi-band RAREs.

* 13 pages, 7 figures, ongoing work

Via

Access Paper or Ask Questions

Ultra-Low-Latency Edge Intelligent Sensing: A Source-Channel Tradeoff and Its Application to Coding Rate Adaptation

Mar 06, 2025

Qunsong Zeng, Jianhao Huang, Zhanwei Wang, Kaibin Huang, Kin K. Leung

Abstract:The forthcoming sixth-generation (6G) mobile network is set to merge edge artificial intelligence (AI) and integrated sensing and communication (ISAC) extensively, giving rise to the new paradigm of edge intelligent sensing (EI-Sense). This paradigm leverages ubiquitous edge devices for environmental sensing and deploys AI algorithms at edge servers to interpret the observations via remote inference on wirelessly uploaded features. A significant challenge arises in designing EI-Sense systems for 6G mission-critical applications, which demand high performance under stringent latency constraints. To tackle this challenge, we focus on the end-to-end (E2E) performance of EI-Sense and characterize a source-channel tradeoff that balances source distortion and channel reliability. In this work, we establish a theoretical foundation for the source-channel tradeoff by quantifying the effects of source coding on feature discriminant gains and channel reliability on packet loss. Building on this foundation, we design the coding rate control by optimizing the tradeoff to minimize the E2E sensing error probability, leading to a low-complexity algorithm for ultra-low-latency EI-Sense. Finally, we validate our theoretical analysis and proposed coding rate control algorithm through extensive experiments on both synthetic and real datasets, demonstrating the sensing performance gain of our approach with respect to traditional reliability-centric methods.

Via

Access Paper or Ask Questions

Integrated Sensing and Edge AI: Realizing Intelligent Perception in 6G

Jan 12, 2025

Zhiyan Liu, Xu Chen, Hai Wu, Zhanwei Wang, Xianhao Chen, Dusit Niyato, Kaibin Huang

Abstract:Sensing and edge artificial intelligence (AI) are envisioned as two essential and interconnected functions in sixth-generation (6G) mobile networks. On the one hand, sensing-empowered applications rely on powerful AI models to extract features and understand semantics from ubiquitous wireless sensors. On the other hand, the massive amount of sensory data serves as the fuel to continuously refine edge AI models. This deep integration of sensing and edge AI has given rise to a new task-oriented paradigm known as integrated sensing and edge AI (ISEA), which features a holistic design approach to communication, AI computation, and sensing for optimal sensing-task performance. In this article, we present a comprehensive survey for ISEA. We first provide technical preliminaries for sensing, edge AI, and new communication paradigms in ISEA. Then, we study several use cases of ISEA to demonstrate its practical relevance and introduce current standardization and industrial progress. Next, the design principles, metrics, tradeoffs, and architectures of ISEA are established, followed by a thorough overview of ISEA techniques, including digital air interface, over-the-air computation, and advanced signal processing. Its interplay with various 6G advancements, e.g., new physical-layer and networking techniques, are presented. Finally, we present future research opportunities in ISEA, including the integration of foundation models, convergence of ISEA and integrated sensing and communications (ISAC), and ultra-low-latency ISEA.

Via

Access Paper or Ask Questions

Spectrum Breathing: Protecting Over-the-Air Federated Learning Against Interference

May 10, 2023

Zhanwei Wang, Kaibin Huang, Yonina C. Eldar

Figure 1 for Spectrum Breathing: Protecting Over-the-Air Federated Learning Against Interference

Figure 2 for Spectrum Breathing: Protecting Over-the-Air Federated Learning Against Interference

Figure 3 for Spectrum Breathing: Protecting Over-the-Air Federated Learning Against Interference

Figure 4 for Spectrum Breathing: Protecting Over-the-Air Federated Learning Against Interference

Abstract:Federated Learning (FL) is a widely embraced paradigm for distilling artificial intelligence from distributed mobile data. However, the deployment of FL in mobile networks can be compromised by exposure to interference from neighboring cells or jammers. Existing interference mitigation techniques require multi-cell cooperation or at least interference channel state information, which is expensive in practice. On the other hand, power control that treats interference as noise may not be effective due to limited power budgets, and also that this mechanism can trigger countermeasures by interference sources. As a practical approach for protecting FL against interference, we propose Spectrum Breathing, which cascades stochastic-gradient pruning and spread spectrum to suppress interference without bandwidth expansion. The cost is higher learning latency by exploiting the graceful degradation of learning speed due to pruning. We synchronize the two operations such that their levels are controlled by the same parameter, Breathing Depth. To optimally control the parameter, we develop a martingale-based approach to convergence analysis of Over-the-Air FL with spectrum breathing, termed AirBreathing FL. We show a performance tradeoff between gradient-pruning and interference-induced error as regulated by the breathing depth. Given receive SIR and model size, the optimization of the tradeoff yields two schemes for controlling the breathing depth that can be either fixed or adaptive to channels and the learning process. As shown by experiments, in scenarios where traditional Over-the-Air FL fails to converge in the presence of strong interference, AirBreahing FL with either fixed or adaptive breathing depth can ensure convergence where the adaptive scheme achieves close-to-ideal performance.

Via

Access Paper or Ask Questions