Abstract:Distracted driving contributes to fatal crashes worldwide. To address this, researchers are using driver activity recognition (DAR) with impulse radio ultra-wideband (IR-UWB) radar, which offers advantages such as interference resistance, low power consumption, and privacy preservation. However, two challenges limit its adoption: the lack of large-scale real-world UWB datasets covering diverse distracted driving behaviors, and the difficulty of adapting fixed-input Vision Transformers (ViTs) to UWB radar data with non-standard dimensions. This work addresses both challenges. We present the ALERT dataset, which contains 10,220 radar samples of seven distracted driving activities collected in real driving conditions. We also propose the input-size-agnostic Vision Transformer (ISA-ViT), a framework designed for radar-based DAR. The proposed method resizes UWB data to meet ViT input requirements while preserving radar-specific information such as Doppler shifts and phase characteristics. By adjusting patch configurations and leveraging pre-trained positional embedding vectors (PEVs), ISA-ViT overcomes the limitations of naive resizing approaches. In addition, a domain fusion strategy combines range- and frequency-domain features to further improve classification performance. Comprehensive experiments demonstrate that ISA-ViT achieves a 22.68% accuracy improvement over an existing ViT-based approach for UWB-based DAR. By publicly releasing the ALERT dataset and detailing our input-size-agnostic strategy, this work facilitates the development of more robust and scalable distracted driving detection systems for real-world deployment.
Abstract:How can we explain the influence of training data on black-box models? Influence functions (IFs) offer a post-hoc solution by utilizing gradients and Hessians. However, computing the Hessian for an entire dataset is resource-intensive, necessitating a feasible alternative. A common approach involves randomly sampling a small subset of the training data, but this method often results in highly inconsistent IF estimates due to the high variance in sample configurations. To address this, we propose two advanced sampling techniques based on features and logits. These samplers select a small yet representative subset of the entire dataset by considering the stochastic distribution of features or logits, thereby enhancing the accuracy of IF estimations. We validate our approach through class removal experiments, a typical application of IFs, using the F1-score to measure how effectively the model forgets the removed class while maintaining inference consistency on the remaining classes. Our method reduces computation time by 30.1% and memory usage by 42.2%, or improves the F1-score by 2.5% compared to the baseline.
Abstract:Large language models (LLMs) and large multimodal models (LMMs) have achieved unprecedented breakthrough, showcasing remarkable capabilities in natural language understanding, generation, and complex reasoning. This transformative potential has positioned them as key enablers for 6G autonomous communications among machines, vehicles, and humanoids. In this article, we provide an overview of task-oriented autonomous communications with LLMs/LMMs, focusing on multimodal sensing integration, adaptive reconfiguration, and prompt/fine-tuning strategies for wireless tasks. We demonstrate the framework through three case studies: LMM-based traffic control, LLM-based robot scheduling, and LMM-based environment-aware channel estimation. From experimental results, we show that the proposed LLM/LMM-aided autonomous systems significantly outperform conventional and discriminative deep learning (DL) model-based techniques, maintaining robustness under dynamic objectives, varying input parameters, and heterogeneous multimodal conditions where conventional static optimization degrades.
Abstract:This paper explores an integrated sensing and communication (ISAC) network empowered by multiple active simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RISs). A base station (BS) furnishes downlink communication to multiple users while concurrently interrogating a sensing target. We jointly optimize the BS transmit beamformer and the reflection/transmission coefficients of every active STAR-RIS in order to maximize the aggregate communication sum-rate, subject to (i) a stringent sensing signal-to-interference-plus-noise ratio (SINR) requirement, (ii) an upper bound on the leakage of confidential information, and (iii) individual hardware and total power constraints at both the BS and the STAR-RISs. The resulting highly non-convex program is tackled with an efficient alternating optimization (AO) framework. First, the original formulation is reformulated into an equivalent yet more tractable representation and partitioned into subproblems. The BS beamformer is updated in closed form via the Karush-Kuhn-Tucker (KKT) conditions, whereas the STAR-RIS reflection and transmission vectors are refined through successive convex approximation (SCA), yielding a semidefinite program that is then solved via semidefinite relaxation. Comprehensive simulations demonstrate that the proposed algorithm delivers substantial sum-rate gains over passive-RIS and single STAR-RIS baselines, all the while rigorously meeting the prescribed sensing and security constraints.




Abstract:Underwater acoustic (UWA) communications generally rely on cognitive radio (CR)-based ad-hoc networks due to challenges such as long propagation delay, limited channel resources, and high attenuation. To address the constraints of limited frequency resources, UWA communications have recently incorporated orthogonal frequency division multiple access (OFDMA), significantly enhancing spectral efficiency (SE) through multiplexing gains. Still, {the} low propagation speed of UWA signals, combined with {the} dynamic underwater environment, creates asynchrony in multiple access scenarios. This causes inaccurate spectrum sensing as inter-carrier interference (ICI) increases, which leads to difficulties in resource allocation. As efficient resource allocation is essential for achieving high-quality communication in OFDMA-based CR networks, these challenges degrade communication reliability in UWA systems. To resolve the issue, we propose an end-to-end sensing and resource optimization method using deep reinforcement learning (DRL) in an OFDMA-based UWA-CR network. Through extensive simulations, we confirm that the proposed method is superior to baseline schemes, outperforming other methods by 42.9 % in SE and 4.4 % in communication success rate.
Abstract:User association, the problem of assigning each user device to a suitable base station, is increasingly crucial as wireless networks become denser and serve more users with diverse service demands. The joint optimization of user association and resource allocation (UARA) is a fundamental issue for future wireless networks, as it plays a pivotal role in enhancing overall network performance, user fairness, and resource efficiency. Given the latency-sensitive nature of emerging network applications, network management favors algorithms that are simple and computationally efficient rather than complex centralized approaches. Thus, distributed pricing-based strategies have gained prominence in the UARA literature, demonstrating practicality and effectiveness across various objective functions, e.g., sum-rate, proportional fairness, max-min fairness, and alpha-fairness. While the alpha-fairness frameworks allow for flexible adjustments between efficiency and fairness via a single parameter $\alpha$, existing works predominantly assume a homogeneous fairness context, assigning an identical $\alpha$ value to all users. Real-world networks, however, frequently require differentiated user prioritization due to varying application requirements and latency. To bridge this gap, we propose a novel heterogeneous alpha-fairness (HAF) objective function, assigning distinct {\alpha} values to different users, thereby providing enhanced control over the balance between throughput, fairness, and latency across the network. We present a distributed, pricing-based optimization approach utilizing an auxiliary variable framework and provide analytical proof of its convergence to an $\epsilon$-optimal solution, where the optimality gap $\epsilon$ decreases with the number of iterations.




Abstract:As a key enabler of borderless and ubiquitous connectivity, space-air-ground-sea integrated networks (SAGSINs) are expected to be a cornerstone of 6G wireless communications. However, the multi-tiered and global-scale nature of SAGSINs also amplifies the security vulnerabilities, particularly due to the hidden, passive eavesdroppers distributed throughout the network. In this paper, we introduce a joint optimization framework for multi-hop relaying in SAGSINs that maximizes the minimum user throughput while ensuring a minimum strictly positive secure connection (SPSC) probability. We first derive a closed-form expression for the SPSC probability and incorporate this into a cross-layer optimization framework that jointly optimizes radio resources and relay routes. Specifically, we propose an $\mathcal{O}(1)$ optimal frequency allocation and power splitting strategy-dividing power levels of data transmission and cooperative jamming. We then introduce a Monte-Carlo relay routing algorithm that closely approaches the performance of the numerical upper-bound method. We validate our framework on testbeds built with real-world dataset. All source code and data for reproducing the numerical experiments will be open-sourced.




Abstract:With the recent advancements in deep learning, semantic communication which transmits only task-oriented features, has rapidly emerged. However, since feature extraction relies on learning-based models, its performance fundamentally depends on the training dataset or tasks. For practical scenarios, it is essential to design a model that demonstrates robust performance regardless of dataset or tasks. In this correspondence, we propose a novel text transmission model that selects and transmits only a few characters and recovers the missing characters at the receiver using a large language model (LLM). Additionally, we propose a novel importance character extractor (ICE), which selects transmitted characters to enhance LLM recovery performance. Simulations demonstrate that the proposed filter selection by ICE outperforms random filter selection, which selects transmitted characters randomly. Moreover, the proposed model exhibits robust performance across different datasets and tasks and outperforms traditional bit-based communication in low signal-to-noise ratio conditions.
Abstract:Deep learning (DL) has made notable progress in addressing complex radio access network control challenges that conventional analytic methods have struggled to solve. However, DL has shown limitations in solving constrained NP-hard problems often encountered in network optimization, such as those involving quality of service (QoS) or discrete variables like user indices. Current solutions rely on domain-specific architectures or heuristic techniques, and a general DL approach for constrained optimization remains undeveloped. Moreover, even minor changes in communication objectives demand time-consuming retraining, limiting their adaptability to dynamic environments where task objectives, constraints, environmental factors, and communication scenarios frequently change. To address these challenges, we propose a large language model for resource allocation optimizer (LLM-RAO), a novel approach that harnesses the capabilities of LLMs to address the complex resource allocation problem while adhering to QoS constraints. By employing a prompt-based tuning strategy to flexibly convey ever-changing task descriptions and requirements to the LLM, LLM-RAO demonstrates robust performance and seamless adaptability in dynamic environments without requiring extensive retraining. Simulation results reveal that LLM-RAO achieves up to a 40% performance enhancement compared to conventional DL methods and up to an $80$\% improvement over analytical approaches. Moreover, in scenarios with fluctuating communication objectives, LLM-RAO attains up to 2.9 times the performance of traditional DL-based networks.




Abstract:As 6G and beyond networks grow increasingly complex and interconnected, federated learning (FL) emerges as an indispensable paradigm for securely and efficiently leveraging decentralized edge data for AI. By virtue of the superposition property of communication signals, over-the-air FL (OtA-FL) achieves constant communication overhead irrespective of the number of edge devices (EDs). However, training neural networks over the air still incurs substantial communication costs, as the number of transmitted symbols equals the number of trainable parameters. To alleviate this issue, the most straightforward approach is to reduce the number of transmitted symbols by 1) gradient compression and 2) gradient sparsification. Unfortunately, these methods are incompatible with OtA-FL due to the loss of its superposition property. In this work, we introduce federated zeroth-order estimation (Fed-ZOE), an efficient framework inspired by the randomized gradient estimator (RGE) commonly used in zeroth-order optimization (ZOO). In FedZOE, EDs perform local weight updates as in standard FL, but instead of transmitting full gradient vectors, they send compressed local model update vectors in the form of several scalar-valued inner products between the local model update vectors and random vectors. These scalar values enable the parameter server (PS) to reconstruct the gradient using the RGE trick with highly reduced overhead, as well as preserving the superposition property. Unlike conventional ZOO leveraging RGE for step-wise gradient descent, Fed-ZOE compresses local model update vectors before transmission, thereby achieving higher accuracy and computational efficiency. Numerical evaluations using ResNet-18 on datasets such as CIFAR-10, TinyImageNet, SVHN, CIFAR-100, and Brain-CT demonstrate that Fed-ZOE achieves performance comparable to Fed-OtA while drastically reducing communication costs.