In this paper, a semantic-aware joint communication and computation resource allocation framework is proposed for mobile edge computing (MEC) systems. In the considered system, each terminal device (TD) has a computation task, which needs to be executed by offloading to the MEC server. To further decrease the transmission burden, each TD sends the small-size extracted semantic information of tasks to the server instead of the large-size raw data. An optimization problem of joint semantic-aware division factor, communication and computation resource management is formulated. The problem aims to minimize the maximum execution delay of all TDs while satisfying energy consumption constraints. The original non-convex problem is transformed into a convex one based on the geometric programming and the optimal solution is obtained by the alternating optimization algorithm. Moreover, the closed-form optimal solution of the semantic extraction factor is derived. Simulation results show that the proposed algorithm yields up to 37.10% delay reduction compared with the benchmark algorithm without semantic-aware allocation. Furthermore, small semantic extraction factors are preferred in the case of large task sizes and poor channel conditions.
In this paper, the problem of semantic information extraction for resource constrained text data transmission is studied. In the considered model, a sequence of text data need to be transmitted within a communication resource-constrained network, which only allows limited data transmission. Thus, at the transmitter, the original text data is extracted with natural language processing techniques. Then, the extracted semantic information is captured in a knowledge graph. An additional probability dimension is introduced in this graph to capture the importance of each information. This semantic information extraction problem is posed as an optimization framework whose goal is to extract most important semantic information for transmission. To find an optimal solution for this problem, a Floyd's algorithm based solution coupled with an efficient sorting mechanism is proposed. Numerical results testify the effectiveness of the proposed algorithm with regards to two novel performance metrics including semantic uncertainty and semantic similarity.
Data collection and processing timely is crucial for mobile crowd integrated sensing, communication, and computation~(ISCC) systems with various applications such as smart home and connected cars, which requires numerous integrated sensing and communication~(ISAC) devices to sense the targets and offload the data to the base station~(BS) for further processing. However, as the number of ISAC devices growing, there exists intensive interactions among ISAC devices in the processes of data collection and processing since they share the common network resources. In this paper, we consider the environment sensing problem in the large-scale mobile crowd ISCC systems and propose an efficient waveform precoding design algorithm based on the mean field game~(MFG). Specifically, to handle the complex interactions among large-scale ISAC devices, we first utilize the MFG method to transform the influence from other ISAC devices into the mean field term and derive the Fokker-Planck-Kolmogorov equation, which model the evolution of the system state. Then, we derive the cost function based on the mean field term and reformulate the waveform precoding design problem. Next, we utilize the G-prox primal-dual hybrid gradient algorithm to solve the reformulated problem and analyze the computational complexity of the proposed algorithm. Finally, simulation results demonstrate that the proposed algorithm can solve the interactions among large-scale ISAC devices effectively in the ISCC process. In addition, compared with other baselines, the proposed waveform precoding design algorithm has advantages in improving communication performance and reducing cost function.
Large language models (LLMs) have revolutionized natural language processing tasks. However, their practical deployment is hindered by their immense memory and computation requirements. Although recent post-training quantization (PTQ) methods are effective in reducing memory footprint and improving the computational efficiency of LLM, they hand-craft quantization parameters, which leads to low performance and fails to deal with extremely low-bit quantization. To tackle this issue, we introduce an Omnidirectionally calibrated Quantization (OmniQuant) technique for LLMs, which achieves good performance in diverse quantization settings while maintaining the computational efficiency of PTQ by efficiently optimizing various quantization parameters. OmniQuant comprises two innovative components including Learnable Weight Clipping (LWC) and Learnable Equivalent Transformation (LET). LWC modulates the extreme values of weights by optimizing the clipping threshold. Meanwhile, LET tackles activation outliers by shifting the challenge of quantization from activations to weights through a learnable equivalent transformation. Operating within a differentiable framework using block-wise error minimization, OmniQuant can optimize the quantization process efficiently for both weight-only and weight-activation quantization. For instance, the LLaMA-2 model family with the size of 7-70B can be processed with OmniQuant on a single A100-40G GPU within 1-16 hours using 128 samples. Extensive experiments validate OmniQuant's superior performance across diverse quantization configurations such as W4A4, W6A6, W4A16, W3A16, and W2A16. Additionally, OmniQuant demonstrates effectiveness in instruction-tuned models and delivers notable improvements in inference speed and memory reduction on real devices. Codes and models are available at \url{https://github.com/OpenGVLab/OmniQuant}.
Recently, big artificial intelligence (AI) models represented by chatGPT have brought an incredible revolution. With the pre-trained big AI model (BAIM) in certain fields, numerous downstream tasks can be accomplished with only few-shot or even zero-shot learning and exhibit state-of-the-art performances. As widely envisioned, the big AI models are to rapidly penetrate into major intelligent services and applications, and are able to run at low unit cost and high flexibility. In 6G wireless networks, to fully enable intelligent communication, sensing and computing, apart from providing other intelligent wireless services and applications, it is of vital importance to design and deploy certain wireless BAIMs (wBAIMs). However, there still lacks investigation on architecture design and system evaluation for wBAIM. In this paper, we provide a comprehensive discussion as well as some in-depth prospects on the demand, design and deployment aspects of the wBAIM. We opine that wBAIM will be a recipe of the 6G wireless networks to build high-efficient, sustainable, versatile, and extensible wireless intelligence for numerous promising visions. Then, we present the core characteristics and principles to guide the design of wBAIMs, and discuss the key aspects of developing wBAIMs through identifying the differences between the existing BAIMs and the emerging wBAIMs. Finally, related research directions and potential solutions are outlined.
Self-supervised contrastive learning (SSCL) has achieved significant milestones in remote sensing image (RSI) understanding. Its essence lies in designing an unsupervised instance discrimination pretext task to extract image features from a large number of unlabeled images that are beneficial for downstream tasks. However, existing instance discrimination based SSCL suffer from two limitations when applied to the RSI semantic segmentation task: 1) Positive sample confounding issue; 2) Feature adaptation bias. It introduces a feature adaptation bias when applied to semantic segmentation tasks that require pixel-level or object-level features. In this study, We observed that the discrimination information can be mapped to specific regions in RSI through the gradient of unsupervised contrastive loss, these specific regions tend to contain singular ground objects. Based on this, we propose contrastive learning with Gradient guided Sampling Strategy (GraSS) for RSI semantic segmentation. GraSS consists of two stages: Instance Discrimination warm-up (ID warm-up) and Gradient guided Sampling contrastive training (GS training). The ID warm-up aims to provide initial discrimination information to the contrastive loss gradients. The GS training stage aims to utilize the discrimination information contained in the contrastive loss gradients and adaptively select regions in RSI patches that contain more singular ground objects, in order to construct new positive and negative samples. Experimental results on three open datasets demonstrate that GraSS effectively enhances the performance of SSCL in high-resolution RSI semantic segmentation. Compared to seven baseline methods from five different types of SSCL, GraSS achieves an average improvement of 1.57\% and a maximum improvement of 3.58\% in terms of mean intersection over the union. The source code is available at https://github.com/GeoX-Lab/GraSS
Adaptive rate control for deep joint source and channel coding (JSCC) is considered as an effective approach to transmit sufficient information in scenarios with limited communication resources. We propose a deep JSCC scheme for wireless image transmission with entropy-aware adaptive rate control, using a single deep neural network to support multiple rates and automatically adjust the rate based on the feature maps of the input image, their entropy, and the channel condition. In particular, we maximize the entropy of the feature maps to increase the average information carried by each symbol transmitted into the channel during the training. We further decide which feature maps should be activated based on their entropy, which improves the efficiency of the transmitted symbols. We also propose a pruning module to remove less important pixels in the activated feature maps in order to further improve transmission efficiency. The experimental results demonstrate that our proposed scheme learns an effective rate control strategy that reduces the required channel bandwidth while preserving the quality of the received images.
Precoding design for the downlink of multiuser multiple-input multiple-output (MU-MIMO) systems is a fundamental problem. In this paper, we aim to maximize the weighted sum rate (WSR) while considering both quality-of-service (QoS) constraints of each user and per-antenna power constraints (PAPCs) in the downlink MU-MIMO system. To solve the problem, we reformulate the original problem to an equivalent problem by using the well-known weighted minimal mean square error (WMMSE) framework, which can be tackled by iteratively solving three subproblems. Since the precoding matrices are coupled among the QoS constraints and PAPCs, we adopt alternating direction method of multipliers (ADMM) to obtain a distributed solution. Simulation results validate the effectiveness of the proposed algorithm.
In conventional distributed learning over a network, multiple agents collaboratively build a common machine learning model. However, due to the underlying non-i.i.d. data distribution among agents, the unified learning model becomes inefficient for each agent to process its locally accessible data. To address this problem, we propose a graph-attention-based personalized training algorithm (GATTA) for distributed deep learning. The GATTA enables each agent to train its local personalized model while exploiting its correlation with neighboring nodes and utilizing their useful information for aggregation. In particular, the personalized model in each agent is composed of a global part and a node-specific part. By treating each agent as one node in a graph and the node-specific parameters as its features, the benefits of the graph attention mechanism can be inherited. Namely, instead of aggregation based on averaging, it learns the specific weights for different neighboring nodes without requiring prior knowledge about the graph structure or the neighboring nodes' data distribution. Furthermore, relying on the weight-learning procedure, we develop a communication-efficient GATTA by skipping the transmission of information with small aggregation weights. Additionally, we theoretically analyze the convergence properties of GATTA for non-convex loss functions. Numerical results validate the excellent performances of the proposed algorithms in terms of convergence and communication cost.