Abstract:Diffusion models (DMs) have achieved remarkable success across various domains owing to their strong generative and denoising capabilities. Meanwhile, semantic communication based on neural joint source-channel coding (JSCC) has emerged as a promising paradigm for robust and efficient image transmission. However, severe channel noise can still distort the transmitted semantic symbols, resulting in significant performance degradation. Applying DMs to digital semantic symbols, particularly in vector quantization (VQ)-based systems, is fundamentally challenging because the Markov assumption does not hold for the symbol transition dynamics. To address this issue, we introduce SSCDM, a semantic symbol correcting diffusion model whose discrete-time transition dynamics are constructed using solutions from continuous-time Markov chain theory. Furthermore, to promote synergy between DMs and JSCC, our DM structure embeds discrete symbols into a latent feature space using a learned VQ codebook, and a self-organizing map-based loss is incorporated during codebook learning to enhance the geometric vicinity between neighboring digital symbols, thereby promoting topology-preserving semantic representations. Experimental results show that the proposed method significantly improves image reconstruction quality and outperforms previous symbol-level denoising techniques under low signal-to-noise ratio scenarios and different datasets.
Abstract:Recent advances in deep learning (DL)-based joint source-channel coding (JSCC) have enabled efficient semantic communication in dynamic wireless environments. Among these approaches, vector quantization (VQ)-based JSCC effectively maps high-dimensional semantic feature vectors into compact codeword indices for digital modulation. However, existing methods, including universal JSCC (uJSCC), rely on fixed, modulation-specific encoders, decoders, and codebooks, limiting adaptability to fine-grained SNR variations. We propose an extended universal JSCC (euJSCC) framework that achieves SNR- and modulation-adaptive transmission within a single model. euJSCC employs a hypernetwork-based normalization layer for fine-grained feature vector normalization and a dynamic codebook generation (DCG) network that refines modulation-specific base codebooks according to block-wise SNR. To handle block fading channels, which consist of multiple coherence blocks, an inner-outer encoder-decoder architecture is adopted, where the outer encoder and decoder capture long-term channel statistics, and the inner encoder and decoder refine feature vectors to align with block-wise codebooks. A two-phase training strategy, i.e., pretraining on AWGN channels followed by finetuning on block fading channels, ensures stable convergence. Experiments on image transmission demonstrate that euJSCC consistently outperforms state-of-the-art channel-adaptive digital JSCC schemes under both block fading and AWGN channels.




Abstract:From the perspective of joint source-channel coding (JSCC), there has been significant research on utilizing semantic communication, which inherently possesses analog characteristics, within digital device environments. However, a single-model approach that operates modulation-agnostically across various digital modulation orders has not yet been established. This article presents the first attempt at such an approach by proposing a universal joint source-channel coding (uJSCC) system that utilizes a single-model encoder-decoder pair and trained vector quantization (VQ) codebooks. To support various modulation orders within a single model, the operation of every neural network (NN)-based module in the uJSCC system requires the selection of modulation orders according to signal-to-noise ratio (SNR) boundaries. To address the challenge of unequal output statistics from shared parameters across NN layers, we integrate multiple batch normalization (BN) layers, selected based on modulation order, after each NN layer. This integration occurs with minimal impact on the overall model size. Through a comprehensive series of experiments, we validate that this modulation-agnostic semantic communication framework demonstrates superiority over existing digital semantic communication approaches in terms of model complexity, communication efficiency, and task effectiveness.