Abstract:Kolmogorov-Arnold Networks (KANs) have recently emerged as a powerful architecture for various machine learning applications. However, their unique structure raises significant concerns regarding their computational overhead. Existing studies primarily evaluate KAN complexity in terms of Floating-Point Operations (FLOPs) required for GPU-based training and inference. However, in many latency-sensitive and power-constrained deployment scenarios, such as neural network-driven non-linearity mitigation in optical communications or channel state estimation in wireless communications, training is performed offline and dedicated hardware accelerators are preferred over GPUs for inference. Recent hardware implementation studies report KAN complexity using platform-specific resource consumption metrics, such as Look-Up Tables, Flip-Flops, and Block RAMs. However, these metrics require a full hardware design and synthesis stage that limits their utility for early-stage architectural decisions and cross-platform comparisons. To address this, we derive generalized, platform-independent formulae for evaluating the hardware inference complexity of KANs in terms of Real Multiplications (RM), Bit Operations (BOP), and Number of Additions and Bit-Shifts (NABS). We extend our analysis across multiple KAN variants, including B-spline, Gaussian Radial Basis Function (GRBF), Chebyshev, and Fourier KANs. The proposed metrics can be computed directly from the network structure and enable a fair and straightforward inference complexity comparison between KAN and other neural network architectures.
Abstract:Diffusion Model (DM) based Semantic Image Communication (SIC) systems face significant challenges, such as slow inference speed and generation randomness, that limit their reliability and practicality. To overcome these issues, we propose a novel SIC framework inspired by Stable Cascade, where extremely compact latent image embeddings are used as conditioning to the diffusion process. Our approach drastically reduces the data transmission overhead, compressing the transmitted embedding to just 0.29% of the original image size. It outperforms three benchmark approaches - the diffusion SIC model conditioned on segmentation maps (GESCO), the recent Stable Diffusion (SD)-based SIC framework (Img2Img-SC), and the conventional JPEG2000 + LDPC coding - by achieving superior reconstruction quality under noisy channel conditions, as validated across multiple metrics. Notably, it also delivers significant computational efficiency, enabling over 3x faster reconstruction for 512 x 512 images and more than 16x faster for 1024 x 1024 images as compared to the approach adopted in Img2Img-SC.