Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Deniz Gunduz

Diffusion-aided Extreme Video Compression with Lightweight Semantics Guidance

Feb 05, 2026

Maojun Zhang, Haotian Wu, Richeng Jin, Deniz Gunduz, Krystian Mikolajczyk

Abstract:Modern video codecs and learning-based approaches struggle for semantic reconstruction at extremely low bit-rates due to reliance on low-level spatiotemporal redundancies. Generative models, especially diffusion models, offer a new paradigm for video compression by leveraging high-level semantic understanding and powerful visual synthesis. This paper propose a video compression framework that integrates generative priors to drastically reduce bit-rate while maintaining reconstruction fidelity. Specifically, our method compresses high-level semantic representations of the video, then uses a conditional diffusion model to reconstruct frames from these semantics. To further improve compression, we characterize motion information with global camera trajectories and foreground segmentation: background motion is compactly represented by camera pose parameters while foreground dynamics by sparse segmentation masks. This allows for significantly boosts compression efficiency, enabling descent video reconstruction at extremely low bit-rates.

* Accepted by ICASSP 2026

Via

Access Paper or Ask Questions

Towards AI-Native Fronthaul: Neural Compression for NextG Cloud RAN

Jun 07, 2025

Chenghong Bian, Yulin Shao, Deniz Gunduz

Abstract:The rapid growth of data traffic and the emerging AI-native wireless architectures in NextG cellular systems place new demands on the fronthaul links of Cloud Radio Access Networks (C-RAN). In this paper, we investigate neural compression techniques for the Common Public Radio Interface (CPRI), aiming to reduce the fronthaul bandwidth while preserving signal quality. We introduce two deep learning-based compression algorithms designed to optimize the transformation of wireless signals into bit sequences for CPRI transmission. The first algorithm utilizes a non-linear transformation coupled with scalar/vector quantization based on a learned codebook. The second algorithm generates a latent vector transformed into a variable-length output bit sequence via arithmetic encoding, guided by the predicted probability distribution of each latent element. Novel techniques such as a shared weight model for storage-limited devices and a successive refinement model for managing multiple CPRI links with varying Quality of Service (QoS) are proposed. Extensive simulation results demonstrate notable Error Vector Magnitude (EVM) gains with improved rate-distortion performance for both algorithms compared to traditional methods. The proposed solutions are robust to variations in channel conditions, modulation formats, and noise levels, highlighting their potential for enabling efficient and scalable fronthaul in NextG AI-native networks as well as aligning with the current 3GPP research directions.

* 13 pages, submitted to IEEE journal

Via

Access Paper or Ask Questions

Over-the-Air Inference over Multi-hop MIMO Networks

May 01, 2025

Chenghong Bian, Meng Hua, Deniz Gunduz

Abstract:A novel over-the-air machine learning framework over multi-hop multiple-input and multiple-output (MIMO) networks is proposed. The core idea is to imitate fully connected (FC) neural network layers using multiple MIMO channels by carefully designing the precoding matrices at the transmitting nodes. A neural network dubbed PrototypeNet is employed consisting of multiple FC layers, with the number of neurons of each layer equal to the number of antennas of the corresponding terminal. To achieve satisfactory performance, we train PrototypeNet based on a customized loss function consisting of classification error and the power of latent vectors to satisfy transmit power constraints, with noise injection during training. Precoding matrices for each hop are then obtained by solving an optimization problem. We also propose a multiple-block extension when the number of antennas is limited. Numerical results verify that the proposed over-the-air transmission scheme can achieve satisfactory classification accuracy under a power constraint. The results also show that higher classification accuracy can be achieved with an increasing number of hops at a modest signal-to-noise ratio (SNR).

* 5 pages

Via

Access Paper or Ask Questions

Sparse Regression Codes for Integrated Passive Sensing and Communications

Nov 08, 2024

Chenghong Bian, Kaitao Meng, Huihui Wu, Deniz Gunduz

Figure 1 for Sparse Regression Codes for Integrated Passive Sensing and Communications

Figure 2 for Sparse Regression Codes for Integrated Passive Sensing and Communications

Figure 3 for Sparse Regression Codes for Integrated Passive Sensing and Communications

Figure 4 for Sparse Regression Codes for Integrated Passive Sensing and Communications

Abstract:We propose a novel integrated sensing and communication (ISAC) system, where the base station (BS) passively senses the channel parameters using the information carrying signals from a user. To simultaneously guarantee decoding and sensing performance, the user adopts sparse regression codes (SPARCs) with cyclic redundancy check (CRC) to transmit its information bits. The BS generates an initial coarse channel estimation of the parameters after receiving the pilot signal. Then, a novel iterative decoding and parameter sensing algorithm is proposed, where the correctly decoded codewords indicated by the CRC bits are utilized to improve the sensing and channel estimation performance at the BS. In turn, the improved estimate of the channel parameters lead to a better decoding performance. Simulation results show the effectiveness of the proposed iterative decoding and sensing algorithm, where both the sensing and the communication performance are significantly improved with a few iterations. Extensive ablation studies concerning different channel estimation methods and number of CRC bits are carried out for a comprehensive evaluation of the proposed scheme.

* 7 pages, conference version

Via

Access Paper or Ask Questions

LISAC: Learned Coded Waveform Design for ISAC with OFDM

Oct 14, 2024

Chenghong Bian, Yumeng Zhang, Deniz Gunduz

Figure 1 for LISAC: Learned Coded Waveform Design for ISAC with OFDM

Figure 2 for LISAC: Learned Coded Waveform Design for ISAC with OFDM

Figure 3 for LISAC: Learned Coded Waveform Design for ISAC with OFDM

Figure 4 for LISAC: Learned Coded Waveform Design for ISAC with OFDM

Abstract:We propose a novel deep learning based method to design a coded waveform for integrated sensing and communication (ISAC) system based on orthogonal frequency-division multiplexing (OFDM). Our ultimate goal is to design a coded waveform, which is capable of providing satisfactory sensing performance of the target while maintaining high communication quality measured in terms of the bit error rate (BER). The proposed LISAC provides an improved waveform design with the assistance of deep neural networks for the encoding and decoding of the information bits. In particular, the transmitter, parameterized by a recurrent neural network (RNN), encodes the input bit sequence into the transmitted waveform for both sensing and communications. The receiver employs a RNN-based decoder to decode the information bits while the transmitter senses the target via maximum likelihood detection. We optimize the system considering both the communication and sensing performance. Simulation results show that the proposed LISAC waveform achieves a better trade-off curve compared to existing alternatives.

* 6 pages, conference version

Via

Access Paper or Ask Questions

GINO-Q: Learning an Asymptotically Optimal Index Policy for Restless Multi-armed Bandits

Aug 19, 2024

Gongpu Chen, Soung Chang Liew, Deniz Gunduz

Figure 1 for GINO-Q: Learning an Asymptotically Optimal Index Policy for Restless Multi-armed Bandits

Figure 2 for GINO-Q: Learning an Asymptotically Optimal Index Policy for Restless Multi-armed Bandits

Figure 3 for GINO-Q: Learning an Asymptotically Optimal Index Policy for Restless Multi-armed Bandits

Figure 4 for GINO-Q: Learning an Asymptotically Optimal Index Policy for Restless Multi-armed Bandits

Abstract:The restless multi-armed bandit (RMAB) framework is a popular model with applications across a wide variety of fields. However, its solution is hindered by the exponentially growing state space (with respect to the number of arms) and the combinatorial action space, making traditional reinforcement learning methods infeasible for large-scale instances. In this paper, we propose GINO-Q, a three-timescale stochastic approximation algorithm designed to learn an asymptotically optimal index policy for RMABs. GINO-Q mitigates the curse of dimensionality by decomposing the RMAB into a series of subproblems, each with the same dimension as a single arm, ensuring that complexity increases linearly with the number of arms. Unlike recently developed Whittle-index-based algorithms, GINO-Q does not require RMABs to be indexable, enhancing its flexibility and applicability. Our experimental results demonstrate that GINO-Q consistently learns near-optimal policies, even for non-indexable RMABs where Whittle-index-based algorithms perform poorly, and it converges significantly faster than existing baselines.

* 9 pages, 11 figures

Via

Access Paper or Ask Questions

Private Collaborative Edge Inference via Over-the-Air Computation

Jul 30, 2024

Selim F. Yilmaz, Burak Hasircioglu, Li Qiao, Deniz Gunduz

Figure 1 for Private Collaborative Edge Inference via Over-the-Air Computation

Figure 2 for Private Collaborative Edge Inference via Over-the-Air Computation

Figure 3 for Private Collaborative Edge Inference via Over-the-Air Computation

Figure 4 for Private Collaborative Edge Inference via Over-the-Air Computation

Abstract:We consider collaborative inference at the wireless edge, where each client's model is trained independently on their local datasets. Clients are queried in parallel to make an accurate decision collaboratively. In addition to maximizing the inference accuracy, we also want to ensure the privacy of local models. To this end, we leverage the superposition property of the multiple access channel to implement bandwidth-efficient multi-user inference methods. Specifically, we propose different methods for ensemble and multi-view classification that exploit over-the-air computation. We show that these schemes perform better than their orthogonal counterparts with statistically significant differences while using fewer resources and providing privacy guarantees. We also provide experimental results verifying the benefits of the proposed over-the-air multi-user inference approach and perform an ablation study to demonstrate the effectiveness of our design choices. We share the source code of the framework publicly on Github to facilitate further research and reproducibility.

* 15 pages, 8 figures. This work extends from our preliminary study presented at the 2022 IEEE International Symposium on Information Theory [1]. arXiv admin note: text overlap with arXiv:2202.03129

Via

Access Paper or Ask Questions

Aggressive or Imperceptible, or Both: Network Pruning Assisted Hybrid Byzantines in Federated Learning

Apr 09, 2024

Emre Ozfatura, Kerem Ozfatura, Alptekin Kupcu, Deniz Gunduz

Figure 1 for Aggressive or Imperceptible, or Both: Network Pruning Assisted Hybrid Byzantines in Federated Learning

Figure 2 for Aggressive or Imperceptible, or Both: Network Pruning Assisted Hybrid Byzantines in Federated Learning

Figure 3 for Aggressive or Imperceptible, or Both: Network Pruning Assisted Hybrid Byzantines in Federated Learning

Figure 4 for Aggressive or Imperceptible, or Both: Network Pruning Assisted Hybrid Byzantines in Federated Learning

Abstract:Federated learning (FL) has been introduced to enable a large number of clients, possibly mobile devices, to collaborate on generating a generalized machine learning model thanks to utilizing a larger number of local samples without sharing to offer certain privacy to collaborating clients. However, due to the participation of a large number of clients, it is often difficult to profile and verify each client, which leads to a security threat that malicious participants may hamper the accuracy of the trained model by conveying poisoned models during the training. Hence, the aggregation framework at the parameter server also needs to minimize the detrimental effects of these malicious clients. A plethora of attack and defence strategies have been analyzed in the literature. However, often the Byzantine problem is analyzed solely from the outlier detection perspective, being oblivious to the topology of neural networks (NNs). In the scope of this work, we argue that by extracting certain side information specific to the NN topology, one can design stronger attacks. Hence, inspired by the sparse neural networks, we introduce a hybrid sparse Byzantine attack that is composed of two parts: one exhibiting a sparse nature and attacking only certain NN locations with higher sensitivity, and the other being more silent but accumulating over time, where each ideally targets a different type of defence mechanism, and together they form a strong but imperceptible attack. Finally, we show through extensive simulations that the proposed hybrid Byzantine attack is effective against 8 different defence methods.

Via

Access Paper or Ask Questions

Process-and-Forward: Deep Joint Source-Channel Coding Over Cooperative Relay Networks

Mar 15, 2024

Chenghong Bian, Yulin Shao, Haotian Wu, Emre Ozfatura, Deniz Gunduz

Figure 1 for Process-and-Forward: Deep Joint Source-Channel Coding Over Cooperative Relay Networks

Figure 2 for Process-and-Forward: Deep Joint Source-Channel Coding Over Cooperative Relay Networks

Figure 3 for Process-and-Forward: Deep Joint Source-Channel Coding Over Cooperative Relay Networks

Figure 4 for Process-and-Forward: Deep Joint Source-Channel Coding Over Cooperative Relay Networks

Abstract:This paper introduces an innovative deep joint source-channel coding (DeepJSCC) approach to image transmission over a cooperative relay channel. The relay either amplifies and forwards a scaled version of its received signal, referred to as DeepJSCC-AF, or leverages neural networks to extract relevant features about the source signal before forwarding it to the destination, which we call DeepJSCC-PF (Process-and-Forward). In the full-duplex scheme, inspired by the block Markov coding (BMC) concept, we introduce a novel block transmission strategy built upon novel vision transformer architecture. In the proposed scheme, the source transmits information in blocks, and the relay updates its knowledge about the input signal after each block and generates its own signal to be conveyed to the destination. To enhance practicality, we introduce an adaptive transmission model, which allows a single trained DeepJSCC model to adapt seamlessly to various channel qualities, making it a versatile solution. Simulation results demonstrate the superior performance of our proposed DeepJSCC compared to the state-of-the-art BPG image compression algorithm, even when operating at the maximum achievable rate of conventional decode-and-forward and compress-and-forward protocols, for both half-duplex and full-duplex relay scenarios.

* Submitted for possible IEEE journal

Via

Access Paper or Ask Questions

Friendly Attacks to Improve Channel Coding Reliability

Jan 25, 2024

Anastasiia Kurmukova, Deniz Gunduz

Abstract:This paper introduces a novel approach called "friendly attack" aimed at enhancing the performance of error correction channel codes. Inspired by the concept of adversarial attacks, our method leverages the idea of introducing slight perturbations to the neural network input, resulting in a substantial impact on the network's performance. By introducing small perturbations to fixed-point modulated codewords before transmission, we effectively improve the decoder's performance without violating the input power constraint. The perturbation design is accomplished by a modified iterative fast gradient method. This study investigates various decoder architectures suitable for computing gradients to obtain the desired perturbations. Specifically, we consider belief propagation (BP) for LDPC codes; the error correcting code transformer, BP and neural BP (NBP) for polar codes, and neural BCJR for convolutional codes. We demonstrate that the proposed friendly attack method can improve the reliability across different channels, modulations, codes, and decoders. This method allows us to increase the reliability of communication with a legacy receiver by simply modifying the transmitted codeword appropriately.

Via

Access Paper or Ask Questions