Alert button
Picture for Hyeji Kim

Hyeji Kim

Alert button

LASER: Linear Compression in Wireless Distributed Optimization

Oct 19, 2023
Ashok Vardhan Makkuva, Marco Bondaschi, Thijs Vogels, Martin Jaggi, Hyeji Kim, Michael C. Gastpar

Data-parallel SGD is the de facto algorithm for distributed optimization, especially for large scale machine learning. Despite its merits, communication bottleneck is one of its persistent issues. Most compression schemes to alleviate this either assume noiseless communication links, or fail to achieve good performance on practical tasks. In this paper, we close this gap and introduce LASER: LineAr CompreSsion in WirEless DistRibuted Optimization. LASER capitalizes on the inherent low-rank structure of gradients and transmits them efficiently over the noisy channels. Whilst enjoying theoretical guarantees similar to those of the classical SGD, LASER shows consistent gains over baselines on a variety of practical benchmarks. In particular, it outperforms the state-of-the-art compression schemes on challenging computer vision and GPT language modeling tasks. On the latter, we obtain $50$-$64 \%$ improvement in perplexity over our baselines for noisy channels.

Viaarxiv icon

Task-aware Distributed Source Coding under Dynamic Bandwidth

May 24, 2023
Po-han Li, Sravan Kumar Ankireddy, Ruihan Zhao, Hossein Nourkhiz Mahjoub, Ehsan Moradi-Pari, Ufuk Topcu, Sandeep Chinchali, Hyeji Kim

Figure 1 for Task-aware Distributed Source Coding under Dynamic Bandwidth
Figure 2 for Task-aware Distributed Source Coding under Dynamic Bandwidth
Figure 3 for Task-aware Distributed Source Coding under Dynamic Bandwidth
Figure 4 for Task-aware Distributed Source Coding under Dynamic Bandwidth

Efficient compression of correlated data is essential to minimize communication overload in multi-sensor networks. In such networks, each sensor independently compresses the data and transmits them to a central node due to limited communication bandwidth. A decoder at the central node decompresses and passes the data to a pre-trained machine learning-based task to generate the final output. Thus, it is important to compress the features that are relevant to the task. Additionally, the final performance depends heavily on the total available bandwidth. In practice, it is common to encounter varying availability in bandwidth, and higher bandwidth results in better performance of the task. We design a novel distributed compression framework composed of independent encoders and a joint decoder, which we call neural distributed principal component analysis (NDPCA). NDPCA flexibly compresses data from multiple sources to any available bandwidth with a single model, reducing computing and storage overhead. NDPCA achieves this by learning low-rank task representations and efficiently distributing bandwidth among sensors, thus providing a graceful trade-off between performance and bandwidth. Experiments show that NDPCA improves the success rate of multi-view robotic arm manipulation by 9% and the accuracy of object detection tasks on satellite imagery by 14% compared to an autoencoder with uniform bandwidth allocation.

Viaarxiv icon

TinyTurbo: Efficient Turbo Decoders on Edge

Sep 30, 2022
S Ashwin Hebbar, Rajesh K Mishra, Sravan Kumar Ankireddy, Ashok V Makkuva, Hyeji Kim, Pramod Viswanath

Figure 1 for TinyTurbo: Efficient Turbo Decoders on Edge
Figure 2 for TinyTurbo: Efficient Turbo Decoders on Edge
Figure 3 for TinyTurbo: Efficient Turbo Decoders on Edge
Figure 4 for TinyTurbo: Efficient Turbo Decoders on Edge

In this paper, we introduce a neural-augmented decoder for Turbo codes called TINYTURBO . TINYTURBO has complexity comparable to the classical max-log-MAP algorithm but has much better reliability than the max-log-MAP baseline and performs close to the MAP algorithm. We show that TINYTURBO exhibits strong robustness on a variety of practical channels of interest, such as EPA and EVA channels, which are included in the LTE standards. We also show that TINYTURBO strongly generalizes across different rate, blocklengths, and trellises. We verify the reliability and efficiency of TINYTURBO via over-the-air experiments.

* "TinyTurbo: Efficient Turbo Decoders on Edge," 2022 IEEE International Symposium on Information Theory (ISIT), 2022, pp. 2797-2802  
* 10 pages, 6 figures. Published at the 2022 IEEE International Symposium on Information Theory (ISIT) 
Viaarxiv icon

Neural Augmented Min-Sum Decoding of Short Block Codes for Fading Channels

May 21, 2022
Sravan Kumar Ankireddy, Hyeji Kim

Figure 1 for Neural Augmented Min-Sum Decoding of Short Block Codes for Fading Channels
Figure 2 for Neural Augmented Min-Sum Decoding of Short Block Codes for Fading Channels
Figure 3 for Neural Augmented Min-Sum Decoding of Short Block Codes for Fading Channels
Figure 4 for Neural Augmented Min-Sum Decoding of Short Block Codes for Fading Channels

In the decoding of linear block codes, it was shown that noticeable gains in terms of bit error rate can be achieved by introducing learnable parameters to the Belief Propagation (BP) decoder. Despite the success of these methods, there are two key open problems. The first is the lack of analysis for channels other than AWGN. The second is the interpretation of the weights learned and their effect on the reliability of the BP decoder. In this work, we aim to bridge this gap by looking at non-AWGN channels such as Extended Typical Urban (ETU) channel. We study the effect of entangling the weights and how the performance holds across different channel settings for the min-sum version of BP decoder. We show that while entanglement has little degradation in the AWGN channel, a significant loss is observed in more complex channels. We also provide insights into the weights learned and their connection to the structure of the underlying code. Finally, we evaluate our algorithm on the over-the-air channels using Software Defined Radios.

Viaarxiv icon

Turbo Autoencoder with a Trainable Interleaver

Nov 22, 2021
Karl Chahine, Yihan Jiang, Pooja Nuti, Hyeji Kim, Joonyoung Cho

Figure 1 for Turbo Autoencoder with a Trainable Interleaver
Figure 2 for Turbo Autoencoder with a Trainable Interleaver
Figure 3 for Turbo Autoencoder with a Trainable Interleaver
Figure 4 for Turbo Autoencoder with a Trainable Interleaver

A critical aspect of reliable communication involves the design of codes that allow transmissions to be robustly and computationally efficiently decoded under noisy conditions. Advances in the design of reliable codes have been driven by coding theory and have been sporadic. Recently, it is shown that channel codes that are comparable to modern codes can be learned solely via deep learning. In particular, Turbo Autoencoder (TURBOAE), introduced by Jiang et al., is shown to achieve the reliability of Turbo codes for Additive White Gaussian Noise channels. In this paper, we focus on applying the idea of TURBOAE to various practical channels, such as fading channels and chirp noise channels. We introduce TURBOAE-TI, a novel neural architecture that combines TURBOAE with a trainable interleaver design. We develop a carefully-designed training procedure and a novel interleaver penalty function that are crucial in learning the interleaver and TURBOAE jointly. We demonstrate that TURBOAE-TI outperforms TURBOAE and LTE Turbo codes for several channels of interest. We also provide interpretation analysis to better understand TURBOAE-TI.

Viaarxiv icon

DeepIC: Coding for Interference Channels via Deep Learning

Aug 13, 2021
Karl Chahine, Nanyang Ye, Hyeji Kim

Figure 1 for DeepIC: Coding for Interference Channels via Deep Learning
Figure 2 for DeepIC: Coding for Interference Channels via Deep Learning
Figure 3 for DeepIC: Coding for Interference Channels via Deep Learning
Figure 4 for DeepIC: Coding for Interference Channels via Deep Learning

The two-user interference channel is a model for multi one-to-one communications, where two transmitters wish to communicate with their corresponding receivers via a shared wireless medium. Two most common and simple coding schemes are time division (TD) and treating interference as noise (TIN). Interestingly, it is shown that there exists an asymptotic scheme, called Han-Kobayashi scheme, that performs better than TD and TIN. However, Han-Kobayashi scheme has impractically high complexity and is designed for asymptotic settings, which leads to a gap between information theory and practice. In this paper, we focus on designing practical codes for interference channels. As it is challenging to analytically design practical codes with feasible complexity, we apply deep learning to learn codes for interference channels. We demonstrate that DeepIC, a convolutional neural network-based code with an iterative decoder, outperforms TD and TIN by a significant margin for two-user additive white Gaussian noise channels with moderate amount of interference.

Viaarxiv icon

A Channel Coding Benchmark for Meta-Learning

Jul 15, 2021
Rui Li, Ondrej Bohdal, Rajesh Mishra, Hyeji Kim, Da Li, Nicholas Lane, Timothy Hospedales

Figure 1 for A Channel Coding Benchmark for Meta-Learning
Figure 2 for A Channel Coding Benchmark for Meta-Learning
Figure 3 for A Channel Coding Benchmark for Meta-Learning
Figure 4 for A Channel Coding Benchmark for Meta-Learning

Meta-learning provides a popular and effective family of methods for data-efficient learning of new tasks. However, several important issues in meta-learning have proven hard to study thus far. For example, performance degrades in real-world settings where meta-learners must learn from a wide and potentially multi-modal distribution of training tasks; and when distribution shift exists between meta-train and meta-test task distributions. These issues are typically hard to study since the shape of task distributions, and shift between them are not straightforward to measure or control in standard benchmarks. We propose the channel coding problem as a benchmark for meta-learning. Channel coding is an important practical application where task distributions naturally arise, and fast adaptation to new tasks is practically valuable. We use this benchmark to study several aspects of meta-learning, including the impact of task distribution breadth and shift, which can be controlled in the coding problem. Going forward, this benchmark provides a tool for the community to study the capabilities and limitations of meta-learning, and to drive research on practically robust and effective meta-learners.

Viaarxiv icon

Neural Distributed Source Coding

Jun 05, 2021
Jay Whang, Anish Acharya, Hyeji Kim, Alexandros G. Dimakis

Figure 1 for Neural Distributed Source Coding
Figure 2 for Neural Distributed Source Coding
Figure 3 for Neural Distributed Source Coding
Figure 4 for Neural Distributed Source Coding

Distributed source coding is the task of encoding an input in the absence of correlated side information that is only available to the decoder. Remarkably, Slepian and Wolf showed in 1973 that an encoder that has no access to the correlated side information can asymptotically achieve the same compression rate as when the side information is available at both the encoder and the decoder. While there is significant prior work on this topic in information theory, practical distributed source coding has been limited to synthetic datasets and specific correlation structures. Here we present a general framework for lossy distributed source coding that is agnostic to the correlation structure and can scale to high dimensions. Rather than relying on hand-crafted source-modeling, our method utilizes a powerful conditional deep generative model to learn the distributed encoder and decoder. We evaluate our method on realistic high-dimensional datasets and show substantial improvements in distributed compression performance.

Viaarxiv icon

Deepcode and Modulo-SK are Designed for Different Settings

Aug 18, 2020
Hyeji Kim, Yihan Jiang, Sreeram Kannan, Sewoong Oh, Pramod Viswanath

Figure 1 for Deepcode and Modulo-SK are Designed for Different Settings
Figure 2 for Deepcode and Modulo-SK are Designed for Different Settings
Figure 3 for Deepcode and Modulo-SK are Designed for Different Settings
Figure 4 for Deepcode and Modulo-SK are Designed for Different Settings

We respond to [1] which claimed that "Modulo-SK scheme outperforms Deepcode [2]". We demonstrate that this statement is not true: the two schemes are designed and evaluated for entirely different settings. DeepCode is designed and evaluated for the AWGN channel with (potentially delayed) uncoded output feedback. Modulo-SK is evaluated on the AWGN channel with coded feedback and unit delay. [1] also claimed an implementation of Schalkwijk and Kailath (SK) [3] which was numerically stable for any number of information bits and iterations. However, we observe that while their implementation does marginally improve over ours, it also suffers from a fundamental issue with precision. Finally, we show that Deepcode dominates the optimized performance of SK, over a natural choice of parameterizations when the feedback is noisy.

Viaarxiv icon

BRP-NAS: Prediction-based NAS using GCNs

Aug 10, 2020
Łukasz Dudziak, Thomas Chau, Mohamed S. Abdelfattah, Royson Lee, Hyeji Kim, Nicholas D. Lane

Figure 1 for BRP-NAS: Prediction-based NAS using GCNs
Figure 2 for BRP-NAS: Prediction-based NAS using GCNs
Figure 3 for BRP-NAS: Prediction-based NAS using GCNs
Figure 4 for BRP-NAS: Prediction-based NAS using GCNs

Neural architecture search (NAS) enables researchers to automatically explore broad design spaces in order to improve efficiency of neural networks. This efficiency is especially important in the case of on-device deployment, where improvements in accuracy should be balanced out with computational demands of a model. In practice, performance metrics of model are computationally expensive to obtain. Previous work uses a proxy (e.g. number of operations) or a layer-wise measurement of neural network layers to estimate end-to-end hardware performance but the imprecise prediction diminishes the quality of NAS. To address this problem, we propose BRP-NAS, an efficient hardware-aware NAS enabled by an accurate performance predictor-based on graph convolutional network (GCN). What is more, we investigate prediction quality on different metrics and show that sample-efficiency of the predictor-based NAS can be improved by considering binary relations of models and an iterative data selection strategy. We show that our proposed method outperforms all prior methods on both NAS-Bench-101 and NAS-Bench-201. Finally, to raise awareness of the fact that accurate latency estimation is not a trivial task, we release LatBench - a latency dataset of NAS-Bench-201 models running on a broad range of devices.

Viaarxiv icon