Picture for Shay Vargaftik

Shay Vargaftik

VMware Research

HACK: Homomorphic Acceleration via Compression of the Key-Value Cache for Disaggregated LLM Inference

Add code
Feb 05, 2025
Figure 1 for HACK: Homomorphic Acceleration via Compression of the Key-Value Cache for Disaggregated LLM Inference
Figure 2 for HACK: Homomorphic Acceleration via Compression of the Key-Value Cache for Disaggregated LLM Inference
Figure 3 for HACK: Homomorphic Acceleration via Compression of the Key-Value Cache for Disaggregated LLM Inference
Figure 4 for HACK: Homomorphic Acceleration via Compression of the Key-Value Cache for Disaggregated LLM Inference
Viaarxiv icon

Lucy: Think and Reason to Solve Text-to-SQL

Add code
Jul 06, 2024
Figure 1 for Lucy: Think and Reason to Solve Text-to-SQL
Figure 2 for Lucy: Think and Reason to Solve Text-to-SQL
Figure 3 for Lucy: Think and Reason to Solve Text-to-SQL
Figure 4 for Lucy: Think and Reason to Solve Text-to-SQL
Viaarxiv icon

Beyond Throughput and Compression Ratios: Towards High End-to-end Utility of Gradient Compression

Add code
Jul 01, 2024
Figure 1 for Beyond Throughput and Compression Ratios: Towards High End-to-end Utility of Gradient Compression
Figure 2 for Beyond Throughput and Compression Ratios: Towards High End-to-end Utility of Gradient Compression
Figure 3 for Beyond Throughput and Compression Ratios: Towards High End-to-end Utility of Gradient Compression
Figure 4 for Beyond Throughput and Compression Ratios: Towards High End-to-end Utility of Gradient Compression
Viaarxiv icon

Optimal and Near-Optimal Adaptive Vector Quantization

Add code
Feb 05, 2024
Viaarxiv icon

THC: Accelerating Distributed Deep Learning Using Tensor Homomorphic Compression

Add code
Feb 16, 2023
Figure 1 for THC: Accelerating Distributed Deep Learning Using Tensor Homomorphic Compression
Figure 2 for THC: Accelerating Distributed Deep Learning Using Tensor Homomorphic Compression
Figure 3 for THC: Accelerating Distributed Deep Learning Using Tensor Homomorphic Compression
Figure 4 for THC: Accelerating Distributed Deep Learning Using Tensor Homomorphic Compression
Viaarxiv icon

$\texttt{DoCoFL}$: Downlink Compression for Cross-Device Federated Learning

Add code
Feb 01, 2023
Figure 1 for $\texttt{DoCoFL}$: Downlink Compression for Cross-Device Federated Learning
Figure 2 for $\texttt{DoCoFL}$: Downlink Compression for Cross-Device Federated Learning
Figure 3 for $\texttt{DoCoFL}$: Downlink Compression for Cross-Device Federated Learning
Figure 4 for $\texttt{DoCoFL}$: Downlink Compression for Cross-Device Federated Learning
Viaarxiv icon

ScionFL: Secure Quantized Aggregation for Federated Learning

Add code
Oct 13, 2022
Figure 1 for ScionFL: Secure Quantized Aggregation for Federated Learning
Figure 2 for ScionFL: Secure Quantized Aggregation for Federated Learning
Figure 3 for ScionFL: Secure Quantized Aggregation for Federated Learning
Figure 4 for ScionFL: Secure Quantized Aggregation for Federated Learning
Viaarxiv icon

QUIC-FL: Quick Unbiased Compression for Federated Learning

Add code
May 28, 2022
Figure 1 for QUIC-FL: Quick Unbiased Compression for Federated Learning
Figure 2 for QUIC-FL: Quick Unbiased Compression for Federated Learning
Figure 3 for QUIC-FL: Quick Unbiased Compression for Federated Learning
Figure 4 for QUIC-FL: Quick Unbiased Compression for Federated Learning
Viaarxiv icon

Automating In-Network Machine Learning

Add code
May 18, 2022
Figure 1 for Automating In-Network Machine Learning
Figure 2 for Automating In-Network Machine Learning
Figure 3 for Automating In-Network Machine Learning
Figure 4 for Automating In-Network Machine Learning
Viaarxiv icon

IIsy: Practical In-Network Classification

Add code
May 17, 2022
Figure 1 for IIsy: Practical In-Network Classification
Figure 2 for IIsy: Practical In-Network Classification
Figure 3 for IIsy: Practical In-Network Classification
Figure 4 for IIsy: Practical In-Network Classification
Viaarxiv icon