Picture for Eldar Kurtic

Eldar Kurtic

Statistically-Lossless Quantization of Large Language Models

Add code
May 04, 2026
Viaarxiv icon

GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling

Add code
Apr 20, 2026
Viaarxiv icon

DarwinLM: Evolutionary Structured Pruning of Large Language Models

Add code
Feb 11, 2025
Figure 1 for DarwinLM: Evolutionary Structured Pruning of Large Language Models
Figure 2 for DarwinLM: Evolutionary Structured Pruning of Large Language Models
Figure 3 for DarwinLM: Evolutionary Structured Pruning of Large Language Models
Figure 4 for DarwinLM: Evolutionary Structured Pruning of Large Language Models
Viaarxiv icon

"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization

Add code
Nov 04, 2024
Figure 1 for "Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization
Figure 2 for "Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization
Figure 3 for "Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization
Figure 4 for "Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization
Viaarxiv icon

EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search

Add code
Oct 18, 2024
Figure 1 for EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search
Figure 2 for EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search
Figure 3 for EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search
Figure 4 for EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search
Viaarxiv icon

Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on Large Language Models

Add code
Jun 18, 2024
Viaarxiv icon

MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence

Add code
May 24, 2024
Viaarxiv icon

Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment

Add code
May 06, 2024
Viaarxiv icon

How to Prune Your Language Model: Recovering Accuracy on the "Sparsity May Cry'' Benchmark

Add code
Dec 21, 2023
Viaarxiv icon

Sparse Fine-tuning for Inference Acceleration of Large Language Models

Add code
Oct 13, 2023
Viaarxiv icon