Picture for Elad Hoffer

Elad Hoffer

AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation

Add code
Mar 25, 2025
Figure 1 for AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation
Figure 2 for AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation
Figure 3 for AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation
Figure 4 for AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation
Viaarxiv icon

DropCompute: simple and more robust distributed synchronous training via compute variance reduction

Add code
Jun 18, 2023
Figure 1 for DropCompute: simple and more robust distributed synchronous training via compute variance reduction
Figure 2 for DropCompute: simple and more robust distributed synchronous training via compute variance reduction
Figure 3 for DropCompute: simple and more robust distributed synchronous training via compute variance reduction
Figure 4 for DropCompute: simple and more robust distributed synchronous training via compute variance reduction
Viaarxiv icon

Energy awareness in low precision neural networks

Add code
Feb 06, 2022
Figure 1 for Energy awareness in low precision neural networks
Figure 2 for Energy awareness in low precision neural networks
Figure 3 for Energy awareness in low precision neural networks
Figure 4 for Energy awareness in low precision neural networks
Viaarxiv icon

Logarithmic Unbiased Quantization: Practical 4-bit Training in Deep Learning

Add code
Dec 19, 2021
Figure 1 for Logarithmic Unbiased Quantization: Practical 4-bit Training in Deep Learning
Figure 2 for Logarithmic Unbiased Quantization: Practical 4-bit Training in Deep Learning
Figure 3 for Logarithmic Unbiased Quantization: Practical 4-bit Training in Deep Learning
Figure 4 for Logarithmic Unbiased Quantization: Practical 4-bit Training in Deep Learning
Viaarxiv icon

Task Agnostic Continual Learning Using Online Variational Bayes with Fixed-Point Updates

Add code
Oct 01, 2020
Figure 1 for Task Agnostic Continual Learning Using Online Variational Bayes with Fixed-Point Updates
Figure 2 for Task Agnostic Continual Learning Using Online Variational Bayes with Fixed-Point Updates
Figure 3 for Task Agnostic Continual Learning Using Online Variational Bayes with Fixed-Point Updates
Figure 4 for Task Agnostic Continual Learning Using Online Variational Bayes with Fixed-Point Updates
Viaarxiv icon

Neural gradients are lognormally distributed: understanding sparse and quantized training

Add code
Jun 17, 2020
Figure 1 for Neural gradients are lognormally distributed: understanding sparse and quantized training
Figure 2 for Neural gradients are lognormally distributed: understanding sparse and quantized training
Figure 3 for Neural gradients are lognormally distributed: understanding sparse and quantized training
Figure 4 for Neural gradients are lognormally distributed: understanding sparse and quantized training
Viaarxiv icon

The Knowledge Within: Methods for Data-Free Model Compression

Add code
Dec 03, 2019
Figure 1 for The Knowledge Within: Methods for Data-Free Model Compression
Figure 2 for The Knowledge Within: Methods for Data-Free Model Compression
Figure 3 for The Knowledge Within: Methods for Data-Free Model Compression
Figure 4 for The Knowledge Within: Methods for Data-Free Model Compression
Viaarxiv icon

At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?

Add code
Sep 26, 2019
Figure 1 for At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
Figure 2 for At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
Figure 3 for At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
Figure 4 for At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?
Viaarxiv icon

Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency

Add code
Aug 12, 2019
Figure 1 for Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency
Figure 2 for Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency
Figure 3 for Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency
Figure 4 for Mix & Match: training convnets with mixed image sizes for improved accuracy, speed and scale resiliency
Viaarxiv icon

Augment your batch: better training with larger batches

Add code
Jan 27, 2019
Figure 1 for Augment your batch: better training with larger batches
Figure 2 for Augment your batch: better training with larger batches
Figure 3 for Augment your batch: better training with larger batches
Figure 4 for Augment your batch: better training with larger batches
Viaarxiv icon