Picture for Jacob Nielsen

Jacob Nielsen

Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?

Add code
Feb 17, 2025
Figure 1 for Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?
Figure 2 for Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?
Figure 3 for Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?
Figure 4 for Continual Quantization-Aware Pre-Training: When to transition from 16-bit to 1.58-bit pre-training for BitNet language models?
Viaarxiv icon

FlexDeMo: Decoupled Momentum Optimization for Fully and Hybrid Sharded Training

Add code
Feb 10, 2025
Viaarxiv icon

When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization

Add code
Nov 08, 2024
Figure 1 for When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization
Figure 2 for When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization
Figure 3 for When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization
Figure 4 for When are 1.58 bits enough? A Bottom-up Exploration of BitNet Quantization
Viaarxiv icon

Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?

Add code
Dec 07, 2023
Figure 1 for Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?
Figure 2 for Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?
Figure 3 for Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?
Figure 4 for Multiview Aerial Visual Recognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?
Viaarxiv icon