Picture for Maksim Velikanov

Maksim Velikanov

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Add code
Jul 30, 2025
Viaarxiv icon

Falcon Mamba: The First Competitive Attention-free 7B Language Model

Add code
Oct 07, 2024
Figure 1 for Falcon Mamba: The First Competitive Attention-free 7B Language Model
Figure 2 for Falcon Mamba: The First Competitive Attention-free 7B Language Model
Figure 3 for Falcon Mamba: The First Competitive Attention-free 7B Language Model
Figure 4 for Falcon Mamba: The First Competitive Attention-free 7B Language Model
Viaarxiv icon

SGD with memory: fundamental properties and stochastic acceleration

Add code
Oct 05, 2024
Figure 1 for SGD with memory: fundamental properties and stochastic acceleration
Figure 2 for SGD with memory: fundamental properties and stochastic acceleration
Figure 3 for SGD with memory: fundamental properties and stochastic acceleration
Figure 4 for SGD with memory: fundamental properties and stochastic acceleration
Viaarxiv icon

Falcon2-11B Technical Report

Add code
Jul 20, 2024
Figure 1 for Falcon2-11B Technical Report
Figure 2 for Falcon2-11B Technical Report
Figure 3 for Falcon2-11B Technical Report
Figure 4 for Falcon2-11B Technical Report
Viaarxiv icon

Generalization error of spectral algorithms

Add code
Mar 18, 2024
Figure 1 for Generalization error of spectral algorithms
Figure 2 for Generalization error of spectral algorithms
Figure 3 for Generalization error of spectral algorithms
Figure 4 for Generalization error of spectral algorithms
Viaarxiv icon

Efficient Conformal Prediction under Data Heterogeneity

Add code
Dec 25, 2023
Figure 1 for Efficient Conformal Prediction under Data Heterogeneity
Figure 2 for Efficient Conformal Prediction under Data Heterogeneity
Figure 3 for Efficient Conformal Prediction under Data Heterogeneity
Figure 4 for Efficient Conformal Prediction under Data Heterogeneity
Viaarxiv icon

Comparing the robustness of modern no-reference image- and video-quality metrics to adversarial attacks

Add code
Oct 10, 2023
Viaarxiv icon

A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta

Add code
Jun 22, 2022
Figure 1 for A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta
Figure 2 for A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta
Figure 3 for A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta
Figure 4 for A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta
Viaarxiv icon

Embedded Ensembles: Infinite Width Limit and Operating Regimes

Add code
Feb 24, 2022
Figure 1 for Embedded Ensembles: Infinite Width Limit and Operating Regimes
Figure 2 for Embedded Ensembles: Infinite Width Limit and Operating Regimes
Figure 3 for Embedded Ensembles: Infinite Width Limit and Operating Regimes
Figure 4 for Embedded Ensembles: Infinite Width Limit and Operating Regimes
Viaarxiv icon

Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions

Add code
Feb 02, 2022
Figure 1 for Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions
Figure 2 for Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions
Figure 3 for Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions
Figure 4 for Tight Convergence Rate Bounds for Optimization Under Power Law Spectral Conditions
Viaarxiv icon