Picture for Aleksandar Botev

Aleksandar Botev

RecurrentGemma: Moving Past Transformers for Efficient Open Language Models

Add code
Apr 11, 2024
Viaarxiv icon

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Add code
Feb 29, 2024
Viaarxiv icon

Applications of flow models to the generation of correlated lattice QCD ensembles

Jan 19, 2024
Viaarxiv icon

Normalizing flows for lattice gauge theory in arbitrary space-time dimension

May 03, 2023
Figure 1 for Normalizing flows for lattice gauge theory in arbitrary space-time dimension
Figure 2 for Normalizing flows for lattice gauge theory in arbitrary space-time dimension
Figure 3 for Normalizing flows for lattice gauge theory in arbitrary space-time dimension
Figure 4 for Normalizing flows for lattice gauge theory in arbitrary space-time dimension
Viaarxiv icon

Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation

Feb 20, 2023
Figure 1 for Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation
Figure 2 for Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation
Figure 3 for Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation
Figure 4 for Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation
Viaarxiv icon

Aspects of scaling and scalability for flow-based sampling of lattice QCD

Nov 14, 2022
Figure 1 for Aspects of scaling and scalability for flow-based sampling of lattice QCD
Figure 2 for Aspects of scaling and scalability for flow-based sampling of lattice QCD
Figure 3 for Aspects of scaling and scalability for flow-based sampling of lattice QCD
Figure 4 for Aspects of scaling and scalability for flow-based sampling of lattice QCD
Viaarxiv icon

Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers

Add code
Mar 15, 2022
Figure 1 for Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
Figure 2 for Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
Figure 3 for Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
Figure 4 for Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
Viaarxiv icon

SyMetric: Measuring the Quality of Learnt Hamiltonian Dynamics Inferred from Vision

Add code
Nov 10, 2021
Figure 1 for SyMetric: Measuring the Quality of Learnt Hamiltonian Dynamics Inferred from Vision
Figure 2 for SyMetric: Measuring the Quality of Learnt Hamiltonian Dynamics Inferred from Vision
Figure 3 for SyMetric: Measuring the Quality of Learnt Hamiltonian Dynamics Inferred from Vision
Figure 4 for SyMetric: Measuring the Quality of Learnt Hamiltonian Dynamics Inferred from Vision
Viaarxiv icon

Which priors matter? Benchmarking models for learning latent dynamics

Add code
Nov 09, 2021
Figure 1 for Which priors matter? Benchmarking models for learning latent dynamics
Figure 2 for Which priors matter? Benchmarking models for learning latent dynamics
Figure 3 for Which priors matter? Benchmarking models for learning latent dynamics
Viaarxiv icon

Better, Faster Fermionic Neural Networks

Add code
Nov 13, 2020
Figure 1 for Better, Faster Fermionic Neural Networks
Figure 2 for Better, Faster Fermionic Neural Networks
Figure 3 for Better, Faster Fermionic Neural Networks
Figure 4 for Better, Faster Fermionic Neural Networks
Viaarxiv icon