Picture for Acyr Locatelli

Acyr Locatelli

Efficient Benchmarking Is Just Feature Selection and Multiple Regression

Add code
May 25, 2026
Viaarxiv icon

Tiny Aya: Bridging Scale and Multilingual Depth

Add code
Mar 12, 2026
Viaarxiv icon

One Tokenizer To Rule Them All: Emergent Language Plasticity via Multilingual Tokenizers

Add code
Jun 12, 2025
Viaarxiv icon

Aya Vision: Advancing the Frontier of Multilingual Multimodality

Add code
May 13, 2025
Viaarxiv icon

Command A: An Enterprise-Ready Large Language Model

Add code
Apr 01, 2025
Figure 1 for Command A: An Enterprise-Ready Large Language Model
Figure 2 for Command A: An Enterprise-Ready Large Language Model
Figure 3 for Command A: An Enterprise-Ready Large Language Model
Figure 4 for Command A: An Enterprise-Ready Large Language Model
Viaarxiv icon

Rope to Nope and Back Again: A New Hybrid Attention Strategy

Add code
Jan 30, 2025
Figure 1 for Rope to Nope and Back Again: A New Hybrid Attention Strategy
Figure 2 for Rope to Nope and Back Again: A New Hybrid Attention Strategy
Figure 3 for Rope to Nope and Back Again: A New Hybrid Attention Strategy
Figure 4 for Rope to Nope and Back Again: A New Hybrid Attention Strategy
Viaarxiv icon

Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier

Add code
Dec 05, 2024
Figure 1 for Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
Figure 2 for Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
Figure 3 for Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
Figure 4 for Aya Expanse: Combining Research Breakthroughs for a New Multilingual Frontier
Viaarxiv icon

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models

Add code
Nov 19, 2024
Figure 1 for Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Figure 2 for Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Figure 3 for Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Figure 4 for Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models
Viaarxiv icon

Understanding Likelihood Over-optimisation in Direct Alignment Algorithms

Add code
Oct 15, 2024
Figure 1 for Understanding Likelihood Over-optimisation in Direct Alignment Algorithms
Figure 2 for Understanding Likelihood Over-optimisation in Direct Alignment Algorithms
Figure 3 for Understanding Likelihood Over-optimisation in Direct Alignment Algorithms
Figure 4 for Understanding Likelihood Over-optimisation in Direct Alignment Algorithms
Viaarxiv icon

Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts

Add code
Aug 28, 2024
Figure 1 for Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
Figure 2 for Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
Figure 3 for Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
Figure 4 for Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts
Viaarxiv icon