Picture for Pierre Ablin

Pierre Ablin

Ecole normale supérieure, Paris, France

The Design Space of Tri-Modal Masked Diffusion Models

Add code
Feb 25, 2026
Viaarxiv icon

LaCy: What Small Language Models Can and Should Learn is Not Just a Question of Loss

Add code
Feb 13, 2026
Viaarxiv icon

Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration

Add code
Dec 26, 2025
Viaarxiv icon

Learning Unmasking Policies for Diffusion Language Models

Add code
Dec 12, 2025
Viaarxiv icon

The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining

Add code
Oct 02, 2025
Figure 1 for The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Figure 2 for The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Figure 3 for The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Figure 4 for The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Viaarxiv icon

The Geometries of Truth Are Orthogonal Across Tasks

Add code
Jun 10, 2025
Viaarxiv icon

Identifiable Multi-View Causal Discovery Without Non-Gaussianity

Add code
Feb 28, 2025
Viaarxiv icon

Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection

Add code
Feb 09, 2025
Viaarxiv icon

Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging

Add code
Feb 03, 2025
Viaarxiv icon

A Unified Perspective on the Dynamics of Deep Transformers

Add code
Jan 30, 2025
Figure 1 for A Unified Perspective on the Dynamics of Deep Transformers
Figure 2 for A Unified Perspective on the Dynamics of Deep Transformers
Figure 3 for A Unified Perspective on the Dynamics of Deep Transformers
Figure 4 for A Unified Perspective on the Dynamics of Deep Transformers
Viaarxiv icon