Picture for Pierre Ablin

Pierre Ablin

Ecole normale supérieure, Paris, France

Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration

Add code
Dec 26, 2025
Viaarxiv icon

Learning Unmasking Policies for Diffusion Language Models

Add code
Dec 12, 2025
Viaarxiv icon

The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining

Add code
Oct 02, 2025
Figure 1 for The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Figure 2 for The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Figure 3 for The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Figure 4 for The Data-Quality Illusion: Rethinking Classifier-Based Quality Filtering for LLM Pretraining
Viaarxiv icon

The Geometries of Truth Are Orthogonal Across Tasks

Add code
Jun 10, 2025
Viaarxiv icon

Identifiable Multi-View Causal Discovery Without Non-Gaussianity

Add code
Feb 28, 2025
Viaarxiv icon

Scaling Laws for Forgetting during Finetuning with Pretraining Data Injection

Add code
Feb 09, 2025
Viaarxiv icon

Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging

Add code
Feb 03, 2025
Viaarxiv icon

A Unified Perspective on the Dynamics of Deep Transformers

Add code
Jan 30, 2025
Figure 1 for A Unified Perspective on the Dynamics of Deep Transformers
Figure 2 for A Unified Perspective on the Dynamics of Deep Transformers
Figure 3 for A Unified Perspective on the Dynamics of Deep Transformers
Figure 4 for A Unified Perspective on the Dynamics of Deep Transformers
Viaarxiv icon

MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations

Add code
Jan 13, 2025
Figure 1 for MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations
Figure 2 for MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations
Figure 3 for MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations
Figure 4 for MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations
Viaarxiv icon

Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models

Add code
Oct 10, 2024
Figure 1 for Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models
Figure 2 for Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models
Figure 3 for Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models
Figure 4 for Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models
Viaarxiv icon