Picture for Pierre Ablin

Pierre Ablin

Ecole normale supérieure, Paris, France

MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations

Add code
Jan 13, 2025
Figure 1 for MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations
Figure 2 for MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations
Figure 3 for MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations
Figure 4 for MVICAD2: Multi-View Independent Component Analysis with Delays and Dilations
Viaarxiv icon

Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models

Add code
Oct 10, 2024
Figure 1 for Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models
Figure 2 for Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models
Figure 3 for Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models
Figure 4 for Sparse Repellency for Shielded Generation in Text-to-image Diffusion Models
Viaarxiv icon

Dynamic Gradient Alignment for Online Data Mixing

Add code
Oct 03, 2024
Figure 1 for Dynamic Gradient Alignment for Online Data Mixing
Figure 2 for Dynamic Gradient Alignment for Online Data Mixing
Figure 3 for Dynamic Gradient Alignment for Online Data Mixing
Figure 4 for Dynamic Gradient Alignment for Online Data Mixing
Viaarxiv icon

Theory, Analysis, and Best Practices for Sigmoid Self-Attention

Add code
Sep 06, 2024
Figure 1 for Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Figure 2 for Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Figure 3 for Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Figure 4 for Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Viaarxiv icon

The AdEMAMix Optimizer: Better, Faster, Older

Add code
Sep 05, 2024
Figure 1 for The AdEMAMix Optimizer: Better, Faster, Older
Figure 2 for The AdEMAMix Optimizer: Better, Faster, Older
Figure 3 for The AdEMAMix Optimizer: Better, Faster, Older
Figure 4 for The AdEMAMix Optimizer: Better, Faster, Older
Viaarxiv icon

Optimization without retraction on the random generalized Stiefel manifold

Add code
May 02, 2024
Figure 1 for Optimization without retraction on the random generalized Stiefel manifold
Figure 2 for Optimization without retraction on the random generalized Stiefel manifold
Figure 3 for Optimization without retraction on the random generalized Stiefel manifold
Figure 4 for Optimization without retraction on the random generalized Stiefel manifold
Viaarxiv icon

Enhancing Hypergradients Estimation: A Study of Preconditioning and Reparameterization

Add code
Feb 26, 2024
Viaarxiv icon

Careful with that Scalpel: Improving Gradient Surgery with an EMA

Add code
Feb 05, 2024
Viaarxiv icon

Specialized Language Models with Cheap Inference from Limited Domain Data

Add code
Feb 02, 2024
Figure 1 for Specialized Language Models with Cheap Inference from Limited Domain Data
Figure 2 for Specialized Language Models with Cheap Inference from Limited Domain Data
Figure 3 for Specialized Language Models with Cheap Inference from Limited Domain Data
Figure 4 for Specialized Language Models with Cheap Inference from Limited Domain Data
Viaarxiv icon

Understanding the Regularity of Self-Attention with Optimal Transport

Add code
Dec 22, 2023
Figure 1 for Understanding the Regularity of Self-Attention with Optimal Transport
Figure 2 for Understanding the Regularity of Self-Attention with Optimal Transport
Figure 3 for Understanding the Regularity of Self-Attention with Optimal Transport
Figure 4 for Understanding the Regularity of Self-Attention with Optimal Transport
Viaarxiv icon