Picture for Eugene Belilovsky

Eugene Belilovsky

MILA

Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis

Add code
Jul 07, 2024
Viaarxiv icon

Controlling Forgetting with Test-Time Data in Continual Learning

Add code
Jun 19, 2024
Viaarxiv icon

PETRA: Parallel End-to-end Training with Reversible Architectures

Add code
Jun 04, 2024
Viaarxiv icon

From Feature Visualization to Visual Circuits: Effect of Adversarial Model Manipulation

Add code
Jun 03, 2024
Viaarxiv icon

ACCO: Accumulate while you Communicate, Hiding Communications in Distributed LLM Training

Add code
Jun 03, 2024
Viaarxiv icon

Temporally Consistent Object Editing in Videos using Extended Attention

Add code
Jun 01, 2024
Viaarxiv icon

$μ$LO: Compute-Efficient Meta-Generalization of Learned Optimizers

Add code
May 31, 2024
Viaarxiv icon

WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average

Add code
May 27, 2024
Figure 1 for WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average
Figure 2 for WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average
Figure 3 for WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average
Figure 4 for WASH: Train your Ensemble with Communication-Efficient Weight Shuffling, then Average
Viaarxiv icon

AdaFisher: Adaptive Second Order Optimization via Fisher Information

Add code
May 26, 2024
Viaarxiv icon

Simple and Scalable Strategies to Continually Pre-train Large Language Models

Add code
Mar 26, 2024
Figure 1 for Simple and Scalable Strategies to Continually Pre-train Large Language Models
Figure 2 for Simple and Scalable Strategies to Continually Pre-train Large Language Models
Figure 3 for Simple and Scalable Strategies to Continually Pre-train Large Language Models
Figure 4 for Simple and Scalable Strategies to Continually Pre-train Large Language Models
Viaarxiv icon