Picture for Eugene Belilovsky

Eugene Belilovsky

MILA

Incentivizing Permissionless Distributed Learning of LLMs

Add code
May 27, 2025
Viaarxiv icon

Continual Pre-training of MoEs: How robust is your router?

Add code
Mar 06, 2025
Figure 1 for Continual Pre-training of MoEs: How robust is your router?
Figure 2 for Continual Pre-training of MoEs: How robust is your router?
Figure 3 for Continual Pre-training of MoEs: How robust is your router?
Figure 4 for Continual Pre-training of MoEs: How robust is your router?
Viaarxiv icon

Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training

Add code
Mar 06, 2025
Figure 1 for Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training
Figure 2 for Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training
Figure 3 for Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training
Figure 4 for Beyond Cosine Decay: On the effectiveness of Infinite Learning Rate Schedule for Continual Pre-training
Viaarxiv icon

FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups

Add code
Feb 10, 2025
Figure 1 for FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups
Figure 2 for FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups
Figure 3 for FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups
Figure 4 for FairDropout: Using Example-Tied Dropout to Enhance Generalization of Minority Groups
Viaarxiv icon

Non-Uniform Parameter-Wise Model Merging

Add code
Dec 20, 2024
Viaarxiv icon

Sketch-guided Cage-based 3D Gaussian Splatting Deformation

Add code
Nov 19, 2024
Viaarxiv icon

Towards motion from video diffusion models

Add code
Nov 19, 2024
Figure 1 for Towards motion from video diffusion models
Figure 2 for Towards motion from video diffusion models
Figure 3 for Towards motion from video diffusion models
Figure 4 for Towards motion from video diffusion models
Viaarxiv icon

Not Only the Last-Layer Features for Spurious Correlations: All Layer Deep Feature Reweighting

Add code
Sep 23, 2024
Figure 1 for Not Only the Last-Layer Features for Spurious Correlations: All Layer Deep Feature Reweighting
Figure 2 for Not Only the Last-Layer Features for Spurious Correlations: All Layer Deep Feature Reweighting
Figure 3 for Not Only the Last-Layer Features for Spurious Correlations: All Layer Deep Feature Reweighting
Figure 4 for Not Only the Last-Layer Features for Spurious Correlations: All Layer Deep Feature Reweighting
Viaarxiv icon

Accelerating Training with Neuron Interaction and Nowcasting Networks

Add code
Sep 06, 2024
Figure 1 for Accelerating Training with Neuron Interaction and Nowcasting Networks
Figure 2 for Accelerating Training with Neuron Interaction and Nowcasting Networks
Figure 3 for Accelerating Training with Neuron Interaction and Nowcasting Networks
Figure 4 for Accelerating Training with Neuron Interaction and Nowcasting Networks
Viaarxiv icon

Harmony in Diversity: Merging Neural Networks with Canonical Correlation Analysis

Add code
Jul 07, 2024
Viaarxiv icon