Picture for Eugene Belilovsky

Eugene Belilovsky

MILA

Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation

Add code
Feb 07, 2024
Figure 1 for Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation
Figure 2 for Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation
Figure 3 for Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation
Figure 4 for Channel-Selective Normalization for Label-Shift Robust Test-Time Adaptation
Viaarxiv icon

Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks

Add code
Dec 11, 2023
Figure 1 for Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Figure 2 for Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Figure 3 for Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Figure 4 for Model Breadcrumbs: Scaling Multi-Task Model Merging with Sparse Masks
Viaarxiv icon

Can We Learn Communication-Efficient Optimizers?

Add code
Dec 02, 2023
Figure 1 for Can We Learn Communication-Efficient Optimizers?
Figure 2 for Can We Learn Communication-Efficient Optimizers?
Figure 3 for Can We Learn Communication-Efficient Optimizers?
Figure 4 for Can We Learn Communication-Efficient Optimizers?
Viaarxiv icon

DragD3D: Vertex-based Editing for Realistic Mesh Deformations using 2D Diffusion Priors

Add code
Oct 06, 2023
Viaarxiv icon

Continual Pre-Training of Large Language Models: How to (re)warm your model?

Add code
Aug 08, 2023
Figure 1 for Continual Pre-Training of Large Language Models: How to (re)warm your model?
Figure 2 for Continual Pre-Training of Large Language Models: How to (re)warm your model?
Figure 3 for Continual Pre-Training of Large Language Models: How to (re)warm your model?
Figure 4 for Continual Pre-Training of Large Language Models: How to (re)warm your model?
Viaarxiv icon

$\textbf{A}^2\textbf{CiD}^2$: Accelerating Asynchronous Communication in Decentralized Deep Learning

Add code
Jun 14, 2023
Figure 1 for $\textbf{A}^2\textbf{CiD}^2$: Accelerating Asynchronous Communication in Decentralized Deep Learning
Figure 2 for $\textbf{A}^2\textbf{CiD}^2$: Accelerating Asynchronous Communication in Decentralized Deep Learning
Figure 3 for $\textbf{A}^2\textbf{CiD}^2$: Accelerating Asynchronous Communication in Decentralized Deep Learning
Figure 4 for $\textbf{A}^2\textbf{CiD}^2$: Accelerating Asynchronous Communication in Decentralized Deep Learning
Viaarxiv icon

Adversarial Attacks on the Interpretation of Neuron Activation Maximization

Add code
Jun 12, 2023
Viaarxiv icon

Can Forward Gradient Match Backpropagation?

Add code
Jun 12, 2023
Viaarxiv icon

Guiding The Last Layer in Federated Learning with Pre-Trained Models

Add code
Jun 06, 2023
Figure 1 for Guiding The Last Layer in Federated Learning with Pre-Trained Models
Figure 2 for Guiding The Last Layer in Federated Learning with Pre-Trained Models
Figure 3 for Guiding The Last Layer in Federated Learning with Pre-Trained Models
Figure 4 for Guiding The Last Layer in Federated Learning with Pre-Trained Models
Viaarxiv icon

Re-Weighted Softmax Cross-Entropy to Control Forgetting in Federated Learning

Add code
Apr 11, 2023
Viaarxiv icon