Picture for Matteo Pagliardini

Matteo Pagliardini

DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging

Add code
Feb 04, 2024
Figure 1 for DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging
Figure 2 for DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging
Figure 3 for DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging
Figure 4 for DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging
Viaarxiv icon

MEDITRON-70B: Scaling Medical Pretraining for Large Language Models

Add code
Nov 27, 2023
Figure 1 for MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
Figure 2 for MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
Figure 3 for MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
Figure 4 for MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
Viaarxiv icon

DoGE: Domain Reweighting with Generalization Estimation

Add code
Oct 23, 2023
Viaarxiv icon

CoTFormer: More Tokens With Attention Make Up For Less Depth

Add code
Oct 16, 2023
Figure 1 for CoTFormer: More Tokens With Attention Make Up For Less Depth
Figure 2 for CoTFormer: More Tokens With Attention Make Up For Less Depth
Figure 3 for CoTFormer: More Tokens With Attention Make Up For Less Depth
Viaarxiv icon

Faster Causal Attention Over Large Sequences Through Sparse Flash Attention

Add code
Jun 01, 2023
Figure 1 for Faster Causal Attention Over Large Sequences Through Sparse Flash Attention
Figure 2 for Faster Causal Attention Over Large Sequences Through Sparse Flash Attention
Figure 3 for Faster Causal Attention Over Large Sequences Through Sparse Flash Attention
Figure 4 for Faster Causal Attention Over Large Sequences Through Sparse Flash Attention
Viaarxiv icon

Revisiting the ACVI Method for Constrained Variational Inequalities

Add code
Oct 27, 2022
Figure 1 for Revisiting the ACVI Method for Constrained Variational Inequalities
Figure 2 for Revisiting the ACVI Method for Constrained Variational Inequalities
Figure 3 for Revisiting the ACVI Method for Constrained Variational Inequalities
Figure 4 for Revisiting the ACVI Method for Constrained Variational Inequalities
Viaarxiv icon

Improving Generalization via Uncertainty Driven Perturbations

Add code
Feb 28, 2022
Figure 1 for Improving Generalization via Uncertainty Driven Perturbations
Figure 2 for Improving Generalization via Uncertainty Driven Perturbations
Figure 3 for Improving Generalization via Uncertainty Driven Perturbations
Figure 4 for Improving Generalization via Uncertainty Driven Perturbations
Viaarxiv icon

Agree to Disagree: Diversity through Disagreement for Better Transferability

Add code
Feb 09, 2022
Figure 1 for Agree to Disagree: Diversity through Disagreement for Better Transferability
Figure 2 for Agree to Disagree: Diversity through Disagreement for Better Transferability
Figure 3 for Agree to Disagree: Diversity through Disagreement for Better Transferability
Figure 4 for Agree to Disagree: Diversity through Disagreement for Better Transferability
Viaarxiv icon

The Peril of Popular Deep Learning Uncertainty Estimation Methods

Add code
Dec 09, 2021
Figure 1 for The Peril of Popular Deep Learning Uncertainty Estimation Methods
Figure 2 for The Peril of Popular Deep Learning Uncertainty Estimation Methods
Figure 3 for The Peril of Popular Deep Learning Uncertainty Estimation Methods
Figure 4 for The Peril of Popular Deep Learning Uncertainty Estimation Methods
Viaarxiv icon

Taming GANs with Lookahead

Add code
Jun 25, 2020
Figure 1 for Taming GANs with Lookahead
Figure 2 for Taming GANs with Lookahead
Figure 3 for Taming GANs with Lookahead
Figure 4 for Taming GANs with Lookahead
Viaarxiv icon