Picture for Dara Bahri

Dara Bahri

A Universal Class of Sharpness-Aware Minimization Algorithms

Add code
Jun 06, 2024
Viaarxiv icon

Sharpness-Aware Minimization Leads to Low-Rank Features

Add code
May 25, 2023
Figure 1 for Sharpness-Aware Minimization Leads to Low-Rank Features
Figure 2 for Sharpness-Aware Minimization Leads to Low-Rank Features
Figure 3 for Sharpness-Aware Minimization Leads to Low-Rank Features
Figure 4 for Sharpness-Aware Minimization Leads to Low-Rank Features
Viaarxiv icon

Is margin all you need? An extensive empirical study of active learning on tabular data

Add code
Oct 07, 2022
Figure 1 for Is margin all you need? An extensive empirical study of active learning on tabular data
Figure 2 for Is margin all you need? An extensive empirical study of active learning on tabular data
Figure 3 for Is margin all you need? An extensive empirical study of active learning on tabular data
Figure 4 for Is margin all you need? An extensive empirical study of active learning on tabular data
Viaarxiv icon

Confident Adaptive Language Modeling

Add code
Jul 14, 2022
Figure 1 for Confident Adaptive Language Modeling
Figure 2 for Confident Adaptive Language Modeling
Figure 3 for Confident Adaptive Language Modeling
Figure 4 for Confident Adaptive Language Modeling
Viaarxiv icon

Unifying Language Learning Paradigms

Add code
May 10, 2022
Figure 1 for Unifying Language Learning Paradigms
Figure 2 for Unifying Language Learning Paradigms
Figure 3 for Unifying Language Learning Paradigms
Figure 4 for Unifying Language Learning Paradigms
Viaarxiv icon

ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference

Add code
Apr 25, 2022
Figure 1 for ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
Figure 2 for ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
Figure 3 for ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
Figure 4 for ED2LM: Encoder-Decoder to Language Model for Faster Document Re-ranking Inference
Viaarxiv icon

Transformer Memory as a Differentiable Search Index

Add code
Feb 16, 2022
Figure 1 for Transformer Memory as a Differentiable Search Index
Figure 2 for Transformer Memory as a Differentiable Search Index
Figure 3 for Transformer Memory as a Differentiable Search Index
Figure 4 for Transformer Memory as a Differentiable Search Index
Viaarxiv icon

ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning

Add code
Nov 22, 2021
Figure 1 for ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
Figure 2 for ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
Figure 3 for ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
Figure 4 for ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
Viaarxiv icon

Sharpness-Aware Minimization Improves Language Model Generalization

Add code
Oct 16, 2021
Figure 1 for Sharpness-Aware Minimization Improves Language Model Generalization
Figure 2 for Sharpness-Aware Minimization Improves Language Model Generalization
Figure 3 for Sharpness-Aware Minimization Improves Language Model Generalization
Figure 4 for Sharpness-Aware Minimization Improves Language Model Generalization
Viaarxiv icon

Charformer: Fast Character Transformers via Gradient-based Subword Tokenization

Add code
Jul 02, 2021
Figure 1 for Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Figure 2 for Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Figure 3 for Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Figure 4 for Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Viaarxiv icon