Picture for Ari Morcos

Ari Morcos

The Finetuner's Fallacy: When to Pretrain with Your Finetuning Data

Add code
Mar 17, 2026
Viaarxiv icon

ÜberWeb: Insights from Multilingual Curation for a 20-Trillion-Token Dataset

Add code
Feb 16, 2026
Viaarxiv icon

DatBench: Discriminative, Faithful, and Efficient VLM Evaluations

Add code
Jan 05, 2026
Viaarxiv icon

Luxical: High-Speed Lexical-Dense Text Embeddings

Add code
Dec 11, 2025
Viaarxiv icon

SIEVE: Multimodal Dataset Pruning Using Image Captioning Models

Add code
Oct 03, 2023
Figure 1 for SIEVE: Multimodal Dataset Pruning Using Image Captioning Models
Figure 2 for SIEVE: Multimodal Dataset Pruning Using Image Captioning Models
Figure 3 for SIEVE: Multimodal Dataset Pruning Using Image Captioning Models
Figure 4 for SIEVE: Multimodal Dataset Pruning Using Image Captioning Models
Viaarxiv icon

Stable and low-precision training for large-scale vision-language models

Add code
Apr 25, 2023
Figure 1 for Stable and low-precision training for large-scale vision-language models
Figure 2 for Stable and low-precision training for large-scale vision-language models
Figure 3 for Stable and low-precision training for large-scale vision-language models
Figure 4 for Stable and low-precision training for large-scale vision-language models
Viaarxiv icon

Objectives Matter: Understanding the Impact of Self-Supervised Objectives on Vision Transformer Representations

Add code
Apr 25, 2023
Viaarxiv icon

A Cookbook of Self-Supervised Learning

Add code
Apr 24, 2023
Figure 1 for A Cookbook of Self-Supervised Learning
Figure 2 for A Cookbook of Self-Supervised Learning
Figure 3 for A Cookbook of Self-Supervised Learning
Figure 4 for A Cookbook of Self-Supervised Learning
Viaarxiv icon

The Robustness Limits of SoTA Vision Models to Natural Variation

Add code
Oct 24, 2022
Viaarxiv icon

Robust Self-Supervised Learning with Lie Groups

Add code
Oct 24, 2022
Viaarxiv icon