Alert button
Picture for Ari S. Morcos

Ari S. Morcos

Alert button

Effective pruning of web-scale datasets based on complexity of concept clusters

Jan 09, 2024
Amro Abbas, Evgenia Rusak, Kushal Tirumala, Wieland Brendel, Kamalika Chaudhuri, Ari S. Morcos

Viaarxiv icon

Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data

Dec 05, 2023
Yu Yang, Aaditya K. Singh, Mostafa Elhoushi, Anas Mahmoud, Kushal Tirumala, Fabian Gloeckle, Baptiste Rozière, Carole-Jean Wu, Ari S. Morcos, Newsha Ardalani

Viaarxiv icon

D4: Improving LLM Pretraining via Document De-Duplication and Diversification

Aug 23, 2023
Kushal Tirumala, Daniel Simig, Armen Aghajanyan, Ari S. Morcos

Figure 1 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Figure 2 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Figure 3 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Figure 4 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Viaarxiv icon

PUG: Photorealistic and Semantically Controllable Synthetic Data for Representation Learning

Aug 08, 2023
Florian Bordes, Shashank Shekhar, Mark Ibrahim, Diane Bouchacourt, Pascal Vincent, Ari S. Morcos

Figure 1 for PUG: Photorealistic and Semantically Controllable Synthetic Data for Representation Learning
Figure 2 for PUG: Photorealistic and Semantically Controllable Synthetic Data for Representation Learning
Figure 3 for PUG: Photorealistic and Semantically Controllable Synthetic Data for Representation Learning
Figure 4 for PUG: Photorealistic and Semantically Controllable Synthetic Data for Representation Learning
Viaarxiv icon

SemDeDup: Data-efficient learning at web-scale through semantic deduplication

Mar 22, 2023
Amro Abbas, Kushal Tirumala, Dániel Simig, Surya Ganguli, Ari S. Morcos

Figure 1 for SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Figure 2 for SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Figure 3 for SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Figure 4 for SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Viaarxiv icon

Emergence of Maps in the Memories of Blind Navigation Agents

Jan 30, 2023
Erik Wijmans, Manolis Savva, Irfan Essa, Stefan Lee, Ari S. Morcos, Dhruv Batra

Figure 1 for Emergence of Maps in the Memories of Blind Navigation Agents
Figure 2 for Emergence of Maps in the Memories of Blind Navigation Agents
Figure 3 for Emergence of Maps in the Memories of Blind Navigation Agents
Figure 4 for Emergence of Maps in the Memories of Blind Navigation Agents
Viaarxiv icon

Beyond neural scaling laws: beating power law scaling via data pruning

Jun 29, 2022
Ben Sorscher, Robert Geirhos, Shashank Shekhar, Surya Ganguli, Ari S. Morcos

Figure 1 for Beyond neural scaling laws: beating power law scaling via data pruning
Figure 2 for Beyond neural scaling laws: beating power law scaling via data pruning
Figure 3 for Beyond neural scaling laws: beating power law scaling via data pruning
Figure 4 for Beyond neural scaling laws: beating power law scaling via data pruning
Viaarxiv icon

Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time

Mar 10, 2022
Mitchell Wortsman, Gabriel Ilharco, Samir Yitzhak Gadre, Rebecca Roelofs, Raphael Gontijo-Lopes, Ari S. Morcos, Hongseok Namkoong, Ali Farhadi, Yair Carmon, Simon Kornblith, Ludwig Schmidt

Figure 1 for Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Figure 2 for Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Figure 3 for Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Figure 4 for Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Viaarxiv icon

Grounding inductive biases in natural images:invariance stems from variations in data

Jun 09, 2021
Diane Bouchacourt, Mark Ibrahim, Ari S. Morcos

Figure 1 for Grounding inductive biases in natural images:invariance stems from variations in data
Figure 2 for Grounding inductive biases in natural images:invariance stems from variations in data
Figure 3 for Grounding inductive biases in natural images:invariance stems from variations in data
Figure 4 for Grounding inductive biases in natural images:invariance stems from variations in data
Viaarxiv icon