Picture for Niklas Muennighoff

Niklas Muennighoff

Shammie

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Add code
Sep 25, 2024
Viaarxiv icon

OLMoE: Open Mixture-of-Experts Language Models

Add code
Sep 03, 2024
Viaarxiv icon

Consent in Crisis: The Rapid Decline of the AI Data Commons

Add code
Jul 24, 2024
Viaarxiv icon

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Add code
Jul 23, 2024
Viaarxiv icon

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies

Add code
Jul 18, 2024
Viaarxiv icon

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

Add code
Jul 16, 2024
Viaarxiv icon

RegMix: Data Mixture as Regression for Language Model Pre-training

Add code
Jul 01, 2024
Viaarxiv icon

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

Add code
Jun 26, 2024
Viaarxiv icon

DataComp-LM: In search of the next generation of training sets for language models

Add code
Jun 18, 2024
Viaarxiv icon

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages

Add code
Jun 14, 2024
Viaarxiv icon