Picture for Mike Lewis

Mike Lewis

Jack

Law of the Weakest Link: Cross Capabilities of Large Language Models

Add code
Sep 30, 2024
Viaarxiv icon

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts

Add code
Jul 31, 2024
Viaarxiv icon

Lory: Fully Differentiable Mixture-of-Experts for Autoregressive Language Model Pre-training

Add code
May 06, 2024
Viaarxiv icon

In-Context Pretraining: Language Modeling Beyond Document Boundaries

Add code
Oct 20, 2023
Viaarxiv icon

RA-DIT: Retrieval-Augmented Dual Instruction Tuning

Add code
Oct 08, 2023
Viaarxiv icon

Contrastive Decoding Improves Reasoning in Large Language Models

Add code
Sep 29, 2023
Viaarxiv icon

Efficient Streaming Language Models with Attention Sinks

Add code
Sep 29, 2023
Viaarxiv icon

Effective Long-Context Scaling of Foundation Models

Add code
Sep 27, 2023
Viaarxiv icon

Self-Alignment with Instruction Backtranslation

Add code
Aug 14, 2023
Viaarxiv icon