Picture for Mostofa Patwary

Mostofa Patwary

LLM Pruning and Distillation in Practice: The Minitron Approach

Add code
Aug 21, 2024
Figure 1 for LLM Pruning and Distillation in Practice: The Minitron Approach
Figure 2 for LLM Pruning and Distillation in Practice: The Minitron Approach
Figure 3 for LLM Pruning and Distillation in Practice: The Minitron Approach
Figure 4 for LLM Pruning and Distillation in Practice: The Minitron Approach
Viaarxiv icon

Compact Language Models via Pruning and Knowledge Distillation

Add code
Jul 19, 2024
Viaarxiv icon

Reuse, Don't Retrain: A Recipe for Continued Pretraining of Language Models

Add code
Jul 09, 2024
Viaarxiv icon

Data, Data Everywhere: A Guide for Pretraining Dataset Construction

Add code
Jul 08, 2024
Viaarxiv icon

Nemotron-4 340B Technical Report

Add code
Jun 17, 2024
Viaarxiv icon

StarCoder 2 and The Stack v2: The Next Generation

Add code
Feb 29, 2024
Figure 1 for StarCoder 2 and The Stack v2: The Next Generation
Figure 2 for StarCoder 2 and The Stack v2: The Next Generation
Figure 3 for StarCoder 2 and The Stack v2: The Next Generation
Figure 4 for StarCoder 2 and The Stack v2: The Next Generation
Viaarxiv icon

Nemotron-4 15B Technical Report

Add code
Feb 27, 2024
Viaarxiv icon

Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models

Add code
Feb 14, 2023
Viaarxiv icon

Evaluating Parameter Efficient Learning for Generation

Add code
Oct 25, 2022
Viaarxiv icon

Context Generation Improves Open Domain Question Answering

Add code
Oct 12, 2022
Figure 1 for Context Generation Improves Open Domain Question Answering
Figure 2 for Context Generation Improves Open Domain Question Answering
Figure 3 for Context Generation Improves Open Domain Question Answering
Figure 4 for Context Generation Improves Open Domain Question Answering
Viaarxiv icon