Picture for Kushal Tirumala

Kushal Tirumala

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Add code
Aug 20, 2024
Figure 1 for Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Figure 2 for Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Figure 3 for Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Figure 4 for Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Viaarxiv icon

Brevity is the soul of wit: Pruning long files for code generation

Add code
Jun 29, 2024
Viaarxiv icon

An Introduction to Vision-Language Modeling

Add code
May 27, 2024
Viaarxiv icon

Text Quality-Based Pruning for Efficient Training of Language Models

Add code
Apr 26, 2024
Figure 1 for Text Quality-Based Pruning for Efficient Training of Language Models
Figure 2 for Text Quality-Based Pruning for Efficient Training of Language Models
Figure 3 for Text Quality-Based Pruning for Efficient Training of Language Models
Figure 4 for Text Quality-Based Pruning for Efficient Training of Language Models
Viaarxiv icon

The Unreasonable Ineffectiveness of the Deeper Layers

Add code
Mar 26, 2024
Figure 1 for The Unreasonable Ineffectiveness of the Deeper Layers
Figure 2 for The Unreasonable Ineffectiveness of the Deeper Layers
Figure 3 for The Unreasonable Ineffectiveness of the Deeper Layers
Figure 4 for The Unreasonable Ineffectiveness of the Deeper Layers
Viaarxiv icon

Effective pruning of web-scale datasets based on complexity of concept clusters

Add code
Jan 09, 2024
Viaarxiv icon

Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data

Add code
Dec 05, 2023
Figure 1 for Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data
Figure 2 for Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data
Figure 3 for Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data
Figure 4 for Decoding Data Quality via Synthetic Corruptions: Embedding-guided Pruning of Code Data
Viaarxiv icon

D4: Improving LLM Pretraining via Document De-Duplication and Diversification

Add code
Aug 23, 2023
Figure 1 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Figure 2 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Figure 3 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Figure 4 for D4: Improving LLM Pretraining via Document De-Duplication and Diversification
Viaarxiv icon

SemDeDup: Data-efficient learning at web-scale through semantic deduplication

Add code
Mar 22, 2023
Figure 1 for SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Figure 2 for SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Figure 3 for SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Figure 4 for SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Viaarxiv icon

Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models

Add code
May 22, 2022
Figure 1 for Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Figure 2 for Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Figure 3 for Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Figure 4 for Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Viaarxiv icon