Picture for Elie Bakouch

Elie Bakouch

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Add code
Jun 05, 2025
Viaarxiv icon

SmolVLM: Redefining small and efficient multimodal models

Add code
Apr 07, 2025
Figure 1 for SmolVLM: Redefining small and efficient multimodal models
Figure 2 for SmolVLM: Redefining small and efficient multimodal models
Figure 3 for SmolVLM: Redefining small and efficient multimodal models
Figure 4 for SmolVLM: Redefining small and efficient multimodal models
Viaarxiv icon

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Add code
Feb 04, 2025
Figure 1 for SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Figure 2 for SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Figure 3 for SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Figure 4 for SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Viaarxiv icon

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Add code
May 29, 2024
Viaarxiv icon