Picture for Sebastian Ruder

Sebastian Ruder

The Sparse Frontier: Sparse Attention Trade-offs in Transformer LLMs

Add code
Apr 24, 2025
Viaarxiv icon

A Post-trainer's Guide to Multilingual Training Data: Uncovering Cross-lingual Transfer Dynamics

Add code
Apr 23, 2025
Viaarxiv icon

AL-QASIDA: Analyzing LLM Quality and Accuracy Systematically in Dialectal Arabic

Add code
Dec 05, 2024
Figure 1 for AL-QASIDA: Analyzing LLM Quality and Accuracy Systematically in Dialectal Arabic
Figure 2 for AL-QASIDA: Analyzing LLM Quality and Accuracy Systematically in Dialectal Arabic
Figure 3 for AL-QASIDA: Analyzing LLM Quality and Accuracy Systematically in Dialectal Arabic
Figure 4 for AL-QASIDA: Analyzing LLM Quality and Accuracy Systematically in Dialectal Arabic
Viaarxiv icon

M-RewardBench: Evaluating Reward Models in Multilingual Settings

Add code
Oct 20, 2024
Figure 1 for M-RewardBench: Evaluating Reward Models in Multilingual Settings
Figure 2 for M-RewardBench: Evaluating Reward Models in Multilingual Settings
Figure 3 for M-RewardBench: Evaluating Reward Models in Multilingual Settings
Figure 4 for M-RewardBench: Evaluating Reward Models in Multilingual Settings
Viaarxiv icon

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

Add code
Aug 15, 2024
Figure 1 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Figure 2 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Figure 3 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Figure 4 for BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts
Viaarxiv icon

How Does Quantization Affect Multilingual LLMs?

Add code
Jul 03, 2024
Viaarxiv icon

LLM See, LLM Do: Guiding Data Generation to Target Non-Differentiable Objectives

Add code
Jul 01, 2024
Viaarxiv icon

Understanding and Mitigating Language Confusion in LLMs

Add code
Jun 28, 2024
Viaarxiv icon

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages

Add code
Jun 14, 2024
Figure 1 for SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Figure 2 for SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Figure 3 for SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Figure 4 for SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages
Viaarxiv icon

Aya 23: Open Weight Releases to Further Multilingual Progress

Add code
May 23, 2024
Figure 1 for Aya 23: Open Weight Releases to Further Multilingual Progress
Figure 2 for Aya 23: Open Weight Releases to Further Multilingual Progress
Figure 3 for Aya 23: Open Weight Releases to Further Multilingual Progress
Figure 4 for Aya 23: Open Weight Releases to Further Multilingual Progress
Viaarxiv icon