Picture for Luca Soldaini

Luca Soldaini

Amazon Alexa Search

DataDecide: How to Predict Best Pretraining Data with Small Experiments

Add code
Apr 15, 2025
Viaarxiv icon

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Add code
Apr 09, 2025
Viaarxiv icon

Automatic Detection of Research Values from Scientific Abstracts Across Computer Science Subfields

Add code
Feb 26, 2025
Viaarxiv icon

olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models

Add code
Feb 25, 2025
Viaarxiv icon

mFollowIR: a Multilingual Benchmark for Instruction Following in Retrieval

Add code
Jan 31, 2025
Figure 1 for mFollowIR: a Multilingual Benchmark for Instruction Following in Retrieval
Figure 2 for mFollowIR: a Multilingual Benchmark for Instruction Following in Retrieval
Figure 3 for mFollowIR: a Multilingual Benchmark for Instruction Following in Retrieval
Figure 4 for mFollowIR: a Multilingual Benchmark for Instruction Following in Retrieval
Viaarxiv icon

DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images

Add code
Jan 24, 2025
Figure 1 for DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images
Figure 2 for DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images
Figure 3 for DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images
Figure 4 for DrawEduMath: Evaluating Vision Language Models with Expert-Annotated Students' Hand-Drawn Math Images
Viaarxiv icon

2 OLMo 2 Furious

Add code
Dec 31, 2024
Figure 1 for 2 OLMo 2 Furious
Figure 2 for 2 OLMo 2 Furious
Figure 3 for 2 OLMo 2 Furious
Figure 4 for 2 OLMo 2 Furious
Viaarxiv icon

Establishing Task Scaling Laws via Compute-Efficient Model Ladders

Add code
Dec 05, 2024
Viaarxiv icon

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Add code
Nov 22, 2024
Figure 1 for TÜLU 3: Pushing Frontiers in Open Language Model Post-Training
Figure 2 for TÜLU 3: Pushing Frontiers in Open Language Model Post-Training
Figure 3 for TÜLU 3: Pushing Frontiers in Open Language Model Post-Training
Figure 4 for TÜLU 3: Pushing Frontiers in Open Language Model Post-Training
Viaarxiv icon

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs

Add code
Nov 21, 2024
Figure 1 for OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs
Figure 2 for OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs
Figure 3 for OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs
Figure 4 for OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs
Viaarxiv icon