Picture for Frederic Sala

Frederic Sala

LiveResearchBench: A Live Benchmark for User-Centric Deep Research in the Wild

Add code
Oct 16, 2025
Viaarxiv icon

Test-time Scaling Techniques in Theoretical Physics -- A Comparison of Methods on the TPBench Dataset

Add code
Jun 25, 2025
Viaarxiv icon

Time To Impeach LLM-as-a-Judge: Programs are the Future of Evaluation

Add code
Jun 12, 2025
Viaarxiv icon

Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning

Add code
Jun 05, 2025
Viaarxiv icon

R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training

Add code
May 01, 2025
Figure 1 for R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training
Figure 2 for R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training
Figure 3 for R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training
Figure 4 for R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training
Viaarxiv icon

COSMOS: Predictable and Cost-Effective Adaptation of LLMs

Add code
Apr 30, 2025
Viaarxiv icon

TARDIS: Mitigating Temporal Misalignment via Representation Steering

Add code
Mar 25, 2025
Viaarxiv icon

Personalize Your LLM: Fake it then Align it

Add code
Mar 05, 2025
Figure 1 for Personalize Your LLM: Fake it then Align it
Figure 2 for Personalize Your LLM: Fake it then Align it
Figure 3 for Personalize Your LLM: Fake it then Align it
Figure 4 for Personalize Your LLM: Fake it then Align it
Viaarxiv icon

Tabby: Tabular Data Synthesis with Language Models

Add code
Mar 04, 2025
Viaarxiv icon

Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics

Add code
Feb 19, 2025
Figure 1 for Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Figure 2 for Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Figure 3 for Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Figure 4 for Theoretical Physics Benchmark (TPBench) -- a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics
Viaarxiv icon