Picture for Seraphina Goldfarb-Tarrant

Seraphina Goldfarb-Tarrant

The Multilingual Divide and Its Impact on Global AI Safety

Add code
May 27, 2025
Viaarxiv icon

MAPS: A Multilingual Benchmark for Global Agent Performance and Security

Add code
May 21, 2025
Viaarxiv icon

Command A: An Enterprise-Ready Large Language Model

Add code
Apr 01, 2025
Viaarxiv icon

Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts

Add code
Mar 12, 2025
Figure 1 for Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts
Figure 2 for Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts
Figure 3 for Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts
Figure 4 for Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts
Viaarxiv icon

Who Does the Giant Number Pile Like Best: Analyzing Fairness in Hiring Contexts

Add code
Jan 08, 2025
Viaarxiv icon

Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning

Add code
Oct 14, 2024
Figure 1 for Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning
Figure 2 for Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning
Figure 3 for Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning
Figure 4 for Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning
Viaarxiv icon

The Multilingual Alignment Prism: Aligning Global and Local Preferences to Reduce Harm

Add code
Jun 26, 2024
Viaarxiv icon

A SMART Mnemonic Sounds like "Glue Tonic": Mixing LLMs with Student Feedback to Make Mnemonic Learning Stick

Add code
Jun 21, 2024
Viaarxiv icon

MultiContrievers: Analysis of Dense Retrieval Representations

Add code
Feb 24, 2024
Figure 1 for MultiContrievers: Analysis of Dense Retrieval Representations
Figure 2 for MultiContrievers: Analysis of Dense Retrieval Representations
Figure 3 for MultiContrievers: Analysis of Dense Retrieval Representations
Figure 4 for MultiContrievers: Analysis of Dense Retrieval Representations
Viaarxiv icon

This Prompt is Measuring <MASK>: Evaluating Bias Evaluation in Language Models

Add code
May 22, 2023
Figure 1 for This Prompt is Measuring <MASK>: Evaluating Bias Evaluation in Language Models
Figure 2 for This Prompt is Measuring <MASK>: Evaluating Bias Evaluation in Language Models
Figure 3 for This Prompt is Measuring <MASK>: Evaluating Bias Evaluation in Language Models
Figure 4 for This Prompt is Measuring <MASK>: Evaluating Bias Evaluation in Language Models
Viaarxiv icon