Abstract:We present our shared task on evaluating the adaptability of LLMs and NLP systems across multiple languages and cultures. The task data consist of an extended version of our manually constructed BLEnD benchmark (Myung et al. 2024), covering more than 30 language-culture pairs, predominantly representing low-resource languages spoken across multiple continents. As the task is designed strictly for evaluation, participants were not permitted to use the data for training, fine-tuning, few-shot learning, or any other form of model modification. Our task includes two tracks: (a) Short-Answer Questions (SAQ) and (b) Multiple-Choice Questions (MCQ). Participants were required to predict labels and were allowed to submit any NLP system and adopt diverse modelling strategies, provided that the benchmark was used solely for evaluation. The task attracted more than 140 registered participants, and we received final submissions from 62 teams, along with 19 system description papers. We report the results and present an analysis of the best-performing systems and the most commonly adopted approaches. Furthermore, we discuss shared insights into open questions and challenges related to evaluation, misalignment, and methodological perspectives on model behaviour in low-resource languages and for under-represented cultures.



Abstract:In earlier work, we have shown that articulation rate in Swedish child-directed speech (CDS) increases as a function of the age of the child, even when utterance length and differences in articulation rate between subjects are controlled for. In this paper we show on utterance level in spontaneous Swedish speech that i) for the youngest children, articulation rate in CDS is lower than in adult-directed speech (ADS), ii) there is a significant negative correlation between articulation rate and surprisal (the negative log probability) in ADS, and iii) the increase in articulation rate in Swedish CDS as a function of the age of the child holds, even when surprisal along with utterance length and differences in articulation rate between speakers are controlled for. These results indicate that adults adjust their articulation rate to make it fit the linguistic capacity of the child.