Alert button
Picture for Catherine Arnett

Catherine Arnett

Alert button

Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement

Add code
Bookmark button
Alert button
Mar 20, 2024
Catherine Arnett, Pamela D. Rivière, Tyler A. Chang, Sean Trott

Figure 1 for Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement
Figure 2 for Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement
Figure 3 for Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement
Figure 4 for Different Tokenization Schemes Lead to Comparable Performance in Spanish Number Agreement
Viaarxiv icon

A Bit of a Problem: Measurement Disparities in Dataset Sizes Across Languages

Add code
Bookmark button
Alert button
Mar 01, 2024
Catherine Arnett, Tyler A. Chang, Benjamin K. Bergen

Figure 1 for A Bit of a Problem: Measurement Disparities in Dataset Sizes Across Languages
Figure 2 for A Bit of a Problem: Measurement Disparities in Dataset Sizes Across Languages
Figure 3 for A Bit of a Problem: Measurement Disparities in Dataset Sizes Across Languages
Figure 4 for A Bit of a Problem: Measurement Disparities in Dataset Sizes Across Languages
Viaarxiv icon

When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages

Add code
Bookmark button
Alert button
Nov 15, 2023
Tyler A. Chang, Catherine Arnett, Zhuowen Tu, Benjamin K. Bergen

Viaarxiv icon

Structural Priming Demonstrates Abstract Grammatical Representations in Multilingual Language Models

Add code
Bookmark button
Alert button
Nov 15, 2023
James A. Michaelov, Catherine Arnett, Tyler A. Chang, Benjamin K. Bergen

Viaarxiv icon

Crosslingual Structural Priming and the Pre-Training Dynamics of Bilingual Language Models

Add code
Bookmark button
Alert button
Oct 11, 2023
Catherine Arnett, Tyler A. Chang, James A. Michaelov, Benjamin K. Bergen

Viaarxiv icon