Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Taylor Arnold

Cultural Evaluations of Vision-Language Models Have a Lot to Learn from Cultural Theory

May 28, 2025

Srishti Yadav, Lauren Tilton, Maria Antoniak, Taylor Arnold, Jiaang Li, Siddhesh Milind Pawar, Antonia Karamolegkou, Stella Frank, Zhaochong An, Negar Rostamzadeh(+3 more)

Abstract:Modern vision-language models (VLMs) often fail at cultural competency evaluations and benchmarks. Given the diversity of applications built upon VLMs, there is renewed interest in understanding how they encode cultural nuances. While individual aspects of this problem have been studied, we still lack a comprehensive framework for systematically identifying and annotating the nuanced cultural dimensions present in images for VLMs. This position paper argues that foundational methodologies from visual culture studies (cultural studies, semiotics, and visual studies) are necessary for cultural analysis of images. Building upon this review, we propose a set of five frameworks, corresponding to cultural dimensions, that must be considered for a more complete analysis of the cultural competencies of VLMs.

Via

Access Paper or Ask Questions

Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models

Nov 07, 2024

Taylor Arnold, Lauren Tilton

Figure 1 for Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models

Figure 2 for Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models

Figure 3 for Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models

Figure 4 for Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models

Abstract:Many cultural institutions have made large digitized visual collections available online, often under permissible re-use licences. Creating interfaces for exploring and searching these collections is difficult, particularly in the absence of granular metadata. In this paper, we introduce a method for using state-of-the-art multimodal large language models (LLMs) to enable an open-ended, explainable search and discovery interface for visual collections. We show how our approach can create novel clustering and recommendation systems that avoid common pitfalls of methods based directly on visual embeddings. Of particular interest is the ability to offer concrete textual explanations of each recommendation without the need to preselect the features of interest. Together, these features can create a digital interface that is more open-ended and flexible while also being better suited to addressing privacy and ethical concerns. Through a case study using a collection of documentary photographs, we provide several metrics showing the efficacy and possibilities of our approach.

* 16 pages, CHR 2024: Computational Humanities Research Conference, December 4 - 6, 2024, Aarhus University, Denmark

Via

Access Paper or Ask Questions

Automated Image Color Mapping for a Historic Photographic Collection

Nov 07, 2024

Taylor Arnold, Lauren Tilton

Figure 1 for Automated Image Color Mapping for a Historic Photographic Collection

Figure 2 for Automated Image Color Mapping for a Historic Photographic Collection

Figure 3 for Automated Image Color Mapping for a Historic Photographic Collection

Figure 4 for Automated Image Color Mapping for a Historic Photographic Collection

Abstract:In the 1970s, the United States Environmental Protection Agency sponsored Documerica, a large-scale photography initiative to document environmental subjects nation-wide. While over 15,000 digitized public-domain photographs from the collection are available online, most of the images were scanned from damaged copies of the original prints. We present and evaluate a modified histogram matching technique based on the underlying chemistry of the prints for correcting the damaged images by using training data collected from a small set of undamaged prints. The entire set of color-adjusted Documerica images is made available in an open repository.

* 11 pages, CHR 2024: Computational Humanities Research Conference, December 4 - 6, 2024, Aarhus University, Denmark

Via

Access Paper or Ask Questions

Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm

Jun 28, 2018

Taylor Arnold, Lauren Tilton

Figure 1 for Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm

Figure 2 for Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm

Figure 3 for Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm

Figure 4 for Cross-Discourse and Multilingual Exploration of Textual Corpora with the DualNeighbors Algorithm

Abstract:Word choice is dependent on the cultural context of writers and their subjects. Different words are used to describe similar actions, objects, and features based on factors such as class, race, gender, geography and political affinity. Exploratory techniques based on locating and counting words may, therefore, lead to conclusions that reinforce culturally inflected boundaries. We offer a new method, the DualNeighbors algorithm, for linking thematically similar documents both within and across discursive and linguistic barriers to reveal cross-cultural connections. Qualitative and quantitative evaluations of this technique are shown as applied to two cultural datasets of interest to researchers across the humanities and social sciences. An open-source implementation of the DualNeighbors algorithm is provided to assist in its application.

* Chosen for oral presentation at 2nd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2018)

Via

Access Paper or Ask Questions

Predicting CEFRL levels in learner English on the basis of metrics and full texts

Jun 28, 2018

Taylor Arnold, Nicolas Ballier, Thomas Gaillat, Paula Lissòn

Figure 1 for Predicting CEFRL levels in learner English on the basis of metrics and full texts

Figure 2 for Predicting CEFRL levels in learner English on the basis of metrics and full texts

Abstract:This paper analyses the contribution of language metrics and, potentially, of linguistic structures, to classify French learners of English according to levels of the Common European Framework of Reference for Languages (CEFRL). The purpose is to build a model for the prediction of learner levels as a function of language complexity features. We used the EFCAMDAT corpus, a database of one million written assignments by learners. After applying language complexity metrics on the texts, we built a representation matching the language metrics of the texts to their assigned CEFRL levels. Lexical and syntactic metrics were computed with LCA, LSA, and koRpus. Several supervised learning models were built by using Gradient Boosted Trees and Keras Neural Network methods and by contrasting pairs of CEFRL levels. Results show that it is possible to implement pairwise distinctions, especially for levels ranging from A1 to B1 (A1=>A2: 0.916 AUC and A2=>B1: 0.904 AUC). Model explanation reveals significant linguistic features for the predictiveness in the corpus. Word tokens and word types appear to play a significant role in determining levels. This shows that levels are highly dependent on specific semantic profiles.

* Conference paper presented at Conf\'erence sur l'Apprentissage Automatique (CAp) 2018

Via

Access Paper or Ask Questions

A Tidy Data Model for Natural Language Processing using cleanNLP

May 03, 2018

Taylor Arnold

Figure 1 for A Tidy Data Model for Natural Language Processing using cleanNLP

Figure 2 for A Tidy Data Model for Natural Language Processing using cleanNLP

Figure 3 for A Tidy Data Model for Natural Language Processing using cleanNLP

Figure 4 for A Tidy Data Model for Natural Language Processing using cleanNLP

Abstract:The package cleanNLP provides a set of fast tools for converting a textual corpus into a set of normalized tables. The underlying natural language processing pipeline utilizes Stanford's CoreNLP library, exposing a number of annotation tasks for text written in English, French, German, and Spanish. Annotators include tokenization, part of speech tagging, named entity recognition, entity linking, sentiment analysis, dependency parsing, coreference resolution, and information extraction.

* The R Journal, 9.2, 248-267 (2017)
* 20 pages; 4 figures

Via

Access Paper or Ask Questions

Efficient Implementations of the Generalized Lasso Dual Path Algorithm

Nov 03, 2014

Taylor Arnold, Ryan Tibshirani

Abstract:We consider efficient implementations of the generalized lasso dual path algorithm of Tibshirani and Taylor (2011). We first describe a generic approach that covers any penalty matrix D and any (full column rank) matrix X of predictor variables. We then describe fast implementations for the special cases of trend filtering problems, fused lasso problems, and sparse fused lasso problems, both with X=I and a general matrix X. These specialized implementations offer a considerable improvement over the generic implementation, both in terms of numerical stability and efficiency of the solution path computation. These algorithms are all available for use in the genlasso R package, which can be found in the CRAN repository.

Via

Access Paper or Ask Questions