Topic


Auditing Preferences for Brands and Cultures in LLMs

Add code
Mar 18, 2026
Viaarxiv icon

Modeling Changing Scientific Concepts with Complex Networks: A Case Study on the Chemical Revolution

Add code
Mar 18, 2026
Viaarxiv icon

GRAFITE: Generative Regression Analysis Framework for Issue Tracking and Evaluation

Add code
Mar 18, 2026
Viaarxiv icon

Discovering Decoupled Functional Modules in Large Language Models

Add code
Mar 18, 2026
Viaarxiv icon

IndicSafe: A Benchmark for Evaluating Multilingual LLM Safety in South Asia

Add code
Mar 18, 2026
Viaarxiv icon

From Noise to Signal: When Outliers Seed New Topics

Add code
Mar 18, 2026
Viaarxiv icon

How LLMs Distort Our Written Language

Add code
Mar 18, 2026
Viaarxiv icon

VeriGrey: Greybox Agent Validation

Add code
Mar 18, 2026
Viaarxiv icon

The Validity Gap in Health AI Evaluation: A Cross-Sectional Analysis of Benchmark Composition

Add code
Mar 18, 2026
Viaarxiv icon

Sensi: Learn One Thing at a Time -- Curriculum-Based Test-Time Learning for LLM Game Agents

Add code
Mar 18, 2026
Viaarxiv icon