Abstract:Volunteer moderators play a crucial role in sustaining online dialogue, but they often disagree about what should or should not be allowed. In this paper, we study the complexity of content moderation with a focus on disagreements between moderators, which we term the ``gray area'' of moderation. Leveraging 5 years and 4.3 million moderation log entries from 24 subreddits of different topics and sizes, we characterize how gray area, or disputed cases, differ from undisputed cases. We show that one-in-seven moderation cases are disputed among moderators, often addressing transgressions where users' intent is not directly legible, such as in trolling and brigading, as well as tensions around community governance. This is concerning, as almost half of all gray area cases involved automated moderation decisions. Through information-theoretic evaluations, we demonstrate that gray area cases are inherently harder to adjudicate than undisputed cases and show that state-of-the-art language models struggle to adjudicate them. We highlight the key role of expert human moderators in overseeing the moderation process and provide insights about the challenges of current moderation processes and tools.
Abstract:Generative search engines are reshaping information access by replacing traditional ranked lists with synthesized answers and references. In parallel, with the growth of Web3 platforms, incentive-driven creator ecosystems have become an essential part of how enterprises build visibility and community by rewarding creators for contributing to shared narratives. However, the extent to which exposure in generative search engine citations is shaped by external attention markets remains uncertain. In this study, we audit the exposure for 44 Web3 enterprises. First, we show that the creator community around each enterprise is persistent over time. Second, enterprise-specific queries reveal that more popular voices systematically receive greater citation exposure than others. Third, we find that larger follower bases and enterprises with more concentrated creator cores are associated with higher-ranked exposure. Together, these results show that generative search engine citations exhibit exposure bias toward already prominent voices, which risks entrenching incumbents and narrowing viewpoint diversity.




Abstract:Large language Models (LLMs) are highly sensitive to variations in prompt formulation, which can significantly impact their ability to generate accurate responses. In this paper, we introduce a new task, Prompt Sensitivity Prediction, and a dataset PromptSET designed to investigate the effects of slight prompt variations on LLM performance. Using TriviaQA and HotpotQA datasets as the foundation of our work, we generate prompt variations and evaluate their effectiveness across multiple LLMs. We benchmark the prompt sensitivity prediction task employing state-of-the-art methods from related tasks, including LLM-based self-evaluation, text classification, and query performance prediction techniques. Our findings reveal that existing methods struggle to effectively address prompt sensitivity prediction, underscoring the need to understand how information needs should be phrased for accurate LLM responses.