Reddit Corpus


HumanLLM: Towards Personalized Understanding and Simulation of Human Nature

Add code
Jan 22, 2026
Viaarxiv icon

Introducing OmniGEC: A Silver Multilingual Dataset for Grammatical Error Correction

Add code
Sep 18, 2025
Figure 1 for Introducing OmniGEC: A Silver Multilingual Dataset for Grammatical Error Correction
Figure 2 for Introducing OmniGEC: A Silver Multilingual Dataset for Grammatical Error Correction
Figure 3 for Introducing OmniGEC: A Silver Multilingual Dataset for Grammatical Error Correction
Figure 4 for Introducing OmniGEC: A Silver Multilingual Dataset for Grammatical Error Correction
Viaarxiv icon

Assessing how hyperparameters impact Large Language Models' sarcasm detection performance

Add code
Apr 08, 2025
Viaarxiv icon

Worse than Zero-shot? A Fact-Checking Dataset for Evaluating the Robustness of RAG Against Misleading Retrievals

Add code
Feb 22, 2025
Viaarxiv icon

Reddit is all you need: Authorship profiling for Romanian

Add code
Oct 13, 2024
Figure 1 for Reddit is all you need: Authorship profiling for Romanian
Figure 2 for Reddit is all you need: Authorship profiling for Romanian
Figure 3 for Reddit is all you need: Authorship profiling for Romanian
Figure 4 for Reddit is all you need: Authorship profiling for Romanian
Viaarxiv icon

A Few Hypocrites: Few-Shot Learning and Subtype Definitions for Detecting Hypocrisy Accusations in Online Climate Change Debates

Add code
Sep 25, 2024
Figure 1 for A Few Hypocrites: Few-Shot Learning and Subtype Definitions for Detecting Hypocrisy Accusations in Online Climate Change Debates
Figure 2 for A Few Hypocrites: Few-Shot Learning and Subtype Definitions for Detecting Hypocrisy Accusations in Online Climate Change Debates
Figure 3 for A Few Hypocrites: Few-Shot Learning and Subtype Definitions for Detecting Hypocrisy Accusations in Online Climate Change Debates
Figure 4 for A Few Hypocrites: Few-Shot Learning and Subtype Definitions for Detecting Hypocrisy Accusations in Online Climate Change Debates
Viaarxiv icon

LGDE: Local Graph-based Dictionary Expansion

Add code
May 13, 2024
Figure 1 for LGDE: Local Graph-based Dictionary Expansion
Figure 2 for LGDE: Local Graph-based Dictionary Expansion
Figure 3 for LGDE: Local Graph-based Dictionary Expansion
Figure 4 for LGDE: Local Graph-based Dictionary Expansion
Viaarxiv icon

Cost-Efficient Subjective Task Annotation and Modeling through Few-Shot Annotator Adaptation

Add code
Feb 21, 2024
Viaarxiv icon

COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities

Add code
Jun 17, 2024
Figure 1 for COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities
Figure 2 for COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities
Figure 3 for COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities
Figure 4 for COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities
Viaarxiv icon

Polarization and Morality: Lexical Analysis of Abortion Discourse on Reddit

Add code
Jun 29, 2024
Viaarxiv icon