Reddit Corpus


Introducing OmniGEC: A Silver Multilingual Dataset for Grammatical Error Correction

Add code
Sep 18, 2025
Viaarxiv icon

Assessing how hyperparameters impact Large Language Models' sarcasm detection performance

Add code
Apr 08, 2025
Viaarxiv icon

Worse than Zero-shot? A Fact-Checking Dataset for Evaluating the Robustness of RAG Against Misleading Retrievals

Add code
Feb 22, 2025
Viaarxiv icon

Reddit is all you need: Authorship profiling for Romanian

Add code
Oct 13, 2024
Figure 1 for Reddit is all you need: Authorship profiling for Romanian
Figure 2 for Reddit is all you need: Authorship profiling for Romanian
Figure 3 for Reddit is all you need: Authorship profiling for Romanian
Figure 4 for Reddit is all you need: Authorship profiling for Romanian
Viaarxiv icon

A Few Hypocrites: Few-Shot Learning and Subtype Definitions for Detecting Hypocrisy Accusations in Online Climate Change Debates

Add code
Sep 25, 2024
Figure 1 for A Few Hypocrites: Few-Shot Learning and Subtype Definitions for Detecting Hypocrisy Accusations in Online Climate Change Debates
Figure 2 for A Few Hypocrites: Few-Shot Learning and Subtype Definitions for Detecting Hypocrisy Accusations in Online Climate Change Debates
Figure 3 for A Few Hypocrites: Few-Shot Learning and Subtype Definitions for Detecting Hypocrisy Accusations in Online Climate Change Debates
Figure 4 for A Few Hypocrites: Few-Shot Learning and Subtype Definitions for Detecting Hypocrisy Accusations in Online Climate Change Debates
Viaarxiv icon

LGDE: Local Graph-based Dictionary Expansion

Add code
May 13, 2024
Figure 1 for LGDE: Local Graph-based Dictionary Expansion
Figure 2 for LGDE: Local Graph-based Dictionary Expansion
Figure 3 for LGDE: Local Graph-based Dictionary Expansion
Figure 4 for LGDE: Local Graph-based Dictionary Expansion
Viaarxiv icon

Polarization and Morality: Lexical Analysis of Abortion Discourse on Reddit

Add code
Jun 29, 2024
Viaarxiv icon

COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities

Add code
Jun 17, 2024
Figure 1 for COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities
Figure 2 for COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities
Figure 3 for COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities
Figure 4 for COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities
Viaarxiv icon

Cost-Efficient Subjective Task Annotation and Modeling through Few-Shot Annotator Adaptation

Add code
Feb 21, 2024
Viaarxiv icon

Performance evaluation of Reddit Comments using Machine Learning and Natural Language Processing methods in Sentiment Analysis

Add code
May 28, 2024
Figure 1 for Performance evaluation of Reddit Comments using Machine Learning and Natural Language Processing methods in Sentiment Analysis
Figure 2 for Performance evaluation of Reddit Comments using Machine Learning and Natural Language Processing methods in Sentiment Analysis
Figure 3 for Performance evaluation of Reddit Comments using Machine Learning and Natural Language Processing methods in Sentiment Analysis
Figure 4 for Performance evaluation of Reddit Comments using Machine Learning and Natural Language Processing methods in Sentiment Analysis
Viaarxiv icon