Abstract:Claim normalization is an integral part of any automatic fact-check verification system. It parses the typically noisy claim data, such as social media posts into normalized claims, which are then fed into downstream veracity classification tasks. The CheckThat! 2025 Task 2 focuses specifically on claim normalization and spans 20 languages under monolingual and zero-shot conditions. Our proposed solution consists of a lightweight \emph{retrieval-first, LLM-backed} pipeline, in which we either dynamically prompt a GPT-4o-mini with in-context examples, or retrieve the closest normalization from the train dataset directly. On the official test set, the system ranks near the top for most monolingual tracks, achieving first place in 7 out of of the 13 languages. In contrast, the system underperforms in the zero-shot setting, highlighting the limitation of the proposed solution.
Abstract:Numerical claims, statements involving quantities, comparisons, and temporal references, pose unique challenges for automated fact-checking systems. In this study, we evaluate modeling strategies for veracity prediction of such claims using the QuanTemp dataset and building our own evidence retrieval pipeline. We investigate three key factors: (1) the impact of more evidences with longer input context windows using ModernBERT, (2) the effect of right-to-left (R2L) tokenization, and (3) their combined influence on classification performance. Contrary to prior findings in arithmetic reasoning tasks, R2L tokenization does not boost natural language inference (NLI) of numerical tasks. A longer context window does also not enhance veracity performance either, highlighting evidence quality as the dominant bottleneck. Our best-performing system achieves competitive macro-average F1 score of 0.57 and places us among the Top-4 submissions in Task 3 of CheckThat! 2025. Our code is available at https://github.com/dsgt-arc/checkthat-2025-numerical.
Abstract:Amyotrophic Lateral Sclerosis (ALS) is characterized as a rapidly progressive neurodegenerative disease that presents individuals with limited treatment options in the realm of medical interventions and therapies. The disease showcases a diverse range of onset patterns and progression trajectories, emphasizing the critical importance of early detection of functional decline to enable tailored care strategies and timely therapeutic interventions. The present investigation, spearheaded by the iDPP@CLEF 2024 challenge, focuses on utilizing sensor-derived data obtained through an app. This data is used to construct various machine learning models specifically designed to forecast the advancement of the ALS Functional Rating Scale-Revised (ALSFRS-R) score, leveraging the dataset provided by the organizers. In our analysis, multiple predictive models were evaluated to determine their efficacy in handling ALS sensor data. The temporal aspect of the sensor data was compressed and amalgamated using statistical methods, thereby augmenting the interpretability and applicability of the gathered information for predictive modeling objectives. The models that demonstrated optimal performance were a naive baseline and ElasticNet regression. The naive model achieved a Mean Absolute Error (MAE) of 0.20 and a Root Mean Square Error (RMSE) of 0.49, slightly outperforming the ElasticNet model, which recorded an MAE of 0.22 and an RMSE of 0.50. Our comparative analysis suggests that while the naive approach yielded marginally better predictive accuracy, the ElasticNet model provides a robust framework for understanding feature contributions.