The rapid spread of misinformation through social media platforms has raised concerns regarding its impact on public opinion. While misinformation is prevalent in other languages, the majority of research in this field has concentrated on the English language. Hence, there is a scarcity of datasets for other languages, including Turkish. To address this concern, we have introduced the FCTR dataset, consisting of 3238 real-world claims. This dataset spans multiple domains and incorporates evidence collected from three Turkish fact-checking organizations. Additionally, we aim to assess the effectiveness of cross-lingual transfer learning for low-resource languages, with a particular focus on Turkish. We demonstrate in-context learning (zero-shot and few-shot) performance of large language models in this context. The experimental results indicate that the dataset has the potential to advance research in the Turkish language.
The rapid dissemination of misinformation through social media increased the importance of automated fact-checking. Furthermore, studies on what deep neural models pay attention to when making predictions have increased in recent years. While significant progress has been made in this field, it has not yet reached a level of reasoning comparable to human reasoning. To address these gaps, we propose a multi-task explainable neural model for misinformation detection. Specifically, this work formulates an explanation generation process of the model's veracity prediction as a text summarization problem. Additionally, the performance of the proposed model is discussed on publicly available datasets and the findings are evaluated with related studies.
This work investigates segmentation approaches for sentiment analysis on informal short texts in Turkish. The two building blocks of the proposed work are segmentation and deep neural network model. Segmentation focuses on preprocessing of text with different methods. These methods are grouped in four: morphological, sub-word, tokenization, and hybrid approaches. We analyzed several variants for each of these four methods. The second stage focuses on evaluation of the neural model for sentiment analysis. The performance of each segmentation method is evaluated under Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) model proposed in the literature for sentiment classification.
The vocabulary mismatch problem is a long-standing problem in information retrieval. Semantic matching holds the promise of solving the problem. Recent advances in language technology have given rise to unsupervised neural models for learning representations of words as well as bigger textual units. Such representations enable powerful semantic matching methods. This survey is meant as an introduction to the use of neural models for semantic matching. To remain focused we limit ourselves to web search. We detail the required background and terminology, a taxonomy grouping the rapidly growing body of work in the area, and then survey work on neural models for semantic matching in the context of three tasks: query suggestion, ad retrieval, and document retrieval. We include a section on resources and best practices that we believe will help readers who are new to the area. We conclude with an assessment of the state-of-the-art and suggestions for future work.