Abstract:Large Language Models (LLMs) have fundamentally transformed the field of natural language processing; however, their vulnerability to biases presents a notable obstacle that threatens both fairness and trust. This review offers an extensive analysis of the bias landscape in LLMs, tracing its roots and expressions across various NLP tasks. Biases are classified into implicit and explicit types, with particular attention given to their emergence from data sources, architectural designs, and contextual deployments. This study advances beyond theoretical analysis by implementing a simulation framework designed to evaluate bias mitigation strategies in practice. The framework integrates multiple approaches including data curation, debiasing during model training, and post-hoc output calibration and assesses their impact in controlled experimental settings. In summary, this work not only synthesizes existing knowledge on bias in LLMs but also contributes original empirical validation through simulation of mitigation strategies.
Abstract:Clustering algorithms are widely used in various applications, with density-based methods such as Density-Based Spatial Clustering of Applications with Noise (DBSCAN) being particularly prominent. These algorithms identify clusters in high-density regions while treating sparser areas as noise. However, reliance on user-defined parameters often poses optimization challenges that require domain expertise. This paper presents a novel density-based clustering method inspired by the concept of selective attention, which minimizes the need for user-defined parameters under standard conditions. Initially, the algorithm operates without requiring user-defined parameters. If parameter adjustment is needed, the method simplifies the process by introducing a single integer parameter that is straightforward to tune. The approach computes a threshold to filter out the most sparsely distributed points and outliers, forms a preliminary cluster structure, and then reintegrates the excluded points to finalize the results. Experimental evaluations on diverse data sets highlight the accessibility and robust performance of the method, providing an effective alternative for density-based clustering tasks.