Abstract:This study investigates a specific form of positional bias, termed the Myopic Trap, where retrieval models disproportionately attend to the early parts of documents while overlooking relevant information that appears later. To systematically quantify this phenomenon, we propose a semantics-preserving evaluation framework that repurposes the existing NLP datasets into position-aware retrieval benchmarks. By evaluating the SOTA models of full retrieval pipeline, including BM25, embedding models, ColBERT-style late-interaction models, and reranker models, we offer a broader empirical perspective on positional bias than prior work. Experimental results show that embedding models and ColBERT-style models exhibit significant performance degradation when query-related content is shifted toward later positions, indicating a pronounced head bias. Notably, under the same training configuration, ColBERT-style approach show greater potential for mitigating positional bias compared to the traditional single-vector approach. In contrast, BM25 and reranker models remain largely unaffected by such perturbations, underscoring their robustness to positional bias. Code and data are publicly available at: www.github.com/NovaSearch-Team/RAG-Retrieval.
Abstract:The advent of internet medicine provides patients with unprecedented convenience in searching and communicating with doctors relevant to their diseases and desired treatments online. However, the current doctor recommendation systems fail to fully ensure the professionalism and interpretability of the recommended results. In this work, we formulate doctor recommendation as a ranking task and develop a large language model (LLM)-based pointwise ranking framework. Our framework ranks doctors according to their relevance regarding specific diseases-treatment pairs in a zero-shot setting. The advantage of our framework lies in its ability to generate precise and explainable doctor ranking results. Additionally, we construct DrRank, a new expertise-driven doctor ranking dataset comprising over 38 disease-treatment pairs. Experiment results on the DrRank dataset demonstrate that our framework significantly outperforms the strongest cross-encoder baseline, achieving a notable gain of +5.45 in the NDCG@10 score while maintaining affordable latency consumption. Furthermore, we comprehensively present the fairness analysis results of our framework from three perspectives of different diseases, patient gender, and geographical regions. Meanwhile, the interpretability of our framework is rigorously verified by three human experts, providing further evidence of the reliability of our proposed framework for doctor recommendation.
Abstract:Objective: Recognizing retinal fundus vessel abnormity is vital to early diagnosis of ophthalmological diseases and cardiovascular events. However, segmentation results are highly influenced by elusive thin vessels. In this work, we present a synthetic network, including a symmetric equilibrium generative adversarial network (SEGAN), mul-ti-scale features refine blocks (MSFRB), and attention mechanism (AM) to enhance the performance on vessel segmentation especially for thin vessels. Method: The proposed network is granted powerful multi-scale repre-sentation capability. First, SEGAN is proposed to construct a symmetric adversarial architecture, which forces gener-ator to produce more realistic images with local details. Second, MSFRB are devised to prevent high-resolution features from being obscured, thereby preserving multi-scale features. Finally, the AM is employed to encourage the network to concentrate on discriminative features. Results: On public dataset DRIVE, STARE, and CHASEDB1, we evaluate our network quantitatively and compare it with state-of-the-art works. The ablation experiment shows that SEGAN, MSFRB, and AM both contribute to the desirable performance of our network. Conclusion: The proposed network outperforms other strategies and effectively functions in elusive vessels segmentation, achieving highest scores in Sensitivity, G-Mean, Precision, and F1-Score while maintaining the top level in other metrics. Significance: The appreciable per-formance and high computational efficiency offer great potential in clinical retinal vessel segmentation application. Meanwhile, the network could be utilized to extract detail information on other biomedical issues.