Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rongbin Li

MedProbeBench: Systematic Benchmarking at Deep Evidence Integration for Expert-level Medical Guideline

Apr 20, 2026

Jiyao Liu, Jianghan Shen, Sida Song, Tianbin Li, Xiaojia Liu, Rongbin Li, Ziyan Huang, Jiashi Lin, Junzhi Ning, Changkai Ji(+12 more)

Abstract:Recent advances in deep research systems enable large language models to retrieve, synthesize, and reason over large-scale external knowledge. In medicine, developing clinical guidelines critically depends on such deep evidence integration. However, existing benchmarks fail to evaluate this capability in realistic workflows requiring multi-step evidence integration and expert-level judgment. To address this gap, we introduce MedProbeBench, the first benchmark leveraging high-quality clinical guidelines as expert-level references. Medical guidelines, with their rigorous standards in neutrality and verifiability, represent the pinnacle of medical expertise and pose substantial challenges for deep research agents. For evaluation, we propose MedProbe-Eval, a comprehensive evaluation framework featuring: (1) Holistic Rubrics with 1,200+ task-adaptive rubric criteria for comprehensive quality assessment, and (2) Fine-grained Evidence Verification for rigorous validation of evidence precision, grounded in 5,130+ atomic claims. Evaluation of 17 LLMs and deep research agents reveals critical gaps in evidence integration and guideline generation, underscoring the substantial distance between current capabilities and expert-level clinical guideline development. Project: https://github.com/uni-medical/MedProbeBench

Via

Access Paper or Ask Questions

Weighted Diversified Sampling for Efficient Data-Driven Single-Cell Gene-Gene Interaction Discovery

Oct 21, 2024

Yifan Wu, Yuntao Yang, Zirui Liu, Zhao Li, Khushbu Pahwa, Rongbin Li, Wenjin Zheng, Xia Hu, Zhaozhuo Xu

Figure 1 for Weighted Diversified Sampling for Efficient Data-Driven Single-Cell Gene-Gene Interaction Discovery

Figure 2 for Weighted Diversified Sampling for Efficient Data-Driven Single-Cell Gene-Gene Interaction Discovery

Figure 3 for Weighted Diversified Sampling for Efficient Data-Driven Single-Cell Gene-Gene Interaction Discovery

Figure 4 for Weighted Diversified Sampling for Efficient Data-Driven Single-Cell Gene-Gene Interaction Discovery

Abstract:Gene-gene interactions play a crucial role in the manifestation of complex human diseases. Uncovering significant gene-gene interactions is a challenging task. Here, we present an innovative approach utilizing data-driven computational tools, leveraging an advanced Transformer model, to unearth noteworthy gene-gene interactions. Despite the efficacy of Transformer models, their parameter intensity presents a bottleneck in data ingestion, hindering data efficiency. To mitigate this, we introduce a novel weighted diversified sampling algorithm. This algorithm computes the diversity score of each data sample in just two passes of the dataset, facilitating efficient subset generation for interaction discovery. Our extensive experimentation demonstrates that by sampling a mere 1\% of the single-cell dataset, we achieve performance comparable to that of utilizing the entire dataset.

Via

Access Paper or Ask Questions

GPTON: Generative Pre-trained Transformers enhanced with Ontology Narration for accurate annotation of biological data

Oct 12, 2024

Rongbin Li, Wenbo Chen, Jinbo Li, Hanwen Xing, Zhao Li, W. Jim Zheng

Figure 1 for GPTON: Generative Pre-trained Transformers enhanced with Ontology Narration for accurate annotation of biological data

Figure 2 for GPTON: Generative Pre-trained Transformers enhanced with Ontology Narration for accurate annotation of biological data

Abstract:By leveraging GPT-4 for ontology narration, we developed GPTON to infuse structured knowledge into LLMs through verbalized ontology terms, achieving accurate text and ontology annotations for over 68% of gene sets in the top five predictions. Manual evaluations confirm GPTON's robustness, highlighting its potential to harness LLMs and structured knowledge to significantly advance biomedical research beyond gene set annotation.

* 25 pages, 6 figures

Via

Access Paper or Ask Questions