Picture for Jiayuan Chen

Jiayuan Chen

Sherman

Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning

Add code
Mar 27, 2026
Viaarxiv icon

THEMIS: Towards Holistic Evaluation of MLLMs for Scientific Paper Fraud Forensics

Add code
Mar 26, 2026
Viaarxiv icon

A Scheduling Framework for Efficient MoE Inference on Edge GPU-NDP Systems

Add code
Jan 07, 2026
Viaarxiv icon

MedBench v4: A Robust and Scalable Benchmark for Evaluating Chinese Medical Language Models, Multimodal Models, and Intelligent Agents

Add code
Nov 19, 2025
Viaarxiv icon

TCM-5CEval: Extended Deep Evaluation Benchmark for LLM's Comprehensive Clinical Research Competence in Traditional Chinese Medicine

Add code
Nov 17, 2025
Viaarxiv icon

MedCalc-Eval and MedCalc-Env: Advancing Medical Calculation Capabilities of Large Language Models

Add code
Oct 31, 2025
Viaarxiv icon

Building a Human-Verified Clinical Reasoning Dataset via a Human LLM Hybrid Pipeline for Trustworthy Medical AI

Add code
May 11, 2025
Viaarxiv icon

Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies

Add code
Mar 10, 2025
Figure 1 for Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies
Figure 2 for Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies
Figure 3 for Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies
Figure 4 for Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies
Viaarxiv icon

TCM-3CEval: A Triaxial Benchmark for Assessing Responses from Large Language Models in Traditional Chinese Medicine

Add code
Mar 10, 2025
Figure 1 for TCM-3CEval: A Triaxial Benchmark for Assessing Responses from Large Language Models in Traditional Chinese Medicine
Figure 2 for TCM-3CEval: A Triaxial Benchmark for Assessing Responses from Large Language Models in Traditional Chinese Medicine
Figure 3 for TCM-3CEval: A Triaxial Benchmark for Assessing Responses from Large Language Models in Traditional Chinese Medicine
Figure 4 for TCM-3CEval: A Triaxial Benchmark for Assessing Responses from Large Language Models in Traditional Chinese Medicine
Viaarxiv icon

Predictive Modeling with Temporal Graphical Representation on Electronic Health Records

Add code
May 07, 2024
Figure 1 for Predictive Modeling with Temporal Graphical Representation on Electronic Health Records
Figure 2 for Predictive Modeling with Temporal Graphical Representation on Electronic Health Records
Figure 3 for Predictive Modeling with Temporal Graphical Representation on Electronic Health Records
Figure 4 for Predictive Modeling with Temporal Graphical Representation on Electronic Health Records
Viaarxiv icon