Picture for Jingqi Tong

Jingqi Tong

LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation

Add code
Jun 04, 2025
Viaarxiv icon

Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning

Add code
May 20, 2025
Viaarxiv icon

Predicting Large Language Model Capabilities on Closed-Book QA Tasks Using Only Information Available Prior to Training

Add code
Feb 06, 2025
Viaarxiv icon

LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training

Add code
Jun 24, 2024
Viaarxiv icon

Exploring the Compositional Deficiency of Large Language Models in Mathematical Reasoning

Add code
May 05, 2024
Viaarxiv icon