Picture for Tao Gui

Tao Gui

LLMEval-Med: A Real-world Clinical Benchmark for Medical LLMs with Physician Validation

Add code
Jun 04, 2025
Viaarxiv icon

Compression Hacking: A Supplementary Perspective on Informatics Metric of Language Models from Geometric Distortion

Add code
May 23, 2025
Viaarxiv icon

Code2Logic: Game-Code-Driven Data Synthesis for Enhancing VLMs General Reasoning

Add code
May 20, 2025
Viaarxiv icon

Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment

Add code
May 19, 2025
Viaarxiv icon

WorldPM: Scaling Human Preference Modeling

Add code
May 15, 2025
Viaarxiv icon

A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models

Add code
May 12, 2025
Viaarxiv icon

Effective Length Extrapolation via Dimension-Wise Positional Embeddings Manipulation

Add code
Apr 26, 2025
Viaarxiv icon

Improving RL Exploration for LLM Reasoning through Retrospective Replay

Add code
Apr 19, 2025
Viaarxiv icon

Machine-assisted writing evaluation: Exploring pre-trained language models in analyzing argumentative moves

Add code
Mar 25, 2025
Viaarxiv icon

Mitigating Object Hallucinations in MLLMs via Multi-Frequency Perturbations

Add code
Mar 19, 2025
Viaarxiv icon