Picture for Pan Lu

Pan Lu

Learning Human-Perceived Fakeness in AI-Generated Videos via Multimodal LLMs

Add code
Sep 26, 2025
Viaarxiv icon

Solving Inequality Proofs with Large Language Models

Add code
Jun 09, 2025
Viaarxiv icon

Towards Artificial Intelligence Research Assistant for Expert-Involved Learning

Add code
May 03, 2025
Viaarxiv icon

Weak-for-Strong: Training Weak Meta-Agent to Harness Strong Executors

Add code
Apr 07, 2025
Viaarxiv icon

Protein Large Language Models: A Comprehensive Survey

Add code
Feb 21, 2025
Viaarxiv icon

ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning

Add code
Jan 11, 2025
Figure 1 for ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning
Figure 2 for ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning
Figure 3 for ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning
Figure 4 for ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning
Viaarxiv icon

VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning

Add code
Dec 03, 2024
Viaarxiv icon

MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models

Add code
Oct 10, 2024
Viaarxiv icon

VDebugger: Harnessing Execution Feedback for Debugging Visual Programs

Add code
Jun 19, 2024
Figure 1 for VDebugger: Harnessing Execution Feedback for Debugging Visual Programs
Figure 2 for VDebugger: Harnessing Execution Feedback for Debugging Visual Programs
Figure 3 for VDebugger: Harnessing Execution Feedback for Debugging Visual Programs
Figure 4 for VDebugger: Harnessing Execution Feedback for Debugging Visual Programs
Viaarxiv icon

MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding

Add code
Jun 13, 2024
Figure 1 for MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Figure 2 for MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Figure 3 for MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Figure 4 for MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Viaarxiv icon