Picture for Arman Cohan

Arman Cohan

AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research

Add code
Jul 17, 2025
Viaarxiv icon

Can LLMs Identify Critical Limitations within Scientific Research? A Systematic Evaluation on AI Research Papers

Add code
Jul 03, 2025
Viaarxiv icon

SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks

Add code
Jul 01, 2025
Viaarxiv icon

SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification

Add code
Jun 18, 2025
Viaarxiv icon

SUCEA: Reasoning-Intensive Retrieval for Adversarial Fact-checking through Claim Decomposition and Editing

Add code
Jun 05, 2025
Viaarxiv icon

MetaFaith: Faithful Natural Language Uncertainty Expression in LLMs

Add code
May 30, 2025
Viaarxiv icon

Table-R1: Inference-Time Scaling for Table Reasoning

Add code
May 29, 2025
Viaarxiv icon

Judging with Many Minds: Do More Perspectives Mean Less Prejudice?

Add code
May 26, 2025
Viaarxiv icon

Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective

Add code
May 21, 2025
Viaarxiv icon

Towards Artificial Intelligence Research Assistant for Expert-Involved Learning

Add code
May 03, 2025
Viaarxiv icon