Picture for Arman Cohan

Arman Cohan

Herculean: An Agentic Benchmark for Financial Intelligence

Add code
May 14, 2026
Viaarxiv icon

Make Each Token Count: Towards Improving Long-Context Performance with KV Cache Eviction

Add code
May 10, 2026
Viaarxiv icon

Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems

Add code
May 05, 2026
Viaarxiv icon

REVERE: Reflective Evolving Research Engineer for Scientific Workflows

Add code
Mar 21, 2026
Viaarxiv icon

SciMDR: Benchmarking and Advancing Scientific Multimodal Document Reasoning

Add code
Mar 12, 2026
Viaarxiv icon

Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training

Add code
Mar 12, 2026
Viaarxiv icon

RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation

Add code
Mar 10, 2026
Viaarxiv icon

Deconstructing Multimodal Mathematical Reasoning: Towards a Unified Perception-Alignment-Reasoning Paradigm

Add code
Mar 09, 2026
Viaarxiv icon

QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematical Proofs

Add code
Feb 24, 2026
Viaarxiv icon

References Improve LLM Alignment in Non-Verifiable Domains

Add code
Feb 18, 2026
Viaarxiv icon