Picture for Arman Cohan

Arman Cohan

Can LLMs Use Linguistic Uncertainty Markers to Reliably Reflect Intrinsic Confidence?

Add code
May 27, 2026
Viaarxiv icon

OpenComputer: Verifiable Software Worlds for Computer-Use Agents

Add code
May 19, 2026
Viaarxiv icon

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

Add code
May 18, 2026
Viaarxiv icon

Herculean: An Agentic Benchmark for Financial Intelligence

Add code
May 14, 2026
Viaarxiv icon

Make Each Token Count: Towards Improving Long-Context Performance with KV Cache Eviction

Add code
May 10, 2026
Viaarxiv icon

Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems

Add code
May 05, 2026
Viaarxiv icon

REVERE: Reflective Evolving Research Engineer for Scientific Workflows

Add code
Mar 21, 2026
Viaarxiv icon

SciMDR: Benchmarking and Advancing Scientific Multimodal Document Reasoning

Add code
Mar 12, 2026
Viaarxiv icon

Examining Reasoning LLMs-as-Judges in Non-Verifiable LLM Post-Training

Add code
Mar 12, 2026
Viaarxiv icon

RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation

Add code
Mar 10, 2026
Viaarxiv icon