Question Answering


VLAD: A VLM-Augmented Autonomous Driving Framework with Hierarchical Planning and Interpretable Decision Process

Add code
Jul 02, 2025
Viaarxiv icon

Frustratingly Simple Retrieval Improves Challenging, Reasoning-Intensive Benchmarks

Add code
Jul 02, 2025
Viaarxiv icon

Symbolic or Numerical? Understanding Physics Problem Solving in Reasoning LLMs

Add code
Jul 02, 2025
Viaarxiv icon

GAIus: Combining Genai with Legal Clauses Retrieval for Knowledge-based Assistant

Add code
Jul 02, 2025
Viaarxiv icon

SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks

Add code
Jul 01, 2025
Viaarxiv icon

A Keyword-Based Technique to Evaluate Broad Question Answer Script

Add code
Jun 26, 2025
Viaarxiv icon

Towards Probabilistic Question Answering Over Tabular Data

Add code
Jun 25, 2025
Viaarxiv icon

SMMILE: An Expert-Driven Benchmark for Multimodal Medical In-Context Learning

Add code
Jun 26, 2025
Viaarxiv icon

OmniEval: A Benchmark for Evaluating Omni-modal Models with Visual, Auditory, and Textual Inputs

Add code
Jun 26, 2025
Viaarxiv icon

Can Gradient Descent Simulate Prompting?

Add code
Jun 26, 2025
Viaarxiv icon