Picture for Shafiq Joty

Shafiq Joty

DashboardQA: Benchmarking Multimodal Agents for Question Answering on Interactive Dashboards

Add code
Aug 24, 2025
Viaarxiv icon

The Emergence of Abstract Thought in Large Language Models Beyond Any Language

Add code
Jun 11, 2025
Viaarxiv icon

What Makes a Good Natural Language Prompt?

Add code
Jun 07, 2025
Viaarxiv icon

Beyond Accuracy: Dissecting Mathematical Reasoning for LLMs Under Reinforcement Learning

Add code
Jun 05, 2025
Viaarxiv icon

MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision

Add code
May 26, 2025
Viaarxiv icon

Meta-Design Matters: A Self-Design Multi-Agent System

Add code
May 21, 2025
Viaarxiv icon

J4R: Learning to Judge with Equivalent Initial State Group Relative Preference Optimization

Add code
May 19, 2025
Viaarxiv icon

Learning Auxiliary Tasks Improves Reference-Free Hallucination Detection in Open-Domain Long-Form Generation

Add code
May 18, 2025
Viaarxiv icon

Judging the Judges: Can Large Vision-Language Models Fairly Evaluate Chart Comprehension and Reasoning?

Add code
May 13, 2025
Viaarxiv icon

SweRank: Software Issue Localization with Code Ranking

Add code
May 07, 2025
Viaarxiv icon