Picture for Gabriel Stanovsky

Gabriel Stanovsky

Time to Talk: LLM Agents for Asynchronous Group Communication in Mafia Games

Add code
Jun 05, 2025
Viaarxiv icon

ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of Moments

Add code
May 28, 2025
Viaarxiv icon

Cooking Up Creativity: A Cognitively-Inspired Approach for Enhancing LLM Creativity through Structured Representations

Add code
Apr 29, 2025
Viaarxiv icon

More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG

Add code
Mar 06, 2025
Viaarxiv icon

DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM Evaluation

Add code
Mar 04, 2025
Viaarxiv icon

Seeing the Forest for the Trees: A Large Scale, Continuously Updating Meta-Analysis of Frontier LLMs

Add code
Feb 26, 2025
Viaarxiv icon

WildFrame: Comparing Framing in Humans and LLMs on Naturally Occurring Texts

Add code
Feb 24, 2025
Viaarxiv icon

Trust Me, I'm Wrong: High-Certainty Hallucinations in LLMs

Add code
Feb 18, 2025
Viaarxiv icon

Beyond Benchmarks: On The False Promise of AI Regulation

Add code
Jan 26, 2025
Viaarxiv icon

Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time

Add code
Jan 08, 2025
Figure 1 for Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time
Figure 2 for Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time
Figure 3 for Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time
Figure 4 for Improving Image Captioning by Mimicking Human Reformulation Feedback at Inference-time
Viaarxiv icon