Picture for Jack Parker-Holder

Jack Parker-Holder

Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents

Add code
Sep 03, 2025
Figure 1 for Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents
Figure 2 for Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents
Figure 3 for Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents
Figure 4 for Learning When to Plan: Efficiently Allocating Test-Time Compute for LLM Agents
Viaarxiv icon

Synthetic Data is Sufficient for Zero-Shot Visual Generalization from Offline Data

Add code
Aug 17, 2025
Figure 1 for Synthetic Data is Sufficient for Zero-Shot Visual Generalization from Offline Data
Figure 2 for Synthetic Data is Sufficient for Zero-Shot Visual Generalization from Offline Data
Figure 3 for Synthetic Data is Sufficient for Zero-Shot Visual Generalization from Offline Data
Figure 4 for Synthetic Data is Sufficient for Zero-Shot Visual Generalization from Offline Data
Viaarxiv icon

BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

Add code
Nov 20, 2024
Figure 1 for BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Figure 2 for BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Figure 3 for BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Figure 4 for BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Viaarxiv icon

Open-Endedness is Essential for Artificial Superhuman Intelligence

Add code
Jun 06, 2024
Figure 1 for Open-Endedness is Essential for Artificial Superhuman Intelligence
Figure 2 for Open-Endedness is Essential for Artificial Superhuman Intelligence
Figure 3 for Open-Endedness is Essential for Artificial Superhuman Intelligence
Viaarxiv icon

Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs

Add code
Jun 03, 2024
Figure 1 for Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs
Figure 2 for Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs
Figure 3 for Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs
Figure 4 for Outliers and Calibration Sets have Diminishing Effect on Quantization of Modern LLMs
Viaarxiv icon

Video as the New Language for Real-World Decision Making

Add code
Feb 27, 2024
Figure 1 for Video as the New Language for Real-World Decision Making
Figure 2 for Video as the New Language for Real-World Decision Making
Figure 3 for Video as the New Language for Real-World Decision Making
Figure 4 for Video as the New Language for Real-World Decision Making
Viaarxiv icon

Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Add code
Feb 26, 2024
Figure 1 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Figure 2 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Figure 3 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Figure 4 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Viaarxiv icon

Genie: Generative Interactive Environments

Add code
Feb 23, 2024
Figure 1 for Genie: Generative Interactive Environments
Figure 2 for Genie: Generative Interactive Environments
Figure 3 for Genie: Generative Interactive Environments
Figure 4 for Genie: Generative Interactive Environments
Viaarxiv icon

Multi-Agent Diagnostics for Robustness via Illuminated Diversity

Add code
Jan 24, 2024
Viaarxiv icon

Vision-Language Models as a Source of Rewards

Add code
Dec 14, 2023
Figure 1 for Vision-Language Models as a Source of Rewards
Figure 2 for Vision-Language Models as a Source of Rewards
Figure 3 for Vision-Language Models as a Source of Rewards
Figure 4 for Vision-Language Models as a Source of Rewards
Viaarxiv icon