Picture for Shelby Heinecke

Shelby Heinecke

Position: Vector Prompt Interfaces Should Be Exposed to Enable Customization of Large Language Models

Add code
Mar 04, 2026
Viaarxiv icon

AudioCapBench: Quick Evaluation on Audio Captioning across Sound, Music, and Speech

Add code
Feb 27, 2026
Viaarxiv icon

Robotic VLA Benefits from Joint Learning with Motion Image Diffusion

Add code
Dec 19, 2025
Viaarxiv icon

LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering

Add code
Nov 17, 2025
Figure 1 for LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
Figure 2 for LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
Figure 3 for LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
Figure 4 for LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
Viaarxiv icon

GeoGNN: Quantifying and Mitigating Semantic Drift in Text-Attributed Graphs

Add code
Nov 12, 2025
Viaarxiv icon

Grounded Test-Time Adaptation for LLM Agents

Add code
Nov 06, 2025
Figure 1 for Grounded Test-Time Adaptation for LLM Agents
Figure 2 for Grounded Test-Time Adaptation for LLM Agents
Figure 3 for Grounded Test-Time Adaptation for LLM Agents
Figure 4 for Grounded Test-Time Adaptation for LLM Agents
Viaarxiv icon

ToolLibGen: Scalable Automatic Tool Creation and Aggregation for LLM Reasoning

Add code
Oct 09, 2025
Figure 1 for ToolLibGen: Scalable Automatic Tool Creation and Aggregation for LLM Reasoning
Figure 2 for ToolLibGen: Scalable Automatic Tool Creation and Aggregation for LLM Reasoning
Figure 3 for ToolLibGen: Scalable Automatic Tool Creation and Aggregation for LLM Reasoning
Figure 4 for ToolLibGen: Scalable Automatic Tool Creation and Aggregation for LLM Reasoning
Viaarxiv icon

LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering

Add code
Sep 11, 2025
Figure 1 for LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Figure 2 for LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Figure 3 for LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Figure 4 for LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Viaarxiv icon

UserBench: An Interactive Gym Environment for User-Centric Agents

Add code
Jul 29, 2025
Figure 1 for UserBench: An Interactive Gym Environment for User-Centric Agents
Figure 2 for UserBench: An Interactive Gym Environment for User-Centric Agents
Figure 3 for UserBench: An Interactive Gym Environment for User-Centric Agents
Figure 4 for UserBench: An Interactive Gym Environment for User-Centric Agents
Viaarxiv icon

APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay

Add code
Apr 08, 2025
Viaarxiv icon