Picture for Kyle Richardson

Kyle Richardson

Shammie

Operads for compositional reasoning in LLMs

Add code
Jun 11, 2026
Viaarxiv icon

Operadic consistency: a label-free signal for compositional reasoning failures in LLMs

Add code
Jun 11, 2026
Viaarxiv icon

Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis

Add code
Apr 24, 2026
Viaarxiv icon

AstaBench: Rigorous Benchmarking of AI Agents with a Scientific Research Suite

Add code
Oct 24, 2025
Viaarxiv icon

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

Add code
Feb 03, 2025
Figure 1 for ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning
Figure 2 for ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning
Figure 3 for ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning
Figure 4 for ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning
Viaarxiv icon

Understanding the Logic of Direct Preference Alignment through Logic

Add code
Dec 23, 2024
Figure 1 for Understanding the Logic of Direct Preference Alignment through Logic
Figure 2 for Understanding the Logic of Direct Preference Alignment through Logic
Figure 3 for Understanding the Logic of Direct Preference Alignment through Logic
Figure 4 for Understanding the Logic of Direct Preference Alignment through Logic
Viaarxiv icon

SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories

Add code
Sep 11, 2024
Viaarxiv icon

SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals

Add code
Jun 07, 2024
Figure 1 for SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals
Figure 2 for SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals
Figure 3 for SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals
Figure 4 for SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals
Viaarxiv icon

TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation

Add code
Feb 08, 2024
Viaarxiv icon

OLMo: Accelerating the Science of Language Models

Add code
Feb 07, 2024
Figure 1 for OLMo: Accelerating the Science of Language Models
Figure 2 for OLMo: Accelerating the Science of Language Models
Figure 3 for OLMo: Accelerating the Science of Language Models
Figure 4 for OLMo: Accelerating the Science of Language Models
Viaarxiv icon