Picture for Allen Nie

Allen Nie

Shammie

Teaching Large Language Models to Reason through Learning and Forgetting

Add code
Apr 15, 2025
Viaarxiv icon

Predicting Long Term Sequential Policy Value Using Softer Surrogates

Add code
Dec 30, 2024
Viaarxiv icon

Improving Parallel Program Performance Through DSL-Driven Code Generation with LLM Optimizers

Add code
Oct 21, 2024
Viaarxiv icon

EVOLvE: Evaluating and Optimizing LLMs For Exploration

Add code
Oct 08, 2024
Figure 1 for EVOLvE: Evaluating and Optimizing LLMs For Exploration
Figure 2 for EVOLvE: Evaluating and Optimizing LLMs For Exploration
Figure 3 for EVOLvE: Evaluating and Optimizing LLMs For Exploration
Figure 4 for EVOLvE: Evaluating and Optimizing LLMs For Exploration
Viaarxiv icon

Trace is the New AutoDiff -- Unlocking Efficient Optimization of Computational Workflows

Add code
Jun 23, 2024
Figure 1 for Trace is the New AutoDiff -- Unlocking Efficient Optimization of Computational Workflows
Figure 2 for Trace is the New AutoDiff -- Unlocking Efficient Optimization of Computational Workflows
Figure 3 for Trace is the New AutoDiff -- Unlocking Efficient Optimization of Computational Workflows
Figure 4 for Trace is the New AutoDiff -- Unlocking Efficient Optimization of Computational Workflows
Viaarxiv icon

OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators

Add code
May 27, 2024
Figure 1 for OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators
Figure 2 for OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators
Figure 3 for OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators
Figure 4 for OPERA: Automatic Offline Policy Evaluation with Re-weighted Aggregates of Multiple Estimators
Viaarxiv icon

The Importance of Directional Feedback for LLM-based Optimizers

Add code
May 26, 2024
Figure 1 for The Importance of Directional Feedback for LLM-based Optimizers
Figure 2 for The Importance of Directional Feedback for LLM-based Optimizers
Figure 3 for The Importance of Directional Feedback for LLM-based Optimizers
Figure 4 for The Importance of Directional Feedback for LLM-based Optimizers
Viaarxiv icon

LLF-Bench: Benchmark for Interactive Learning from Language Feedback

Add code
Dec 13, 2023
Viaarxiv icon

MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks

Add code
Oct 31, 2023
Figure 1 for MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks
Figure 2 for MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks
Figure 3 for MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks
Figure 4 for MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks
Viaarxiv icon

Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets

Add code
Jun 24, 2023
Figure 1 for Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets
Figure 2 for Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets
Figure 3 for Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets
Figure 4 for Waypoint Transformer: Reinforcement Learning via Supervised Learning with Intermediate Targets
Viaarxiv icon