Picture for Sean Welleck

Sean Welleck

miniCodeProps: a Minimal Benchmark for Proving Code Properties

Add code
Jun 16, 2024
Viaarxiv icon

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Add code
Jun 09, 2024
Viaarxiv icon

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Add code
May 02, 2024
Viaarxiv icon

Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision

Add code
Mar 14, 2024
Figure 1 for Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Figure 2 for Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Figure 3 for Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Figure 4 for Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision
Viaarxiv icon

STEER: Unified Style Transfer with Expert Reinforcement

Add code
Nov 13, 2023
Viaarxiv icon

LLMSTEP: LLM proofstep suggestions in Lean

Add code
Oct 27, 2023
Figure 1 for LLMSTEP: LLM proofstep suggestions in Lean
Figure 2 for LLMSTEP: LLM proofstep suggestions in Lean
Figure 3 for LLMSTEP: LLM proofstep suggestions in Lean
Figure 4 for LLMSTEP: LLM proofstep suggestions in Lean
Viaarxiv icon

Llemma: An Open Language Model For Mathematics

Add code
Oct 16, 2023
Figure 1 for Llemma: An Open Language Model For Mathematics
Figure 2 for Llemma: An Open Language Model For Mathematics
Figure 3 for Llemma: An Open Language Model For Mathematics
Figure 4 for Llemma: An Open Language Model For Mathematics
Viaarxiv icon

Faith and Fate: Limits of Transformers on Compositionality

Jun 01, 2023
Figure 1 for Faith and Fate: Limits of Transformers on Compositionality
Figure 2 for Faith and Fate: Limits of Transformers on Compositionality
Figure 3 for Faith and Fate: Limits of Transformers on Compositionality
Figure 4 for Faith and Fate: Limits of Transformers on Compositionality
Viaarxiv icon

Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning

Add code
May 24, 2023
Figure 1 for Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
Figure 2 for Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
Figure 3 for Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
Figure 4 for Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning
Viaarxiv icon

Self-Refine: Iterative Refinement with Self-Feedback

Add code
Mar 30, 2023
Figure 1 for Self-Refine: Iterative Refinement with Self-Feedback
Figure 2 for Self-Refine: Iterative Refinement with Self-Feedback
Figure 3 for Self-Refine: Iterative Refinement with Self-Feedback
Figure 4 for Self-Refine: Iterative Refinement with Self-Feedback
Viaarxiv icon