Picture for Jacob Eisenstein

Jacob Eisenstein

Cost-Optimal Active AI Model Evaluation

Add code
Jun 09, 2025
Viaarxiv icon

Don't lie to your friends: Learning what you know from collaborative self-play

Add code
Mar 18, 2025
Viaarxiv icon

InfAlign: Inference-aware language model alignment

Add code
Dec 27, 2024
Viaarxiv icon

ALTA: Compiler-Based Analysis of Transformers

Add code
Oct 23, 2024
Figure 1 for ALTA: Compiler-Based Analysis of Transformers
Figure 2 for ALTA: Compiler-Based Analysis of Transformers
Figure 3 for ALTA: Compiler-Based Analysis of Transformers
Figure 4 for ALTA: Compiler-Based Analysis of Transformers
Viaarxiv icon

Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning

Add code
Oct 10, 2024
Figure 1 for Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Figure 2 for Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Figure 3 for Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Figure 4 for Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Viaarxiv icon

Predicting the Target Word of Game-playing Conversations using a Low-Rank Dialect Adapter for Decoder Models

Add code
Aug 31, 2024
Viaarxiv icon

Robust Preference Optimization through Reward Model Distillation

Add code
May 29, 2024
Viaarxiv icon

Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment

Add code
Apr 18, 2024
Figure 1 for Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Figure 2 for Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Figure 3 for Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Figure 4 for Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment
Viaarxiv icon

Transforming and Combining Rewards for Aligning Large Language Models

Add code
Feb 01, 2024
Viaarxiv icon

Theoretical guarantees on the best-of-n alignment policy

Add code
Jan 03, 2024
Figure 1 for Theoretical guarantees on the best-of-n alignment policy
Figure 2 for Theoretical guarantees on the best-of-n alignment policy
Figure 3 for Theoretical guarantees on the best-of-n alignment policy
Viaarxiv icon