Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mikhail Yurochkin

Distributional Preference Alignment of LLMs via Optimal Transport

Jun 09, 2024

Igor Melnyk, Youssef Mroueh, Brian Belgodere, Mattia Rigotti, Apoorva Nitsure, Mikhail Yurochkin, Kristjan Greenewald, Jiri Navratil, Jerret Ross

Figure 1 for Distributional Preference Alignment of LLMs via Optimal Transport

Figure 2 for Distributional Preference Alignment of LLMs via Optimal Transport

Figure 3 for Distributional Preference Alignment of LLMs via Optimal Transport

Figure 4 for Distributional Preference Alignment of LLMs via Optimal Transport

Abstract:Current LLM alignment techniques use pairwise human preferences at a sample level, and as such, they do not imply an alignment on the distributional level. We propose in this paper Alignment via Optimal Transport (AOT), a novel method for distributional preference alignment of LLMs. AOT aligns LLMs on unpaired preference data by making the reward distribution of the positive samples stochastically dominant in the first order on the distribution of negative samples. We introduce a convex relaxation of this first-order stochastic dominance and cast it as an optimal transport problem with a smooth and convex cost. Thanks to the one-dimensional nature of the resulting optimal transport problem and the convexity of the cost, it has a closed-form solution via sorting on empirical measures. We fine-tune LLMs with this AOT objective, which enables alignment by penalizing the violation of the stochastic dominance of the reward distribution of the positive samples on the reward distribution of the negative samples. We analyze the sample complexity of AOT by considering the dual of the OT problem and show that it converges at the parametric rate. Empirically, we show on a diverse set of alignment datasets and LLMs that AOT leads to state-of-the-art models in the 7B family of models when evaluated with Open LLM Benchmarks and AlpacaEval.

Via

Access Paper or Ask Questions

Efficient multi-prompt evaluation of LLMs

May 27, 2024

Felipe Maia Polo, Ronald Xu, Lucas Weber, Mírian Silva, Onkar Bhardwaj, Leshem Choshen, Allysson Flavio Melo de Oliveira, Yuekai Sun, Mikhail Yurochkin

Figure 1 for Efficient multi-prompt evaluation of LLMs

Figure 2 for Efficient multi-prompt evaluation of LLMs

Figure 3 for Efficient multi-prompt evaluation of LLMs

Figure 4 for Efficient multi-prompt evaluation of LLMs

Abstract:Most popular benchmarks for comparing LLMs rely on a limited set of prompt templates, which may not fully capture the LLMs' abilities and can affect the reproducibility of results on leaderboards. Many recent works empirically verify prompt sensitivity and advocate for changes in LLM evaluation. In this paper, we consider the problem of estimating the performance distribution across many prompt variants instead of finding a single prompt to evaluate with. We introduce PromptEval, a method for estimating performance across a large set of prompts borrowing strength across prompts and examples to produce accurate estimates under practical evaluation budgets. The resulting distribution can be used to obtain performance quantiles to construct various robust performance metrics (e.g., top 95% quantile or median). We prove that PromptEval consistently estimates the performance distribution and demonstrate its efficacy empirically on three prominent LLM benchmarks: MMLU, BIG-bench Hard, and LMentry. For example, PromptEval can accurately estimate performance quantiles across 100 prompt templates on MMLU with a budget equivalent to two single-prompt evaluations. Our code and data can be found at https://github.com/felipemaiapolo/prompt-eval.

Via

Access Paper or Ask Questions

A statistical framework for weak-to-strong generalization

May 25, 2024

Seamus Somerstep, Felipe Maia Polo, Moulinath Banerjee, Ya'acov Ritov, Mikhail Yurochkin, Yuekai Sun

Figure 1 for A statistical framework for weak-to-strong generalization

Figure 2 for A statistical framework for weak-to-strong generalization

Figure 3 for A statistical framework for weak-to-strong generalization

Figure 4 for A statistical framework for weak-to-strong generalization

Abstract:Modern large language model (LLM) alignment techniques rely on human feedback, but it is unclear whether the techniques fundamentally limit the capabilities of aligned LLMs. In particular, it is unclear whether it is possible to align (stronger) LLMs with superhuman capabilities with (weaker) human feedback without degrading their capabilities. This is an instance of the weak-to-strong generalization problem: using weaker (less capable) feedback to train a stronger (more capable) model. We prove that weak-to-strong generalization is possible by eliciting latent knowledge from pre-trained LLMs. In particular, we cast the weak-to-strong generalization problem as a transfer learning problem in which we wish to transfer a latent concept from a weak model to a strong pre-trained model. We prove that a naive fine-tuning approach suffers from fundamental limitations, but an alternative refinement-based approach suggested by the problem structure provably overcomes the limitations of fine-tuning. Finally, we demonstrate the practical applicability of the refinement approach with three LLM alignment tasks.

Via

Access Paper or Ask Questions

Prompt Exploration with Prompt Regression

May 17, 2024

Michael Feffer, Ronald Xu, Yuekai Sun, Mikhail Yurochkin

Figure 1 for Prompt Exploration with Prompt Regression

Figure 2 for Prompt Exploration with Prompt Regression

Figure 3 for Prompt Exploration with Prompt Regression

Figure 4 for Prompt Exploration with Prompt Regression

Abstract:In the advent of democratized usage of large language models (LLMs), there is a growing desire to systematize LLM prompt creation and selection processes beyond iterative trial-and-error. Prior works majorly focus on searching the space of prompts without accounting for relations between prompt variations. Here we propose a framework, Prompt Exploration with Prompt Regression (PEPR), to predict the effect of prompt combinations given results for individual prompt elements as well as a simple method to select an effective prompt for a given use-case. We evaluate our approach with open-source LLMs of different sizes on several different tasks.

Via

Access Paper or Ask Questions

Aligners: Decoupling LLMs and Alignment

Mar 11, 2024

Lilian Ngweta, Mayank Agarwal, Subha Maity, Alex Gittens, Yuekai Sun, Mikhail Yurochkin

Figure 1 for Aligners: Decoupling LLMs and Alignment

Figure 2 for Aligners: Decoupling LLMs and Alignment

Abstract:Large Language Models (LLMs) need to be aligned with human expectations to ensure their safety and utility in most applications. Alignment is challenging, costly, and needs to be repeated for every LLM and alignment criterion. We propose to decouple LLMs and alignment by training aligner models that can be used to align any LLM for a given criteria on an as-needed basis, thus also reducing the potential negative impacts of alignment on performance. Our recipe for training the aligner models solely relies on synthetic data generated with a (prompted) LLM and can be easily adjusted for a variety of alignment criteria. We illustrate our method by training an "ethical" aligner and verify its efficacy empirically.

* Tiny Papers Track at the International Conference on Learning Representations (ICLR) 2024

Via

Access Paper or Ask Questions

Asymmetry in Low-Rank Adapters of Foundation Models

Feb 27, 2024

Jiacheng Zhu, Kristjan Greenewald, Kimia Nadjahi, Haitz Sáez de Ocáriz Borde, Rickard Brüel Gabrielsson, Leshem Choshen, Marzyeh Ghassemi, Mikhail Yurochkin, Justin Solomon

Figure 1 for Asymmetry in Low-Rank Adapters of Foundation Models

Figure 2 for Asymmetry in Low-Rank Adapters of Foundation Models

Figure 3 for Asymmetry in Low-Rank Adapters of Foundation Models

Figure 4 for Asymmetry in Low-Rank Adapters of Foundation Models

Abstract:Parameter-efficient fine-tuning optimizes large, pre-trained foundation models by updating a subset of parameters; in this class, Low-Rank Adaptation (LoRA) is particularly effective. Inspired by an effort to investigate the different roles of LoRA matrices during fine-tuning, this paper characterizes and leverages unexpected asymmetry in the importance of low-rank adapter matrices. Specifically, when updating the parameter matrices of a neural network by adding a product $BA$, we observe that the $B$ and $A$ matrices have distinct functions: $A$ extracts features from the input, while $B$ uses these features to create the desired output. Based on this observation, we demonstrate that fine-tuning $B$ is inherently more effective than fine-tuning $A$, and that a random untrained $A$ should perform nearly as well as a fine-tuned one. Using an information-theoretic lens, we also bound the generalization of low-rank adapters, showing that the parameter savings of exclusively training $B$ improves the bound. We support our conclusions with experiments on RoBERTa, BART-Large, LLaMA-2, and ViTs.

* 17 pages, 2 figures, 9 tables

Via

Access Paper or Ask Questions

tinyBenchmarks: evaluating LLMs with fewer examples

Feb 22, 2024

Felipe Maia Polo, Lucas Weber, Leshem Choshen, Yuekai Sun, Gongjun Xu, Mikhail Yurochkin

Figure 1 for tinyBenchmarks: evaluating LLMs with fewer examples

Figure 2 for tinyBenchmarks: evaluating LLMs with fewer examples

Figure 3 for tinyBenchmarks: evaluating LLMs with fewer examples

Figure 4 for tinyBenchmarks: evaluating LLMs with fewer examples

Abstract:The versatility of large language models (LLMs) led to the creation of diverse benchmarks that thoroughly test a variety of language models' abilities. These benchmarks consist of tens of thousands of examples making evaluation of LLMs very expensive. In this paper, we investigate strategies to reduce the number of evaluations needed to assess the performance of an LLM on several key benchmarks. For example, we show that to accurately estimate the performance of an LLM on MMLU, a popular multiple-choice QA benchmark consisting of 14K examples, it is sufficient to evaluate this LLM on 100 curated examples. We release evaluation tools and tiny versions of popular benchmarks: Open LLM Leaderboard, MMLU, HELM, and AlpacaEval 2.0. Our empirical analysis demonstrates that these tools and tiny benchmarks are sufficient to reliably and efficiently reproduce the original evaluation results.

Via

Access Paper or Ask Questions

Uncertainty Quantification via Stable Distribution Propagation

Feb 13, 2024

Felix Petersen, Aashwin Mishra, Hilde Kuehne, Christian Borgelt, Oliver Deussen, Mikhail Yurochkin

Figure 1 for Uncertainty Quantification via Stable Distribution Propagation

Figure 2 for Uncertainty Quantification via Stable Distribution Propagation

Figure 3 for Uncertainty Quantification via Stable Distribution Propagation

Figure 4 for Uncertainty Quantification via Stable Distribution Propagation

Abstract:We propose a new approach for propagating stable probability distributions through neural networks. Our method is based on local linearization, which we show to be an optimal approximation in terms of total variation distance for the ReLU non-linearity. This allows propagating Gaussian and Cauchy input uncertainties through neural networks to quantify their output uncertainties. To demonstrate the utility of propagating distributions, we apply the proposed method to predicting calibrated confidence intervals and selective prediction on out-of-distribution data. The results demonstrate a broad applicability of propagating distributions and show the advantages of our method over other approaches such as moment matching.

* Published at ICLR 2024, Code @ https://github.com/Felix-Petersen/distprop

Via

Access Paper or Ask Questions

Estimating Fréchet bounds for validating programmatic weak supervision

Dec 07, 2023

Felipe Maia Polo, Mikhail Yurochkin, Moulinath Banerjee, Subha Maity, Yuekai Sun

Figure 1 for Estimating Fréchet bounds for validating programmatic weak supervision

Figure 2 for Estimating Fréchet bounds for validating programmatic weak supervision

Figure 3 for Estimating Fréchet bounds for validating programmatic weak supervision

Figure 4 for Estimating Fréchet bounds for validating programmatic weak supervision

Abstract:We develop methods for estimating Fr\'echet bounds on (possibly high-dimensional) distribution classes in which some variables are continuous-valued. We establish the statistical correctness of the computed bounds under uncertainty in the marginal constraints and demonstrate the usefulness of our algorithms by evaluating the performance of machine learning (ML) models trained with programmatic weak supervision (PWS). PWS is a framework for principled learning from weak supervision inputs (e.g., crowdsourced labels, knowledge bases, pre-trained models on related tasks, etc), and it has achieved remarkable success in many areas of science and engineering. Unfortunately, it is generally difficult to validate the performance of ML models trained with PWS due to the absence of labeled data. Our algorithms address this issue by estimating sharp lower and upper bounds for performance metrics such as accuracy/recall/precision/F1 score.

Via

Access Paper or Ask Questions

Risk Assessment and Statistical Significance in the Age of Foundation Models

Oct 11, 2023

Apoorva Nitsure, Youssef Mroueh, Mattia Rigotti, Kristjan Greenewald, Brian Belgodere, Mikhail Yurochkin, Jiri Navratil, Igor Melnyk, Jerret Ross

Figure 1 for Risk Assessment and Statistical Significance in the Age of Foundation Models

Figure 2 for Risk Assessment and Statistical Significance in the Age of Foundation Models

Figure 3 for Risk Assessment and Statistical Significance in the Age of Foundation Models

Figure 4 for Risk Assessment and Statistical Significance in the Age of Foundation Models

Abstract:We propose a distributional framework for assessing socio-technical risks of foundation models with quantified statistical significance. Our approach hinges on a new statistical relative testing based on first and second order stochastic dominance of real random variables. We show that the second order statistics in this test are linked to mean-risk models commonly used in econometrics and mathematical finance to balance risk and utility when choosing between alternatives. Using this framework, we formally develop a risk-aware approach for foundation model selection given guardrails quantified by specified metrics. Inspired by portfolio optimization and selection theory in mathematical finance, we define a \emph{metrics portfolio} for each model as a means to aggregate a collection of metrics, and perform model selection based on the stochastic dominance of these portfolios. The statistical significance of our tests is backed theoretically by an asymptotic analysis via central limit theorems instantiated in practice via a bootstrap variance estimate. We use our framework to compare various large language models regarding risks related to drifting from instructions and outputting toxic content.

Via

Access Paper or Ask Questions