Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Virginia Smith

Position: LLM Unlearning Benchmarks are Weak Measures of Progress

Oct 03, 2024

Pratiksha Thaker, Shengyuan Hu, Neil Kale, Yash Maurya, Zhiwei Steven Wu, Virginia Smith

Figure 1 for Position: LLM Unlearning Benchmarks are Weak Measures of Progress

Figure 2 for Position: LLM Unlearning Benchmarks are Weak Measures of Progress

Figure 3 for Position: LLM Unlearning Benchmarks are Weak Measures of Progress

Figure 4 for Position: LLM Unlearning Benchmarks are Weak Measures of Progress

Abstract:Unlearning methods have the potential to improve the privacy and safety of large language models (LLMs) by removing sensitive or harmful information post hoc. The LLM unlearning research community has increasingly turned toward empirical benchmarks to assess the effectiveness of such methods. In this paper, we find that existing benchmarks provide an overly optimistic and potentially misleading view on the effectiveness of candidate unlearning methods. By introducing simple, benign modifications to a number of popular benchmarks, we expose instances where supposedly unlearned information remains accessible, or where the unlearning process has degraded the model's performance on retained information to a much greater extent than indicated by the original benchmark. We identify that existing benchmarks are particularly vulnerable to modifications that introduce even loose dependencies between the forget and retain information. Further, we show that ambiguity in unlearning targets in existing benchmarks can easily lead to the design of methods that overfit to the given test queries. Based on our findings, we urge the community to be cautious when interpreting benchmark results as reliable measures of progress, and we provide several recommendations to guide future LLM unlearning research.

Via

Access Paper or Ask Questions

Revisiting Cascaded Ensembles for Efficient Inference

Jul 02, 2024

Steven Kolawole, Don Dennis, Ameet Talwalkar, Virginia Smith

Figure 1 for Revisiting Cascaded Ensembles for Efficient Inference

Figure 2 for Revisiting Cascaded Ensembles for Efficient Inference

Figure 3 for Revisiting Cascaded Ensembles for Efficient Inference

Figure 4 for Revisiting Cascaded Ensembles for Efficient Inference

Abstract:A common approach to make machine learning inference more efficient is to use example-specific adaptive schemes, which route or select models for each example at inference time. In this work we study a simple scheme for adaptive inference. We build a cascade of ensembles (CoE), beginning with resource-efficient models and growing to larger, more expressive models, where ensemble agreement serves as a data-dependent routing criterion. This scheme is easy to incorporate into existing inference pipelines, requires no additional training, and can be used to place models across multiple resource tiers--for instance, serving efficient models at the edge and invoking larger models in the cloud only when necessary. In cases where parallel inference is feasible, we show that CoE can improve accuracy relative to the single best model while reducing the average cost of inference by up to 7x, and provides Pareto-dominate solutions in accuracy and efficiency relative to existing adaptive inference baselines. These savings translate to an over 3x-reduction in total monetary cost when performing inference using a heterogeneous cluster of GPUs. Finally, for edge inference scenarios where portions of the cascade reside at the edge vs. in the cloud, CoE can provide a 14x reduction in communication cost and inference latency without sacrificing accuracy.

* ES-FOMO, ICML 2024

Via

Access Paper or Ask Questions

Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Jun 25, 2024

Aashiq Muhamed, Oscar Li, David Woodruff, Mona Diab, Virginia Smith

Figure 1 for Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Figure 2 for Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Figure 3 for Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Figure 4 for Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Abstract:Large language model (LLM) training and finetuning are often bottlenecked by limited GPU memory. While existing projection-based optimization methods address this by projecting gradients into a lower-dimensional subspace to reduce optimizer state memory, they typically rely on dense projection matrices, which can introduce computational and memory overheads. In this work, we propose Grass (GRAdient Stuctured Sparsification), a novel approach that leverages sparse projections to transform gradients into structured sparse updates. This design not only significantly reduces memory usage for optimizer states but also minimizes gradient memory footprint, computation, and communication costs, leading to substantial throughput improvements. Extensive experiments on pretraining and finetuning tasks demonstrate that Grass achieves competitive performance to full-rank training and existing projection-based methods. Notably, Grass enables half-precision pretraining of a 13B parameter LLaMA model on a single 40GB A100 GPU--a feat infeasible for previous methods--and yields up to a $2\times$ throughput improvement on an 8-GPU system. Code can be found at https://github.com/aashiqmuhamed/GRASS .

Via

Access Paper or Ask Questions

RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold

Jun 20, 2024

Amrith Setlur, Saurabh Garg, Xinyang Geng, Naman Garg, Virginia Smith, Aviral Kumar

Abstract:Training on model-generated synthetic data is a promising approach for finetuning LLMs, but it remains unclear when it helps or hurts. In this paper, we investigate this question for math reasoning via an empirical study, followed by building a conceptual understanding of our observations. First, we find that while the typical approach of finetuning a model on synthetic correct or positive problem-solution pairs generated by capable models offers modest performance gains, sampling more correct solutions from the finetuned learner itself followed by subsequent fine-tuning on this self-generated data $\textbf{doubles}$ the efficiency of the same synthetic problems. At the same time, training on model-generated positives can amplify various spurious correlations, resulting in flat or even inverse scaling trends as the amount of data increases. Surprisingly, we find that several of these issues can be addressed if we also utilize negative responses, i.e., model-generated responses that are deemed incorrect by a final answer verifier. Crucially, these negatives must be constructed such that the training can appropriately recover the utility or advantage of each intermediate step in the negative response. With this per-step scheme, we are able to attain consistent gains over only positive data, attaining performance similar to amplifying the amount of synthetic data by $\mathbf{8 \times}$. We show that training on per-step negatives can help to unlearn spurious correlations in the positive data, and is equivalent to advantage-weighted reinforcement learning (RL), implying that it inherits robustness benefits of RL over imitating positive data alone.

Via

Access Paper or Ask Questions

Jogging the Memory of Unlearned Model Through Targeted Relearning Attack

Jun 19, 2024

Shengyuan Hu, Yiwei Fu, Zhiwei Steven Wu, Virginia Smith

Figure 1 for Jogging the Memory of Unlearned Model Through Targeted Relearning Attack

Figure 2 for Jogging the Memory of Unlearned Model Through Targeted Relearning Attack

Figure 3 for Jogging the Memory of Unlearned Model Through Targeted Relearning Attack

Figure 4 for Jogging the Memory of Unlearned Model Through Targeted Relearning Attack

Abstract:Machine unlearning is a promising approach to mitigate undesirable memorization of training data in ML models. However, in this work we show that existing approaches for unlearning in LLMs are surprisingly susceptible to a simple set of targeted relearning attacks. With access to only a small and potentially loosely related set of data, we find that we can 'jog' the memory of unlearned models to reverse the effects of unlearning. We formalize this unlearning-relearning pipeline, explore the attack across three popular unlearning benchmarks, and discuss future directions and guidelines that result from our study.

* 17 pages, 8 figures, 12 tables

Via

Access Paper or Ask Questions

Federated LoRA with Sparse Communication

Jun 07, 2024

Kevin Kuo, Arian Raje, Kousik Rajesh, Virginia Smith

Figure 1 for Federated LoRA with Sparse Communication

Figure 2 for Federated LoRA with Sparse Communication

Figure 3 for Federated LoRA with Sparse Communication

Figure 4 for Federated LoRA with Sparse Communication

Abstract:Low-rank adaptation (LoRA) is a natural method for finetuning in communication-constrained machine learning settings such as cross-device federated learning. Prior work that has studied LoRA in the context of federated learning has focused on improving LoRA's robustness to heterogeneity and privacy. In this work, we instead consider techniques for further improving communication-efficiency in federated LoRA. Unfortunately, we show that centralized ML methods that improve the efficiency of LoRA through unstructured pruning do not transfer well to federated settings. We instead study a simple approach, \textbf{FLASC}, that applies sparsity to LoRA during communication while allowing clients to locally fine-tune the entire LoRA module. Across four common federated learning tasks, we demonstrate that this method matches the performance of dense LoRA with up to $10\times$ less communication. Additionally, despite being designed primarily to target communication, we find that this approach has benefits in terms of heterogeneity and privacy relative to existing approaches tailored to these specific concerns. Overall, our work highlights the importance of considering system-specific constraints when developing communication-efficient finetuning approaches, and serves as a simple and competitive baseline for future work in federated finetuning.

* 12 pages (excluding references), 8 figures

Via

Access Paper or Ask Questions

Privacy Amplification for the Gaussian Mechanism via Bounded Support

Mar 07, 2024

Shengyuan Hu, Saeed Mahloujifar, Virginia Smith, Kamalika Chaudhuri, Chuan Guo

Figure 1 for Privacy Amplification for the Gaussian Mechanism via Bounded Support

Figure 2 for Privacy Amplification for the Gaussian Mechanism via Bounded Support

Figure 3 for Privacy Amplification for the Gaussian Mechanism via Bounded Support

Figure 4 for Privacy Amplification for the Gaussian Mechanism via Bounded Support

Abstract:Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset. These guarantees can be desirable compared to vanilla DP in real world settings as they tightly upper-bound the privacy leakage for a $\textit{specific}$ individual in an $\textit{actual}$ dataset, rather than considering worst-case datasets. While these frameworks are beginning to gain popularity, to date, there is a lack of private mechanisms that can fully leverage advantages of data-dependent accounting. To bridge this gap, we propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting. Experiments on model training with DP-SGD show that using bounded support Gaussian mechanisms can provide a reduction of the pDP bound $\epsilon$ by as much as 30% without negative effects on model utility.

* 23 pages, 4 figures

Via

Access Paper or Ask Questions

Many-Objective Multi-Solution Transport

Mar 06, 2024

Ziyue Li, Tian Li, Virginia Smith, Jeff Bilmes, Tianyi Zhou

Figure 1 for Many-Objective Multi-Solution Transport

Figure 2 for Many-Objective Multi-Solution Transport

Figure 3 for Many-Objective Multi-Solution Transport

Figure 4 for Many-Objective Multi-Solution Transport

Abstract:Optimizing the performance of many objectives (instantiated by tasks or clients) jointly with a few Pareto stationary solutions (models) is critical in machine learning. However, previous multi-objective optimization methods often focus on a few number of objectives and cannot scale to many objectives that outnumber the solutions, leading to either subpar performance or ignored objectives. We introduce Many-objective multi-solution Transport (MosT), a framework that finds multiple diverse solutions in the Pareto front of many objectives. Our insight is to seek multiple solutions, each performing as a domain expert and focusing on a specific subset of objectives while collectively covering all of them. MosT formulates the problem as a bi-level optimization of weighted objectives for each solution, where the weights are defined by an optimal transport between the objectives and solutions. Our algorithm ensures convergence to Pareto stationary solutions for complementary subsets of objectives. On a range of applications in federated learning, multi-task learning, and mixture-of-prompt learning for LLMs, MosT distinctly outperforms strong baselines, delivering high-quality, diverse solutions that profile the entire Pareto frontier, thus ensuring balanced trade-offs across many objectives.

Via

Access Paper or Ask Questions

Guardrail Baselines for Unlearning in LLMs

Mar 05, 2024

Pratiksha Thaker, Yash Maurya, Virginia Smith

Figure 1 for Guardrail Baselines for Unlearning in LLMs

Figure 2 for Guardrail Baselines for Unlearning in LLMs

Figure 3 for Guardrail Baselines for Unlearning in LLMs

Abstract:Recent work has demonstrated that fine-tuning is a promising approach to `unlearn' concepts from large language models. However, fine-tuning can be expensive, as it requires both generating a set of examples and running iterations of fine-tuning to update the model. In this work, we show that simple guardrail-based approaches such as prompting and filtering can achieve unlearning results comparable to fine-tuning. We recommend that researchers investigate these lightweight baselines when evaluating the performance of more computationally intensive fine-tuning methods. While we do not claim that methods such as prompting or filtering are universal solutions to the problem of unlearning, our work suggests the need for evaluation metrics that can better separate the power of guardrails vs. fine-tuning, and highlights scenarios where guardrails themselves may be advantageous for unlearning, such as in generating examples for fine-tuning or unlearning when only API access is available.

* Preliminary work, accepted to ICLR workshop SeT-LLM 2024

Via

Access Paper or Ask Questions

Attacking LLM Watermarks by Exploiting Their Strengths

Feb 25, 2024

Qi Pang, Shengyuan Hu, Wenting Zheng, Virginia Smith

Figure 1 for Attacking LLM Watermarks by Exploiting Their Strengths

Figure 2 for Attacking LLM Watermarks by Exploiting Their Strengths

Figure 3 for Attacking LLM Watermarks by Exploiting Their Strengths

Figure 4 for Attacking LLM Watermarks by Exploiting Their Strengths

Abstract:Advances in generative models have made it possible for AI-generated text, code, and images to mirror human-generated content in many applications. Watermarking, a technique that aims to embed information in the output of a model to verify its source, is useful for mitigating misuse of such AI-generated content. However, existing watermarking schemes remain surprisingly susceptible to attack. In particular, we show that desirable properties shared by existing LLM watermarking systems such as quality preservation, robustness, and public detection APIs can in turn make these systems vulnerable to various attacks. We rigorously study potential attacks in terms of common watermark design choices, and propose best practices and defenses for mitigation -- establishing a set of practical guidelines for embedding and detection of LLM watermarks.

Via

Access Paper or Ask Questions