Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yuxin Wen

Efficient Vision-Language Models by Summarizing Visual Tokens into Compact Registers

Oct 17, 2024

Yuxin Wen, Qingqing Cao, Qichen Fu, Sachin Mehta, Mahyar Najibi

Figure 1 for Efficient Vision-Language Models by Summarizing Visual Tokens into Compact Registers

Figure 2 for Efficient Vision-Language Models by Summarizing Visual Tokens into Compact Registers

Figure 3 for Efficient Vision-Language Models by Summarizing Visual Tokens into Compact Registers

Figure 4 for Efficient Vision-Language Models by Summarizing Visual Tokens into Compact Registers

Abstract:Recent advancements in vision-language models (VLMs) have expanded their potential for real-world applications, enabling these models to perform complex reasoning on images. In the widely used fully autoregressive transformer-based models like LLaVA, projected visual tokens are prepended to textual tokens. Oftentimes, visual tokens are significantly more than prompt tokens, resulting in increased computational overhead during both training and inference. In this paper, we propose Visual Compact Token Registers (Victor), a method that reduces the number of visual tokens by summarizing them into a smaller set of register tokens. Victor adds a few learnable register tokens after the visual tokens and summarizes the visual information into these registers using the first few layers in the language tower of VLMs. After these few layers, all visual tokens are discarded, significantly improving computational efficiency for both training and inference. Notably, our method is easy to implement and requires a small number of new trainable parameters with minimal impact on model performance. In our experiment, with merely 8 visual registers--about 1% of the original tokens--Victor shows less than a 4% accuracy drop while reducing the total training time by 43% and boosting the inference throughput by 3.3X.

Via

Access Paper or Ask Questions

Detecting, Explaining, and Mitigating Memorization in Diffusion Models

Jul 31, 2024

Yuxin Wen, Yuchen Liu, Chen Chen, Lingjuan Lyu

Figure 1 for Detecting, Explaining, and Mitigating Memorization in Diffusion Models

Figure 2 for Detecting, Explaining, and Mitigating Memorization in Diffusion Models

Figure 3 for Detecting, Explaining, and Mitigating Memorization in Diffusion Models

Figure 4 for Detecting, Explaining, and Mitigating Memorization in Diffusion Models

Abstract:Recent breakthroughs in diffusion models have exhibited exceptional image-generation capabilities. However, studies show that some outputs are merely replications of training data. Such replications present potential legal challenges for model owners, especially when the generated content contains proprietary information. In this work, we introduce a straightforward yet effective method for detecting memorized prompts by inspecting the magnitude of text-conditional predictions. Our proposed method seamlessly integrates without disrupting sampling algorithms, and delivers high accuracy even at the first generation step, with a single generation per prompt. Building on our detection strategy, we unveil an explainable approach that shows the contribution of individual words or tokens to memorization. This offers an interactive medium for users to adjust their prompts. Moreover, we propose two strategies i.e., to mitigate memorization by leveraging the magnitude of text-conditional predictions, either through minimization during inference or filtering during training. These proposed strategies effectively counteract memorization while maintaining high-generation quality. Code is available at https://github.com/YuxinWenRick/diffusion_memorization.

* 16 pages, 9 figures, accepted as oral presentation in ICLR 2024

Via

Access Paper or Ask Questions

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Jun 14, 2024

Abhimanyu Hans, Yuxin Wen, Neel Jain, John Kirchenbauer, Hamid Kazemi, Prajwal Singhania, Siddharth Singh, Gowthami Somepalli, Jonas Geiping, Abhinav Bhatele(+1 more)

Figure 1 for Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Figure 2 for Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Figure 3 for Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Figure 4 for Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

Abstract:Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training, a randomly sampled subset of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verbatim reproduction of a complete chain of tokens from the training set. We run extensive experiments training billion-scale Llama-2 models, both pre-trained and trained from scratch, and demonstrate significant reductions in extractable memorization with little to no impact on downstream benchmarks.

* 9.5 pages, 8 figures, and 1 table in the main body. Code available at https://github.com/ahans30/goldfish-loss

Via

Access Paper or Ask Questions

GenQA: Generating Millions of Instructions from a Handful of Prompts

Jun 14, 2024

Jiuhai Chen, Rifaa Qadri, Yuxin Wen, Neel Jain, John Kirchenbauer, Tianyi Zhou, Tom Goldstein

Figure 1 for GenQA: Generating Millions of Instructions from a Handful of Prompts

Figure 2 for GenQA: Generating Millions of Instructions from a Handful of Prompts

Figure 3 for GenQA: Generating Millions of Instructions from a Handful of Prompts

Figure 4 for GenQA: Generating Millions of Instructions from a Handful of Prompts

Abstract:Most public instruction finetuning datasets are relatively small compared to the closed source datasets used to train industry models. To study questions about finetuning at scale, such as curricula and learning rate cooldown schedules, there is a need for industrial-scale datasets. However, this scale necessitates a data generation process that is almost entirely automated. In this work, we study methods for generating large instruction datasets from a single prompt. With little human oversight, we get LLMs to write diverse sets of instruction examples ranging from simple completion tasks to complex multi-turn dialogs across a variety of subject areas. When finetuning a Llama-3 8B base model, our dataset meets or exceeds both WizardLM and Ultrachat on both knowledge-intensive leaderboard tasks as well as conversational evaluations. We release our dataset, the "generator" prompts that created it, and our finetuned model checkpoints.

* 9.5 pages, 6 Figures, and 3 tables in the main body. Dataset available at https://huggingface.co/datasets/tomg-group-umd/GenQA

Via

Access Paper or Ask Questions

Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization

Apr 02, 2024

Yuhang Li, Xin Dong, Chen Chen, Jingtao Li, Yuxin Wen, Michael Spranger, Lingjuan Lyu

Figure 1 for Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization

Figure 2 for Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization

Figure 3 for Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization

Figure 4 for Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization

Abstract:Synthetic image data generation represents a promising avenue for training deep learning models, particularly in the realm of transfer learning, where obtaining real images within a specific domain can be prohibitively expensive due to privacy and intellectual property considerations. This work delves into the generation and utilization of synthetic images derived from text-to-image generative models in facilitating transfer learning paradigms. Despite the high visual fidelity of the generated images, we observe that their naive incorporation into existing real-image datasets does not consistently enhance model performance due to the inherent distribution gap between synthetic and real images. To address this issue, we introduce a novel two-stage framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability and subsequently uses real data for rapid adaptation. Alongside, We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images. Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements, with up to 30% accuracy increase on classification tasks. Intriguingly, we note that the enhancements were not yet saturated, indicating that the benefits may further increase with an expanded volume of synthetic data.

* ICLR24 Score 6865 https://openreview.net/forum?id=CjPt1AC6w0

Via

Access Paper or Ask Questions

Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models

Apr 01, 2024

Yuxin Wen, Leo Marchyok, Sanghyun Hong, Jonas Geiping, Tom Goldstein, Nicholas Carlini

Abstract:It is commonplace to produce application-specific models by fine-tuning large pre-trained models using a small bespoke dataset. The widespread availability of foundation model checkpoints on the web poses considerable risks, including the vulnerability to backdoor attacks. In this paper, we unveil a new vulnerability: the privacy backdoor attack. This black-box privacy attack aims to amplify the privacy leakage that arises when fine-tuning a model: when a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model. We conduct extensive experiments on various datasets and models, including both vision-language models (CLIP) and large language models, demonstrating the broad applicability and effectiveness of such an attack. Additionally, we carry out multiple ablation studies with different fine-tuning methods and inference strategies to thoroughly analyze this new threat. Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.

Via

Access Paper or Ask Questions

Coercing LLMs to do and reveal anything

Feb 21, 2024

Jonas Geiping, Alex Stein, Manli Shu, Khalid Saifullah, Yuxin Wen, Tom Goldstein

Figure 1 for Coercing LLMs to do and reveal anything

Figure 2 for Coercing LLMs to do and reveal anything

Figure 3 for Coercing LLMs to do and reveal anything

Figure 4 for Coercing LLMs to do and reveal anything

Abstract:It has recently been shown that adversarial attacks on large language models (LLMs) can "jailbreak" the model into making harmful statements. In this work, we argue that the spectrum of adversarial attacks on LLMs is much larger than merely jailbreaking. We provide a broad overview of possible attack surfaces and attack goals. Based on a series of concrete examples, we discuss, categorize and systematize attacks that coerce varied unintended behaviors, such as misdirection, model control, denial-of-service, or data extraction. We analyze these attacks in controlled experiments, and find that many of them stem from the practice of pre-training LLMs with coding capabilities, as well as the continued existence of strange "glitch" tokens in common LLM vocabularies that should be removed for security reasons.

* 32 pages. Implementation available at https://github.com/JonasGeiping/carving

Via

Access Paper or Ask Questions

Benchmarking the Robustness of Image Watermarks

Jan 22, 2024

Bang An, Mucong Ding, Tahseen Rabbani, Aakriti Agrawal, Yuancheng Xu, Chenghao Deng, Sicheng Zhu, Abdirisak Mohamed, Yuxin Wen, Tom Goldstein(+1 more)

Figure 1 for Benchmarking the Robustness of Image Watermarks

Figure 2 for Benchmarking the Robustness of Image Watermarks

Figure 3 for Benchmarking the Robustness of Image Watermarks

Figure 4 for Benchmarking the Robustness of Image Watermarks

Abstract:This paper investigates the weaknesses of image watermarking techniques. We present WAVES (Watermark Analysis Via Enhanced Stress-testing), a novel benchmark for assessing watermark robustness, overcoming the limitations of current evaluation methods.WAVES integrates detection and identification tasks, and establishes a standardized evaluation protocol comprised of a diverse range of stress tests. The attacks in WAVES range from traditional image distortions to advanced and novel variations of diffusive, and adversarial attacks. Our evaluation examines two pivotal dimensions: the degree of image quality degradation and the efficacy of watermark detection after attacks. We develop a series of Performance vs. Quality 2D plots, varying over several prominent image similarity metrics, which are then aggregated in a heuristically novel manner to paint an overall picture of watermark robustness and attack potency. Our comprehensive evaluation reveals previously undetected vulnerabilities of several modern watermarking algorithms. We envision WAVES as a toolkit for the future development of robust watermarking systems. The project is available at https://wavesbench.github.io/

Via

Access Paper or Ask Questions

NEFTune: Noisy Embeddings Improve Instruction Finetuning

Oct 10, 2023

Neel Jain, Ping-yeh Chiang, Yuxin Wen, John Kirchenbauer, Hong-Min Chu, Gowthami Somepalli, Brian R. Bartoldson, Bhavya Kailkhura, Avi Schwarzschild, Aniruddha Saha(+3 more)

Figure 1 for NEFTune: Noisy Embeddings Improve Instruction Finetuning

Figure 2 for NEFTune: Noisy Embeddings Improve Instruction Finetuning

Figure 3 for NEFTune: Noisy Embeddings Improve Instruction Finetuning

Figure 4 for NEFTune: Noisy Embeddings Improve Instruction Finetuning

Abstract:We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation. NEFTune adds noise to the embedding vectors during training. Standard finetuning of LLaMA-2-7B using Alpaca achieves 29.79% on AlpacaEval, which rises to 64.69% using noisy embeddings. NEFTune also improves over strong baselines on modern instruction datasets. Models trained with Evol-Instruct see a 10% improvement, with ShareGPT an 8% improvement, and with OpenPlatypus an 8% improvement. Even powerful models further refined with RLHF such as LLaMA-2-Chat benefit from additional training with NEFTune.

* 25 pages, Code is available on Github: https://github.com/neelsjain/NEFTune

Via

Access Paper or Ask Questions

Baseline Defenses for Adversarial Attacks Against Aligned Language Models

Sep 04, 2023

Neel Jain, Avi Schwarzschild, Yuxin Wen, Gowthami Somepalli, John Kirchenbauer, Ping-yeh Chiang, Micah Goldblum, Aniruddha Saha, Jonas Geiping, Tom Goldstein

Figure 1 for Baseline Defenses for Adversarial Attacks Against Aligned Language Models

Figure 2 for Baseline Defenses for Adversarial Attacks Against Aligned Language Models

Figure 3 for Baseline Defenses for Adversarial Attacks Against Aligned Language Models

Figure 4 for Baseline Defenses for Adversarial Attacks Against Aligned Language Models

Abstract:As Large Language Models quickly become ubiquitous, it becomes critical to understand their security vulnerabilities. Recent work shows that text optimizers can produce jailbreaking prompts that bypass moderation and alignment. Drawing from the rich body of work on adversarial machine learning, we approach these attacks with three questions: What threat models are practically useful in this domain? How do baseline defense techniques perform in this new domain? How does LLM security differ from computer vision? We evaluate several baseline defense strategies against leading adversarial attacks on LLMs, discussing the various settings in which each is feasible and effective. Particularly, we look at three types of defenses: detection (perplexity based), input preprocessing (paraphrase and retokenization), and adversarial training. We discuss white-box and gray-box settings and discuss the robustness-performance trade-off for each of the defenses considered. We find that the weakness of existing discrete optimizers for text, combined with the relatively high costs of optimization, makes standard adaptive attacks more challenging for LLMs. Future research will be needed to uncover whether more powerful optimizers can be developed, or whether the strength of filtering and preprocessing defenses is greater in the LLMs domain than it has been in computer vision.

* 12 pages

Via

Access Paper or Ask Questions