Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Huishuai Zhang

©Plug-in Authorization for Human Content Copyright Protection in Text-to-Image Model

Apr 18, 2024

Chao Zhou, Huishuai Zhang, Jiang Bian, Weiming Zhang, Nenghai Yu

Figure 1 for ©Plug-in Authorization for Human Content Copyright Protection in Text-to-Image Model

Figure 2 for ©Plug-in Authorization for Human Content Copyright Protection in Text-to-Image Model

Figure 3 for ©Plug-in Authorization for Human Content Copyright Protection in Text-to-Image Model

Figure 4 for ©Plug-in Authorization for Human Content Copyright Protection in Text-to-Image Model

Abstract:This paper addresses the contentious issue of copyright infringement in images generated by text-to-image models, sparking debates among AI developers, content creators, and legal entities. State-of-the-art models create high-quality content without crediting original creators, causing concern in the artistic community. To mitigate this, we propose the \copyright Plug-in Authorization framework, introducing three operations: addition, extraction, and combination. Addition involves training a \copyright plug-in for specific copyright, facilitating proper credit attribution. Extraction allows creators to reclaim copyright from infringing models, and combination enables users to merge different \copyright plug-ins. These operations act as permits, incentivizing fair use and providing flexibility in authorization. We present innovative approaches,"Reverse LoRA" for extraction and "EasyMerge" for seamless combination. Experiments in artist-style replication and cartoon IP recreation demonstrate \copyright plug-ins' effectiveness, offering a valuable solution for human copyright protection in the age of generative AIs.

* 20 pages, 6 figures

Via

Access Paper or Ask Questions

On the Convergence of Adam under Non-uniform Smoothness: Separability from SGDM and Beyond

Mar 22, 2024

Bohan Wang, Huishuai Zhang, Qi Meng, Ruoyu Sun, Zhi-Ming Ma, Wei Chen

Abstract:This paper aims to clearly distinguish between Stochastic Gradient Descent with Momentum (SGDM) and Adam in terms of their convergence rates. We demonstrate that Adam achieves a faster convergence compared to SGDM under the condition of non-uniformly bounded smoothness. Our findings reveal that: (1) in deterministic environments, Adam can attain the known lower bound for the convergence rate of deterministic first-order optimizers, whereas the convergence rate of Gradient Descent with Momentum (GDM) has higher order dependence on the initial function value; (2) in stochastic setting, Adam's convergence rate upper bound matches the lower bounds of stochastic first-order optimizers, considering both the initial function value and the final error, whereas there are instances where SGDM fails to converge with any learning rate. These insights distinctly differentiate Adam and SGDM regarding their convergence rates. Additionally, by introducing a novel stopping-time based technique, we further prove that if we consider the minimum gradient norm during iterations, the corresponding convergence rate can match the lower bounds across all problem hyperparameters. The technique can also help proving that Adam with a specific hyperparameter scheduler is parameter-agnostic, which hence can be of independent interest.

Via

Access Paper or Ask Questions

Differentially Private Synthetic Data via Foundation Model APIs 2: Text

Mar 04, 2024

Chulin Xie, Zinan Lin, Arturs Backurs, Sivakanth Gopi, Da Yu, Huseyin A Inan, Harsha Nori, Haotian Jiang, Huishuai Zhang, Yin Tat Lee(+2 more)

Figure 1 for Differentially Private Synthetic Data via Foundation Model APIs 2: Text

Figure 2 for Differentially Private Synthetic Data via Foundation Model APIs 2: Text

Figure 3 for Differentially Private Synthetic Data via Foundation Model APIs 2: Text

Figure 4 for Differentially Private Synthetic Data via Foundation Model APIs 2: Text

Abstract:Text data has become extremely valuable due to the emergence of machine learning algorithms that learn from it. A lot of high-quality text data generated in the real world is private and therefore cannot be shared or used freely due to privacy concerns. Generating synthetic replicas of private text data with a formal privacy guarantee, i.e., differential privacy (DP), offers a promising and scalable solution. However, existing methods necessitate DP finetuning of large language models (LLMs) on private data to generate DP synthetic data. This approach is not viable for proprietary LLMs (e.g., GPT-3.5) and also demands considerable computational resources for open-source LLMs. Lin et al. (2024) recently introduced the Private Evolution (PE) algorithm to generate DP synthetic images with only API access to diffusion models. In this work, we propose an augmented PE algorithm, named Aug-PE, that applies to the complex setting of text. We use API access to an LLM and generate DP synthetic text without any model training. We conduct comprehensive experiments on three benchmark datasets. Our results demonstrate that Aug-PE produces DP synthetic text that yields competitive utility with the SOTA DP finetuning baselines. This underscores the feasibility of relying solely on API access of LLMs to produce high-quality DP synthetic texts, thereby facilitating more accessible routes to privacy-preserving LLM applications. Our code and data are available at https://github.com/AI-secure/aug-pe.

Via

Access Paper or Ask Questions

Exploring Transferability for Randomized Smoothing

Dec 14, 2023

Kai Qiu, Huishuai Zhang, Zhirong Wu, Stephen Lin

Abstract:Training foundation models on extensive datasets and then finetuning them on specific tasks has emerged as the mainstream approach in artificial intelligence. However, the model robustness, which is a critical aspect for safety, is often optimized for each specific task rather than at the pretraining stage. In this paper, we propose a method for pretraining certifiably robust models that can be readily finetuned for adaptation to a particular task. A key challenge is dealing with the compromise between semantic learning and robustness. We address this with a simple yet highly effective strategy based on significantly broadening the pretraining data distribution, which is shown to greatly benefit finetuning for downstream tasks. Through pretraining on a mixture of clean and various noisy images, we find that surprisingly strong certified accuracy can be achieved even when finetuning on only clean images. Furthermore, this strategy requires just a single model to deal with various noise levels, thus substantially reducing computational costs in relation to previous works that employ multiple models. Despite using just one model, our method can still yield results that are on par with, or even superior to, existing multi-model methods.

Via

Access Paper or Ask Questions

Large Catapults in Momentum Gradient Descent with Warmup: An Empirical Study

Nov 25, 2023

Prin Phunyaphibarn, Junghyun Lee, Bohan Wang, Huishuai Zhang, Chulhee Yun

Figure 1 for Large Catapults in Momentum Gradient Descent with Warmup: An Empirical Study

Figure 2 for Large Catapults in Momentum Gradient Descent with Warmup: An Empirical Study

Figure 3 for Large Catapults in Momentum Gradient Descent with Warmup: An Empirical Study

Figure 4 for Large Catapults in Momentum Gradient Descent with Warmup: An Empirical Study

Abstract:Although gradient descent with momentum is widely used in modern deep learning, a concrete understanding of its effects on the training trajectory still remains elusive. In this work, we empirically show that momentum gradient descent with a large learning rate and learning rate warmup displays large catapults, driving the iterates towards flatter minima than those found by gradient descent. We then provide empirical evidence and theoretical intuition that the large catapult is caused by momentum "amplifying" the self-stabilization effect (Damian et al., 2023).

* 19 pages, 14 figures. Accepted to the NeurIPS 2023 M3L Workshop (oral). The first two authors contributed equally

Via

Access Paper or Ask Questions

On the Generalization Properties of Diffusion Models

Nov 14, 2023

Puheng Li, Zhong Li, Huishuai Zhang, Jiang Bian

Abstract:Diffusion models are a class of generative models that serve to establish a stochastic transport map between an empirically observed, yet unknown, target distribution and a known prior. Despite their remarkable success in real-world applications, a theoretical understanding of their generalization capabilities remains underdeveloped. This work embarks on a comprehensive theoretical exploration of the generalization attributes of diffusion models. We establish theoretical estimates of the generalization gap that evolves in tandem with the training dynamics of score-based diffusion models, suggesting a polynomially small generalization error ($O(n^{-2/5}+m^{-4/5})$) on both the sample size $n$ and the model capacity $m$, evading the curse of dimensionality (i.e., not exponentially large in the data dimension) when early-stopped. Furthermore, we extend our quantitative analysis to a data-dependent scenario, wherein target distributions are portrayed as a succession of densities with progressively increasing distances between modes. This precisely elucidates the adverse effect of "modes shift" in ground truths on the model generalization. Moreover, these estimates are not solely theoretical constructs but have also been confirmed through numerical simulations. Our findings contribute to the rigorous understanding of diffusion models' generalization properties and provide insights that may guide practical applications.

* 42 pages, 11 figures

Via

Access Paper or Ask Questions

FD-Align: Feature Discrimination Alignment for Fine-tuning Pre-Trained Models in Few-Shot Learning

Nov 01, 2023

Kun Song, Huimin Ma, Bochao Zou, Huishuai Zhang, Weiran Huang

Abstract:Due to the limited availability of data, existing few-shot learning methods trained from scratch fail to achieve satisfactory performance. In contrast, large-scale pre-trained models such as CLIP demonstrate remarkable few-shot and zero-shot capabilities. To enhance the performance of pre-trained models for downstream tasks, fine-tuning the model on downstream data is frequently necessary. However, fine-tuning the pre-trained model leads to a decrease in its generalizability in the presence of distribution shift, while the limited number of samples in few-shot learning makes the model highly susceptible to overfitting. Consequently, existing methods for fine-tuning few-shot learning primarily focus on fine-tuning the model's classification head or introducing additional structure. In this paper, we introduce a fine-tuning approach termed Feature Discrimination Alignment (FD-Align). Our method aims to bolster the model's generalizability by preserving the consistency of spurious features across the fine-tuning process. Extensive experimental results validate the efficacy of our approach for both ID and OOD tasks. Once fine-tuned, the model can seamlessly integrate with existing methods, leading to performance improvements. Our code can be found in https://github.com/skingorz/FD-Align.

* Accepted by NeurIPS 2023

Via

Access Paper or Ask Questions

Closing the Gap Between the Upper Bound and the Lower Bound of Adam's Iteration Complexity

Oct 27, 2023

Bohan Wang, Jingwen Fu, Huishuai Zhang, Nanning Zheng, Wei Chen

Abstract:Recently, Arjevani et al. [1] established a lower bound of iteration complexity for the first-order optimization under an $L$-smooth condition and a bounded noise variance assumption. However, a thorough review of existing literature on Adam's convergence reveals a noticeable gap: none of them meet the above lower bound. In this paper, we close the gap by deriving a new convergence guarantee of Adam, with only an $L$-smooth condition and a bounded noise variance assumption. Our results remain valid across a broad spectrum of hyperparameters. Especially with properly chosen hyperparameters, we derive an upper bound of the iteration complexity of Adam and show that it meets the lower bound for first-order optimizers. To the best of our knowledge, this is the first to establish such a tight upper bound for Adam's convergence. Our proof utilizes novel techniques to handle the entanglement between momentum and adaptive learning rate and to convert the first-order term in the Descent Lemma to the gradient norm, which may be of independent interest.

* NeurIPS 2023 Accept

Via

Access Paper or Ask Questions

DiffKendall: A Novel Approach for Few-Shot Learning with Differentiable Kendall's Rank Correlation

Jul 28, 2023

Kaipeng Zheng, Huishuai Zhang, Weiran Huang

Figure 1 for DiffKendall: A Novel Approach for Few-Shot Learning with Differentiable Kendall's Rank Correlation

Figure 2 for DiffKendall: A Novel Approach for Few-Shot Learning with Differentiable Kendall's Rank Correlation

Figure 3 for DiffKendall: A Novel Approach for Few-Shot Learning with Differentiable Kendall's Rank Correlation

Figure 4 for DiffKendall: A Novel Approach for Few-Shot Learning with Differentiable Kendall's Rank Correlation

Abstract:Few-shot learning aims to adapt models trained on the base dataset to novel tasks where the categories are not seen by the model before. This often leads to a relatively uniform distribution of feature values across channels on novel classes, posing challenges in determining channel importance for novel tasks. Standard few-shot learning methods employ geometric similarity metrics such as cosine similarity and negative Euclidean distance to gauge the semantic relatedness between two features. However, features with high geometric similarities may carry distinct semantics, especially in the context of few-shot learning. In this paper, we demonstrate that the importance ranking of feature channels is a more reliable indicator for few-shot learning than geometric similarity metrics. We observe that replacing the geometric similarity metric with Kendall's rank correlation only during inference is able to improve the performance of few-shot learning across a wide range of datasets with different domains. Furthermore, we propose a carefully designed differentiable loss for meta-training to address the non-differentiability issue of Kendall's rank correlation. Extensive experiments demonstrate that the proposed rank-correlation-based approach substantially enhances few-shot learning performance.

Via

Access Paper or Ask Questions

FILM: How can Few-Shot Image Classification Benefit from Pre-Trained Language Models?

Jul 09, 2023

Zihao Jiang, Yunkai Dang, Dong Pang, Huishuai Zhang, Weiran Huang

Figure 1 for FILM: How can Few-Shot Image Classification Benefit from Pre-Trained Language Models?

Figure 2 for FILM: How can Few-Shot Image Classification Benefit from Pre-Trained Language Models?

Figure 3 for FILM: How can Few-Shot Image Classification Benefit from Pre-Trained Language Models?

Figure 4 for FILM: How can Few-Shot Image Classification Benefit from Pre-Trained Language Models?

Abstract:Few-shot learning aims to train models that can be generalized to novel classes with only a few samples. Recently, a line of works are proposed to enhance few-shot learning with accessible semantic information from class names. However, these works focus on improving existing modules such as visual prototypes and feature extractors of the standard few-shot learning framework. This limits the full potential use of semantic information. In this paper, we propose a novel few-shot learning framework that uses pre-trained language models based on contrastive learning. To address the challenge of alignment between visual features and textual embeddings obtained from text-based pre-trained language model, we carefully design the textual branch of our framework and introduce a metric module to generalize the cosine similarity. For better transferability, we let the metric module adapt to different few-shot tasks and adopt MAML to train the model via bi-level optimization. Moreover, we conduct extensive experiments on multiple benchmarks to demonstrate the effectiveness of our method.

Via

Access Paper or Ask Questions