Abstract:Diffusion models have demonstrated remarkable success in high-fidelity image synthesis and prompt-guided generative modeling. However, ensuring adequate diversity in generated samples of prompt-guided diffusion models remains a challenge, particularly when the prompts span a broad semantic spectrum and the diversity of generated data needs to be evaluated in a prompt-aware fashion across semantically similar prompts. Recent methods have introduced guidance via diversity measures to encourage more varied generations. In this work, we extend the diversity measure-based approaches by proposing the Scalable Prompt-Aware R\'eny Kernel Entropy Diversity Guidance (SPARKE) method for prompt-aware diversity guidance. SPARKE utilizes conditional entropy for diversity guidance, which dynamically conditions diversity measurement on similar prompts and enables prompt-aware diversity control. While the entropy-based guidance approach enhances prompt-aware diversity, its reliance on the matrix-based entropy scores poses computational challenges in large-scale generation settings. To address this, we focus on the special case of Conditional latent RKE Score Guidance, reducing entropy computation and gradient-based optimization complexity from the $O(n^3)$ of general entropy measures to $O(n)$. The reduced computational complexity allows for diversity-guided sampling over potentially thousands of generation rounds on different prompts. We numerically test the SPARKE method on several text-to-image diffusion models, demonstrating that the proposed method improves the prompt-aware diversity of the generated data without incurring significant computational costs. We release our code on the project page: https://mjalali.github.io/SPARKE
Abstract:Federated learning (FL) algorithms commonly aim to maximize clients' accuracy by training a model on their collective data. However, in several FL applications, the model's decisions should meet a group fairness constraint to be independent of sensitive attributes such as gender or race. While such group fairness constraints can be incorporated into the objective function of the FL optimization problem, in this work, we show that such an approach would lead to suboptimal classification accuracy in an FL setting with heterogeneous client distributions. To achieve an optimal accuracy-group fairness trade-off, we propose the Personalized Federated Learning for Client-Level Group Fairness (pFedFair) framework, where clients locally impose their fairness constraints over the distributed training process. Leveraging the image embedding models, we extend the application of pFedFair to computer vision settings, where we numerically show that pFedFair achieves an optimal group fairness-accuracy trade-off in heterogeneous FL settings. We present the results of several numerical experiments on benchmark and synthetic datasets, which highlight the suboptimality of non-personalized FL algorithms and the improvements made by the pFedFair method.
Abstract:Fair supervised learning algorithms assigning labels with little dependence on a sensitive attribute have attracted great attention in the machine learning community. While the demographic parity (DP) notion has been frequently used to measure a model's fairness in training fair classifiers, several studies in the literature suggest potential impacts of enforcing DP in fair learning algorithms. In this work, we analytically study the effect of standard DP-based regularization methods on the conditional distribution of the predicted label given the sensitive attribute. Our analysis shows that an imbalanced training dataset with a non-uniform distribution of the sensitive attribute could lead to a classification rule biased toward the sensitive attribute outcome holding the majority of training data. To control such inductive biases in DP-based fair learning, we propose a sensitive attribute-based distributionally robust optimization (SA-DRO) method improving robustness against the marginal distribution of the sensitive attribute. Finally, we present several numerical results on the application of DP-based learning methods to standard centralized and distributed learning problems. The empirical findings support our theoretical results on the inductive biases in DP-based fair learning algorithms and the debiasing effects of the proposed SA-DRO method.