Abstract:This paper introduces a novel Multi-Agent Cooperative Learning (MACL) framework to address cross-modal alignment collapse in vision-language models when handling out-of-distribution (OOD) concepts. Four core agents, including image, text, name, and coordination agents, collaboratively mitigate modality imbalance through structured message passing. The proposed framework enables multi-agent feature space name learning, incorporates a context exchange enhanced few-shot learning algorithm, and adopts an adaptive dynamic balancing mechanism to regulate inter-agent contributions. Experiments on the VISTA-Beyond dataset demonstrate that MACL significantly improves performance in both few-shot and zero-shot settings, achieving 1-5% precision gains across diverse visual domains.
Abstract:Despite the generative model's groundbreaking success, the need to study its implications for privacy and utility becomes more urgent. Although many studies have demonstrated the privacy threats brought by GANs, no existing survey has systematically categorized the privacy and utility perspectives of GANs and VAEs. In this article, we comprehensively study privacy-preserving generative models, articulating the novel taxonomies for both privacy and utility metrics by analyzing 100 research publications. Finally, we discuss the current challenges and future research directions that help new researchers gain insight into the underlying concepts.