Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

David Doermann

Federated Learning with Privacy-Preserving Ensemble Attention Distillation

Oct 16, 2022

Xuan Gong, Liangchen Song, Rishi Vedula, Abhishek Sharma, Meng Zheng, Benjamin Planche, Arun Innanje, Terrence Chen, Junsong Yuan, David Doermann(+1 more)

Figure 1 for Federated Learning with Privacy-Preserving Ensemble Attention Distillation

Figure 2 for Federated Learning with Privacy-Preserving Ensemble Attention Distillation

Figure 3 for Federated Learning with Privacy-Preserving Ensemble Attention Distillation

Figure 4 for Federated Learning with Privacy-Preserving Ensemble Attention Distillation

Abstract:Federated Learning (FL) is a machine learning paradigm where many local nodes collaboratively train a central model while keeping the training data decentralized. This is particularly relevant for clinical applications since patient data are usually not allowed to be transferred out of medical facilities, leading to the need for FL. Existing FL methods typically share model parameters or employ co-distillation to address the issue of unbalanced data distribution. However, they also require numerous rounds of synchronized communication and, more importantly, suffer from a privacy leakage risk. We propose a privacy-preserving FL framework leveraging unlabeled public data for one-way offline knowledge distillation in this work. The central model is learned from local knowledge via ensemble attention distillation. Our technique uses decentralized and heterogeneous local data like existing FL approaches, but more importantly, it significantly reduces the risk of privacy leakage. We demonstrate that our method achieves very competitive performance with more robust privacy preservation based on extensive experiments on image classification, segmentation, and reconstruction tasks.

Via

Access Paper or Ask Questions

PREF: Predictability Regularized Neural Motion Fields

Sep 21, 2022

Liangchen Song, Xuan Gong, Benjamin Planche, Meng Zheng, David Doermann, Junsong Yuan, Terrence Chen, Ziyan Wu

Figure 1 for PREF: Predictability Regularized Neural Motion Fields

Figure 2 for PREF: Predictability Regularized Neural Motion Fields

Figure 3 for PREF: Predictability Regularized Neural Motion Fields

Figure 4 for PREF: Predictability Regularized Neural Motion Fields

Abstract:Knowing the 3D motions in a dynamic scene is essential to many vision applications. Recent progress is mainly focused on estimating the activity of some specific elements like humans. In this paper, we leverage a neural motion field for estimating the motion of all points in a multiview setting. Modeling the motion from a dynamic scene with multiview data is challenging due to the ambiguities in points of similar color and points with time-varying color. We propose to regularize the estimated motion to be predictable. If the motion from previous frames is known, then the motion in the near future should be predictable. Therefore, we introduce a predictability regularization by first conditioning the estimated motion on latent embeddings, then by adopting a predictor network to enforce predictability on the embeddings. The proposed framework PREF (Predictability REgularized Fields) achieves on par or better results than state-of-the-art neural motion field-based dynamic scene representation methods, while requiring no prior knowledge of the scene.

* Accepted at ECCV 2022 (oral). Paper + supplementary material

Via

Access Paper or Ask Questions

Preserving Privacy in Federated Learning with Ensemble Cross-Domain Knowledge Distillation

Sep 10, 2022

Xuan Gong, Abhishek Sharma, Srikrishna Karanam, Ziyan Wu, Terrence Chen, David Doermann, Arun Innanje

Figure 1 for Preserving Privacy in Federated Learning with Ensemble Cross-Domain Knowledge Distillation

Figure 2 for Preserving Privacy in Federated Learning with Ensemble Cross-Domain Knowledge Distillation

Figure 3 for Preserving Privacy in Federated Learning with Ensemble Cross-Domain Knowledge Distillation

Figure 4 for Preserving Privacy in Federated Learning with Ensemble Cross-Domain Knowledge Distillation

Abstract:Federated Learning (FL) is a machine learning paradigm where local nodes collaboratively train a central model while the training data remains decentralized. Existing FL methods typically share model parameters or employ co-distillation to address the issue of unbalanced data distribution. However, they suffer from communication bottlenecks. More importantly, they risk privacy leakage. In this work, we develop a privacy preserving and communication efficient method in a FL framework with one-shot offline knowledge distillation using unlabeled, cross-domain public data. We propose a quantized and noisy ensemble of local predictions from completely trained local models for stronger privacy guarantees without sacrificing accuracy. Based on extensive experiments on image classification and text classification tasks, we show that our privacy-preserving method outperforms baseline FL algorithms with superior performance in both accuracy and communication efficiency.

* Accepted by AAAI2022

Via

Access Paper or Ask Questions

Self-supervised Human Mesh Recovery with Cross-Representation Alignment

Sep 10, 2022

Xuan Gong, Meng Zheng, Benjamin Planche, Srikrishna Karanam, Terrence Chen, David Doermann, Ziyan Wu

Figure 1 for Self-supervised Human Mesh Recovery with Cross-Representation Alignment

Figure 2 for Self-supervised Human Mesh Recovery with Cross-Representation Alignment

Figure 3 for Self-supervised Human Mesh Recovery with Cross-Representation Alignment

Figure 4 for Self-supervised Human Mesh Recovery with Cross-Representation Alignment

Abstract:Fully supervised human mesh recovery methods are data-hungry and have poor generalizability due to the limited availability and diversity of 3D-annotated benchmark datasets. Recent progress in self-supervised human mesh recovery has been made using synthetic-data-driven training paradigms where the model is trained from synthetic paired 2D representation (e.g., 2D keypoints and segmentation masks) and 3D mesh. However, on synthetic dense correspondence maps (i.e., IUV) few have been explored since the domain gap between synthetic training data and real testing data is hard to address for 2D dense representation. To alleviate this domain gap on IUV, we propose cross-representation alignment utilizing the complementary information from the robust but sparse representation (2D keypoints). Specifically, the alignment errors between initial mesh estimation and both 2D representations are forwarded into regressor and dynamically corrected in the following mesh regression. This adaptive cross-representation alignment explicitly learns from the deviations and captures complementary information: robustness from sparse representation and richness from dense representation. We conduct extensive experiments on multiple standard benchmark datasets and demonstrate competitive results, helping take a step towards reducing the annotation effort needed to produce state-of-the-art models in human mesh estimation.

* Accepted ECCV2022

Via

Access Paper or Ask Questions

Confidence Dimension for Deep Learning based on Hoeffding Inequality and Relative Evaluation

Mar 17, 2022

Runqi Wang, Linlin Yang, Baochang Zhang, Wentao Zhu, David Doermann, Guodong Guo

Figure 1 for Confidence Dimension for Deep Learning based on Hoeffding Inequality and Relative Evaluation

Figure 2 for Confidence Dimension for Deep Learning based on Hoeffding Inequality and Relative Evaluation

Figure 3 for Confidence Dimension for Deep Learning based on Hoeffding Inequality and Relative Evaluation

Figure 4 for Confidence Dimension for Deep Learning based on Hoeffding Inequality and Relative Evaluation

Abstract:Research on the generalization ability of deep neural networks (DNNs) has recently attracted a great deal of attention. However, due to their complex architectures and large numbers of parameters, measuring the generalization ability of specific DNN models remains an open challenge. In this paper, we propose to use multiple factors to measure and rank the relative generalization of DNNs based on a new concept of confidence dimension (CD). Furthermore, we provide a feasible framework in our CD to theoretically calculate the upper bound of generalization based on the conventional Vapnik-Chervonenk dimension (VC-dimension) and Hoeffding's inequality. Experimental results on image classification and object detection demonstrate that our CD can reflect the relative generalization ability for different DNNs. In addition to full-precision DNNs, we also analyze the generalization ability of binary neural networks (BNNs), whose generalization ability remains an unsolved problem. Our CD yields a consistent and reliable measure and ranking for both full-precision DNNs and BNNs on all the tasks.

Via

Access Paper or Ask Questions

Associative Adversarial Learning Based on Selective Attack

Jan 04, 2022

Runqi Wang, Xiaoyue Duan, Baochang Zhang, Song Xue, Wentao Zhu, David Doermann, Guodong Guo

Figure 1 for Associative Adversarial Learning Based on Selective Attack

Figure 2 for Associative Adversarial Learning Based on Selective Attack

Figure 3 for Associative Adversarial Learning Based on Selective Attack

Figure 4 for Associative Adversarial Learning Based on Selective Attack

Abstract:A human's attention can intuitively adapt to corrupted areas of an image by recalling a similar uncorrupted image they have previously seen. This observation motivates us to improve the attention of adversarial images by considering their clean counterparts. To accomplish this, we introduce Associative Adversarial Learning (AAL) into adversarial learning to guide a selective attack. We formulate the intrinsic relationship between attention and attack (perturbation) as a coupling optimization problem to improve their interaction. This leads to an attention backtracking algorithm that can effectively enhance the attention's adversarial robustness. Our method is generic and can be used to address a variety of tasks by simply choosing different kernels for the associative attention that select other regions for a specific attack. Experimental results show that the selective attack improves the model's performance. We show that our method improves the recognition accuracy of adversarial training on ImageNet by 8.32% compared with the baseline. It also increases object detection mAP on PascalVOC by 2.02% and recognition accuracy of few-shot learning on miniImageNet by 1.63%.

Via

Access Paper or Ask Questions

Learning Robot Swarm Tactics over Complex Adversarial Environments

Sep 13, 2021

Amir Behjat, Hemanth Manjunatha, Prajit KrisshnaKumar, Apurv Jani, Leighton Collins, Payam Ghassemi, Joseph Distefano, David Doermann, Karthik Dantu, Ehsan Esfahani(+1 more)

Figure 1 for Learning Robot Swarm Tactics over Complex Adversarial Environments

Figure 2 for Learning Robot Swarm Tactics over Complex Adversarial Environments

Figure 3 for Learning Robot Swarm Tactics over Complex Adversarial Environments

Figure 4 for Learning Robot Swarm Tactics over Complex Adversarial Environments

Abstract:To accomplish complex swarm robotic missions in the real world, one needs to plan and execute a combination of single robot behaviors, group primitives such as task allocation, path planning, and formation control, and mission-specific objectives such as target search and group coverage. Most such missions are designed manually by teams of robotics experts. Recent work in automated approaches to learning swarm behavior has been limited to individual primitives with sparse work on learning complete missions. This paper presents a systematic approach to learn tactical mission-specific policies that compose primitives in a swarm to accomplish the mission efficiently using neural networks with special input and output encoding. To learn swarm tactics in an adversarial environment, we employ a combination of 1) map-to-graph abstraction, 2) input/output encoding via Pareto filtering of points of interest and clustering of robots, and 3) learning via neuroevolution and policy gradient approaches. We illustrate this combination as critical to providing tractable learning, especially given the computational cost of simulating swarm missions of this scale and complexity. Successful mission completion outcomes are demonstrated with up to 60 robots. In addition, a close match in the performance statistics in training and testing scenarios shows the potential generalizability of the proposed framework.

* Accepted to IEEE International Symposium on Multi-Robot and Multi-Agent Systems 2021

Via

Access Paper or Ask Questions

Semantic Text-to-Face GAN -ST^2FG

Jul 22, 2021

Manan Oza, Sukalpa Chanda, David Doermann

Figure 1 for Semantic Text-to-Face GAN -ST^2FG

Figure 2 for Semantic Text-to-Face GAN -ST^2FG

Figure 3 for Semantic Text-to-Face GAN -ST^2FG

Figure 4 for Semantic Text-to-Face GAN -ST^2FG

Abstract:Faces generated using generative adversarial networks (GANs) have reached unprecedented realism. These faces, also known as "Deep Fakes", appear as realistic photographs with very little pixel-level distortions. While some work has enabled the training of models that lead to the generation of specific properties of the subject, generating a facial image based on a natural language description has not been fully explored. For security and criminal identification, the ability to provide a GAN-based system that works like a sketch artist would be incredibly useful. In this paper, we present a novel approach to generate facial images from semantic text descriptions. The learned model is provided with a text description and an outline of the type of face, which the model uses to sketch the features. Our models are trained using an Affine Combination Module (ACM) mechanism to combine the text embedding from BERT and the GAN latent space using a self-attention matrix. This avoids the loss of features due to inadequate "attention", which may happen if text embedding and latent vector are simply concatenated. Our approach is capable of generating images that are very accurately aligned to the exhaustive textual descriptions of faces with many fine detail features of the face and helps in generating better images. The proposed method is also capable of making incremental changes to a previously generated image if it is provided with additional textual descriptions or sentences.

* arXiv admin note: text overlap with arXiv:2010.12136 by other authors

Via

Access Paper or Ask Questions

Two-Stream Consensus Network: Submission to HACS Challenge 2021 Weakly-Supervised Learning Track

Jul 11, 2021

Yuanhao Zhai, Le Wang, David Doermann, Junsong Yuan

Figure 1 for Two-Stream Consensus Network: Submission to HACS Challenge 2021 Weakly-Supervised Learning Track

Figure 2 for Two-Stream Consensus Network: Submission to HACS Challenge 2021 Weakly-Supervised Learning Track

Figure 3 for Two-Stream Consensus Network: Submission to HACS Challenge 2021 Weakly-Supervised Learning Track

Abstract:This technical report presents our solution to the HACS Temporal Action Localization Challenge 2021, Weakly-Supervised Learning Track. The goal of weakly-supervised temporal action localization is to temporally locate and classify action of interest in untrimmed videos given only video-level labels. We adopt the two-stream consensus network (TSCN) as the main framework in this challenge. The TSCN consists of a two-stream base model training procedure and a pseudo ground truth learning procedure. The base model training encourages the model to predict reliable predictions based on single modality (i.e., RGB or optical flow), based on the fusion of which a pseudo ground truth is generated and in turn used as supervision to train the base models. On the HACS v1.1.1 dataset, without fine-tuning the feature-extraction I3D models, our method achieves 22.20% on the validation set and 21.68% on the testing set in terms of average mAP. Our solution ranked the 2rd in this challenge, and we hope our method can serve as a baseline for future academic research.

* Second place solution to the HACS Weakly-Supervised Temporal Action Localization Challenge 2021. arXiv admin note: text overlap with arXiv:2010.11594

Via

Access Paper or Ask Questions

Cogradient Descent for Dependable Learning

Jun 20, 2021

Runqi Wang, Baochang Zhang, Li'an Zhuo, Qixiang Ye, David Doermann

Figure 1 for Cogradient Descent for Dependable Learning

Figure 2 for Cogradient Descent for Dependable Learning

Figure 3 for Cogradient Descent for Dependable Learning

Figure 4 for Cogradient Descent for Dependable Learning

Abstract:Conventional gradient descent methods compute the gradients for multiple variables through the partial derivative. Treating the coupled variables independently while ignoring the interaction, however, leads to an insufficient optimization for bilinear models. In this paper, we propose a dependable learning based on Cogradient Descent (CoGD) algorithm to address the bilinear optimization problem, providing a systematic way to coordinate the gradients of coupling variables based on a kernelized projection function. CoGD is introduced to solve bilinear problems when one variable is with sparsity constraint, as often occurs in modern learning paradigms. CoGD can also be used to decompose the association of features and weights, which further generalizes our method to better train convolutional neural networks (CNNs) and improve the model capacity. CoGD is applied in representative bilinear problems, including image reconstruction, image inpainting, network pruning and CNN training. Extensive experiments show that CoGD improves the state-of-the-arts by significant margins. Code is available at {https://github.com/bczhangbczhang/CoGD}.

* arXiv admin note: substantial text overlap with arXiv:2006.09142

Via

Access Paper or Ask Questions