Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lawrence Carin

Duke University

Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning

Mar 17, 2021

Siyang Yuan, Pengyu Cheng, Ruiyi Zhang, Weituo Hao, Zhe Gan, Lawrence Carin

Figure 1 for Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning

Figure 2 for Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning

Figure 3 for Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning

Figure 4 for Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning

Abstract:Voice style transfer, also called voice conversion, seeks to modify one speaker's voice to generate speech as if it came from another (target) speaker. Previous works have made progress on voice conversion with parallel training data and pre-known speakers. However, zero-shot voice style transfer, which learns from non-parallel data and generates voices for previously unseen speakers, remains a challenging problem. We propose a novel zero-shot voice transfer method via disentangled representation learning. The proposed method first encodes speaker-related style and voice content of each input voice into separated low-dimensional embedding spaces, and then transfers to a new voice by combining the source content embedding and target style embedding through a decoder. With information-theoretic guidance, the style and content embedding spaces are representative and (ideally) independent of each other. On real-world VCTK datasets, our method outperforms other baselines and obtains state-of-the-art results in terms of transfer accuracy and voice naturalness for voice style transfer experiments under both many-to-many and zero-shot setups.

* To appear in ICLR 2021

Via

Access Paper or Ask Questions

FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Mar 11, 2021

Pengyu Cheng, Weituo Hao, Siyang Yuan, Shijing Si, Lawrence Carin

Figure 1 for FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Figure 2 for FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Figure 3 for FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Figure 4 for FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Abstract:Pretrained text encoders, such as BERT, have been applied increasingly in various natural language processing (NLP) tasks, and have recently demonstrated significant performance gains. However, recent studies have demonstrated the existence of social bias in these pretrained NLP models. Although prior works have made progress on word-level debiasing, improved sentence-level fairness of pretrained encoders still lacks exploration. In this paper, we proposed the first neural debiasing method for a pretrained sentence encoder, which transforms the pretrained encoder outputs into debiased representations via a fair filter (FairFil) network. To learn the FairFil, we introduce a contrastive learning framework that not only minimizes the correlation between filtered embeddings and bias words but also preserves rich semantic information of the original sentences. On real-world datasets, our FairFil effectively reduces the bias degree of pretrained text encoders, while continuously showing desirable performance on downstream tasks. Moreover, our post-hoc method does not require any retraining of the text encoders, further enlarging FairFil's application space.

* Accepted by the 9th International Conference on Learning Representations (ICLR 2021)

Via

Access Paper or Ask Questions

Efficient Continual Adaptation for Generative Adversarial Networks

Mar 06, 2021

Sakshi Varshney, Vinay Kumar Verma, Lawrence Carin, Piyush Rai

Figure 1 for Efficient Continual Adaptation for Generative Adversarial Networks

Figure 2 for Efficient Continual Adaptation for Generative Adversarial Networks

Figure 3 for Efficient Continual Adaptation for Generative Adversarial Networks

Figure 4 for Efficient Continual Adaptation for Generative Adversarial Networks

Abstract:We present a continual learning approach for generative adversarial networks (GANs), by designing and leveraging parameter-efficient feature map transformations. Our approach is based on learning a set of global and task-specific parameters. The global parameters are fixed across tasks whereas the task specific parameters act as local adapters for each task, and help in efficiently transforming the previous task's feature map to the new task's feature map. Moreover, we propose an element-wise residual bias in the transformed feature space which highly stabilizes GAN training. In contrast to the recent approaches for continual GANs, we do not rely on memory replay, regularization towards previous tasks' parameters, or expensive weight transformations. Through extensive experiments on challenging and diverse datasets, we show that the feature-map transformation based approach outperforms state-of-the-art continual GANs methods, with substantially fewer parameters, and also generates high-quality samples that can be used in generative replay based continual learning of discriminative tasks.

* Under Submission

Via

Access Paper or Ask Questions

Meta-Learned Attribute Self-Gating for Continual Generalized Zero-Shot Learning

Feb 23, 2021

Vinay Kumar Verma, Kevin Liang, Nikhil Mehta, Lawrence Carin

Figure 1 for Meta-Learned Attribute Self-Gating for Continual Generalized Zero-Shot Learning

Figure 2 for Meta-Learned Attribute Self-Gating for Continual Generalized Zero-Shot Learning

Figure 3 for Meta-Learned Attribute Self-Gating for Continual Generalized Zero-Shot Learning

Figure 4 for Meta-Learned Attribute Self-Gating for Continual Generalized Zero-Shot Learning

Abstract:Zero-shot learning (ZSL) has been shown to be a promising approach to generalizing a model to categories unseen during training by leveraging class attributes, but challenges still remain. Recently, methods using generative models to combat bias towards classes seen during training have pushed the state of the art of ZSL, but these generative models can be slow or computationally expensive to train. Additionally, while many previous ZSL methods assume a one-time adaptation to unseen classes, in reality, the world is always changing, necessitating a constant adjustment for deployed models. Models unprepared to handle a sequential stream of data are likely to experience catastrophic forgetting. We propose a meta-continual zero-shot learning (MCZSL) approach to address both these issues. In particular, by pairing self-gating of attributes and scaled class normalization with meta-learning based training, we are able to outperform state-of-the-art results while being able to train our models substantially faster ($>100\times$) than expensive generative-based approaches. We demonstrate this by performing experiments on five standard ZSL datasets (CUB, aPY, AWA1, AWA2 and SUN) in both generalized zero-shot learning and generalized continual zero-shot learning settings.

* Under Review

Via

Access Paper or Ask Questions

FLOP: Federated Learning on Medical Datasets using Partial Networks

Feb 10, 2021

Qian Yang, Jianyi Zhang, Weituo Hao, Gregory Spell, Lawrence Carin

Figure 1 for FLOP: Federated Learning on Medical Datasets using Partial Networks

Figure 2 for FLOP: Federated Learning on Medical Datasets using Partial Networks

Figure 3 for FLOP: Federated Learning on Medical Datasets using Partial Networks

Figure 4 for FLOP: Federated Learning on Medical Datasets using Partial Networks

Abstract:The outbreak of COVID-19 Disease due to the novel coronavirus has caused a shortage of medical resources. To aid and accelerate the diagnosis process, automatic diagnosis of COVID-19 via deep learning models has recently been explored by researchers across the world. While different data-driven deep learning models have been developed to mitigate the diagnosis of COVID-19, the data itself is still scarce due to patient privacy concerns. Federated Learning (FL) is a natural solution because it allows different organizations to cooperatively learn an effective deep learning model without sharing raw data. However, recent studies show that FL still lacks privacy protection and may cause data leakage. We investigate this challenging problem by proposing a simple yet effective algorithm, named \textbf{F}ederated \textbf{L}earning \textbf{o}n Medical Datasets using \textbf{P}artial Networks (FLOP), that shares only a partial model between the server and clients. Extensive experiments on benchmark data and real-world healthcare tasks show that our approach achieves comparable or better performance while reducing the privacy and security risks. Of particular interest, we conduct experiments on the COVID-19 dataset and find that our FLOP algorithm can allow different hospitals to collaboratively and effectively train a partially shared model without sharing local patients' data.

Via

Access Paper or Ask Questions

Reinforcement Learning for Flexibility Design Problems

Jan 18, 2021

Yehua Wei, Lei Zhang, Ruiyi Zhang, Shijing Si, Hao Zhang, Lawrence Carin

Figure 1 for Reinforcement Learning for Flexibility Design Problems

Figure 2 for Reinforcement Learning for Flexibility Design Problems

Figure 3 for Reinforcement Learning for Flexibility Design Problems

Figure 4 for Reinforcement Learning for Flexibility Design Problems

Abstract:Flexibility design problems are a class of problems that appear in strategic decision-making across industries, where the objective is to design a ($e.g.$, manufacturing) network that affords flexibility and adaptivity. The underlying combinatorial nature and stochastic objectives make flexibility design problems challenging for standard optimization methods. In this paper, we develop a reinforcement learning (RL) framework for flexibility design problems. Specifically, we carefully design mechanisms with noisy exploration and variance reduction to ensure empirical success and show the unique advantage of RL in terms of fast-adaptation. Empirical results show that the RL-based method consistently finds better solutions compared to classical heuristics.

Via

Access Paper or Ask Questions

What Makes Good In-Context Examples for GPT-$3$?

Jan 17, 2021

Jiachang Liu, Dinghan Shen, Yizhe Zhang, Bill Dolan, Lawrence Carin, Weizhu Chen

Figure 1 for What Makes Good In-Context Examples for GPT-$3$?

Figure 2 for What Makes Good In-Context Examples for GPT-$3$?

Figure 3 for What Makes Good In-Context Examples for GPT-$3$?

Figure 4 for What Makes Good In-Context Examples for GPT-$3$?

Abstract:GPT-$3$ has attracted lots of attention due to its superior performance across a wide range of NLP tasks, especially with its powerful and versatile in-context few-shot learning ability. Despite its success, we found that the empirical results of GPT-$3$ depend heavily on the choice of in-context examples. In this work, we investigate whether there are more effective strategies for judiciously selecting in-context examples (relative to random sampling) that better leverage GPT-$3$'s few-shot capabilities. Inspired by the recent success of leveraging a retrieval module to augment large-scale neural network models, we propose to retrieve examples that are semantically-similar to a test sample to formulate its corresponding prompt. Intuitively, the in-context examples selected with such a strategy may serve as more informative inputs to unleash GPT-$3$'s extensive knowledge. We evaluate the proposed approach on several natural language understanding and generation benchmarks, where the retrieval-based prompt selection approach consistently outperforms the random baseline. Moreover, it is observed that the sentence encoders fine-tuned on task-related datasets yield even more helpful retrieval results. Notably, significant gains are observed on tasks such as table-to-text generation (41.9% on the ToTTo dataset) and open-domain question answering (45.5% on the NQ dataset). We hope our investigation could help understand the behaviors of GPT-$3$ and large-scale pre-trained LMs in general and enhance their few-shot capabilities.

Via

Access Paper or Ask Questions

Learning Graphons via Structured Gromov-Wasserstein Barycenters

Dec 17, 2020

Hongteng Xu, Dixin Luo, Lawrence Carin, Hongyuan Zha

Figure 1 for Learning Graphons via Structured Gromov-Wasserstein Barycenters

Figure 2 for Learning Graphons via Structured Gromov-Wasserstein Barycenters

Figure 3 for Learning Graphons via Structured Gromov-Wasserstein Barycenters

Figure 4 for Learning Graphons via Structured Gromov-Wasserstein Barycenters

Abstract:We propose a novel and principled method to learn a nonparametric graph model called graphon, which is defined in an infinite-dimensional space and represents arbitrary-size graphs. Based on the weak regularity lemma from the theory of graphons, we leverage a step function to approximate a graphon. We show that the cut distance of graphons can be relaxed to the Gromov-Wasserstein distance of their step functions. Accordingly, given a set of graphs generated by an underlying graphon, we learn the corresponding step function as the Gromov-Wasserstein barycenter of the given graphs. Furthermore, we develop several enhancements and extensions of the basic algorithm, $e.g.$, the smoothed Gromov-Wasserstein barycenter for guaranteeing the continuity of the learned graphons and the mixed Gromov-Wasserstein barycenters for learning multiple structured graphons. The proposed approach overcomes drawbacks of prior state-of-the-art methods, and outperforms them on both synthetic and real-world data. The code is available at https://github.com/HongtengXu/SGWB-Graphon.

* AAAI 2021

Via

Access Paper or Ask Questions

Wasserstein Contrastive Representation Distillation

Dec 15, 2020

Liqun Chen, Zhe Gan, Dong Wang, Jingjing Liu, Ricardo Henao, Lawrence Carin

Figure 1 for Wasserstein Contrastive Representation Distillation

Figure 2 for Wasserstein Contrastive Representation Distillation

Figure 3 for Wasserstein Contrastive Representation Distillation

Figure 4 for Wasserstein Contrastive Representation Distillation

Abstract:The primary goal of knowledge distillation (KD) is to encapsulate the information of a model learned from a teacher network into a student network, with the latter being more compact than the former. Existing work, e.g., using Kullback-Leibler divergence for distillation, may fail to capture important structural knowledge in the teacher network and often lacks the ability for feature generalization, particularly in situations when teacher and student are built to address different classification tasks. We propose Wasserstein Contrastive Representation Distillation (WCoRD), which leverages both primal and dual forms of Wasserstein distance for KD. The dual form is used for global knowledge transfer, yielding a contrastive learning objective that maximizes the lower bound of mutual information between the teacher and the student networks. The primal form is used for local contrastive knowledge transfer within a mini-batch, effectively matching the distributions of features between the teacher and the student networks. Experiments demonstrate that the proposed WCoRD method outperforms state-of-the-art approaches on privileged information distillation, model compression and cross-modal transfer.

Via

Access Paper or Ask Questions

Proactive Pseudo-Intervention: Causally Informed Contrastive Learning For Interpretable Vision Models

Dec 06, 2020

Dong Wang, Yuewei Yang, Chenyang Tao, Fanjie Kong, Ricardo Henao, Lawrence Carin

Figure 1 for Proactive Pseudo-Intervention: Causally Informed Contrastive Learning For Interpretable Vision Models

Figure 2 for Proactive Pseudo-Intervention: Causally Informed Contrastive Learning For Interpretable Vision Models

Figure 3 for Proactive Pseudo-Intervention: Causally Informed Contrastive Learning For Interpretable Vision Models

Figure 4 for Proactive Pseudo-Intervention: Causally Informed Contrastive Learning For Interpretable Vision Models

Abstract:Deep neural networks have shown significant promise in comprehending complex visual signals, delivering performance on par or even superior to that of human experts. However, these models often lack a mechanism for interpreting their predictions, and in some cases, particularly when the sample size is small, existing deep learning solutions tend to capture spurious correlations that compromise model generalizability on unseen inputs. In this work, we propose a contrastive causal representation learning strategy that leverages proactive interventions to identify causally-relevant image features, called Proactive Pseudo-Intervention (PPI). This approach is complemented with a causal salience map visualization module, i.e., Weight Back Propagation (WBP), that identifies important pixels in the raw input image, which greatly facilitates the interpretability of predictions. To validate its utility, our model is benchmarked extensively on both standard natural images and challenging medical image datasets. We show this new contrastive causal representation learning model consistently improves model performance relative to competing solutions, particularly for out-of-domain predictions or when dealing with data integration from heterogeneous sources. Further, our causal saliency maps are more succinct and meaningful relative to their non-causal counterparts.

Via

Access Paper or Ask Questions