Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Liqun Chen

Weakly supervised cross-domain alignment with optimal transport

Aug 14, 2020

Siyang Yuan, Ke Bai, Liqun Chen, Yizhe Zhang, Chenyang Tao, Chunyuan Li, Guoyin Wang, Ricardo Henao, Lawrence Carin

Figure 1 for Weakly supervised cross-domain alignment with optimal transport

Figure 2 for Weakly supervised cross-domain alignment with optimal transport

Figure 3 for Weakly supervised cross-domain alignment with optimal transport

Figure 4 for Weakly supervised cross-domain alignment with optimal transport

Abstract:Cross-domain alignment between image objects and text sequences is key to many visual-language tasks, and it poses a fundamental challenge to both computer vision and natural language processing. This paper investigates a novel approach for the identification and optimization of fine-grained semantic similarities between image and text entities, under a weakly-supervised setup, improving performance over state-of-the-art solutions. Our method builds upon recent advances in optimal transport (OT) to resolve the cross-domain matching problem in a principled manner. Formulated as a drop-in regularizer, the proposed OT solution can be efficiently computed and used in combination with other existing approaches. We present empirical evidence to demonstrate the effectiveness of our approach, showing how it enables simpler model architectures to outperform or be comparable with more sophisticated designs on a range of vision-language tasks.

* Accepted to BMVC 2020 (Oral)

Via

Access Paper or Ask Questions

Graph Optimal Transport for Cross-Domain Alignment

Jun 29, 2020

Liqun Chen, Zhe Gan, Yu Cheng, Linjie Li, Lawrence Carin, Jingjing Liu

Figure 1 for Graph Optimal Transport for Cross-Domain Alignment

Figure 2 for Graph Optimal Transport for Cross-Domain Alignment

Figure 3 for Graph Optimal Transport for Cross-Domain Alignment

Figure 4 for Graph Optimal Transport for Cross-Domain Alignment

Abstract:Cross-domain alignment between two sets of entities (e.g., objects in an image, words in a sentence) is fundamental to both computer vision and natural language processing. Existing methods mainly focus on designing advanced attention mechanisms to simulate soft alignment, with no training signals to explicitly encourage alignment. The learned attention matrices are also dense and lacks interpretability. We propose Graph Optimal Transport (GOT), a principled framework that germinates from recent advances in Optimal Transport (OT). In GOT, cross-domain alignment is formulated as a graph matching problem, by representing entities into a dynamically-constructed graph. Two types of OT distances are considered: (i) Wasserstein distance (WD) for node (entity) matching; and (ii) Gromov-Wasserstein distance (GWD) for edge (structure) matching. Both WD and GWD can be incorporated into existing neural network models, effectively acting as a drop-in regularizer. The inferred transport plan also yields sparse and self-normalized alignment, enhancing the interpretability of the learned model. Experiments show consistent outperformance of GOT over baselines across a wide range of tasks, including image-text retrieval, visual question answering, image captioning, machine translation, and text summarization.

* ICML 2020

Via

Access Paper or Ask Questions

Graph-Driven Generative Models for Heterogeneous Multi-Task Learning

Nov 20, 2019

Wenlin Wang, Hongteng Xu, Zhe Gan, Bai Li, Guoyin Wang, Liqun Chen, Qian Yang, Wenqi Wang, Lawrence Carin

Figure 1 for Graph-Driven Generative Models for Heterogeneous Multi-Task Learning

Figure 2 for Graph-Driven Generative Models for Heterogeneous Multi-Task Learning

Figure 3 for Graph-Driven Generative Models for Heterogeneous Multi-Task Learning

Figure 4 for Graph-Driven Generative Models for Heterogeneous Multi-Task Learning

Abstract:We propose a novel graph-driven generative model, that unifies multiple heterogeneous learning tasks into the same framework. The proposed model is based on the fact that heterogeneous learning tasks, which correspond to different generative processes, often rely on data with a shared graph structure. Accordingly, our model combines a graph convolutional network (GCN) with multiple variational autoencoders, thus embedding the nodes of the graph i.e., samples for the tasks) in a uniform manner while specializing their organization and usage to different tasks. With a focus on healthcare applications (tasks), including clinical topic modeling, procedure recommendation and admission-type prediction, we demonstrate that our method successfully leverages information across different tasks, boosting performance in all tasks and outperforming existing state-of-the-art approaches.

* Accepted by AAAI-2020

Via

Access Paper or Ask Questions

Improving Textual Network Learning with Variational Homophilic Embeddings

Sep 30, 2019

Wenlin Wang, Chenyang Tao, Zhe Gan, Guoyin Wang, Liqun Chen, Xinyuan Zhang, Ruiyi Zhang, Qian Yang, Ricardo Henao, Lawrence Carin

Figure 1 for Improving Textual Network Learning with Variational Homophilic Embeddings

Figure 2 for Improving Textual Network Learning with Variational Homophilic Embeddings

Figure 3 for Improving Textual Network Learning with Variational Homophilic Embeddings

Figure 4 for Improving Textual Network Learning with Variational Homophilic Embeddings

Abstract:The performance of many network learning applications crucially hinges on the success of network embedding algorithms, which aim to encode rich network information into low-dimensional vertex-based vector representations. This paper considers a novel variational formulation of network embeddings, with special focus on textual networks. Different from most existing methods that optimize a discriminative objective, we introduce Variational Homophilic Embedding (VHE), a fully generative model that learns network embeddings by modeling the semantic (textual) information with a variational autoencoder, while accounting for the structural (topology) information through a novel homophilic prior design. Homophilic vertex embeddings encourage similar embedding vectors for related (connected) vertices. The proposed VHE promises better generalization for downstream tasks, robustness to incomplete observations, and the ability to generalize to unseen vertices. Extensive experiments on real-world networks, for multiple tasks, demonstrate that the proposed method consistently achieves superior performance relative to competing state-of-the-art approaches.

* Accepted to NeurIPS 2019

Via

Access Paper or Ask Questions

LMVP: Video Predictor with Leaked Motion Information

Jun 24, 2019

Dong Wang, Yitong Li, Wei Cao, Liqun Chen, Qi Wei, Lawrence Carin

Figure 1 for LMVP: Video Predictor with Leaked Motion Information

Figure 2 for LMVP: Video Predictor with Leaked Motion Information

Figure 3 for LMVP: Video Predictor with Leaked Motion Information

Figure 4 for LMVP: Video Predictor with Leaked Motion Information

Abstract:We propose a Leaked Motion Video Predictor (LMVP) to predict future frames by capturing the spatial and temporal dependencies from given inputs. The motion is modeled by a newly proposed component, motion guider, which plays the role of both learner and teacher. Specifically, it {\em learns} the temporal features from real data and {\em guides} the generator to predict future frames. The spatial consistency in video is modeled by an adaptive filtering network. To further ensure the spatio-temporal consistency of the prediction, a discriminator is also adopted to distinguish the real and generated frames. Further, the discriminator leaks information to the motion guider and the generator to help the learning of motion. The proposed LMVP can effectively learn the static and temporal features in videos without the need for human labeling. Experiments on synthetic and real data demonstrate that LMVP can yield state-of-the-art results.

Via

Access Paper or Ask Questions

Improving Textual Network Embedding with Global Attention via Optimal Transport

Jun 05, 2019

Liqun Chen, Guoyin Wang, Chenyang Tao, Dinghan Shen, Pengyu Cheng, Xinyuan Zhang, Wenlin Wang, Yizhe Zhang, Lawrence Carin

Figure 1 for Improving Textual Network Embedding with Global Attention via Optimal Transport

Figure 2 for Improving Textual Network Embedding with Global Attention via Optimal Transport

Figure 3 for Improving Textual Network Embedding with Global Attention via Optimal Transport

Figure 4 for Improving Textual Network Embedding with Global Attention via Optimal Transport

Abstract:Constituting highly informative network embeddings is an important tool for network analysis. It encodes network topology, along with other useful side information, into low-dimensional node-based feature representations that can be exploited by statistical modeling. This work focuses on learning context-aware network embeddings augmented with text data. We reformulate the network-embedding problem, and present two novel strategies to improve over traditional attention mechanisms: ($i$) a content-aware sparse attention module based on optimal transport, and ($ii$) a high-level attention parsing module. Our approach yields naturally sparse and self-normalized relational inference. It can capture long-term interactions between sequences, thus addressing the challenges faced by existing textual network embedding schemes. Extensive experiments are conducted to demonstrate our model can consistently outperform alternative state-of-the-art methods.

Via

Access Paper or Ask Questions

Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models

Feb 01, 2019

Dinghan Shen, Asli Celikyilmaz, Yizhe Zhang, Liqun Chen, Xin Wang, Jianfeng Gao, Lawrence Carin

Figure 1 for Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models

Figure 2 for Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models

Figure 3 for Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models

Figure 4 for Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models

Abstract:Variational autoencoders (VAEs) have received much attention recently as an end-to-end architecture for text generation with latent variables. In this paper, we investigate several multi-level structures to learn a VAE model to generate long, and coherent text. In particular, we use a hierarchy of stochastic layers between the encoder and decoder networks to generate more informative latent codes. We also investigate a multi-level decoder structure to learn a coherent long-term structure by generating intermediate sentence representations as high-level plan vectors. Empirical results demonstrate that a multi-level VAE model produces more coherent and less repetitive long text compared to the standard VAE models and can further mitigate the posterior-collapse issue.

Via

Access Paper or Ask Questions

Improving Sequence-to-Sequence Learning via Optimal Transport

Jan 18, 2019

Liqun Chen, Yizhe Zhang, Ruiyi Zhang, Chenyang Tao, Zhe Gan, Haichao Zhang, Bai Li, Dinghan Shen, Changyou Chen, Lawrence Carin

Figure 1 for Improving Sequence-to-Sequence Learning via Optimal Transport

Figure 2 for Improving Sequence-to-Sequence Learning via Optimal Transport

Figure 3 for Improving Sequence-to-Sequence Learning via Optimal Transport

Figure 4 for Improving Sequence-to-Sequence Learning via Optimal Transport

Abstract:Sequence-to-sequence models are commonly trained via maximum likelihood estimation (MLE). However, standard MLE training considers a word-level objective, predicting the next word given the previous ground-truth partial sentence. This procedure focuses on modeling local syntactic patterns, and may fail to capture long-range semantic structure. We present a novel solution to alleviate these issues. Our approach imposes global sequence-level guidance via new supervision based on optimal transport, enabling the overall characterization and preservation of semantic features. We further show that this method can be understood as a Wasserstein gradient flow trying to match our model to the ground truth sequence distribution. Extensive experiments are conducted to validate the utility of the proposed approach, showing consistent improvements over a wide variety of NLP tasks, including machine translation, abstractive text summarization, and image captioning.

Via

Access Paper or Ask Questions

Sequence Generation with Guider Network

Nov 02, 2018

Ruiyi Zhang, Changyou Chen, Zhe Gan, Wenlin Wang, Liqun Chen, Dinghan Shen, Guoyin Wang, Lawrence Carin

Figure 1 for Sequence Generation with Guider Network

Figure 2 for Sequence Generation with Guider Network

Figure 3 for Sequence Generation with Guider Network

Figure 4 for Sequence Generation with Guider Network

Abstract:Sequence generation with reinforcement learning (RL) has received significant attention recently. However, a challenge with such methods is the sparse-reward problem in the RL training process, in which a scalar guiding signal is often only available after an entire sequence has been generated. This type of sparse reward tends to ignore the global structural information of a sequence, causing generation of sequences that are semantically inconsistent. In this paper, we present a model-based RL approach to overcome this issue. Specifically, we propose a novel guider network to model the sequence-generation environment, which can assist next-word prediction and provide intermediate rewards for generator optimization. Extensive experiments show that the proposed method leads to improved performance for both unconditional and conditional sequence-generation tasks.

Via

Access Paper or Ask Questions

Adversarial Text Generation via Feature-Mover's Distance

Sep 17, 2018

Liqun Chen, Shuyang Dai, Chenyang Tao, Dinghan Shen, Zhe Gan, Haichao Zhang, Yizhe Zhang, Lawrence Carin

Figure 1 for Adversarial Text Generation via Feature-Mover's Distance

Figure 2 for Adversarial Text Generation via Feature-Mover's Distance

Figure 3 for Adversarial Text Generation via Feature-Mover's Distance

Figure 4 for Adversarial Text Generation via Feature-Mover's Distance

Abstract:Generative adversarial networks (GANs) have achieved significant success in generating real-valued data. However, the discrete nature of text hinders the application of GAN to text-generation tasks. Instead of using the standard GAN objective, we propose to improve text-generation GAN via a novel approach inspired by optimal transport. Specifically, we consider matching the latent feature distributions of real and synthetic sentences using a novel metric, termed the feature-mover's distance (FMD). This formulation leads to a highly discriminative critic and easy-to-optimize objective, overcoming the mode-collapsing and brittle-training problems in existing methods. Extensive experiments are conducted on a variety of tasks to evaluate the proposed model empirically, including unconditional text generation, style transfer from non-parallel text, and unsupervised cipher cracking. The proposed model yields superior performance, demonstrating wide applicability and effectiveness.

Via

Access Paper or Ask Questions