Alert button
Picture for Ziliang Chen

Ziliang Chen

Alert button

LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts

Aug 13, 2023
Binbin Yang, Yi Luo, Ziliang Chen, Guangrun Wang, Xiaodan Liang, Liang Lin

Figure 1 for LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts
Figure 2 for LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts
Figure 3 for LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts
Figure 4 for LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts

Thanks to the rapid development of diffusion models, unprecedented progress has been witnessed in image synthesis. Prior works mostly rely on pre-trained linguistic models, but a text is often too abstract to properly specify all the spatial properties of an image, e.g., the layout configuration of a scene, leading to the sub-optimal results of complex scene generation. In this paper, we achieve accurate complex scene generation by proposing a semantically controllable Layout-AWare diffusion model, termed LAW-Diffusion. Distinct from the previous Layout-to-Image generation (L2I) methods that only explore category-aware relationships, LAW-Diffusion introduces a spatial dependency parser to encode the location-aware semantic coherence across objects as a layout embedding and produces a scene with perceptually harmonious object styles and contextual relations. To be specific, we delicately instantiate each object's regional semantics as an object region map and leverage a location-aware cross-object attention module to capture the spatial dependencies among those disentangled representations. We further propose an adaptive guidance schedule for our layout guidance to mitigate the trade-off between the regional semantic alignment and the texture fidelity of generated objects. Moreover, LAW-Diffusion allows for instance reconfiguration while maintaining the other regions in a synthesized image by introducing a layout-aware latent grafting mechanism to recompose its local regional semantics. To better verify the plausibility of generated scenes, we propose a new evaluation metric for the L2I task, dubbed Scene Relation Score (SRS) to measure how the images preserve the rational and harmonious relations among contextual objects. Comprehensive experiments demonstrate that our LAW-Diffusion yields the state-of-the-art generative performance, especially with coherent object relations.

Viaarxiv icon

Open Set Domain Adaptation By Novel Class Discovery

Mar 07, 2022
Jingyu Zhuang, Ziliang Chen, Pengxu Wei, Guanbin Li, Liang Lin

Figure 1 for Open Set Domain Adaptation By Novel Class Discovery
Figure 2 for Open Set Domain Adaptation By Novel Class Discovery
Figure 3 for Open Set Domain Adaptation By Novel Class Discovery
Figure 4 for Open Set Domain Adaptation By Novel Class Discovery

In Open Set Domain Adaptation (OSDA), large amounts of target samples are drawn from the implicit categories that never appear in the source domain. Due to the lack of their specific belonging, existing methods indiscriminately regard them as a single class unknown. We challenge this broadly-adopted practice that may arouse unexpected detrimental effects because the decision boundaries between the implicit categories have been fully ignored. Instead, we propose Self-supervised Class-Discovering Adapter (SCDA) that attempts to achieve OSDA by gradually discovering those implicit classes, then incorporating them to restructure the classifier and update the domain-adaptive features iteratively. SCDA performs two alternate steps to achieve implicit class discovery and self-supervised OSDA, respectively. By jointly optimizing for two tasks, SCDA achieves the state-of-the-art in OSDA and shows a competitive performance to unearth the implicit target classes.

Viaarxiv icon

Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation

Dec 22, 2020
Shuai Lin, Pan Zhou, Xiaodan Liang, Jianheng Tang, Ruihui Zhao, Ziliang Chen, Liang Lin

Figure 1 for Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation
Figure 2 for Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation
Figure 3 for Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation
Figure 4 for Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation

Human doctors with well-structured medical knowledge can diagnose a disease merely via a few conversations with patients about symptoms. In contrast, existing knowledge-grounded dialogue systems often require a large number of dialogue instances to learn as they fail to capture the correlations between different diseases and neglect the diagnostic experience shared among them. To address this issue, we propose a more natural and practical paradigm, i.e., low-resource medical dialogue generation, which can transfer the diagnostic experience from source diseases to target ones with a handful of data for adaptation. It is capitalized on a commonsense knowledge graph to characterize the prior disease-symptom relations. Besides, we develop a Graph-Evolving Meta-Learning (GEML) framework that learns to evolve the commonsense graph for reasoning disease-symptom correlations in a new disease, which effectively alleviates the needs of a large number of dialogues. More importantly, by dynamically evolving disease-symptom graphs, GEML also well addresses the real-world challenges that the disease-symptom correlations of each disease may vary or evolve along with more diagnostic cases. Extensive experiment results on the CMDD dataset and our newly-collected Chunyu dataset testify the superiority of our approach over state-of-the-art approaches. Besides, our GEML can generate an enriched dialogue-sensitive knowledge graph in an online manner, which could benefit other tasks grounded on knowledge graph.

* Accepted by AAAI 2021 
Viaarxiv icon

Learning Reinforced Agents with Counterfactual Simulation for Medical Automatic Diagnosis

Mar 14, 2020
Junfan Lin, Ziliang Chen, Xiaodan Liang, Keze Wang, Liang Lin

Figure 1 for Learning Reinforced Agents with Counterfactual Simulation for Medical Automatic Diagnosis
Figure 2 for Learning Reinforced Agents with Counterfactual Simulation for Medical Automatic Diagnosis
Figure 3 for Learning Reinforced Agents with Counterfactual Simulation for Medical Automatic Diagnosis
Figure 4 for Learning Reinforced Agents with Counterfactual Simulation for Medical Automatic Diagnosis

Medical automatic diagnosis (MAD) aims to learn an agent that mimics the behavior of a human doctor, i.e. inquiring symptoms and informing diseases. Due to medical ethics concerns, it is impractical to directly apply reinforcement learning techniques to solving MAD, e.g., training a reinforced agent with the human patient. Developing a patient simulator by using the collected patient-doctor dialogue records has been proposed as a promising approach to MAD. However, most of these existing works overlook the causal relationship between patient symptoms and disease diagnoses. For example, these simulators simply generate the ``not-sure'' response to the inquiry (i.e., symptom) that was not observed in one dialogue record. As a result, the MAD agent is usually trained without exploiting the counterfactual reasoning beyond the factual observations. To address this problem, this paper presents a propensity-based patient simulator (PBPS), which is capable of facilitating the training of MAD agents by generating informative counterfactual answers along with the disease diagnosis. Specifically, our PBPS estimates the propensity score of each record with the patient-doctor dialogue reasoning, and can thus generate the counterfactual answers by searching across records. That is, the unrecorded symptom for one patient can be found in the records of other patients according to the propensity score matching. A progressive assurance agent (P2A) can be thus trained with PBPS, which includes two separate yet cooperative branches accounting for the execution of symptom-inquiry and disease-diagnosis actions, respectively. The disease-diagnosis predicts the confidence of disease and drives the symptom-inquiry in terms of enhancing the confidence, and the two branches are jointly optimized with benefiting from each other.

* Submitted to TPAMI 2020. In the experiments, our trained agent achieves the new state-of-the-art under various experimental settings and possesses the advantage of sample-efficiency and robustness compared to other existing MAD methods 
Viaarxiv icon

Meta R-CNN : Towards General Solver for Instance-level Low-shot Learning

Sep 28, 2019
Xiaopeng Yan, Ziliang Chen, Anni Xu, Xiaoxi Wang, Xiaodan Liang, Liang Lin

Figure 1 for Meta R-CNN : Towards General Solver for Instance-level Low-shot Learning
Figure 2 for Meta R-CNN : Towards General Solver for Instance-level Low-shot Learning
Figure 3 for Meta R-CNN : Towards General Solver for Instance-level Low-shot Learning
Figure 4 for Meta R-CNN : Towards General Solver for Instance-level Low-shot Learning

Resembling the rapid learning capability of human, low-shot learning empowers vision systems to understand new concepts by training with few samples. Leading approaches derived from meta-learning on images with a single visual object. Obfuscated by a complex background and multiple objects in one image, they are hard to promote the research of low-shot object detection/segmentation. In this work, we present a flexible and general methodology to achieve these tasks. Our work extends Faster /Mask R-CNN by proposing meta-learning over RoI (Region-of-Interest) features instead of a full image feature. This simple spirit disentangles multi-object information merged with the background, without bells and whistles, enabling Faster /Mask R-CNN turn into a meta-learner to achieve the tasks. Specifically, we introduce a Predictor-head Remodeling Network (PRN) that shares its main backbone with Faster /Mask R-CNN. PRN receives images containing low-shot objects with their bounding boxes or masks to infer their class attentive vectors. The vectors take channel-wise soft-attention on RoI features, remodeling those R-CNN predictor heads to detect or segment the objects that are consistent with the classes these vectors represent. In our experiments, Meta R-CNN yields the state of the art in low-shot object detection and improves low-shot object segmentation by Mask R-CNN.

* Published in ICCV-2019. Project: https://yanxp.github.io/metarcnn.html 
Viaarxiv icon

Graph Neural Reasoning May Fail in Certifying Boolean Unsatisfiability

Sep 27, 2019
Ziliang Chen, Zhanfu Yang

It is feasible and practically-valuable to bridge the characteristics between graph neural networks (GNNs) and logical reasoning. Despite considerable efforts and successes witnessed to solve Boolean satisfiability (SAT), it remains a mystery of GNN-based solvers for more complex predicate logic formulae. In this work, we conjectures with some evidences, that generally-defined GNNs present several limitations to certify the unsatisfiability (UNSAT) in Boolean formulae. It implies that GNNs may probably fail in learning the logical reasoning tasks if they contain proving UNSAT as the sub-problem included by most predicate logic formulae.

* 6 pages 
Viaarxiv icon

Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching

Jul 08, 2019
Ziliang Chen, Zhanfu Yang, Xiaoxi Wang, Xiaodan Liang, Xiaopeng Yan, Guanbin Li, Liang Lin

Figure 1 for Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching
Figure 2 for Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching
Figure 3 for Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching
Figure 4 for Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching

A broad range of cross-$m$-domain generation researches boil down to matching a joint distribution by deep generative models (DGMs). Hitherto algorithms excel in pairwise domains while as $m$ increases, remain struggling to scale themselves to fit a joint distribution. In this paper, we propose a domain-scalable DGM, i.e., MMI-ALI for $m$-domain joint distribution matching. As an $m$-domain ensemble model of ALIs \cite{dumoulin2016adversarially}, MMI-ALI is adversarially trained with maximizing Multivariate Mutual Information (MMI) w.r.t. joint variables of each pair of domains and their shared feature. The negative MMIs are upper bounded by a series of feasible losses that provably lead to matching $m$-domain joint distributions. MMI-ALI linearly scales as $m$ increases and thus, strikes a right balance between efficacy and scalability. We evaluate MMI-ALI in diverse challenging $m$-domain scenarios and verify its superiority.

* ICML-19 
Viaarxiv icon

Blending-target Domain Adaptation by Adversarial Meta-Adaptation Networks

Jul 08, 2019
Ziliang Chen, Jingyu Zhuang, Xiaodan Liang, Liang Lin

Figure 1 for Blending-target Domain Adaptation by Adversarial Meta-Adaptation Networks
Figure 2 for Blending-target Domain Adaptation by Adversarial Meta-Adaptation Networks
Figure 3 for Blending-target Domain Adaptation by Adversarial Meta-Adaptation Networks
Figure 4 for Blending-target Domain Adaptation by Adversarial Meta-Adaptation Networks

(Unsupervised) Domain Adaptation (DA) seeks for classifying target instances when solely provided with source labeled and target unlabeled examples for training. Learning domain-invariant features helps to achieve this goal, whereas it underpins unlabeled samples drawn from a single or multiple explicit target domains (Multi-target DA). In this paper, we consider a more realistic transfer scenario: our target domain is comprised of multiple sub-targets implicitly blended with each other, so that learners could not identify which sub-target each unlabeled sample belongs to. This Blending-target Domain Adaptation (BTDA) scenario commonly appears in practice and threatens the validities of most existing DA algorithms, due to the presence of domain gaps and categorical misalignments among these hidden sub-targets. To reap the transfer performance gains in this new scenario, we propose Adversarial Meta-Adaptation Network (AMEAN). AMEAN entails two adversarial transfer learning processes. The first is a conventional adversarial transfer to bridge our source and mixed target domains. To circumvent the intra-target category misalignment, the second process presents as ``learning to adapt'': It deploys an unsupervised meta-learner receiving target data and their ongoing feature-learning feedbacks, to discover target clusters as our ``meta-sub-target'' domains. These meta-sub-targets auto-design our meta-sub-target DA loss, which empirically eliminates the implicit category mismatching in our mixed target. We evaluate AMEAN and a variety of DA algorithms in three benchmarks under the BTDA setup. Empirical results show that BTDA is a quite challenging transfer setup for most existing DA algorithms, yet AMEAN significantly outperforms these state-of-the-art baselines and effectively restrains the negative transfer effects in BTDA.

* CVPR-19 (oral). Code is available at http://github.com/zjy526223908/BTDA 
Viaarxiv icon

Improved Hard Example Mining by Discovering Attribute-based Hard Person Identity

May 09, 2019
Xiao Wang, Ziliang Chen, Rui Yang, Bin Luo, Jin Tang

Figure 1 for Improved Hard Example Mining by Discovering Attribute-based Hard Person Identity
Figure 2 for Improved Hard Example Mining by Discovering Attribute-based Hard Person Identity
Figure 3 for Improved Hard Example Mining by Discovering Attribute-based Hard Person Identity
Figure 4 for Improved Hard Example Mining by Discovering Attribute-based Hard Person Identity

In this paper, we propose Hard Person Identity Mining (HPIM) that attempts to refine the hard example mining to improve the exploration efficacy in person re-identification. It is motivated by following observation: the more attributes some people share, the more difficult to separate their identities. Based on this observation, we develop HPIM via a transferred attribute describer, a deep multi-attribute classifier trained from the source noisy person attribute datasets. We encode each image into the attribute probabilistic description in the target person re-ID dataset. Afterwards in the attribute code space, we consider each person as a distribution to generate his view-specific attribute codes in different practical scenarios. Hence we estimate the person-specific statistical moments from zeroth to higher order, which are further used to calculate the central moment discrepancies between persons. Such discrepancy is a ground to choose hard identity to organize proper mini-batches, without concerning the person representation changing in metric learning. It presents as a complementary tool of hard example mining, which helps to explore the global instead of the local hard example constraint in the mini-batch built by randomly sampled identities. Extensive experiments on two person re-identification benchmarks validated the effectiveness of our proposed algorithm.

Viaarxiv icon

Graph Neural Reasoning for 2-Quantified Boolean Formula Solvers

Apr 27, 2019
Zhanfu Yang, Fei Wang, Ziliang Chen, Guannan Wei, Tiark Rompf

Figure 1 for Graph Neural Reasoning for 2-Quantified Boolean Formula Solvers
Figure 2 for Graph Neural Reasoning for 2-Quantified Boolean Formula Solvers
Figure 3 for Graph Neural Reasoning for 2-Quantified Boolean Formula Solvers
Figure 4 for Graph Neural Reasoning for 2-Quantified Boolean Formula Solvers

In this paper, we investigate the feasibility of learning GNN (Graph Neural Network) based solvers and GNN-based heuristics for specified QBF (Quantified Boolean Formula) problems. We design and evaluate several GNN architectures for 2QBF formulae, and conjecture that GNN has limitations in learning 2QBF solvers. Then we show how to learn a heuristic CEGAR 2QBF solver. We further explore generalizing GNN-based heuristics to larger unseen instances, and uncover some interesting challenges. In summary, this paper provides a comprehensive surveying view of applying GNN-embeddings to specified QBF solvers, and aims to offer guidance in applying ML to more complicated symbolic reasoning problems.

* 5 Pages 
Viaarxiv icon