Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jianping Shi

Understanding the wiring evolution in differentiable neural architecture search

Sep 02, 2020
Sirui Xie, Shoukang Hu, Xinjiang Wang, Chunxiao Liu, Jianping Shi, Xunying Liu, Dahua Lin

Figure 1 for Understanding the wiring evolution in differentiable neural architecture search

Figure 2 for Understanding the wiring evolution in differentiable neural architecture search

Figure 3 for Understanding the wiring evolution in differentiable neural architecture search

Figure 4 for Understanding the wiring evolution in differentiable neural architecture search

Controversy exists on whether differentiable neural architecture search methods discover wiring topology effectively. To understand how wiring topology evolves, we study the underlying mechanism of several existing differentiable NAS frameworks. Our investigation is motivated by three observed searching patterns of differentiable NAS: 1) they search by growing instead of pruning; 2) wider networks are more preferred than deeper ones; 3) no edges are selected in bi-level optimization. To anatomize these phenomena, we propose a unified view on searching algorithms of existing frameworks, transferring the global optimization to local cost minimization. Based on this reformulation, we conduct empirical and theoretical analyses, revealing implicit inductive biases in the cost's assignment mechanism and evolution dynamics that cause the observed phenomena. These biases indicate strong discrimination towards certain topologies. To this end, we pose questions that future differentiable methods for neural wiring discovery need to confront, hoping to evoke a discussion and rethinking on how much bias has been enforced implicitly in existing NAS methods.

Via

Access Paper or Ask Questions

Improving Semantic Segmentation via Decoupled Body and Edge Supervision

Aug 18, 2020
Xiangtai Li, Xia Li, Li Zhang, Guangliang Cheng, Jianping Shi, Zhouchen Lin, Shaohua Tan, Yunhai Tong

Figure 1 for Improving Semantic Segmentation via Decoupled Body and Edge Supervision

Figure 2 for Improving Semantic Segmentation via Decoupled Body and Edge Supervision

Figure 3 for Improving Semantic Segmentation via Decoupled Body and Edge Supervision

Figure 4 for Improving Semantic Segmentation via Decoupled Body and Edge Supervision

Existing semantic segmentation approaches either aim to improve the object's inner consistency by modeling the global context, or refine objects detail along their boundaries by multi-scale feature fusion. In this paper, a new paradigm for semantic segmentation is proposed. Our insight is that appealing performance of semantic segmentation requires \textit{explicitly} modeling the object \textit{body} and \textit{edge}, which correspond to the high and low frequency of the image. To do so, we first warp the image feature by learning a flow field to make the object part more consistent. The resulting body feature and the residual edge feature are further optimized under decoupled supervision by explicitly sampling different parts (body or edge) pixels. We show that the proposed framework with various baselines or backbone networks leads to better object inner consistency and object boundaries. Extensive experiments on four major road scene semantic segmentation benchmarks including \textit{Cityscapes}, \textit{CamVid}, \textit{KIITI} and \textit{BDD} show that our proposed approach establishes new state of the art while retaining high efficiency in inference. In particular, we achieve 83.7 mIoU \% on Cityscape with only fine-annotated data. Code and models are made available to foster any further research (\url{https://github.com/lxtGH/DecoupleSegNets}).

* accepted by ECCV 2020

Via

Access Paper or Ask Questions

TSIT: A Simple and Versatile Framework for Image-to-Image Translation

Jul 25, 2020
Liming Jiang, Changxu Zhang, Mingyang Huang, Chunxiao Liu, Jianping Shi, Chen Change Loy

Figure 1 for TSIT: A Simple and Versatile Framework for Image-to-Image Translation

Figure 2 for TSIT: A Simple and Versatile Framework for Image-to-Image Translation

Figure 3 for TSIT: A Simple and Versatile Framework for Image-to-Image Translation

Figure 4 for TSIT: A Simple and Versatile Framework for Image-to-Image Translation

We introduce a simple and versatile framework for image-to-image translation. We unearth the importance of normalization layers, and provide a carefully designed two-stream generative model with newly proposed feature transformations in a coarse-to-fine fashion. This allows multi-scale semantic structure information and style representation to be effectively captured and fused by the network, permitting our method to scale to various tasks in both unsupervised and supervised settings. No additional constraints (e.g., cycle consistency) are needed, contributing to a very clean and simple method. Multi-modal image synthesis with arbitrary style control is made possible. A systematic study compares the proposed method with several state-of-the-art task-specific baselines, verifying its effectiveness in both perceptual quality and quantitative evaluations.

* ECCV 2020 (Spotlight). Table 2 is updated. GitHub: https://github.com/EndlessSora/TSIT

Via

Access Paper or Ask Questions

Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization

Jul 20, 2020
Haibao Yu, Qi Han, Jianbo Li, Jianping Shi, Guangliang Cheng, Bin Fan

Figure 1 for Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization

Figure 2 for Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization

Figure 3 for Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization

Figure 4 for Search What You Want: Barrier Panelty NAS for Mixed Precision Quantization

Emergent hardwares can support mixed precision CNN models inference that assign different bitwidths for different layers. Learning to find an optimal mixed precision model that can preserve accuracy and satisfy the specific constraints on model size and computation is extremely challenge due to the difficult in training a mixed precision model and the huge space of all possible bit quantizations. In this paper, we propose a novel soft Barrier Penalty based NAS (BP-NAS) for mixed precision quantization, which ensures all the searched models are inside the valid domain defined by the complexity constraint, thus could return an optimal model under the given constraint by conducting search only one time. The proposed soft Barrier Penalty is differentiable and can impose very large losses to those models outside the valid domain while almost no punishment for models inside the valid domain, thus constraining the search only in the feasible domain. In addition, a differentiable Prob-1 regularizer is proposed to ensure learning with NAS is reasonable. A distribution reshaping training strategy is also used to make training more stable. BP-NAS sets new state of the arts on both classification (Cifar-10, ImageNet) and detection (COCO), surpassing all the efficient mixed precision methods designed manually and automatically. Particularly, BP-NAS achieves higher mAP (up to 2.7\% mAP improvement) together with lower bit computation cost compared with the existing best mixed precision model on COCO detection.

* ECCV2020

Via

Access Paper or Ask Questions

TPNet: Trajectory Proposal Network for Motion Prediction

Apr 26, 2020
Liangji Fang, Qinhong Jiang, Jianping Shi, Bolei Zhou

Figure 1 for TPNet: Trajectory Proposal Network for Motion Prediction

Figure 2 for TPNet: Trajectory Proposal Network for Motion Prediction

Figure 3 for TPNet: Trajectory Proposal Network for Motion Prediction

Figure 4 for TPNet: Trajectory Proposal Network for Motion Prediction

Making accurate motion prediction of the surrounding traffic agents such as pedestrians, vehicles, and cyclists is crucial for autonomous driving. Recent data-driven motion prediction methods have attempted to learn to directly regress the exact future position or its distribution from massive amount of trajectory data. However, it remains difficult for these methods to provide multimodal predictions as well as integrate physical constraints such as traffic rules and movable areas. In this work we propose a novel two-stage motion prediction framework, Trajectory Proposal Network (TPNet). TPNet first generates a candidate set of future trajectories as hypothesis proposals, then makes the final predictions by classifying and refining the proposals which meets the physical constraints. By steering the proposal generation process, safe and multimodal predictions are realized. Thus this framework effectively mitigates the complexity of motion prediction problem while ensuring the multimodal output. Experiments on four large-scale trajectory prediction datasets, i.e. the ETH, UCY, Apollo and Argoverse datasets, show that TPNet achieves the state-of-the-art results both quantitatively and qualitatively.

Via

Access Paper or Ask Questions

Uncertainty-Aware Consistency Regularization for Cross-Domain Semantic Segmentation

Apr 19, 2020
Qianyu Zhou, Zhengyang Feng, Guangliang Cheng, Xin Tan, Jianping Shi, Lizhuang Ma

Figure 1 for Uncertainty-Aware Consistency Regularization for Cross-Domain Semantic Segmentation

Figure 2 for Uncertainty-Aware Consistency Regularization for Cross-Domain Semantic Segmentation

Figure 3 for Uncertainty-Aware Consistency Regularization for Cross-Domain Semantic Segmentation

Figure 4 for Uncertainty-Aware Consistency Regularization for Cross-Domain Semantic Segmentation

Unsupervised domain adaptation (UDA) aims to adapt existing models of the source domain to a new target domain with only unlabeled data. The main challenge to UDA lies in how to reduce the domain gap between the source domain and the target domain. Existing approaches of cross-domain semantic segmentation usually employ a consistency regularization on the target prediction of student model and teacher model respectively under different perturbations. However, previous works do not consider the reliability of the predicted target samples, which could harm the learning process by generating unreasonable guidance for the student model. In this paper, we propose an uncertainty-aware consistency regularization method to tackle this issue for semantic segmentation. By exploiting the latent uncertainty information of the target samples, more meaningful and reliable knowledge from the teacher model would be transferred to the student model. The experimental evaluation has shown that the proposed method outperforms the state-of-the-art methods by around $3\% \sim 5\%$ improvement on two domain adaptation benchmarks, i.e. GTAV $\rightarrow $ Cityscapes and SYNTHIA $\rightarrow $ Cityscapes.

Via

Access Paper or Ask Questions

Semi-Supervised Semantic Segmentation via Dynamic Self-Training and Class-Balanced Curriculum

Apr 18, 2020
Zhengyang Feng, Qianyu Zhou, Guangliang Cheng, Xin Tan, Jianping Shi, Lizhuang Ma

Figure 1 for Semi-Supervised Semantic Segmentation via Dynamic Self-Training and Class-Balanced Curriculum

Figure 2 for Semi-Supervised Semantic Segmentation via Dynamic Self-Training and Class-Balanced Curriculum

Figure 3 for Semi-Supervised Semantic Segmentation via Dynamic Self-Training and Class-Balanced Curriculum

Figure 4 for Semi-Supervised Semantic Segmentation via Dynamic Self-Training and Class-Balanced Curriculum

In this work, we propose a novel and concise approach for semi-supervised semantic segmentation. The major challenge of this task lies in how to exploit unlabeled data efficiently and thoroughly. Previous state-of-the-art methods utilize unlabeled data by GAN-based self-training or consistency regularization. However, these methods either suffer from noisy self-supervision and class-imbalance, resulting in a low unlabeled data utilization rate, or do not consider the apparent link between self-training and consistency regularization. Our method, Dynamic Self-Training and Class-Balanced Curriculum (DST-CBC), exploits inter-model disagreement by prediction confidence to construct a dynamic loss robust against pseudo label noise, enabling it to extend pseudo labeling to a class-balanced curriculum learning process. While we further show that our method implicitly includes consistency regularization. Thus, DST-CBC not only exploits unlabeled data efficiently, but also thoroughly utilizes $all$ unlabeled data. Without using adversarial training or any kind of modification to the network architecture, DST-CBC outperforms existing methods on different datasets across all labeled ratios, bringing semi-supervised learning yet another step closer to match the performance of fully-supervised learning for semantic segmentation. Our code and data splits are available at: https://github.com/voldemortX/DST-CBC .

* Code is available at https://github.com/voldemortX/DST-CBC

Via

Access Paper or Ask Questions

AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching

Apr 09, 2020
Xiao Song, Guorun Yang, Xinge Zhu, Hui Zhou, Zhe Wang, Jianping Shi

Figure 1 for AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching

Figure 2 for AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching

Figure 3 for AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching

Figure 4 for AdaStereo: A Simple and Efficient Approach for Adaptive Stereo Matching

In this paper, we attempt to solve the domain adaptation problem for deep stereo matching networks. Instead of resorting to black-box structures or layers to find implicit connections across domains, we focus on investigating adaptation gaps for stereo matching. By visual inspections and extensive experiments, we conclude that low-level aligning is crucial for adaptive stereo matching, since main gaps across domains lie in the inconsistent input color and cost volume distributions. Correspondingly, we design a bottom-up domain adaptation method, in which two particular approaches are proposed, i.e. color transfer and cost regularization, that can be easily integrated into existing stereo matching models. The color transfer enables transferring a large amount of synthetic data to the same color spaces with target domains during training. The cost regularization can further constrain both the lower-layer features and cost volumes to domain-invariant distributions. Although our proposed strategies are simple and have no parameters to learn, they do improve the generalization ability of existing disparity networks by a large margin. We conduct experiments across multiple datasets, including Scene Flow, KITTI, Middlebury, ETH3D and DrivingStereo. Without whistles and bells, our synthetic-data pretrained models achieve state-of-the-art cross-domain performance compared to previous domain-invariant methods, even outperform state-of-the-art disparity networks fine-tuned with target domain ground-truths on multiple stereo matching benchmarks.

* 18 pages, 5 figures, 5 tables

Via

Access Paper or Ask Questions