Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yong Guo

Automatic Subspace Evoking for Efficient Neural Architecture Search

Oct 31, 2022

Yaofo Chen, Yong Guo, Daihai Liao, Fanbing Lv, Hengjie Song, Mingkui Tan

Figure 1 for Automatic Subspace Evoking for Efficient Neural Architecture Search

Figure 2 for Automatic Subspace Evoking for Efficient Neural Architecture Search

Figure 3 for Automatic Subspace Evoking for Efficient Neural Architecture Search

Figure 4 for Automatic Subspace Evoking for Efficient Neural Architecture Search

Abstract:Neural Architecture Search (NAS) aims to automatically find effective architectures from a predefined search space. However, the search space is often extremely large. As a result, directly searching in such a large search space is non-trivial and also very time-consuming. To address the above issues, in each search step, we seek to limit the search space to a small but effective subspace to boost both the search performance and search efficiency. To this end, we propose a novel Neural Architecture Search method via Automatic Subspace Evoking (ASE-NAS) that finds promising architectures in automatically evoked subspaces. Specifically, we first perform a global search, i.e., automatic subspace evoking, to evoke/find a good subspace from a set of candidates. Then, we perform a local search within the evoked subspace to find an effective architecture. More critically, we further boost search performance by taking well-designed/searched architectures as the initial candidate subspaces. Extensive experiments show that our ASE-NAS not only greatly reduces the search cost but also finds better architectures than state-of-the-art methods in various benchmark search spaces.

Via

Access Paper or Ask Questions

Pareto-aware Neural Architecture Generation for Diverse Computational Budgets

Oct 14, 2022

Yong Guo, Yaofo Chen, Yin Zheng, Qi Chen, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan

Figure 1 for Pareto-aware Neural Architecture Generation for Diverse Computational Budgets

Figure 2 for Pareto-aware Neural Architecture Generation for Diverse Computational Budgets

Figure 3 for Pareto-aware Neural Architecture Generation for Diverse Computational Budgets

Figure 4 for Pareto-aware Neural Architecture Generation for Diverse Computational Budgets

Abstract:Designing feasible and effective architectures under diverse computational budgets, incurred by different applications/devices, is essential for deploying deep models in real-world applications. To achieve this goal, existing methods often perform an independent architecture search process for each target budget, which is very inefficient yet unnecessary. More critically, these independent search processes cannot share their learned knowledge (i.e., the distribution of good architectures) with each other and thus often result in limited search results. To address these issues, we propose a Pareto-aware Neural Architecture Generator (PNAG) which only needs to be trained once and dynamically produces the Pareto optimal architecture for any given budget via inference. To train our PNAG, we learn the whole Pareto frontier by jointly finding multiple Pareto optimal architectures under diverse budgets. Such a joint search algorithm not only greatly reduces the overall search cost but also improves the search results. Extensive experiments on three hardware platforms (i.e., mobile device, CPU, and GPU) show the superiority of our method over existing methods.

* 11 pages, 7 figures, journal version

Via

Access Paper or Ask Questions

Improving Fine-tuning of Self-supervised Models with Contrastive Initialization

Jul 30, 2022

Haolin Pan, Yong Guo, Qinyi Deng, Haomin Yang, Yiqun Chen, Jian Chen

Figure 1 for Improving Fine-tuning of Self-supervised Models with Contrastive Initialization

Figure 2 for Improving Fine-tuning of Self-supervised Models with Contrastive Initialization

Figure 3 for Improving Fine-tuning of Self-supervised Models with Contrastive Initialization

Figure 4 for Improving Fine-tuning of Self-supervised Models with Contrastive Initialization

Abstract:Self-supervised learning (SSL) has achieved remarkable performance in pretraining the models that can be further used in downstream tasks via fine-tuning. However, these self-supervised models may not capture meaningful semantic information since the images belonging to the same class are always regarded as negative pairs in the contrastive loss. Consequently, the images of the same class are often located far away from each other in learned feature space, which would inevitably hamper the fine-tuning process. To address this issue, we seek to provide a better initialization for the self-supervised models by enhancing the semantic information. To this end, we propose a Contrastive Initialization (COIN) method that breaks the standard fine-tuning pipeline by introducing an extra initialization stage before fine-tuning. Extensive experiments show that, with the enriched semantics, our COIN significantly outperforms existing methods without introducing extra training cost and sets new state-of-the-arts on multiple downstream tasks.

* 22 pages, 4 figures

Via

Access Paper or Ask Questions

Towards Lightweight Super-Resolution with Dual Regression Learning

Jul 21, 2022

Yong Guo, Jingdong Wang, Qi Chen, Jiezhang Cao, Zeshuai Deng, Yanwu Xu, Jian Chen, Mingkui Tan

Figure 1 for Towards Lightweight Super-Resolution with Dual Regression Learning

Figure 2 for Towards Lightweight Super-Resolution with Dual Regression Learning

Figure 3 for Towards Lightweight Super-Resolution with Dual Regression Learning

Figure 4 for Towards Lightweight Super-Resolution with Dual Regression Learning

Abstract:Deep neural networks have exhibited remarkable performance in image super-resolution (SR) tasks by learning a mapping from low-resolution (LR) images to high-resolution (HR) images. However, the SR problem is typically an ill-posed problem and existing methods would come with several limitations. First, the possible mapping space of SR can be extremely large since there may exist many different HR images that can be downsampled to the same LR image. As a result, it is hard to directly learn a promising SR mapping from such a large space. Second, it is often inevitable to develop very large models with extremely high computational cost to yield promising SR performance. In practice, one can use model compression techniques to obtain compact models by reducing model redundancy. Nevertheless, it is hard for existing model compression methods to accurately identify the redundant components due to the extremely large SR mapping space. To alleviate the first challenge, we propose a dual regression learning scheme to reduce the space of possible SR mappings. Specifically, in addition to the mapping from LR to HR images, we learn an additional dual regression mapping to estimate the downsampling kernel and reconstruct LR images. In this way, the dual mapping acts as a constraint to reduce the space of possible mappings. To address the second challenge, we propose a lightweight dual regression compression method to reduce model redundancy in both layer-level and channel-level based on channel pruning. Specifically, we first develop a channel number search method that minimizes the dual regression loss to determine the redundancy of each layer. Given the searched channel numbers, we further exploit the dual regression manner to evaluate the importance of channels and prune the redundant ones. Extensive experiments show the effectiveness of our method in obtaining accurate and efficient SR models.

* Journal extension of DRN. arXiv admin note: text overlap with arXiv:2003.07018

Via

Access Paper or Ask Questions

Improving Corruption and Adversarial Robustness by Enhancing Weak Subnets

Jan 30, 2022

Yong Guo, David Stutz, Bernt Schiele

Figure 1 for Improving Corruption and Adversarial Robustness by Enhancing Weak Subnets

Figure 2 for Improving Corruption and Adversarial Robustness by Enhancing Weak Subnets

Figure 3 for Improving Corruption and Adversarial Robustness by Enhancing Weak Subnets

Figure 4 for Improving Corruption and Adversarial Robustness by Enhancing Weak Subnets

Abstract:Deep neural networks have achieved great success in many computer vision tasks. However, deep networks have been shown to be very susceptible to corrupted or adversarial images, which often result in significant performance drops. In this paper, we observe that weak subnetwork (subnet) performance is correlated with a lack of robustness against corruptions and adversarial attacks. Based on that observation, we propose a novel robust training method which explicitly identifies and enhances weak subnets (EWS) during training to improve robustness. Specifically, we develop a search algorithm to find particularly weak subnets and propose to explicitly strengthen them via knowledge distillation from the full network. We show that our EWS greatly improves the robustness against corrupted images as well as the accuracy on clean data. Being complementary to many state-of-the-art data augmentation approaches, EWS consistently improves corruption robustness on top of many of these approaches. Moreover, EWS is also able to boost the adversarial robustness when combined with popular adversarial training methods.

Via

Access Paper or Ask Questions

Content-Aware Convolutional Neural Networks

Jul 23, 2021

Yong Guo, Yaofo Chen, Mingkui Tan, Kui Jia, Jian Chen, Jingdong Wang

Figure 1 for Content-Aware Convolutional Neural Networks

Figure 2 for Content-Aware Convolutional Neural Networks

Figure 3 for Content-Aware Convolutional Neural Networks

Figure 4 for Content-Aware Convolutional Neural Networks

Abstract:Convolutional Neural Networks (CNNs) have achieved great success due to the powerful feature learning ability of convolution layers. Specifically, the standard convolution traverses the input images/features using a sliding window scheme to extract features. However, not all the windows contribute equally to the prediction results of CNNs. In practice, the convolutional operation on some of the windows (e.g., smooth windows that contain very similar pixels) can be very redundant and may introduce noises into the computation. Such redundancy may not only deteriorate the performance but also incur the unnecessary computational cost. Thus, it is important to reduce the computational redundancy of convolution to improve the performance. To this end, we propose a Content-aware Convolution (CAC) that automatically detects the smooth windows and applies a 1x1 convolutional kernel to replace the original large kernel. In this sense, we are able to effectively avoid the redundant computation on similar pixels. By replacing the standard convolution in CNNs with our CAC, the resultant models yield significantly better performance and lower computational cost than the baseline models with the standard convolution. More critically, we are able to dynamically allocate suitable computation resources according to the data smoothness of different images, making it possible for content-aware computation. Extensive experiments on various computer vision tasks demonstrate the superiority of our method over existing methods.

* Accepted by Neural Networks

Via

Access Paper or Ask Questions

Improvement of image classification by multiple optical scattering

Jul 12, 2021

Xinyu Gao, Yi Li, Yanqing Qiu, Bangning Mao, Miaogen Chen, Yanlong Meng, Chunliu Zhao, Juan Kang, Yong Guo, Changyu Shen

Figure 1 for Improvement of image classification by multiple optical scattering

Figure 2 for Improvement of image classification by multiple optical scattering

Figure 3 for Improvement of image classification by multiple optical scattering

Figure 4 for Improvement of image classification by multiple optical scattering

Abstract:Multiple optical scattering occurs when light propagates in a non-uniform medium. During the multiple scattering, images were distorted and the spatial information they carried became scrambled. However, the image information is not lost but presents in the form of speckle patterns (SPs). In this study, we built up an optical random scattering system based on an LCD and an RGB laser source. We found that the image classification can be improved by the help of random scattering which is considered as a feedforward neural network to extracts features from image. Along with the ridge classification deployed on computer, we achieved excellent classification accuracy higher than 94%, for a variety of data sets covering medical, agricultural, environmental protection and other fields. In addition, the proposed optical scattering system has the advantages of high speed, low power consumption, and miniaturization, which is suitable for deploying in edge computing applications.

Via

Access Paper or Ask Questions

AdaXpert: Adapting Neural Architecture for Growing Data

Jul 01, 2021

Shuaicheng Niu, Jiaxiang Wu, Guanghui Xu, Yifan Zhang, Yong Guo, Peilin Zhao, Peng Wang, Mingkui Tan

Figure 1 for AdaXpert: Adapting Neural Architecture for Growing Data

Figure 2 for AdaXpert: Adapting Neural Architecture for Growing Data

Figure 3 for AdaXpert: Adapting Neural Architecture for Growing Data

Figure 4 for AdaXpert: Adapting Neural Architecture for Growing Data

Abstract:In real-world applications, data often come in a growing manner, where the data volume and the number of classes may increase dynamically. This will bring a critical challenge for learning: given the increasing data volume or the number of classes, one has to instantaneously adjust the neural model capacity to obtain promising performance. Existing methods either ignore the growing nature of data or seek to independently search an optimal architecture for a given dataset, and thus are incapable of promptly adjusting the architectures for the changed data. To address this, we present a neural architecture adaptation method, namely Adaptation eXpert (AdaXpert), to efficiently adjust previous architectures on the growing data. Specifically, we introduce an architecture adjuster to generate a suitable architecture for each data snapshot, based on the previous architecture and the different extent between current and previous data distributions. Furthermore, we propose an adaptation condition to determine the necessity of adjustment, thereby avoiding unnecessary and time-consuming adjustments. Extensive experiments on two growth scenarios (increasing data volume and number of classes) demonstrate the effectiveness of the proposed method.

* accepted by ICML 2021

Via

Access Paper or Ask Questions

Contrastive Neural Architecture Search with Neural Architecture Comparators

Apr 06, 2021

Yaofo Chen, Yong Guo, Qi Chen, Minli Li, Wei Zeng, Yaowei Wang, Mingkui Tan

Figure 1 for Contrastive Neural Architecture Search with Neural Architecture Comparators

Figure 2 for Contrastive Neural Architecture Search with Neural Architecture Comparators

Figure 3 for Contrastive Neural Architecture Search with Neural Architecture Comparators

Figure 4 for Contrastive Neural Architecture Search with Neural Architecture Comparators

Abstract:One of the key steps in Neural Architecture Search (NAS) is to estimate the performance of candidate architectures. Existing methods either directly use the validation performance or learn a predictor to estimate the performance. However, these methods can be either computationally expensive or very inaccurate, which may severely affect the search efficiency and performance. Moreover, as it is very difficult to annotate architectures with accurate performance on specific tasks, learning a promising performance predictor is often non-trivial due to the lack of labeled data. In this paper, we argue that it may not be necessary to estimate the absolute performance for NAS. On the contrary, we may need only to understand whether an architecture is better than a baseline one. However, how to exploit this comparison information as the reward and how to well use the limited labeled data remains two great challenges. In this paper, we propose a novel Contrastive Neural Architecture Search (CTNAS) method which performs architecture search by taking the comparison results between architectures as the reward. Specifically, we design and learn a Neural Architecture Comparator (NAC) to compute the probability of candidate architectures being better than a baseline one. Moreover, we present a baseline updating scheme to improve the baseline iteratively in a curriculum learning manner. More critically, we theoretically show that learning NAC is equivalent to optimizing the ranking over architectures. Extensive experiments in three search spaces demonstrate the superiority of our CTNAS over existing methods.

* Accpeted by CVPR 2021. The code is available at https://github.com/chenyaofo/CTNAS

Via

Access Paper or Ask Questions

Pareto-Frontier-aware Neural Architecture Generation for Diverse Budgets

Feb 27, 2021

Yong Guo, Yaofo Chen, Yin Zheng, Qi Chen, Peilin Zhao, Jian Chen, Junzhou Huang, Mingkui Tan

Figure 1 for Pareto-Frontier-aware Neural Architecture Generation for Diverse Budgets

Figure 2 for Pareto-Frontier-aware Neural Architecture Generation for Diverse Budgets

Figure 3 for Pareto-Frontier-aware Neural Architecture Generation for Diverse Budgets

Figure 4 for Pareto-Frontier-aware Neural Architecture Generation for Diverse Budgets

Abstract:Designing feasible and effective architectures under diverse computation budgets incurred by different applications/devices is essential for deploying deep models in practice. Existing methods often perform an independent architecture search for each target budget, which is very inefficient yet unnecessary. Moreover, the repeated independent search manner would inevitably ignore the common knowledge among different search processes and hamper the search performance. To address these issues, we seek to train a general architecture generator that automatically produces effective architectures for an arbitrary budget merely via model inference. To this end, we propose a Pareto-Frontier-aware Neural Architecture Generator (NAG) which takes an arbitrary budget as input and produces the Pareto optimal architecture for the target budget. We train NAG by learning the Pareto frontier (i.e., the set of Pareto optimal architectures) over model performance and computational cost (e.g., latency). Extensive experiments on three platforms (i.e., mobile, CPU, and GPU) show the superiority of the proposed method over existing NAS methods.

* 8 pages

Via

Access Paper or Ask Questions