Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yang Li

Efficient End-to-End AutoML via Scalable Search Space Decomposition

Jun 19, 2022
Yang Li, Yu Shen, Wentao Zhang, Ce Zhang, Bin Cui

End-to-end AutoML has attracted intensive interests from both academia and industry which automatically searches for ML pipelines in a space induced by feature engineering, algorithm/model selection, and hyper-parameter tuning. Existing AutoML systems, however, suffer from scalability issues when applying to application domains with large, high-dimensional search spaces. We present VolcanoML, a scalable and extensible framework that facilitates systematic exploration of large AutoML search spaces. VolcanoML introduces and implements basic building blocks that decompose a large search space into smaller ones, and allows users to utilize these building blocks to compose an execution plan for the AutoML problem at hand. VolcanoML further supports a Volcano-style execution model -- akin to the one supported by modern database systems -- to execute the plan constructed. Our evaluation demonstrates that, not only does VolcanoML raise the level of expressiveness for search space decomposition in AutoML, it also leads to actual findings of decomposition strategies that are significantly more efficient than the ones employed by state-of-the-art AutoML systems such as auto-sklearn.

* VLDB Journal, 2022
* extended paper for VolcanoML (li et al. VLDB 2021)/Mindware. arXiv admin note: substantial text overlap with arXiv:2107.08861

Via

Access Paper or Ask Questions

NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning

Jun 17, 2022
Wentao Zhang, Zeang Sheng, Mingyu Yang, Yang Li, Yu Shen, Zhi Yang, Bin Cui

Figure 1 for NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning

Figure 2 for NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning

Figure 3 for NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning

Figure 4 for NAFS: A Simple yet Tough-to-beat Baseline for Graph Representation Learning

Recently, graph neural networks (GNNs) have shown prominent performance in graph representation learning by leveraging knowledge from both graph structure and node features. However, most of them have two major limitations. First, GNNs can learn higher-order structural information by stacking more layers but can not deal with large depth due to the over-smoothing issue. Second, it is not easy to apply these methods on large graphs due to the expensive computation cost and high memory usage. In this paper, we present node-adaptive feature smoothing (NAFS), a simple non-parametric method that constructs node representations without parameter learning. NAFS first extracts the features of each node with its neighbors of different hops by feature smoothing, and then adaptively combines the smoothed features. Besides, the constructed node representation can further be enhanced by the ensemble of smoothed features extracted via different smoothing strategies. We conduct experiments on four benchmark datasets on two different application scenarios: node clustering and link prediction. Remarkably, NAFS with feature ensemble outperforms the state-of-the-art GNNs on these tasks and mitigates the aforementioned two limitations of most learning-based GNN counterparts.

* ICML 2022
* 17 pages, 8 figures

Via

Access Paper or Ask Questions

DFG-NAS: Deep and Flexible Graph Neural Architecture Search

Jun 17, 2022
Wentao Zhang, Zheyu Lin, Yu Shen, Yang Li, Zhi Yang, Bin Cui

Figure 1 for DFG-NAS: Deep and Flexible Graph Neural Architecture Search

Figure 2 for DFG-NAS: Deep and Flexible Graph Neural Architecture Search

Figure 3 for DFG-NAS: Deep and Flexible Graph Neural Architecture Search

Figure 4 for DFG-NAS: Deep and Flexible Graph Neural Architecture Search

Graph neural networks (GNNs) have been intensively applied to various graph-based applications. Despite their success, manually designing the well-behaved GNNs requires immense human expertise. And thus it is inefficient to discover the potentially optimal data-specific GNN architecture. This paper proposes DFG-NAS, a new neural architecture search (NAS) method that enables the automatic search of very deep and flexible GNN architectures. Unlike most existing methods that focus on micro-architectures, DFG-NAS highlights another level of design: the search for macro-architectures on how atomic propagation (\textbf{\texttt{P}}) and transformation (\textbf{\texttt{T}}) operations are integrated and organized into a GNN. To this end, DFG-NAS proposes a novel search space for \textbf{\texttt{P-T}} permutations and combinations based on message-passing dis-aggregation, defines four custom-designed macro-architecture mutations, and employs the evolutionary algorithm to conduct an efficient and effective search. Empirical studies on four node classification tasks demonstrate that DFG-NAS outperforms state-of-the-art manual designs and NAS methods of GNNs.

* ICML 2022
* 13 pages, 7 figures

Via

Access Paper or Ask Questions

Level 2 Autonomous Driving on a Single Device: Diving into the Devils of Openpilot

Jun 16, 2022
Li Chen, Tutian Tang, Zhitian Cai, Yang Li, Penghao Wu, Hongyang Li, Jianping Shi, Junchi Yan, Yu Qiao

Figure 1 for Level 2 Autonomous Driving on a Single Device: Diving into the Devils of Openpilot

Figure 2 for Level 2 Autonomous Driving on a Single Device: Diving into the Devils of Openpilot

Figure 3 for Level 2 Autonomous Driving on a Single Device: Diving into the Devils of Openpilot

Figure 4 for Level 2 Autonomous Driving on a Single Device: Diving into the Devils of Openpilot

Equipped with a wide span of sensors, predominant autonomous driving solutions are becoming more modular-oriented for safe system design. Though these sensors have laid a solid foundation, most massive-production solutions up to date still fall into L2 phase. Among these, Comma.ai comes to our sight, claiming one $999 aftermarket device mounted with a single camera and board inside owns the ability to handle L2 scenarios. Together with open-sourced software of the entire system released by Comma.ai, the project is named Openpilot. Is it possible? If so, how is it made possible? With curiosity in mind, we deep-dive into Openpilot and conclude that its key to success is the end-to-end system design instead of a conventional modular framework. The model is briefed as Supercombo, and it can predict the ego vehicle's future trajectory and other road semantics on the fly from monocular input. Unfortunately, the training process and massive amount of data to make all these work are not publicly available. To achieve an intensive investigation, we try to reimplement the training details and test the pipeline on public benchmarks. The refactored network proposed in this work is referred to as OP-Deepdive. For a fair comparison of our version to the original Supercombo, we introduce a dual-model deployment scheme to test the driving performance in the real world. Experimental results on nuScenes, Comma2k19, CARLA, and in-house realistic scenarios verify that a low-cost device can indeed achieve most L2 functionalities and be on par with the original Supercombo model. In this report, we would like to share our latest findings, shed some light on the new perspective of end-to-end autonomous driving from an industrial product-level side, and potentially inspire the community to continue improving the performance. Our code, benchmarks are at https://github.com/OpenPerceptionX/Openpilot-Deepdive.

* Tech report. Project page: https://github.com/OpenPerceptionX/Openpilot-Deepdive

Via

Access Paper or Ask Questions

Graph Attention Multi-Layer Perceptron

Jun 09, 2022
Wentao Zhang, Ziqi Yin, Zeang Sheng, Yang Li, Wen Ouyang, Xiaosen Li, Yangyu Tao, Zhi Yang, Bin Cui

Figure 1 for Graph Attention Multi-Layer Perceptron

Figure 2 for Graph Attention Multi-Layer Perceptron

Figure 3 for Graph Attention Multi-Layer Perceptron

Figure 4 for Graph Attention Multi-Layer Perceptron

Graph neural networks (GNNs) have achieved great success in many graph-based applications. However, the enormous size and high sparsity level of graphs hinder their applications under industrial scenarios. Although some scalable GNNs are proposed for large-scale graphs, they adopt a fixed $K$-hop neighborhood for each node, thus facing the over-smoothing issue when adopting large propagation depths for nodes within sparse regions. To tackle the above issue, we propose a new GNN architecture -- Graph Attention Multi-Layer Perceptron (GAMLP), which can capture the underlying correlations between different scales of graph knowledge. We have deployed GAMLP in Tencent with the Angel platform, and we further evaluate GAMLP on both real-world datasets and large-scale industrial datasets. Extensive experiments on these 14 graph datasets demonstrate that GAMLP achieves state-of-the-art performance while enjoying high scalability and efficiency. Specifically, it outperforms GAT by 1.3\% regarding predictive accuracy on our large-scale Tencent Video dataset while achieving up to $50\times$ training speedup. Besides, it ranks top-1 on both the leaderboards of the largest homogeneous and heterogeneous graph (i.e., ogbn-papers100M and ogbn-mag) of Open Graph Benchmark.

* In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022
* 11 pages, 7 figures. arXiv admin note: text overlap with arXiv:2108.10097

Via

Access Paper or Ask Questions

Solving the Spike Feature Information Vanishing Problem in Spiking Deep Q Network with Potential Based Normalization

Jun 08, 2022
Yinqian Sun, Yi Zeng, Yang Li

Figure 1 for Solving the Spike Feature Information Vanishing Problem in Spiking Deep Q Network with Potential Based Normalization

Figure 2 for Solving the Spike Feature Information Vanishing Problem in Spiking Deep Q Network with Potential Based Normalization

Figure 3 for Solving the Spike Feature Information Vanishing Problem in Spiking Deep Q Network with Potential Based Normalization

Figure 4 for Solving the Spike Feature Information Vanishing Problem in Spiking Deep Q Network with Potential Based Normalization

Brain inspired spiking neural networks (SNNs) have been successfully applied to many pattern recognition domains. The SNNs based deep structure have achieved considerable results in perceptual tasks, such as image classification, target detection. However, the application of deep SNNs in reinforcement learning (RL) tasks is still a problem to be explored. Although there have been previous studies on the combination of SNNs and RL, most of them focus on robotic control problems with shallow networks or using ANN-SNN conversion method to implement spiking deep Q Network (SDQN). In this work, we mathematically analyzed the problem of the disappearance of spiking signal features in SDQN and proposed a potential based layer normalization(pbLN) method to directly train spiking deep Q networks. Experiment shows that compared with state-of-art ANN-SNN conversion method and other SDQN works, the proposed pbLN spiking deep Q networks (PL-SDQN) achieved better performance on Atari game tasks.

Via

Access Paper or Ask Questions

TransBO: Hyperparameter Optimization via Two-Phase Transfer Learning

Jun 06, 2022
Yang Li, Yu Shen, Huaijun Jiang, Wentao Zhang, Zhi Yang, Ce Zhang, Bin Cui

Figure 1 for TransBO: Hyperparameter Optimization via Two-Phase Transfer Learning

Figure 2 for TransBO: Hyperparameter Optimization via Two-Phase Transfer Learning

Figure 3 for TransBO: Hyperparameter Optimization via Two-Phase Transfer Learning

Figure 4 for TransBO: Hyperparameter Optimization via Two-Phase Transfer Learning

With the extensive applications of machine learning models, automatic hyperparameter optimization (HPO) has become increasingly important. Motivated by the tuning behaviors of human experts, it is intuitive to leverage auxiliary knowledge from past HPO tasks to accelerate the current HPO task. In this paper, we propose TransBO, a novel two-phase transfer learning framework for HPO, which can deal with the complementary nature among source tasks and dynamics during knowledge aggregation issues simultaneously. This framework extracts and aggregates source and target knowledge jointly and adaptively, where the weights can be learned in a principled manner. The extensive experiments, including static and dynamic transfer learning settings and neural architecture search, demonstrate the superiority of TransBO over the state-of-the-arts.

* Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2022)
* 9 pages and 2 extra pages of appendix

Via

Access Paper or Ask Questions

Transfer Learning based Search Space Design for Hyperparameter Tuning

Jun 06, 2022
Yang Li, Yu Shen, Huaijun Jiang, Tianyi Bai, Wentao Zhang, Ce Zhang, Bin Cui

Figure 1 for Transfer Learning based Search Space Design for Hyperparameter Tuning

Figure 2 for Transfer Learning based Search Space Design for Hyperparameter Tuning

Figure 3 for Transfer Learning based Search Space Design for Hyperparameter Tuning

Figure 4 for Transfer Learning based Search Space Design for Hyperparameter Tuning

The tuning of hyperparameters becomes increasingly important as machine learning (ML) models have been extensively applied in data mining applications. Among various approaches, Bayesian optimization (BO) is a successful methodology to tune hyper-parameters automatically. While traditional methods optimize each tuning task in isolation, there has been recent interest in speeding up BO by transferring knowledge across previous tasks. In this work, we introduce an automatic method to design the BO search space with the aid of tuning history from past tasks. This simple yet effective approach can be used to endow many existing BO methods with transfer learning capabilities. In addition, it enjoys the three advantages: universality, generality, and safeness. The extensive experiments show that our approach considerably boosts BO by designing a promising and compact search space instead of using the entire space, and outperforms the state-of-the-arts on a wide range of benchmarks, including machine learning and deep learning tuning tasks, and neural architecture search.

* Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2022)
* 9 pages and 2 extra pages for appendix

Via

Access Paper or Ask Questions

Data Encryption based on 9D Complex Chaotic System with Quaternion for Smart Grid

Jun 03, 2022
Fangfang Zhang, Zhe Huang, Lei Kou, Yang Li, Maoyong Cao, Fengying Ma

Figure 1 for Data Encryption based on 9D Complex Chaotic System with Quaternion for Smart Grid

Figure 2 for Data Encryption based on 9D Complex Chaotic System with Quaternion for Smart Grid

Figure 3 for Data Encryption based on 9D Complex Chaotic System with Quaternion for Smart Grid

Figure 4 for Data Encryption based on 9D Complex Chaotic System with Quaternion for Smart Grid

With the development of smart grid, the operation and control of power system is realized through power communication network, especially the power production and enterprise management business involve a large amount of sensitive information, and the requirements for data security and real-time transmission are gradually improved. In this paper, a new 9D complex chaotic system with quaternion is proposed for the encryption of smart grid data. Firstly, a new 9D complex chaotic system with quaternion is proposed, and its attractors, bifurcation diagram, complexity, and 0-1 test are analyzed. Secondly, the pseudo-random sequences are generated by the new chaotic system to encrypt power data. Finally, the proposed encryption algorithm is verifed with power data and images in the smart grid, which can ensure the encryption security and real-time. The verifcation results show that the proposed encryption scheme is technically feasible and available for power data and image encryption in smart grid.

* Accepted by Chinese Physics B

Via

Access Paper or Ask Questions

Do Deep Neural Networks Always Perform Better When Eating More Data?

May 30, 2022
Jiachen Yang, Zhuo Zhang, Yicheng Gong, Shukun Ma, Xiaolan Guo, Yue Yang, Shuai Xiao, Jiabao Wen, Yang Li, Xinbo Gao, Wen Lu, Qinggang Meng

Figure 1 for Do Deep Neural Networks Always Perform Better When Eating More Data?

Figure 2 for Do Deep Neural Networks Always Perform Better When Eating More Data?

Figure 3 for Do Deep Neural Networks Always Perform Better When Eating More Data?

Figure 4 for Do Deep Neural Networks Always Perform Better When Eating More Data?

Data has now become a shortcoming of deep learning. Researchers in their own fields share the thinking that "deep neural networks might not always perform better when they eat more data," which still lacks experimental validation and a convincing guiding theory. Here to fill this lack, we design experiments from Identically Independent Distribution(IID) and Out of Distribution(OOD), which give powerful answers. For the purpose of guidance, based on the discussion of results, two theories are proposed: under IID condition, the amount of information determines the effectivity of each sample, the contribution of samples and difference between classes determine the amount of sample information and the amount of class information; under OOD condition, the cross-domain degree of samples determine the contributions, and the bias-fitting caused by irrelevant elements is a significant factor of cross-domain. The above theories provide guidance from the perspective of data, which can promote a wide range of practical applications of artificial intelligence.

Via

Access Paper or Ask Questions