Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Frank Hutter

Meta-Learning of Neural Architectures for Few-Shot Learning

Nov 25, 2019
Thomas Elsken, Benedikt Staffler, Jan Hendrik Metzen, Frank Hutter

Figure 1 for Meta-Learning of Neural Architectures for Few-Shot Learning

Figure 2 for Meta-Learning of Neural Architectures for Few-Shot Learning

Figure 3 for Meta-Learning of Neural Architectures for Few-Shot Learning

Figure 4 for Meta-Learning of Neural Architectures for Few-Shot Learning

The recent progress in neural architectures search (NAS) has allowed scaling the automated design of neural architectures to real-world domains such as object detection and semantic segmentation. However, one prerequisite for the application of NAS are large amounts of labeled data and compute resources. This renders its application challenging in few-shot learning scenarios, where many related tasks need to be learned, each with limited amounts of data and compute time. Thus, few-shot learning is typically done with a fixed neural architecture. To improve upon this, we propose MetaNAS, the first method which fully integrates NAS with gradient-based meta-learning. MetaNAS optimizes a meta-architecture along with the meta-weights during meta-training. During meta-testing, architectures can be adapted to a novel task with a few steps of the task optimizer, that is: task adaptation becomes computationally cheap and requires only little data per task. Moreover, MetaNAS is agnostic in that it can be used with arbitrary model-agnostic meta-learning algorithms and arbitrary gradient-based NAS methods. Empirical results on standard few-shot classification benchmarks show that MetaNAS with a combination of DARTS and REPTILE yields state-of-the-art results.

Via

Access Paper or Ask Questions

OpenML-Python: an extensible Python API for OpenML

Nov 06, 2019
Matthias Feurer, Jan N. van Rijn, Arlind Kadra, Pieter Gijsbers, Neeratyoy Mallik, Sahithya Ravi, Andreas Müller, Joaquin Vanschoren, Frank Hutter

Figure 1 for OpenML-Python: an extensible Python API for OpenML

OpenML is an online platform for open science collaboration in machine learning, used to share datasets and results of machine learning experiments. In this paper we introduce \emph{OpenML-Python}, a client API for Python, opening up the OpenML platform for a wide range of Python-based tools. It provides easy access to all datasets, tasks and experiments on OpenML from within Python. It also provides functionality to conduct machine learning experiments, upload the results to OpenML, and reproduce results which are stored on OpenML. Furthermore, it comes with a scikit-learn plugin and a plugin mechanism to easily integrate other machine learning libraries written in Python into the OpenML ecosystem. Source code and documentation is available at https://github.com/openml/openml-python/.

Via

Access Paper or Ask Questions

Neural Architecture Evolution in Deep Reinforcement Learning for Continuous Control

Oct 28, 2019
Jörg K. H. Franke, Gregor Köhler, Noor Awad, Frank Hutter

Figure 1 for Neural Architecture Evolution in Deep Reinforcement Learning for Continuous Control

Figure 2 for Neural Architecture Evolution in Deep Reinforcement Learning for Continuous Control

Figure 3 for Neural Architecture Evolution in Deep Reinforcement Learning for Continuous Control

Figure 4 for Neural Architecture Evolution in Deep Reinforcement Learning for Continuous Control

Current Deep Reinforcement Learning algorithms still heavily rely on handcrafted neural network architectures. We propose a novel approach to automatically find strong topologies for continuous control tasks while only adding a minor overhead in terms of interactions in the environment. To achieve this, we combine Neuroevolution techniques with off-policy training and propose a novel architecture mutation operator. Experiments on five continuous control benchmarks show that the proposed Actor-Critic Neuroevolution algorithm often outperforms the strong Actor-Critic baseline and is capable of automatically finding topologies in a sample-efficient manner which would otherwise have to be found by expensive architecture search.

* Accepted to NeurIPS'19 MetaLearn Workshop

Via

Access Paper or Ask Questions

Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings

Oct 10, 2019
Matilde Gargiani, Aaron Klein, Stefan Falkner, Frank Hutter

Figure 1 for Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings

Figure 2 for Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings

Figure 3 for Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings

Figure 4 for Probabilistic Rollouts for Learning Curve Extrapolation Across Hyperparameter Settings

We propose probabilistic models that can extrapolate learning curves of iterative machine learning algorithms, such as stochastic gradient descent for training deep networks, based on training data with variable-length learning curves. We study instantiations of this framework based on random forests and Bayesian recurrent neural networks. Our experiments show that these models yield better predictions than state-of-the-art models from the hyperparameter optimization literature when extrapolating the performance of neural networks trained with different hyperparameter settings.

Via

Access Paper or Ask Questions

Understanding and Robustifying Differentiable Architecture Search

Sep 20, 2019
Arber Zela, Thomas Elsken, Tonmoy Saikia, Yassine Marrakchi, Thomas Brox, Frank Hutter

Figure 1 for Understanding and Robustifying Differentiable Architecture Search

Figure 2 for Understanding and Robustifying Differentiable Architecture Search

Figure 3 for Understanding and Robustifying Differentiable Architecture Search

Figure 4 for Understanding and Robustifying Differentiable Architecture Search

Differentiable Architecture Search (DARTS) has attracted a lot of attention due to its simplicity and small search costs achieved by a continuous relaxation and an approximation of the resulting bi-level optimization problem. However, DARTS does not work robustly for new problems: we identify a wide range of search spaces for which DARTS yields degenerate architectures with very poor test performance. We study this failure mode and show that, while DARTS successfully minimizes validation loss, the found solutions generalize poorly when they coincide with high validation loss curvature in the space of architectures. We show that by adding one of various types of regularization we can robustify DARTS to find solutions with smaller Hessian spectrum and with better generalization properties. Based on these observations we propose several simple variations of DARTS that perform substantially more robustly in practice. Our observations are robust across five search spaces on three image classification tasks and also hold for the very different domains of disparity estimation (a dense regression task) and language modelling. We provide our implementation and scripts to facilitate reproducibility.

* 28 pages, 25 figures

Via

Access Paper or Ask Questions

!MDP Playground: Meta-Features in Reinforcement Learning

Sep 17, 2019
Raghu Rajan, Frank Hutter

Figure 1 for !MDP Playground: Meta-Features in Reinforcement Learning

Figure 2 for !MDP Playground: Meta-Features in Reinforcement Learning

Figure 3 for !MDP Playground: Meta-Features in Reinforcement Learning

Figure 4 for !MDP Playground: Meta-Features in Reinforcement Learning

Reinforcement Learning (RL) algorithms usually assume their environment to be a Markov Decision Process (MDP). Additionally, they do not try to identify specific features of environments which could help them perform better. Here, we present a few key meta-features of environments: delayed rewards, specific reward sequences, sparsity of rewards, and stochasticity of environments, which may violate the MDP assumptions and adapting to which should help RL agents perform better. While it is very time consuming to run RL algorithms on standard benchmarks, we define a parameterised collection of fast-to-run toy benchmarks in OpenAI Gym by varying these meta-features. Despite their toy nature and low compute requirements, we show that these benchmarks present substantial difficulties to current RL algorithms. Furthermore, since we can generate environments with a desired value for each of the meta-features, we have fine-grained control over the environments' difficulty and also have the ground truth available for evaluating algorithms. We believe that devising algorithms that can detect such meta-features of environments and adapt to them will be key to creating robust RL algorithms that work in a variety of different real-world problems.

* Submitted to NIPS Deep RL Workshop

Via

Access Paper or Ask Questions

Best Practices for Scientific Research on Neural Architecture Search

Sep 05, 2019
Marius Lindauer, Frank Hutter

Figure 1 for Best Practices for Scientific Research on Neural Architecture Search

Figure 2 for Best Practices for Scientific Research on Neural Architecture Search

We describe a set of best practices for the young field of neural architecture search (NAS), which lead to the best practices checklist for NAS available at http://automl.org/nas_checklist.pdf.

Via

Access Paper or Ask Questions

Towards Assessing the Impact of Bayesian Optimization's Own Hyperparameters

Aug 19, 2019
Marius Lindauer, Matthias Feurer, Katharina Eggensperger, André Biedenkapp, Frank Hutter

Figure 1 for Towards Assessing the Impact of Bayesian Optimization's Own Hyperparameters

Figure 2 for Towards Assessing the Impact of Bayesian Optimization's Own Hyperparameters

Figure 3 for Towards Assessing the Impact of Bayesian Optimization's Own Hyperparameters

Figure 4 for Towards Assessing the Impact of Bayesian Optimization's Own Hyperparameters

Bayesian Optimization (BO) is a common approach for hyperparameter optimization (HPO) in automated machine learning. Although it is well-accepted that HPO is crucial to obtain well-performing machine learning models, tuning BO's own hyperparameters is often neglected. In this paper, we empirically study the impact of optimizing BO's own hyperparameters and the transferability of the found settings using a wide range of benchmarks, including artificial functions, HPO and HPO combined with neural architecture search. In particular, we show (i) that tuning can improve the any-time performance of different BO approaches, that optimized BO settings also perform well (ii) on similar problems and (iii) partially even on problems from other problem families, and (iv) which BO hyperparameters are most important.

* Accepted at DSO workshop (as part of IJCAI'19)

Via

Access Paper or Ask Questions

BOAH: A Tool Suite for Multi-Fidelity Bayesian Optimization & Analysis of Hyperparameters

Aug 16, 2019
Marius Lindauer, Katharina Eggensperger, Matthias Feurer, André Biedenkapp, Joshua Marben, Philipp Müller, Frank Hutter

Figure 1 for BOAH: A Tool Suite for Multi-Fidelity Bayesian Optimization & Analysis of Hyperparameters

Figure 2 for BOAH: A Tool Suite for Multi-Fidelity Bayesian Optimization & Analysis of Hyperparameters

Hyperparameter optimization and neural architecture search can become prohibitively expensive for regular black-box Bayesian optimization because the training and evaluation of a single model can easily take several hours. To overcome this, we introduce a comprehensive tool suite for effective multi-fidelity Bayesian optimization and the analysis of its runs. The suite, written in Python, provides a simple way to specify complex design spaces, a robust and efficient combination of Bayesian optimization and HyperBand, and a comprehensive analysis of the optimization process and its outcomes.

Via

Access Paper or Ask Questions

Towards White-box Benchmarks for Algorithm Control

Jun 18, 2019
André Biedenkapp, H. Furkan Bozkurt, Frank Hutter, Marius Lindauer

Figure 1 for Towards White-box Benchmarks for Algorithm Control

Figure 2 for Towards White-box Benchmarks for Algorithm Control

Figure 3 for Towards White-box Benchmarks for Algorithm Control

Figure 4 for Towards White-box Benchmarks for Algorithm Control

The performance of many algorithms in the fields of hard combinatorial problem solving, machine learning or AI in general depends on tuned hyperparameter configurations. Automated methods have been proposed to alleviate users from the tedious and error-prone task of manually searching for performance-optimized configurations across a set of problem instances. However there is still a lot of untapped potential through adjusting an algorithm's hyperparameters online since different hyperparameters are potentially optimal at different stages of the algorithm. We formulate the problem of adjusting an algorithm's hyperparameters for a given instance on the fly as a contextual MDP, making reinforcement learning (RL) the prime candidate to solve the resulting algorithm control problem in a data-driven way. Furthermore, inspired by applications of algorithm configuration, we introduce new white-box benchmarks suitable to study algorithm control. We show that on short sequences, algorithm configuration is a valid choice, but that with increasing sequence length a black-box view on the problem quickly becomes infeasible and RL performs better.

* 8 pages, 9 figures

Via

Access Paper or Ask Questions