Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alexander Ilin

Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University

End-to-End Learning of Keypoint Representations for Continuous Control from Images

Jun 15, 2021

Rinu Boney, Alexander Ilin, Juho Kannala

Figure 1 for End-to-End Learning of Keypoint Representations for Continuous Control from Images

Figure 2 for End-to-End Learning of Keypoint Representations for Continuous Control from Images

Figure 3 for End-to-End Learning of Keypoint Representations for Continuous Control from Images

Figure 4 for End-to-End Learning of Keypoint Representations for Continuous Control from Images

Abstract:In many control problems that include vision, optimal controls can be inferred from the location of the objects in the scene. This information can be represented using keypoints, which is a list of spatial locations in the input image. Previous works show that keypoint representations learned during unsupervised pre-training using encoder-decoder architectures can provide good features for control tasks. In this paper, we show that it is possible to learn efficient keypoint representations end-to-end, without the need for unsupervised pre-training, decoders, or additional losses. Our proposed architecture consists of a differentiable keypoint extractor that feeds the coordinates of the estimated keypoints directly to a soft actor-critic agent. The proposed algorithm yields performance competitive to the state-of-the art on DeepMind Control Suite tasks.

Via

Access Paper or Ask Questions

Learning to Play Imperfect-Information Games by Imitating an Oracle Planner

Dec 22, 2020

Rinu Boney, Alexander Ilin, Juho Kannala, Jarno Seppänen

Figure 1 for Learning to Play Imperfect-Information Games by Imitating an Oracle Planner

Figure 2 for Learning to Play Imperfect-Information Games by Imitating an Oracle Planner

Figure 3 for Learning to Play Imperfect-Information Games by Imitating an Oracle Planner

Figure 4 for Learning to Play Imperfect-Information Games by Imitating an Oracle Planner

Abstract:We consider learning to play multiplayer imperfect-information games with simultaneous moves and large state-action spaces. Previous attempts to tackle such challenging games have largely focused on model-free learning methods, often requiring hundreds of years of experience to produce competitive agents. Our approach is based on model-based planning. We tackle the problem of partial observability by first building an (oracle) planner that has access to the full state of the environment and then distilling the knowledge of the oracle to a (follower) agent which is trained to play the imperfect-information game by imitating the oracle's choices. We experimentally show that planning with naive Monte Carlo tree search does not perform very well in large combinatorial action spaces. We therefore propose planning with a fixed-depth tree search and decoupled Thompson sampling for action selection. We show that the planner is able to discover efficient playing strategies in the games of Clash Royale and Pommerman and the follower policy successfully learns to implement them by training on a few hundred battles.

Via

Access Paper or Ask Questions

Regularizing Model-Based Planning with Energy-Based Models

Oct 12, 2019

Rinu Boney, Juho Kannala, Alexander Ilin

Figure 1 for Regularizing Model-Based Planning with Energy-Based Models

Figure 2 for Regularizing Model-Based Planning with Energy-Based Models

Figure 3 for Regularizing Model-Based Planning with Energy-Based Models

Figure 4 for Regularizing Model-Based Planning with Energy-Based Models

Abstract:Model-based reinforcement learning could enable sample-efficient learning by quickly acquiring rich knowledge about the world and using it to improve behaviour without additional data. Learned dynamics models can be directly used for planning actions but this has been challenging because of inaccuracies in the learned models. In this paper, we focus on planning with learned dynamics models and propose to regularize it using energy estimates of state transitions in the environment. We visually demonstrate the effectiveness of the proposed method and show that off-policy training of an energy estimator can be effectively used to regularize planning with pre-trained dynamics models. Further, we demonstrate that the proposed method enables sample-efficient learning to achieve competitive performance in challenging continuous control tasks such as Half-cheetah and Ant in just a few minutes of experience.

* Conference on Robot Learning 2019

Via

Access Paper or Ask Questions

Regularizing Trajectory Optimization with Denoising Autoencoders

Mar 28, 2019

Rinu Boney, Norman Di Palo, Mathias Berglund, Alexander Ilin, Juho Kannala, Antti Rasmus, Harri Valpola

Figure 1 for Regularizing Trajectory Optimization with Denoising Autoencoders

Figure 2 for Regularizing Trajectory Optimization with Denoising Autoencoders

Figure 3 for Regularizing Trajectory Optimization with Denoising Autoencoders

Figure 4 for Regularizing Trajectory Optimization with Denoising Autoencoders

Abstract:Trajectory optimization with learned dynamics models can often suffer from erroneous predictions of out-of-distribution trajectories. We propose to regularize trajectory optimization by means of a denoising autoencoder that is trained on the same trajectories as the dynamics model. We visually demonstrate the effectiveness of the regularization in gradient-based trajectory optimization for open-loop control of an industrial process. We compare with recent model-based reinforcement learning algorithms on a set of popular motor control tasks to demonstrate that the denoising regularization enables state-of-the-art sample-efficiency. We demonstrate the efficacy of the proposed method in regularizing both gradient-based and gradient-free trajectory optimization.

Via

Access Paper or Ask Questions

Semi-Supervised and Active Few-Shot Learning with Prototypical Networks

Apr 25, 2018

Rinu Boney, Alexander Ilin

Figure 1 for Semi-Supervised and Active Few-Shot Learning with Prototypical Networks

Figure 2 for Semi-Supervised and Active Few-Shot Learning with Prototypical Networks

Figure 3 for Semi-Supervised and Active Few-Shot Learning with Prototypical Networks

Figure 4 for Semi-Supervised and Active Few-Shot Learning with Prototypical Networks

Abstract:We consider the problem of semi-supervised few-shot classification where a classifier needs to adapt to new tasks using a few labeled examples and (potentially many) unlabeled examples. We propose a clustering approach to the problem. The features extracted with Prototypical Networks are clustered using $K$-means with the few labeled examples guiding the clustering process. We note that in many real-world applications the adaptation performance can be significantly improved by requesting the few labels through user feedback. We demonstrate good performance of the active adaptation strategy using image data.

Via

Access Paper or Ask Questions

Recurrent Ladder Networks

Dec 18, 2017

Isabeau Prémont-Schwarz, Alexander Ilin, Tele Hotloo Hao, Antti Rasmus, Rinu Boney, Harri Valpola

Abstract:We propose a recurrent extension of the Ladder networks whose structure is motivated by the inference required in hierarchical latent variable models. We demonstrate that the recurrent Ladder is able to handle a wide variety of complex learning tasks that benefit from iterative inference and temporal modeling. The architecture shows close-to-optimal results on temporal modeling of video data, competitive results on music modeling, and improved perceptual grouping based on higher order abstractions, such as stochastic textures and motion cues. We present results for fully supervised, semi-supervised, and unsupervised tasks. The results suggest that the proposed architecture and principles are powerful tools for learning a hierarchy of abstractions, learning iterative inference and handling temporal information.

* 9 pages, 9 figures, 7-page appendix, fixed fig 9 (c)

Via

Access Paper or Ask Questions

Linear State-Space Model with Time-Varying Dynamics

Oct 03, 2014

Jaakko Luttinen, Tapani Raiko, Alexander Ilin

Figure 1 for Linear State-Space Model with Time-Varying Dynamics

Figure 2 for Linear State-Space Model with Time-Varying Dynamics

Figure 3 for Linear State-Space Model with Time-Varying Dynamics

Figure 4 for Linear State-Space Model with Time-Varying Dynamics

Abstract:This paper introduces a linear state-space model with time-varying dynamics. The time dependency is obtained by forming the state dynamics matrix as a time-varying linear combination of a set of matrices. The time dependency of the weights in the linear combination is modelled by another linear Gaussian dynamical model allowing the model to learn how the dynamics of the process changes. Previous approaches have used switching models which have a small set of possible state dynamics matrices and the model selects one of those matrices at each time, thus jumping between them. Our model forms the dynamics as a linear combination and the changes can be smooth and more continuous. The model is motivated by physical processes which are described by linear partial differential equations whose parameters vary in time. An example of such a process could be a temperature field whose evolution is driven by a varying wind direction. The posterior inference is performed using variational Bayesian approximation. The experiments on stochastic advection-diffusion processes and real-world weather processes show that the model with time-varying dynamics can outperform previously introduced approaches.

* Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science Volume 8725, 2014, pp 338-353
* The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-662-44851-9_22

Via

Access Paper or Ask Questions