Alert button
Picture for Andrew Melnik

Andrew Melnik

Alert button

Behavioral Cloning via Search in Embedded Demonstration Dataset

Jun 15, 2023
Federico Malato, Florian Leopold, Ville Hautamaki, Andrew Melnik

Figure 1 for Behavioral Cloning via Search in Embedded Demonstration Dataset
Figure 2 for Behavioral Cloning via Search in Embedded Demonstration Dataset
Figure 3 for Behavioral Cloning via Search in Embedded Demonstration Dataset
Figure 4 for Behavioral Cloning via Search in Embedded Demonstration Dataset

Behavioural cloning uses a dataset of demonstrations to learn a behavioural policy. To overcome various learning and policy adaptation problems, we propose to use latent space to index a demonstration dataset, instantly access similar relevant experiences, and copy behavior from these situations. Actions from a selected similar situation can be performed by the agent until representations of the agent's current situation and the selected experience diverge in the latent space. Thus, we formulate our control problem as a search problem over a dataset of experts' demonstrations. We test our approach on BASALT MineRL-dataset in the latent representation of a Video PreTraining model. We compare our model to state-of-the-art Minecraft agents. Our approach can effectively recover meaningful demonstrations and show human-like behavior of an agent in the Minecraft environment in a wide variety of scenarios. Experimental results reveal that performance of our search-based approach is comparable to trained models, while allowing zero-shot task adaptation by changing the demonstration examples.

Viaarxiv icon

Contrastive Language, Action, and State Pre-training for Robot Learning

Apr 21, 2023
Krishan Rana, Andrew Melnik, Niko Sünderhauf

Figure 1 for Contrastive Language, Action, and State Pre-training for Robot Learning
Figure 2 for Contrastive Language, Action, and State Pre-training for Robot Learning
Figure 3 for Contrastive Language, Action, and State Pre-training for Robot Learning
Figure 4 for Contrastive Language, Action, and State Pre-training for Robot Learning

In this paper, we introduce a method for unifying language, action, and state information in a shared embedding space to facilitate a range of downstream tasks in robot learning. Our method, Contrastive Language, Action, and State Pre-training (CLASP), extends the CLIP formulation by incorporating distributional learning, capturing the inherent complexities and one-to-many relationships in behaviour-text alignment. By employing distributional outputs for both text and behaviour encoders, our model effectively associates diverse textual commands with a single behaviour and vice-versa. We demonstrate the utility of our method for the following downstream tasks: zero-shot text-behaviour retrieval, captioning unseen robot behaviours, and learning a behaviour prior for language-conditioned reinforcement learning. Our distributional encoders exhibit superior retrieval and captioning performance on unseen datasets, and the ability to generate meaningful exploratory behaviours from textual commands, capturing the intricate relationships between language, action, and state. This work represents an initial step towards developing a unified pre-trained model for robotics, with the potential to generalise to a broad range of downstream tasks.

Viaarxiv icon

Shape complexity estimation using VAE

Apr 05, 2023
Markus Rothgaenger, Andrew Melnik, Helge Ritter

Figure 1 for Shape complexity estimation using VAE
Figure 2 for Shape complexity estimation using VAE
Figure 3 for Shape complexity estimation using VAE
Figure 4 for Shape complexity estimation using VAE

In this paper, we compare methods for estimating the complexity of two-dimensional shapes and introduce a method that exploits reconstruction loss of Variational Autoencoders with different sizes of latent vectors. Although complexity of a shape is not a well defined attribute, different aspects of it can be estimated. We demonstrate that our methods captures some aspects of shape complexity. Code and training details will be publicly available.

Viaarxiv icon

Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition

Mar 23, 2023
Stephanie Milani, Anssi Kanervisto, Karolis Ramanauskas, Sander Schulhoff, Brandon Houghton, Sharada Mohanty, Byron Galbraith, Ke Chen, Yan Song, Tianze Zhou, Bingquan Yu, He Liu, Kai Guan, Yujing Hu, Tangjie Lv, Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik, Shu Ishida, João F. Henriques, Robert Klassert, Walter Laurito, Ellen Novoseller, Vinicius G. Goecks, Nicholas Waytowich, David Watkins, Josh Miller, Rohin Shah

Figure 1 for Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition
Figure 2 for Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition
Figure 3 for Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition
Figure 4 for Towards Solving Fuzzy Tasks with Human Feedback: A Retrospective of the MineRL BASALT 2022 Competition

To facilitate research in the direction of fine-tuning foundation models from human feedback, we held the MineRL BASALT Competition on Fine-Tuning from Human Feedback at NeurIPS 2022. The BASALT challenge asks teams to compete to develop algorithms to solve tasks with hard-to-specify reward functions in Minecraft. Through this competition, we aimed to promote the development of algorithms that use human feedback as channels to learn the desired behavior. We describe the competition and provide an overview of the top solutions. We conclude by discussing the impact of the competition and future directions for improvement.

Viaarxiv icon

Behavioral Cloning via Search in Video PreTraining Latent Space

Dec 27, 2022
Federico Malato, Florian Leopold, Amogh Raut, Ville Hautamäki, Andrew Melnik

Figure 1 for Behavioral Cloning via Search in Video PreTraining Latent Space
Figure 2 for Behavioral Cloning via Search in Video PreTraining Latent Space
Figure 3 for Behavioral Cloning via Search in Video PreTraining Latent Space
Figure 4 for Behavioral Cloning via Search in Video PreTraining Latent Space

Our aim is to build autonomous agents that can solve tasks in environments like Minecraft. To do so, we used an imitation learning-based approach. We formulate our control problem as a search problem over a dataset of experts' demonstrations, where the agent copies actions from a similar demonstration trajectory of image-action pairs. We perform a proximity search over the BASALT MineRL-dataset in the latent representation of a Video PreTraining model. The agent copies the actions from the expert trajectory as long as the distance between the state representations of the agent and the selected expert trajectory from the dataset do not diverge. Then the proximity search is repeated. Our approach can effectively recover meaningful demonstration trajectories and show human-like behavior of an agent in the Minecraft environment.

Viaarxiv icon

Face Generation and Editing with StyleGAN: A Survey

Dec 18, 2022
Andrew Melnik, Maksim Miasayedzenkau, Dzianis Makarovets, Dzianis Pirshtuk, Eren Akbulut, Dennis Holzmann, Tarek Renusch, Gustav Reichert, Helge Ritter

Figure 1 for Face Generation and Editing with StyleGAN: A Survey
Figure 2 for Face Generation and Editing with StyleGAN: A Survey
Figure 3 for Face Generation and Editing with StyleGAN: A Survey
Figure 4 for Face Generation and Editing with StyleGAN: A Survey

Our goal with this survey is to provide an overview of the state of the art deep learning technologies for face generation and editing. We will cover popular latest architectures and discuss key ideas that make them work, such as inversion, latent representation, loss functions, training procedures, editing methods, and cross domain style transfer. We particularly focus on GAN-based architectures that have culminated in the StyleGAN approaches, which allow generation of high-quality face images and offer rich interfaces for controllable semantics editing and preserving photo quality. We aim to provide an entry point into the field for readers that have basic knowledge about the field of deep learning and are looking for an accessible introduction and overview.

Viaarxiv icon

Planning with RL and episodic-memory behavioral priors

Jul 07, 2022
Shivansh Beohar, Andrew Melnik

Figure 1 for Planning with RL and episodic-memory behavioral priors
Figure 2 for Planning with RL and episodic-memory behavioral priors
Figure 3 for Planning with RL and episodic-memory behavioral priors
Figure 4 for Planning with RL and episodic-memory behavioral priors

The practical application of learning agents requires sample efficient and interpretable algorithms. Learning from behavioral priors is a promising way to bootstrap agents with a better-than-random exploration policy or a safe-guard against the pitfalls of early learning. Existing solutions for imitation learning require a large number of expert demonstrations and rely on hard-to-interpret learning methods like Deep Q-learning. In this work we present a planning-based approach that can use these behavioral priors for effective exploration and learning in a reinforcement learning environment, and we demonstrate that curated exploration policies in the form of behavioral priors can help an agent learn faster.

* Published in ICRA 2022 BPRL Workshop 
Viaarxiv icon

Solving Learn-to-Race Autonomous Racing Challenge by Planning in Latent Space

Jul 05, 2022
Shivansh Beohar, Fabian Heinrich, Rahul Kala, Helge Ritter, Andrew Melnik

Figure 1 for Solving Learn-to-Race Autonomous Racing Challenge by Planning in Latent Space
Figure 2 for Solving Learn-to-Race Autonomous Racing Challenge by Planning in Latent Space
Figure 3 for Solving Learn-to-Race Autonomous Racing Challenge by Planning in Latent Space

Learn-to-Race Autonomous Racing Virtual Challenge hosted on www<dot>aicrowd<dot>com platform consisted of two tracks: Single and Multi Camera. Our UniTeam team was among the final winners in the Single Camera track. The agent is required to pass the previously unknown F1-style track in the minimum time with the least amount of off-road driving violations. In our approach, we used the U-Net architecture for road segmentation, variational autocoder for encoding a road binary mask, and a nearest-neighbor search strategy that selects the best action for a given state. Our agent achieved an average speed of 105 km/h on stage 1 (known track) and 73 km/h on stage 2 (unknown track) without any off-road driving violations. Here we present our solution and results.

* Published in SL4AD Workshop, ICML 2022 
Viaarxiv icon

Faces: AI Blitz XIII Solutions

Apr 03, 2022
Andrew Melnik, Eren Akbulut, Jannik Sheikh, Kira Loos, Michael Buettner, Tobias Lenze

Figure 1 for Faces: AI Blitz XIII Solutions
Figure 2 for Faces: AI Blitz XIII Solutions
Figure 3 for Faces: AI Blitz XIII Solutions
Figure 4 for Faces: AI Blitz XIII Solutions

AI Blitz XIII Faces challenge hosted on www.aicrowd.com platform consisted of five problems: Sentiment Classification, Age Prediction, Mask Prediction, Face Recognition, and Face De-Blurring. Our team GLaDOS took second place. Here we present our solutions and results. Code implementation: https://github.com/ndrwmlnk/ai-blitz-xiii

Viaarxiv icon

Traffic4cast at NeurIPS 2021 -- Temporal and Spatial Few-Shot Transfer Learning in Gridded Geo-Spatial Processes

Apr 01, 2022
Christian Eichenberger, Moritz Neun, Henry Martin, Pedro Herruzo, Markus Spanring, Yichao Lu, Sungbin Choi, Vsevolod Konyakhin, Nina Lukashina, Aleksei Shpilman, Nina Wiedemann, Martin Raubal, Bo Wang, Hai L. Vu, Reza Mohajerpoor, Chen Cai, Inhi Kim, Luca Hermes, Andrew Melnik, Riza Velioglu, Markus Vieth, Malte Schilling, Alabi Bojesomo, Hasan Al Marzouqi, Panos Liatsis, Jay Santokhi, Dylan Hillier, Yiming Yang, Joned Sarwar, Anna Jordan, Emil Hewage, David Jonietz, Fei Tang, Aleksandra Gruca, Michael Kopp, David Kreil, Sepp Hochreiter

Figure 1 for Traffic4cast at NeurIPS 2021 -- Temporal and Spatial Few-Shot Transfer Learning in Gridded Geo-Spatial Processes
Figure 2 for Traffic4cast at NeurIPS 2021 -- Temporal and Spatial Few-Shot Transfer Learning in Gridded Geo-Spatial Processes
Figure 3 for Traffic4cast at NeurIPS 2021 -- Temporal and Spatial Few-Shot Transfer Learning in Gridded Geo-Spatial Processes
Figure 4 for Traffic4cast at NeurIPS 2021 -- Temporal and Spatial Few-Shot Transfer Learning in Gridded Geo-Spatial Processes

The IARAI Traffic4cast competitions at NeurIPS 2019 and 2020 showed that neural networks can successfully predict future traffic conditions 1 hour into the future on simply aggregated GPS probe data in time and space bins. We thus reinterpreted the challenge of forecasting traffic conditions as a movie completion task. U-Nets proved to be the winning architecture, demonstrating an ability to extract relevant features in this complex real-world geo-spatial process. Building on the previous competitions, Traffic4cast 2021 now focuses on the question of model robustness and generalizability across time and space. Moving from one city to an entirely different city, or moving from pre-COVID times to times after COVID hit the world thus introduces a clear domain shift. We thus, for the first time, release data featuring such domain shifts. The competition now covers ten cities over 2 years, providing data compiled from over 10^12 GPS probe data. Winning solutions captured traffic dynamics sufficiently well to even cope with these complex domain shifts. Surprisingly, this seemed to require only the previous 1h traffic dynamic history and static road graph as input.

* Pre-print under review, submitted to Proceedings of Machine Learning Research 
Viaarxiv icon