Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniela Rus

Solving Continuous Control via Q-learning

Oct 22, 2022

Tim Seyde, Peter Werner, Wilko Schwarting, Igor Gilitschenski, Martin Riedmiller, Daniela Rus, Markus Wulfmeier

Abstract:While there has been substantial success in applying actor-critic methods to continuous control, simpler critic-only methods such as Q-learning often remain intractable in the associated high-dimensional action spaces. However, most actor-critic methods come at the cost of added complexity: heuristics for stabilization, compute requirements as well as wider hyperparameter search spaces. We show that these issues can be largely alleviated via Q-learning by combining action discretization with value decomposition, framing single-agent control as cooperative multi-agent reinforcement learning (MARL). With bang-bang actions, performance of this critic-only approach matches state-of-the-art continuous actor-critic methods when learning from features or pixels. We extend classical bandit examples from cooperative MARL to provide intuition for how decoupled critics leverage state information to coordinate joint optimization, and demonstrate surprisingly strong performance across a wide variety of continuous control tasks.

Via

Access Paper or Ask Questions

Interpreting Neural Policies with Disentangled Tree Representations

Oct 13, 2022

Tsun-Hsuan Wang, Wei Xiao, Tim Seyde, Ramin Hasani, Daniela Rus

Figure 1 for Interpreting Neural Policies with Disentangled Tree Representations

Figure 2 for Interpreting Neural Policies with Disentangled Tree Representations

Figure 3 for Interpreting Neural Policies with Disentangled Tree Representations

Figure 4 for Interpreting Neural Policies with Disentangled Tree Representations

Abstract:Compact neural networks used in policy learning and closed-loop end-to-end control learn representations from data that encapsulate agent dynamics and potentially the agent-environment's factors of variation. A formal and quantitative understanding and interpretation of these explanatory factors in neural representations is difficult to achieve due to the complex and intertwined correspondence of neural activities with emergent behaviors. In this paper, we design a new algorithm that programmatically extracts tree representations from compact neural policies, in the form of a set of logic programs grounded by the world state. To assess how well networks uncover the dynamics of the task and their factors of variation, we introduce interpretability metrics that measure the disentanglement of learned neural dynamics from a concentration of decisions, mutual information, and modularity perspectives. Moreover, our method allows us to quantify how accurate the extracted decision paths (explanations) are and computes cross-neuron logic conflict. We demonstrate the effectiveness of our approach with several types of compact network architectures on a series of end-to-end learning to control tasks.

Via

Access Paper or Ask Questions

On the Forward Invariance of Neural ODEs

Oct 10, 2022

Wei Xiao, Tsun-Hsuan Wang, Ramin Hasani, Mathias Lechner, Daniela Rus

Figure 1 for On the Forward Invariance of Neural ODEs

Figure 2 for On the Forward Invariance of Neural ODEs

Figure 3 for On the Forward Invariance of Neural ODEs

Figure 4 for On the Forward Invariance of Neural ODEs

Abstract:To ensure robust and trustworthy decision-making, it is highly desirable to enforce constraints over a neural network's parameters and its inputs automatically by back-propagating output specifications. This way, we can guarantee that the network makes reliable decisions under perturbations. Here, we propose a new method for achieving a class of specification guarantees for neural Ordinary Differentiable Equations (ODEs) by using invariance set propagation. An invariance of a neural ODE is defined as an output specification, such as to satisfy mathematical formulae, physical laws, and system safety. We use control barrier functions to specify the invariance of a neural ODE on the output layer and propagate it back to the input layer. Through the invariance backpropagation, we map output specifications onto constraints on the neural ODE parameters or its input. The satisfaction of the corresponding constraints implies the satisfaction of output specifications. This allows us to achieve output specification guarantees by changing the input or parameters while maximally preserving the model performance. We demonstrate the invariance propagation on a comprehensive series of representation learning tasks, including spiral curve regression, autoregressive modeling of joint physical dynamics, convexity portrait of a function, and safe neural control of collision avoidance for autonomous vehicles.

* 20 pages

Via

Access Paper or Ask Questions

PyHopper -- Hyperparameter optimization

Oct 10, 2022

Mathias Lechner, Ramin Hasani, Philipp Neubauer, Sophie Neubauer, Daniela Rus

Figure 1 for PyHopper -- Hyperparameter optimization

Figure 2 for PyHopper -- Hyperparameter optimization

Figure 3 for PyHopper -- Hyperparameter optimization

Figure 4 for PyHopper -- Hyperparameter optimization

Abstract:Hyperparameter tuning is a fundamental aspect of machine learning research. Setting up the infrastructure for systematic optimization of hyperparameters can take a significant amount of time. Here, we present PyHopper, a black-box optimization platform designed to streamline the hyperparameter tuning workflow of machine learning researchers. PyHopper's goal is to integrate with existing code with minimal effort and run the optimization process with minimal necessary manual oversight. With simplicity as the primary theme, PyHopper is powered by a single robust Markov-chain Monte-Carlo optimization algorithm that scales to millions of dimensions. Compared to existing tuning packages, focusing on a single algorithm frees the user from having to decide between several algorithms and makes PyHopper easily customizable. PyHopper is publicly available under the Apache-2.0 license at https://github.com/PyHopper/PyHopper.

Via

Access Paper or Ask Questions

Are All Vision Models Created Equal? A Study of the Open-Loop to Closed-Loop Causality Gap

Oct 09, 2022

Mathias Lechner, Ramin Hasani, Alexander Amini, Tsun-Hsuan Wang, Thomas A. Henzinger, Daniela Rus

Figure 1 for Are All Vision Models Created Equal? A Study of the Open-Loop to Closed-Loop Causality Gap

Figure 2 for Are All Vision Models Created Equal? A Study of the Open-Loop to Closed-Loop Causality Gap

Figure 3 for Are All Vision Models Created Equal? A Study of the Open-Loop to Closed-Loop Causality Gap

Figure 4 for Are All Vision Models Created Equal? A Study of the Open-Loop to Closed-Loop Causality Gap

Abstract:There is an ever-growing zoo of modern neural network models that can efficiently learn end-to-end control from visual observations. These advanced deep models, ranging from convolutional to patch-based networks, have been extensively tested on offline image classification and regression tasks. In this paper, we study these vision architectures with respect to the open-loop to closed-loop causality gap, i.e., offline training followed by an online closed-loop deployment. This causality gap typically emerges in robotics applications such as autonomous driving, where a network is trained to imitate the control commands of a human. In this setting, two situations arise: 1) Closed-loop testing in-distribution, where the test environment shares properties with those of offline training data. 2) Closed-loop testing under distribution shifts and out-of-distribution. Contrary to recently reported results, we show that under proper training guidelines, all vision models perform indistinguishably well on in-distribution deployment, resolving the causality gap. In situation 2, We observe that the causality gap disrupts performance regardless of the choice of the model architecture. Our results imply that the causality gap can be solved in situation one with our proposed training guideline with any modern network architecture, whereas achieving out-of-distribution generalization (situation two) requires further investigations, for instance, on data diversity rather than the model architecture.

Via

Access Paper or Ask Questions

Intention Communication and Hypothesis Likelihood in Game-Theoretic Motion Planning

Sep 26, 2022

Makram Chahine, Roya Firoozi, Wei Xiao, Mac Schwager, Daniela Rus

Figure 1 for Intention Communication and Hypothesis Likelihood in Game-Theoretic Motion Planning

Figure 2 for Intention Communication and Hypothesis Likelihood in Game-Theoretic Motion Planning

Figure 3 for Intention Communication and Hypothesis Likelihood in Game-Theoretic Motion Planning

Figure 4 for Intention Communication and Hypothesis Likelihood in Game-Theoretic Motion Planning

Abstract:Game-theoretic motion planners are a potent solution for controlling systems of multiple highly interactive robots. Most existing game-theoretic planners unrealistically assume a priori objective function knowledge is available to all agents. To address this, we propose a fault-tolerant receding horizon game-theoretic motion planner that leverages inter-agent communication with intention hypothesis likelihood. Specifically, robots communicate their objective function incorporating their intentions. A discrete Bayesian filter is designed to infer the objectives in real-time based on the discrepancy between observed trajectories and the ones from communicated intentions. In simulation, we consider three safety-critical autonomous driving scenarios of overtaking, lane-merging and intersection crossing, to demonstrate our planner's ability to capitalize on alternative intention hypotheses to generate safe trajectories in the presence of faulty transmissions in the communication network.

* This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Liquid Structural State-Space Models

Sep 26, 2022

Ramin Hasani, Mathias Lechner, Tsun-Hsuan Wang, Makram Chahine, Alexander Amini, Daniela Rus

Figure 1 for Liquid Structural State-Space Models

Figure 2 for Liquid Structural State-Space Models

Figure 3 for Liquid Structural State-Space Models

Figure 4 for Liquid Structural State-Space Models

Abstract:A proper parametrization of state transition matrices of linear state-space models (SSMs) followed by standard nonlinearities enables them to efficiently learn representations from sequential data, establishing the state-of-the-art on a large series of long-range sequence modeling benchmarks. In this paper, we show that we can improve further when the structural SSM such as S4 is given by a linear liquid time-constant (LTC) state-space model. LTC neural networks are causal continuous-time neural networks with an input-dependent state transition module, which makes them learn to adapt to incoming inputs at inference. We show that by using a diagonal plus low-rank decomposition of the state transition matrix introduced in S4, and a few simplifications, the LTC-based structural state-space model, dubbed Liquid-S4, achieves the new state-of-the-art generalization across sequence modeling tasks with long-term dependencies such as image, text, audio, and medical time-series, with an average performance of 87.32% on the Long-Range Arena benchmark. On the full raw Speech Command recognition, dataset Liquid-S4 achieves 96.78% accuracy with a 30% reduction in parameter counts compared to S4. The additional gain in performance is the direct result of the Liquid-S4's kernel structure that takes into account the similarities of the input sequence samples during training and inference.

Via

Access Paper or Ask Questions

Deep Learning on Home Drone: Searching for the Optimal Architecture

Sep 21, 2022

Alaa Maalouf, Yotam Gurfinkel, Barak Diker, Oren Gal, Daniela Rus, Dan Feldman

Figure 1 for Deep Learning on Home Drone: Searching for the Optimal Architecture

Figure 2 for Deep Learning on Home Drone: Searching for the Optimal Architecture

Figure 3 for Deep Learning on Home Drone: Searching for the Optimal Architecture

Figure 4 for Deep Learning on Home Drone: Searching for the Optimal Architecture

Abstract:We suggest the first system that runs real-time semantic segmentation via deep learning on a weak micro-computer such as the Raspberry Pi Zero v2 (whose price was \$15) attached to a toy-drone. In particular, since the Raspberry Pi weighs less than $16$ grams, and its size is half of a credit card, we could easily attach it to the common commercial DJI Tello toy-drone (<\$100, <90 grams, 98 $\times$ 92.5 $\times$ 41 mm). The result is an autonomous drone (no laptop nor human in the loop) that can detect and classify objects in real-time from a video stream of an on-board monocular RGB camera (no GPS or LIDAR sensors). The companion videos demonstrate how this Tello drone scans the lab for people (e.g. for the use of firefighters or security forces) and for an empty parking slot outside the lab. Existing deep learning solutions are either much too slow for real-time computation on such IoT devices, or provide results of impractical quality. Our main challenge was to design a system that takes the best of all worlds among numerous combinations of networks, deep learning platforms/frameworks, compression techniques, and compression ratios. To this end, we provide an efficient searching algorithm that aims to find the optimal combination which results in the best tradeoff between the network running time and its accuracy/performance.

Via

Access Paper or Ask Questions

BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling

Jun 25, 2022

Yechao Bai, Xiaogang Wang, Marcelo H. Ang Jr, Daniela Rus

Figure 1 for BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling

Figure 2 for BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling

Figure 3 for BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling

Figure 4 for BIMS-PU: Bi-Directional and Multi-Scale Point Cloud Upsampling

Abstract:The learning and aggregation of multi-scale features are essential in empowering neural networks to capture the fine-grained geometric details in the point cloud upsampling task. Most existing approaches extract multi-scale features from a point cloud of a fixed resolution, hence obtain only a limited level of details. Though an existing approach aggregates a feature hierarchy of different resolutions from a cascade of upsampling sub-network, the training is complex with expensive computation. To address these issues, we construct a new point cloud upsampling pipeline called BIMS-PU that integrates the feature pyramid architecture with a bi-directional up and downsampling path. Specifically, we decompose the up/downsampling procedure into several up/downsampling sub-steps by breaking the target sampling factor into smaller factors. The multi-scale features are naturally produced in a parallel manner and aggregated using a fast feature fusion method. Supervision signal is simultaneously applied to all upsampled point clouds of different scales. Moreover, we formulate a residual block to ease the training of our model. Extensive quantitative and qualitative experiments on different datasets show that our method achieves superior results to state-of-the-art approaches. Last but not least, we demonstrate that point cloud upsampling can improve robot perception by ameliorating the 3D data quality.

* in IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 7447-7454, July 2022
* Accepted to RA-L 2022. in IEEE Robotics and Automation Letters

Via

Access Paper or Ask Questions

Entangled Residual Mappings

Jun 02, 2022

Mathias Lechner, Ramin Hasani, Zahra Babaiee, Radu Grosu, Daniela Rus, Thomas A. Henzinger, Sepp Hochreiter

Figure 1 for Entangled Residual Mappings

Figure 2 for Entangled Residual Mappings

Figure 3 for Entangled Residual Mappings

Figure 4 for Entangled Residual Mappings

Abstract:Residual mappings have been shown to perform representation learning in the first layers and iterative feature refinement in higher layers. This interplay, combined with their stabilizing effect on the gradient norms, enables them to train very deep networks. In this paper, we take a step further and introduce entangled residual mappings to generalize the structure of the residual connections and evaluate their role in iterative learning representations. An entangled residual mapping replaces the identity skip connections with specialized entangled mappings such as orthogonal, sparse, and structural correlation matrices that share key attributes (eigenvalues, structure, and Jacobian norm) with identity mappings. We show that while entangled mappings can preserve the iterative refinement of features across various deep models, they influence the representation learning process in convolutional networks differently than attention-based models and recurrent neural networks. In general, we find that for CNNs and Vision Transformers entangled sparse mapping can help generalization while orthogonal mappings hurt performance. For recurrent networks, orthogonal residual mappings form an inductive bias for time-variant sequences, which degrades accuracy on time-invariant tasks.

* 21 Pages

Via

Access Paper or Ask Questions