Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniel Tanneberg

Neuro-Symbolic Imitation Learning: Discovering Symbolic Abstractions for Skill Learning

Mar 27, 2025

Leon Keller, Daniel Tanneberg, Jan Peters

Abstract:Imitation learning is a popular method for teaching robots new behaviors. However, most existing methods focus on teaching short, isolated skills rather than long, multi-step tasks. To bridge this gap, imitation learning algorithms must not only learn individual skills but also an abstract understanding of how to sequence these skills to perform extended tasks effectively. This paper addresses this challenge by proposing a neuro-symbolic imitation learning framework. Using task demonstrations, the system first learns a symbolic representation that abstracts the low-level state-action space. The learned representation decomposes a task into easier subtasks and allows the system to leverage symbolic planning to generate abstract plans. Subsequently, the system utilizes this task decomposition to learn a set of neural skills capable of refining abstract plans into actionable robot commands. Experimental results in three simulated robotic environments demonstrate that, compared to baselines, our neuro-symbolic approach increases data efficiency, improves generalization capabilities, and facilitates interpretability.

* IEEE International Conference on Robotics and Automation (ICRA) 2025

Via

Access Paper or Ask Questions

Tulip Agent -- Enabling LLM-Based Agents to Solve Tasks Using Large Tool Libraries

Jul 31, 2024

Felix Ocker, Daniel Tanneberg, Julian Eggert, Michael Gienger

Abstract:We introduce tulip agent, an architecture for autonomous LLM-based agents with Create, Read, Update, and Delete access to a tool library containing a potentially large number of tools. In contrast to state-of-the-art implementations, tulip agent does not encode the descriptions of all available tools in the system prompt, which counts against the model's context window, or embed the entire prompt for retrieving suitable tools. Instead, the tulip agent can recursively search for suitable tools in its extensible tool library, implemented exemplarily as a vector store. The tulip agent architecture significantly reduces inference costs, allows using even large tool libraries, and enables the agent to adapt and extend its set of tools. We evaluate the architecture with several ablation studies in a mathematics context and demonstrate its generalizability with an application to robotics. A reference implementation and the benchmark are available at github.com/HRI-EU/tulip_agent.

* 19 pages, 4 figures

Via

Access Paper or Ask Questions

Efficient Symbolic Planning with Views

May 06, 2024

Stephan Hasler, Daniel Tanneberg, Michael Gienger

Abstract:Robotic planning systems model spatial relations in detail as these are needed for manipulation tasks. In contrast to this, other physical attributes of objects and the effect of devices are usually oversimplified and expressed by abstract compound attributes. This limits the ability of planners to find alternative solutions. We propose to break these compound attributes down into a shared set of elementary attributes. This strongly facilitates generalization between different tasks and environments and thus helps to find innovative solutions. On the down-side, this generalization comes with an increased complexity of the solution space. Therefore, as the main contribution of the paper, we propose a method that splits the planning problem into a sequence of views, where in each view only an increasing subset of attributes is considered. We show that this view-based strategy offers a good compromise between planning speed and quality of the found plan, and discuss its general applicability and limitations.

* 6 pages

Via

Access Paper or Ask Questions

To Help or Not to Help: LLM-based Attentive Support for Human-Robot Group Interactions

Mar 19, 2024

Daniel Tanneberg, Felix Ocker, Stephan Hasler, Joerg Deigmoeller, Anna Belardinelli, Chao Wang, Heiko Wersing, Bernhard Sendhoff, Michael Gienger

Abstract:How can a robot provide unobtrusive physical support within a group of humans? We present Attentive Support, a novel interaction concept for robots to support a group of humans. It combines scene perception, dialogue acquisition, situation understanding, and behavior generation with the common-sense reasoning capabilities of Large Language Models (LLMs). In addition to following user instructions, Attentive Support is capable of deciding when and how to support the humans, and when to remain silent to not disturb the group. With a diverse set of scenarios, we show and evaluate the robot's attentive behavior, which supports and helps the humans when required, while not disturbing if no help is needed.

* 8 pages, 5 figures

Via

Access Paper or Ask Questions

Large Language Models for Multi-Modal Human-Robot Interaction

Jan 26, 2024

Chao Wang, Stephan Hasler, Daniel Tanneberg, Felix Ocker, Frank Joublin, Antonello Ceravola, Joerg Deigmoeller, Michael Gienger

Figure 1 for Large Language Models for Multi-Modal Human-Robot Interaction

Figure 2 for Large Language Models for Multi-Modal Human-Robot Interaction

Figure 3 for Large Language Models for Multi-Modal Human-Robot Interaction

Figure 4 for Large Language Models for Multi-Modal Human-Robot Interaction

Abstract:This paper presents an innovative large language model (LLM)-based robotic system for enhancing multi-modal human-robot interaction (HRI). Traditional HRI systems relied on complex designs for intent estimation, reasoning, and behavior generation, which were resource-intensive. In contrast, our system empowers researchers and practitioners to regulate robot behavior through three key aspects: providing high-level linguistic guidance, creating "atomics" for actions and expressions the robot can use, and offering a set of examples. Implemented on a physical robot, it demonstrates proficiency in adapting to multi-modal inputs and determining the appropriate manner of action to assist humans with its arms, following researchers' defined guidelines. Simultaneously, it coordinates the robot's lid, neck, and ear movements with speech output to produce dynamic, multi-modal expressions. This showcases the system's potential to revolutionize HRI by shifting from conventional, manual state-and-flow design methods to an intuitive, guidance-based, and example-driven approach.

* 9 pages, 5 figures

Via

Access Paper or Ask Questions

CoPAL: Corrective Planning of Robot Actions with Large Language Models

Oct 11, 2023

Frank Joublin, Antonello Ceravola, Pavel Smirnov, Felix Ocker, Joerg Deigmoeller, Anna Belardinelli, Chao Wang, Stephan Hasler, Daniel Tanneberg, Michael Gienger

Abstract:In the pursuit of fully autonomous robotic systems capable of taking over tasks traditionally performed by humans, the complexity of open-world environments poses a considerable challenge. Addressing this imperative, this study contributes to the field of Large Language Models (LLMs) applied to task and motion planning for robots. We propose a system architecture that orchestrates a seamless interplay between multiple cognitive levels, encompassing reasoning, planning, and motion generation. At its core lies a novel replanning strategy that handles physically grounded, logical, and semantic errors in the generated plans. We demonstrate the efficacy of the proposed feedback architecture, particularly its impact on executability, correctness, and time complexity via empirical evaluation in the context of a simulation and two intricate real-world scenarios: blocks world, barman and pizza preparation.

Via

Access Paper or Ask Questions

Learning Type-Generalized Actions for Symbolic Planning

Aug 09, 2023

Daniel Tanneberg, Michael Gienger

Figure 1 for Learning Type-Generalized Actions for Symbolic Planning

Figure 2 for Learning Type-Generalized Actions for Symbolic Planning

Figure 3 for Learning Type-Generalized Actions for Symbolic Planning

Figure 4 for Learning Type-Generalized Actions for Symbolic Planning

Abstract:Symbolic planning is a powerful technique to solve complex tasks that require long sequences of actions and can equip an intelligent agent with complex behavior. The downside of this approach is the necessity for suitable symbolic representations describing the state of the environment as well as the actions that can change it. Traditionally such representations are carefully hand-designed by experts for distinct problem domains, which limits their transferability to different problems and environment complexities. In this paper, we propose a novel concept to generalize symbolic actions using a given entity hierarchy and observed similar behavior. In a simulated grid-based kitchen environment, we show that type-generalized actions can be learned from few observations and generalize to novel situations. Incorporating an additional on-the-fly generalization mechanism during planning, unseen task combinations, involving longer sequences, novel entities and unexpected environment behavior, can be solved.

* IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2023

Via

Access Paper or Ask Questions

Intention estimation from gaze and motion features for human-robot shared-control object manipulation

Aug 18, 2022

Anna Belardinelli, Anirudh Reddy Kondapally, Dirk Ruiken, Daniel Tanneberg, Tomoki Watabe

Figure 1 for Intention estimation from gaze and motion features for human-robot shared-control object manipulation

Figure 2 for Intention estimation from gaze and motion features for human-robot shared-control object manipulation

Figure 3 for Intention estimation from gaze and motion features for human-robot shared-control object manipulation

Figure 4 for Intention estimation from gaze and motion features for human-robot shared-control object manipulation

Abstract:Shared control can help in teleoperated object manipulation by assisting with the execution of the user's intention. To this end, robust and prompt intention estimation is needed, which relies on behavioral observations. Here, an intention estimation framework is presented, which uses natural gaze and motion features to predict the current action and the target object. The system is trained and tested in a simulated environment with pick and place sequences produced in a relatively cluttered scene and with both hands, with possible hand-over to the other hand. Validation is conducted across different users and hands, achieving good accuracy and earliness of prediction. An analysis of the predictive power of single features shows the predominance of the grasping trigger and the gaze features in the early identification of the current action. In the current framework, the same probabilistic model can be used for the two hands working in parallel and independently, while a rule-based model is proposed to identify the resulting bimanual action. Finally, limitations and perspectives of this approach to more complex, full-bimanual manipulations are discussed.

* IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022

Via

Access Paper or Ask Questions

Evolutionary Training and Abstraction Yields Algorithmic Generalization of Neural Computers

May 17, 2021

Daniel Tanneberg, Elmar Rueckert, Jan Peters

Figure 1 for Evolutionary Training and Abstraction Yields Algorithmic Generalization of Neural Computers

Figure 2 for Evolutionary Training and Abstraction Yields Algorithmic Generalization of Neural Computers

Figure 3 for Evolutionary Training and Abstraction Yields Algorithmic Generalization of Neural Computers

Figure 4 for Evolutionary Training and Abstraction Yields Algorithmic Generalization of Neural Computers

Abstract:A key feature of intelligent behaviour is the ability to learn abstract strategies that scale and transfer to unfamiliar problems. An abstract strategy solves every sample from a problem class, no matter its representation or complexity -- like algorithms in computer science. Neural networks are powerful models for processing sensory data, discovering hidden patterns, and learning complex functions, but they struggle to learn such iterative, sequential or hierarchical algorithmic strategies. Extending neural networks with external memories has increased their capacities in learning such strategies, but they are still prone to data variations, struggle to learn scalable and transferable solutions, and require massive training data. We present the Neural Harvard Computer (NHC), a memory-augmented network based architecture, that employs abstraction by decoupling algorithmic operations from data manipulations, realized by splitting the information flow and separated modules. This abstraction mechanism and evolutionary training enable the learning of robust and scalable algorithmic solutions. On a diverse set of 11 algorithms with varying complexities, we show that the NHC reliably learns algorithmic solutions with strong generalization and abstraction: perfect generalization and scaling to arbitrary task configurations and complexities far beyond seen during training, and being independent of the data representation and the task domain.

* Nature Machine Intelligence, Vol. 2, December 2020, 753-763
* Nature Machine Intelligence

Via

Access Paper or Ask Questions

SKID RAW: Skill Discovery from Raw Trajectories

Mar 26, 2021

Daniel Tanneberg, Kai Ploeger, Elmar Rueckert, Jan Peters

Figure 1 for SKID RAW: Skill Discovery from Raw Trajectories

Figure 2 for SKID RAW: Skill Discovery from Raw Trajectories

Figure 3 for SKID RAW: Skill Discovery from Raw Trajectories

Figure 4 for SKID RAW: Skill Discovery from Raw Trajectories

Abstract:Integrating robots in complex everyday environments requires a multitude of problems to be solved. One crucial feature among those is to equip robots with a mechanism for teaching them a new task in an easy and natural way. When teaching tasks that involve sequences of different skills, with varying order and number of these skills, it is desirable to only demonstrate full task executions instead of all individual skills. For this purpose, we propose a novel approach that simultaneously learns to segment trajectories into reoccurring patterns and the skills to reconstruct these patterns from unlabelled demonstrations without further supervision. Moreover, the approach learns a skill conditioning that can be used to understand possible sequences of skills, a practical mechanism to be used in, for example, human-robot-interactions for a more intelligent and adaptive robot behaviour. The Bayesian and variational inference based approach is evaluated on synthetic and real human demonstrations with varying complexities and dimensionality, showing the successful learning of segmentations and skill libraries from unlabelled data.

* IEEE Robotics and Automation Letters

Via

Access Paper or Ask Questions