Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Truyen Tran

LaGR-SEQ: Language-Guided Reinforcement Learning with Sample-Efficient Querying

Aug 21, 2023

Thommen George Karimpanal, Laknath Buddhika Semage, Santu Rana, Hung Le, Truyen Tran, Sunil Gupta, Svetha Venkatesh

Abstract:Large language models (LLMs) have recently demonstrated their impressive ability to provide context-aware responses via text. This ability could potentially be used to predict plausible solutions in sequential decision making tasks pertaining to pattern completion. For example, by observing a partial stack of cubes, LLMs can predict the correct sequence in which the remaining cubes should be stacked by extrapolating the observed patterns (e.g., cube sizes, colors or other attributes) in the partial stack. In this work, we introduce LaGR (Language-Guided Reinforcement learning), which uses this predictive ability of LLMs to propose solutions to tasks that have been partially completed by a primary reinforcement learning (RL) agent, in order to subsequently guide the latter's training. However, as RL training is generally not sample-efficient, deploying this approach would inherently imply that the LLM be repeatedly queried for solutions; a process that can be expensive and infeasible. To address this issue, we introduce SEQ (sample efficient querying), where we simultaneously train a secondary RL agent to decide when the LLM should be queried for solutions. Specifically, we use the quality of the solutions emanating from the LLM as the reward to train this agent. We show that our proposed framework LaGR-SEQ enables more efficient primary RL training, while simultaneously minimizing the number of queries to the LLM. We demonstrate our approach on a series of tasks and highlight the advantages of our approach, along with its limitations and potential future research directions.

* 18 pages, 11 figures

Via

Access Paper or Ask Questions

Persistent-Transient Duality: A Multi-mechanism Approach for Modeling Human-Object Interaction

Jul 24, 2023

Hung Tran, Vuong Le, Svetha Venkatesh, Truyen Tran

Abstract:Humans are highly adaptable, swiftly switching between different modes to progressively handle different tasks, situations and contexts. In Human-object interaction (HOI) activities, these modes can be attributed to two mechanisms: (1) the large-scale consistent plan for the whole activity and (2) the small-scale children interactive actions that start and end along the timeline. While neuroscience and cognitive science have confirmed this multi-mechanism nature of human behavior, machine modeling approaches for human motion are trailing behind. While attempted to use gradually morphing structures (e.g., graph attention networks) to model the dynamic HOI patterns, they miss the expeditious and discrete mode-switching nature of the human motion. To bridge that gap, this work proposes to model two concurrent mechanisms that jointly control human motion: the Persistent process that runs continually on the global scale, and the Transient sub-processes that operate intermittently on the local context of the human while interacting with objects. These two mechanisms form an interactive Persistent-Transient Duality that synergistically governs the activity sequences. We model this conceptual duality by a parent-child neural network of Persistent and Transient channels with a dedicated neural module for dynamic mechanism switching. The framework is trialed on HOI motion forecasting. On two rich datasets and a wide variety of settings, the model consistently delivers superior performances, proving its suitability for the challenge.

* Accepted at ICCV 2023

Via

Access Paper or Ask Questions

Memory-Augmented Theory of Mind Network

Jan 17, 2023

Dung Nguyen, Phuoc Nguyen, Hung Le, Kien Do, Svetha Venkatesh, Truyen Tran

Figure 1 for Memory-Augmented Theory of Mind Network

Figure 2 for Memory-Augmented Theory of Mind Network

Figure 3 for Memory-Augmented Theory of Mind Network

Figure 4 for Memory-Augmented Theory of Mind Network

Abstract:Social reasoning necessitates the capacity of theory of mind (ToM), the ability to contextualise and attribute mental states to others without having access to their internal cognitive structure. Recent machine learning approaches to ToM have demonstrated that we can train the observer to read the past and present behaviours of other agents and infer their beliefs (including false beliefs about things that no longer exist), goals, intentions and future actions. The challenges arise when the behavioural space is complex, demanding skilful space navigation for rapidly changing contexts for an extended period. We tackle the challenges by equipping the observer with novel neural memory mechanisms to encode, and hierarchical attention to selectively retrieve information about others. The memories allow rapid, selective querying of distal related past behaviours of others to deliberatively reason about their current mental state, beliefs and future behaviours. This results in ToMMY, a theory of mind model that learns to reason while making little assumptions about the underlying mental processes. We also construct a new suite of experiments to demonstrate that memories facilitate the learning process and achieve better theory of mind performance, especially for high-demand false-belief tasks that require inferring through multiple steps of changes.

* Accepted for publication at AAAI 2023

Via

Access Paper or Ask Questions

Functional Indirection Neural Estimator for Better Out-of-distribution Generalization

Oct 23, 2022

Kha Pham, Hung Le, Man Ngo, Truyen Tran

Figure 1 for Functional Indirection Neural Estimator for Better Out-of-distribution Generalization

Figure 2 for Functional Indirection Neural Estimator for Better Out-of-distribution Generalization

Figure 3 for Functional Indirection Neural Estimator for Better Out-of-distribution Generalization

Figure 4 for Functional Indirection Neural Estimator for Better Out-of-distribution Generalization

Abstract:The capacity to achieve out-of-distribution (OOD) generalization is a hallmark of human intelligence and yet remains out of reach for machines. This remarkable capability has been attributed to our abilities to make conceptual abstraction and analogy, and to a mechanism known as indirection, which binds two representations and uses one representation to refer to the other. Inspired by these mechanisms, we hypothesize that OOD generalization may be achieved by performing analogy-making and indirection in the functional space instead of the data space as in current methods. To realize this, we design FINE (Functional Indirection Neural Estimator), a neural framework that learns to compose functions that map data input to output on-the-fly. FINE consists of a backbone network and a trainable semantic memory of basis weight matrices. Upon seeing a new input-output data pair, FINE dynamically constructs the backbone weights by mixing the basis weights. The mixing coefficients are indirectly computed through querying a separate corresponding semantic memory using the data pair. We demonstrate empirically that FINE can strongly improve out-of-distribution generalization on IQ tasks that involve geometric transformations. In particular, we train FINE and competing models on IQ tasks using images from the MNIST, Omniglot and CIFAR100 datasets and test on tasks with unseen image classes from one or different datasets and unseen transformation rules. FINE not only achieves the best performance on all tasks but also is able to adapt to small-scale data scenarios.

* Accepted paper at NeurIPS 2022

Via

Access Paper or Ask Questions

Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation

Sep 21, 2022

Kien Do, Hung Le, Dung Nguyen, Dang Nguyen, Haripriya Harikumar, Truyen Tran, Santu Rana, Svetha Venkatesh

Figure 1 for Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation

Figure 2 for Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation

Figure 3 for Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation

Figure 4 for Momentum Adversarial Distillation: Handling Large Distribution Shifts in Data-Free Knowledge Distillation

Abstract:Data-free Knowledge Distillation (DFKD) has attracted attention recently thanks to its appealing capability of transferring knowledge from a teacher network to a student network without using training data. The main idea is to use a generator to synthesize data for training the student. As the generator gets updated, the distribution of synthetic data will change. Such distribution shift could be large if the generator and the student are trained adversarially, causing the student to forget the knowledge it acquired at previous steps. To alleviate this problem, we propose a simple yet effective method called Momentum Adversarial Distillation (MAD) which maintains an exponential moving average (EMA) copy of the generator and uses synthetic samples from both the generator and the EMA generator to train the student. Since the EMA generator can be considered as an ensemble of the generator's old versions and often undergoes a smaller change in updates compared to the generator, training on its synthetic samples can help the student recall the past knowledge and prevent the student from adapting too quickly to new updates of the generator. Our experiments on six benchmark datasets including big datasets like ImageNet and Places365 demonstrate the superior performance of MAD over competing methods for handling the large distribution shift problem. Our method also compares favorably to existing DFKD methods and even achieves state-of-the-art results in some cases.

* Accepted to NeurIPS 2022

Via

Access Paper or Ask Questions

Video Dialog as Conversation about Objects Living in Space-Time

Jul 08, 2022

Hoang-Anh Pham, Thao Minh Le, Vuong Le, Tu Minh Phuong, Truyen Tran

Figure 1 for Video Dialog as Conversation about Objects Living in Space-Time

Figure 2 for Video Dialog as Conversation about Objects Living in Space-Time

Figure 3 for Video Dialog as Conversation about Objects Living in Space-Time

Figure 4 for Video Dialog as Conversation about Objects Living in Space-Time

Abstract:It would be a technological feat to be able to create a system that can hold a meaningful conversation with humans about what they watch. A setup toward that goal is presented as a video dialog task, where the system is asked to generate natural utterances in response to a question in an ongoing dialog. The task poses great visual, linguistic, and reasoning challenges that cannot be easily overcome without an appropriate representation scheme over video and dialog that supports high-level reasoning. To tackle these challenges we present a new object-centric framework for video dialog that supports neural reasoning dubbed COST - which stands for Conversation about Objects in Space-Time. Here dynamic space-time visual content in videos is first parsed into object trajectories. Given this video abstraction, COST maintains and tracks object-associated dialog states, which are updated upon receiving new questions. Object interactions are dynamically and conditionally inferred for each question, and these serve as the basis for relational reasoning among them. COST also maintains a history of previous answers, and this allows retrieval of relevant object-centric information to enrich the answer forming process. Language production then proceeds in a step-wise manner, taking into the context of the current utterance, the existing dialog, the current question. We evaluate COST on the DSTC7 and DSTC8 benchmarks, demonstrating its competitiveness against state-of-the-arts.

* Accepted to ECCV 2022, code will be available at https://github.com/hoanganhpham1006/COST

Via

Access Paper or Ask Questions

Guiding Visual Question Answering with Attention Priors

May 25, 2022

Thao Minh Le, Vuong Le, Sunil Gupta, Svetha Venkatesh, Truyen Tran

Figure 1 for Guiding Visual Question Answering with Attention Priors

Figure 2 for Guiding Visual Question Answering with Attention Priors

Figure 3 for Guiding Visual Question Answering with Attention Priors

Figure 4 for Guiding Visual Question Answering with Attention Priors

Abstract:The current success of modern visual reasoning systems is arguably attributed to cross-modality attention mechanisms. However, in deliberative reasoning such as in VQA, attention is unconstrained at each step, and thus may serve as a statistical pooling mechanism rather than a semantic operation intended to select information relevant to inference. This is because at training time, attention is only guided by a very sparse signal (i.e. the answer label) at the end of the inference chain. This causes the cross-modality attention weights to deviate from the desired visual-language bindings. To rectify this deviation, we propose to guide the attention mechanism using explicit linguistic-visual grounding. This grounding is derived by connecting structured linguistic concepts in the query to their referents among the visual objects. Here we learn the grounding from the pairing of questions and images alone, without the need for answer annotation or external grounding supervision. This grounding guides the attention mechanism inside VQA models through a duality of mechanisms: pre-training attention weight calculation and directly guiding the weights at inference time on a case-by-case basis. The resultant algorithm is capable of probing attention-based reasoning models, injecting relevant associative knowledge, and regulating the core reasoning process. This scalable enhancement improves the performance of VQA models, fortifies their robustness to limited access to supervised data, and increases interpretability.

* Preprint, 10 pages

Via

Access Paper or Ask Questions

Fast Conditional Network Compression Using Bayesian HyperNetworks

May 13, 2022

Phuoc Nguyen, Truyen Tran, Ky Le, Sunil Gupta, Santu Rana, Dang Nguyen, Trong Nguyen, Shannon Ryan, Svetha Venkatesh

Figure 1 for Fast Conditional Network Compression Using Bayesian HyperNetworks

Figure 2 for Fast Conditional Network Compression Using Bayesian HyperNetworks

Figure 3 for Fast Conditional Network Compression Using Bayesian HyperNetworks

Figure 4 for Fast Conditional Network Compression Using Bayesian HyperNetworks

Abstract:We introduce a conditional compression problem and propose a fast framework for tackling it. The problem is how to quickly compress a pretrained large neural network into optimal smaller networks given target contexts, e.g. a context involving only a subset of classes or a context where only limited compute resource is available. To solve this, we propose an efficient Bayesian framework to compress a given large network into much smaller size tailored to meet each contextual requirement. We employ a hypernetwork to parameterize the posterior distribution of weights given conditional inputs and minimize a variational objective of this Bayesian neural network. To further reduce the network sizes, we propose a new input-output group sparsity factorization of weights to encourage more sparseness in the generated weights. Our methods can quickly generate compressed networks with significantly smaller sizes than baseline methods.

* Published as a conference paper at ECML 2021

Via

Access Paper or Ask Questions

Persistent-Transient Duality in Human Behavior Modeling

Apr 21, 2022

Hung Tran, Vuong Le, Svetha Venkatesh, Truyen Tran

Figure 1 for Persistent-Transient Duality in Human Behavior Modeling

Figure 2 for Persistent-Transient Duality in Human Behavior Modeling

Figure 3 for Persistent-Transient Duality in Human Behavior Modeling

Abstract:We propose to model the persistent-transient duality in human behavior using a parent-child multi-channel neural network, which features a parent persistent channel that manages the global dynamics and children transient channels that are initiated and terminated on-demand to handle detailed interactive actions. The short-lived transient sessions are managed by a proposed Transient Switch. The neural framework is trained to discover the structure of the duality automatically. Our model shows superior performances in human-object interaction motion prediction.

* Accepted at CVPR Precognition Workshop 2022

Via

Access Paper or Ask Questions

Learning to Transfer Role Assignment Across Team Sizes

Apr 17, 2022

Dung Nguyen, Phuoc Nguyen, Svetha Venkatesh, Truyen Tran

Figure 1 for Learning to Transfer Role Assignment Across Team Sizes

Figure 2 for Learning to Transfer Role Assignment Across Team Sizes

Figure 3 for Learning to Transfer Role Assignment Across Team Sizes

Figure 4 for Learning to Transfer Role Assignment Across Team Sizes

Abstract:Multi-agent reinforcement learning holds the key for solving complex tasks that demand the coordination of learning agents. However, strong coordination often leads to expensive exploration over the exponentially large state-action space. A powerful approach is to decompose team works into roles, which are ideally assigned to agents with the relevant skills. Training agents to adaptively choose and play emerging roles in a team thus allows the team to scale to complex tasks and quickly adapt to changing environments. These promises, however, have not been fully realised by current role-based multi-agent reinforcement learning methods as they assume either a pre-defined role structure or a fixed team size. We propose a framework to learn role assignment and transfer across team sizes. In particular, we train a role assignment network for small teams by demonstration and transfer the network to larger teams, which continue to learn through interaction with the environment. We demonstrate that re-using the role-based credit assignment structure can foster the learning process of larger reinforcement learning teams to achieve tasks requiring different roles. Our proposal outperforms competing techniques in enriched role-enforcing Prey-Predator games and in new scenarios in the StarCraft II Micro-Management benchmark.

* Accepted for publication at AAMAS 2022

Via

Access Paper or Ask Questions