Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Thanh Vinh Vo

Hierarchical Molecular Representation Learning via Fragment-Based Self-Supervised Embedding Prediction

Feb 23, 2026

Jiele Wu, Haozhe Ma, Zhihan Guo, Thanh Vinh Vo, Tze Yun Leong

Abstract:Graph self-supervised learning (GSSL) has demonstrated strong potential for generating expressive graph embeddings without the need for human annotations, making it particularly valuable in domains with high labeling costs such as molecular graph analysis. However, existing GSSL methods mostly focus on node- or edge-level information, often ignoring chemically relevant substructures which strongly influence molecular properties. In this work, we propose Graph Semantic Predictive Network (GraSPNet), a hierarchical self-supervised framework that explicitly models both atomic-level and fragment-level semantics. GraSPNet decomposes molecular graphs into chemically meaningful fragments without predefined vocabularies and learns node- and fragment-level representations through multi-level message passing with masked semantic prediction at both levels. This hierarchical semantic supervision enables GraSPNet to learn multi-resolution structural information that is both expressive and transferable. Extensive experiments on multiple molecular property prediction benchmarks demonstrate that GraSPNet learns chemically meaningful representations and consistently outperforms state-of-the-art GSSL methods in transfer learning settings.

* 15 pages (8 pages main text),8 figures

Via

Access Paper or Ask Questions

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Jan 15, 2026

Aaron Adcock, Aayushi Srivastava, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pande, Abhinav Pandey, Abhinav Sharma, Abhishek Kadian, Abhishek Kumawat, Adam Kelsey(+1295 more)

Abstract:This document consolidates publicly reported technical details about Metas Llama 4 model family. It summarizes (i) released variants (Scout and Maverick) and the broader herd context including the previewed Behemoth teacher model, (ii) architectural characteristics beyond a high-level MoE description covering routed/shared-expert structure, early-fusion multimodality, and long-context design elements reported for Scout (iRoPE and length generalization strategies), (iii) training disclosures spanning pre-training, mid-training for long-context extension, and post-training methodology (lightweight SFT, online RL, and lightweight DPO) as described in release materials, (iv) developer-reported benchmark results for both base and instruction-tuned checkpoints, and (v) practical deployment constraints observed across major serving environments, including provider-specific context limits and quantization packaging. The manuscript also summarizes licensing obligations relevant to redistribution and derivative naming, and reviews publicly described safeguards and evaluation practices. The goal is to provide a compact technical reference for researchers and practitioners who need precise, source-backed facts about Llama 4.

* 15 pages

Via

Access Paper or Ask Questions

Causal Policy Learning in Reinforcement Learning: Backdoor-Adjusted Soft Actor-Critic

Jun 05, 2025

Thanh Vinh Vo, Young Lee, Haozhe Ma, Chien Lu, Tze-Yun Leong

Figure 1 for Causal Policy Learning in Reinforcement Learning: Backdoor-Adjusted Soft Actor-Critic

Figure 2 for Causal Policy Learning in Reinforcement Learning: Backdoor-Adjusted Soft Actor-Critic

Figure 3 for Causal Policy Learning in Reinforcement Learning: Backdoor-Adjusted Soft Actor-Critic

Figure 4 for Causal Policy Learning in Reinforcement Learning: Backdoor-Adjusted Soft Actor-Critic

Abstract:Hidden confounders that influence both states and actions can bias policy learning in reinforcement learning (RL), leading to suboptimal or non-generalizable behavior. Most RL algorithms ignore this issue, learning policies from observational trajectories based solely on statistical associations rather than causal effects. We propose DoSAC (Do-Calculus Soft Actor-Critic with Backdoor Adjustment), a principled extension of the SAC algorithm that corrects for hidden confounding via causal intervention estimation. DoSAC estimates the interventional policy $\pi(a | \mathrm{do}(s))$ using the backdoor criterion, without requiring access to true confounders or causal labels. To achieve this, we introduce a learnable Backdoor Reconstructor that infers pseudo-past variables (previous state and action) from the current state to enable backdoor adjustment from observational data. This module is integrated into a soft actor-critic framework to compute both the interventional policy and its entropy. Empirical results on continuous control benchmarks show that DoSAC outperforms baselines under confounded settings, with improved robustness, generalization, and policy reliability.

* Preprint

Via

Access Paper or Ask Questions

Knowledge Sharing and Transfer via Centralized Reward Agent for Multi-Task Reinforcement Learning

Aug 20, 2024

Haozhe Ma, Zhengding Luo, Thanh Vinh Vo, Kuankuan Sima, Tze-Yun Leong

Abstract:Reward shaping is effective in addressing the sparse-reward challenge in reinforcement learning by providing immediate feedback through auxiliary informative rewards. Based on the reward shaping strategy, we propose a novel multi-task reinforcement learning framework, that integrates a centralized reward agent (CRA) and multiple distributed policy agents. The CRA functions as a knowledge pool, which aims to distill knowledge from various tasks and distribute it to individual policy agents to improve learning efficiency. Specifically, the shaped rewards serve as a straightforward metric to encode knowledge. This framework not only enhances knowledge sharing across established tasks but also adapts to new tasks by transferring valuable reward signals. We validate the proposed method on both discrete and continuous domains, demonstrating its robustness in multi-task sparse-reward settings and its effective transferability to unseen tasks.

Via

Access Paper or Ask Questions

Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning

Aug 07, 2024

Haozhe Ma, Zhengding Luo, Thanh Vinh Vo, Kuankuan Sima, Tze-Yun Leong

Abstract:Reward shaping addresses the challenge of sparse rewards in reinforcement learning by constructing denser and more informative reward signals. To achieve self-adaptive and highly efficient reward shaping, we propose a novel method that incorporates success rates derived from historical experiences into shaped rewards. Our approach utilizes success rates sampled from Beta distributions, which dynamically evolve from uncertain to reliable values as more data is collected. Initially, the self-adaptive success rates exhibit more randomness to encourage exploration. Over time, they become more certain to enhance exploitation, thus achieving a better balance between exploration and exploitation. We employ Kernel Density Estimation (KDE) combined with Random Fourier Features (RFF) to derive the Beta distributions, resulting in a computationally efficient implementation in high-dimensional continuous state spaces. This method provides a non-parametric and learning-free approach. The proposed method is evaluated on a wide range of continuous control tasks with sparse and delayed rewards, demonstrating significant improvements in sample efficiency and convergence stability compared to relevant baselines.

Via

Access Paper or Ask Questions

Decoupled Prompt-Adapter Tuning for Continual Activity Recognition

Jul 20, 2024

Di Fu, Thanh Vinh Vo, Haozhe Ma, Tze-Yun Leong

Figure 1 for Decoupled Prompt-Adapter Tuning for Continual Activity Recognition

Figure 2 for Decoupled Prompt-Adapter Tuning for Continual Activity Recognition

Figure 3 for Decoupled Prompt-Adapter Tuning for Continual Activity Recognition

Figure 4 for Decoupled Prompt-Adapter Tuning for Continual Activity Recognition

Abstract:Action recognition technology plays a vital role in enhancing security through surveillance systems, enabling better patient monitoring in healthcare, providing in-depth performance analysis in sports, and facilitating seamless human-AI collaboration in domains such as manufacturing and assistive technologies. The dynamic nature of data in these areas underscores the need for models that can continuously adapt to new video data without losing previously acquired knowledge, highlighting the critical role of advanced continual action recognition. To address these challenges, we propose Decoupled Prompt-Adapter Tuning (DPAT), a novel framework that integrates adapters for capturing spatial-temporal information and learnable prompts for mitigating catastrophic forgetting through a decoupled training strategy. DPAT uniquely balances the generalization benefits of prompt tuning with the plasticity provided by adapters in pretrained vision models, effectively addressing the challenge of maintaining model performance amidst continuous data evolution without necessitating extensive finetuning. DPAT consistently achieves state-of-the-art performance across several challenging action recognition benchmarks, thus demonstrating the effectiveness of our model in the domain of continual action recognition.

Via

Access Paper or Ask Questions

Federated Learning of Causal Effects from Incomplete Observational Data

Aug 24, 2023

Thanh Vinh Vo, Young lee, Tze-Yun Leong

Figure 1 for Federated Learning of Causal Effects from Incomplete Observational Data

Figure 2 for Federated Learning of Causal Effects from Incomplete Observational Data

Figure 3 for Federated Learning of Causal Effects from Incomplete Observational Data

Figure 4 for Federated Learning of Causal Effects from Incomplete Observational Data

Abstract:Decentralized and incomplete data sources are prevalent in real-world applications, posing a formidable challenge for causal inference. These sources cannot be consolidated into a single entity owing to privacy constraints, and the presence of missing values within them can potentially introduce bias to the causal estimands. We introduce a new approach for federated causal inference from incomplete data, enabling the estimation of causal effects from multiple decentralized and incomplete data sources. Our approach disentangles the loss function into multiple components, each corresponding to a specific data source with missing values. Our approach accounts for the missing data under the missing at random assumption, while also estimating higher-order statistics of the causal estimands. Our method recovers the conditional distribution of missing confounders given the observed confounders from the decentralized data sources to identify causal effects. Our framework estimates heterogeneous causal effects without the sharing of raw training data among sources, which helps to mitigate privacy risks. The efficacy of our approach is demonstrated through a collection of simulated and real-world instances, illustrating its potential and practicality.

* Preprint

Via

Access Paper or Ask Questions

An Adaptive Kernel Approach to Federated Learning of Heterogeneous Causal Effects

Jan 01, 2023

Thanh Vinh Vo, Arnab Bhattacharyya, Young Lee, Tze-Yun Leong

Figure 1 for An Adaptive Kernel Approach to Federated Learning of Heterogeneous Causal Effects

Figure 2 for An Adaptive Kernel Approach to Federated Learning of Heterogeneous Causal Effects

Figure 3 for An Adaptive Kernel Approach to Federated Learning of Heterogeneous Causal Effects

Figure 4 for An Adaptive Kernel Approach to Federated Learning of Heterogeneous Causal Effects

Abstract:We propose a new causal inference framework to learn causal effects from multiple, decentralized data sources in a federated setting. We introduce an adaptive transfer algorithm that learns the similarities among the data sources by utilizing Random Fourier Features to disentangle the loss function into multiple components, each of which is associated with a data source. The data sources may have different distributions; the causal effects are independently and systematically incorporated. The proposed method estimates the similarities among the sources through transfer coefficients, and hence requiring no prior information about the similarity measures. The heterogeneous causal effects can be estimated with no sharing of the raw training data among the sources, thus minimizing the risk of privacy leak. We also provide minimax lower bounds to assess the quality of the parameters learned from the disparate sources. The proposed method is empirically shown to outperform the baselines on decentralized data sources with dissimilar distributions.

* NeurIPS 2022

Via

Access Paper or Ask Questions

Adaptive Multi-Source Causal Inference

May 31, 2021

Thanh Vinh Vo, Pengfei Wei, Trong Nghia Hoang, Tze-Yun Leong

Figure 1 for Adaptive Multi-Source Causal Inference

Figure 2 for Adaptive Multi-Source Causal Inference

Figure 3 for Adaptive Multi-Source Causal Inference

Figure 4 for Adaptive Multi-Source Causal Inference

Abstract:Data scarcity is a tremendous challenge in causal effect estimation. In this paper, we propose to exploit additional data sources to facilitate estimating causal effects in the target population. Specifically, we leverage additional source datasets which share similar causal mechanisms with the target observations to help infer causal effects of the target population. We propose three levels of knowledge transfer, through modelling the outcomes, treatments, and confounders. To achieve consistent positive transfer, we introduce learnable parametric transfer factors to adaptively control the transfer strength, and thus achieving a fair and balanced knowledge transfer between the sources and the target. The proposed method can infer causal effects in the target population without prior knowledge of data discrepancy between the additional data sources and the target. Experiments on both synthetic and real-world datasets show the effectiveness of the proposed method as compared with recent baselines.

* Preprint

Via

Access Paper or Ask Questions

Federated Estimation of Causal Effects from Observational Data

May 31, 2021

Thanh Vinh Vo, Trong Nghia Hoang, Young Lee, Tze-Yun Leong

Figure 1 for Federated Estimation of Causal Effects from Observational Data

Figure 2 for Federated Estimation of Causal Effects from Observational Data

Figure 3 for Federated Estimation of Causal Effects from Observational Data

Figure 4 for Federated Estimation of Causal Effects from Observational Data

Abstract:Many modern applications collect data that comes in federated spirit, with data kept locally and undisclosed. Till date, most insight into the causal inference requires data to be stored in a central repository. We present a novel framework for causal inference with federated data sources. We assess and integrate local causal effects from different private data sources without centralizing them. Then, the treatment effects on subjects from observational data using a non-parametric reformulation of the classical potential outcomes framework is estimated. We model the potential outcomes as a random function distributed by Gaussian processes, whose defining parameters can be efficiently learned from multiple data sources, respecting privacy constraints. We demonstrate the promise and efficiency of the proposed approach through a set of simulated and real-world benchmark examples.

* Preprint

Via

Access Paper or Ask Questions