Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiajun Hu

Maximum Entropy Behavior Exploration for Sim2Real Zero-Shot Reinforcement Learning

Mar 26, 2026

Jiajun Hu, Nuria Armengol Urpi, Jin Cheng, Stelian Coros

Abstract:Zero-shot reinforcement learning (RL) algorithms aim to learn a family of policies from a reward-free dataset, and recover optimal policies for any reward function directly at test time. Naturally, the quality of the pretraining dataset determines the performance of the recovered policies across tasks. However, pre-collecting a relevant, diverse dataset without prior knowledge of the downstream tasks of interest remains a challenge. In this work, we study $\textit{online}$ zero-shot RL for quadrupedal control on real robotic systems, building upon the Forward-Backward (FB) algorithm. We observe that undirected exploration yields low-diversity data, leading to poor downstream performance and rendering policies impractical for direct hardware deployment. Therefore, we introduce FB-MEBE, an online zero-shot RL algorithm that combines an unsupervised behavior exploration strategy with a regularization critic. FB-MEBE promotes exploration by maximizing the entropy of the achieved behavior distribution. Additionally, a regularization critic shapes the recovered policies toward more natural and physically plausible behaviors. We empirically demonstrate that FB-MEBE achieves and improved performance compared to other exploration strategies in a range of simulated downstream tasks, and that it renders natural policies that can be seamlessly deployed to hardware without further finetuning. Videos and code available on our website.

Via

Access Paper or Ask Questions

Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization

Jul 21, 2024

Jiajun Hu, Jian Zhang, Lei Qi, Yinghuan Shi, Yang Gao

Figure 1 for Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization

Figure 2 for Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization

Figure 3 for Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization

Figure 4 for Learn to Preserve and Diversify: Parameter-Efficient Group with Orthogonal Regularization for Domain Generalization

Abstract:Domain generalization (DG) aims to avoid the performance degradation of the model when the distribution shift between the limited training data and unseen test data occurs. Recently, foundation models with enormous parameters have been pre-trained with huge datasets, demonstrating strong generalization ability and showing promising direction for solving the DG problem. However, fully Fine-Tuning (FT) the foundation models results in unsatisfactory out-of-distribution accuracy due to the destroyed pre-trained generalized features. Recently, Parameter-Efficient Fine-Tuning (PEFT) alleviates the above problem by fine-tuning a small portion of the model parameters while keeping the rest frozen, which achieves better generalization performance compared to FT. Nevertheless, PEFT still suffers from the issue of overfitting to the training domains. To address the above issue, we propose Parameter-Efficient Group with Orthogonal regularization (PEGO) for vision transformers, which effectively preserves the generalization ability of the pre-trained network and learns more diverse knowledge compared with conventional PEFT. Specifically, we inject a group of trainable Low-Rank Adaptation (LoRA) modules into the pre-trained model and propose an orthogonal regularization loss to enhance the generalization ability of the model. Our framework achieves SOTA performance on five DG benchmarks, while only requiring training a small number of parameters without adding additional testing cost.

Via

Access Paper or Ask Questions

Generative prediction of flow field based on the diffusion model

Jun 30, 2024

Jiajun Hu, Zhen Lu, Yue Yang

Figure 1 for Generative prediction of flow field based on the diffusion model

Figure 2 for Generative prediction of flow field based on the diffusion model

Figure 3 for Generative prediction of flow field based on the diffusion model

Figure 4 for Generative prediction of flow field based on the diffusion model

Abstract:We propose a geometry-to-flow diffusion model that utilizes the input of obstacle shape to predict a flow field past the obstacle. The model is based on a learnable Markov transition kernel to recover the data distribution from the Gaussian distribution. The Markov process is conditioned on the obstacle geometry, estimating the noise to be removed at each step, implemented via a U-Net. A cross-attention mechanism incorporates the geometry as a prompt. We train the geometry-to-flow diffusion model using a dataset of flows past simple obstacles, including the circle, ellipse, rectangle, and triangle. For comparison, the CNN model is trained using the same dataset. Tests are carried out on flows past obstacles with simple and complex geometries, representing interpolation and extrapolation on the geometry condition, respectively. In the test set, challenging scenarios include a cross and characters `PKU'. Generated flow fields show that the geometry-to-flow diffusion model is superior to the CNN model in predicting instantaneous flow fields and handling complex geometries. Quantitative analysis of the model accuracy and divergence in the fields demonstrate the high robustness of the diffusion model, indicating that the diffusion model learns physical laws implicitly.

Via

Access Paper or Ask Questions

Successive Model-Agnostic Meta-Learning for Few-Shot Fault Time Series Prognosis

Nov 04, 2023

Hai Su, Jiajun Hu, Songsen Yu

Figure 1 for Successive Model-Agnostic Meta-Learning for Few-Shot Fault Time Series Prognosis

Figure 2 for Successive Model-Agnostic Meta-Learning for Few-Shot Fault Time Series Prognosis

Figure 3 for Successive Model-Agnostic Meta-Learning for Few-Shot Fault Time Series Prognosis

Figure 4 for Successive Model-Agnostic Meta-Learning for Few-Shot Fault Time Series Prognosis

Abstract:Meta learning is a promising technique for solving few-shot fault prediction problems, which have attracted the attention of many researchers in recent years. Existing meta-learning methods for time series prediction, which predominantly rely on random and similarity matching-based task partitioning, face three major limitations: (1) feature exploitation inefficiency; (2) suboptimal task data allocation; and (3) limited robustness with small samples. To overcome these limitations, we introduce a novel 'pseudo meta-task' partitioning scheme that treats a continuous time period of a time series as a meta-task, composed of multiple successive short time periods. Employing continuous time series as pseudo meta-tasks allows our method to extract more comprehensive features and relationships from the data, resulting in more accurate predictions. Moreover, we introduce a differential algorithm to enhance the robustness of our method across different datasets. Through extensive experiments on several fault and time series prediction datasets, we demonstrate that our approach substantially enhances prediction performance and generalization capability under both few-shot and general conditions.

Via

Access Paper or Ask Questions