Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Manrui Jiang

DRL-STAF: A Deep Reinforcement Learning Framework for State-Aware Forecasting of Complex Multivariate Hidden Markov Processes

May 14, 2026

Manrui Jiang, Jingru Huang, Yong Chen, Chen Zhang

Abstract:Forecasting multivariate hidden Markov processes is challenging due to nonlinear and nonstationary observations, latent state transitions, and cross-sequence dependencies. While deep learning methods achieve strong predictive accuracy, they typically lack explicit state modeling, whereas Hidden Markov Models (HMMs) provide interpretable latent states but struggle with complex nonlinear emissions and scalability. To address these limitations, we propose DRL-STAF, a Deep Reinforcement Learning based STate-Aware Forecasting framework that jointly predicts next-step observations and estimates the corresponding hidden states for complex multivariate hidden Markov processes. Specifically, DRL-STAF models complex nonlinear emissions using deep neural networks and estimates discrete hidden states using reinforcement learning, reducing the reliance on predefined transition structures and enabling flexible adaptation to diverse temporal dynamics. In particular, DRL-STAF mitigates the state-space explosion encountered by typical multivariate HMM-based methods. Extensive experiments demonstrate that DRL-STAF outperforms HMM variants, standalone deep learning models, and existing DL-HMM hybrids in most cases, while also providing reliable hidden-state estimates.

Via

Access Paper or Ask Questions

Combinatorial Bandit Bayesian Optimization for Tensor Outputs

Jan 31, 2026

Jingru Huang, Haijie Xu, Jie Guo, Manrui Jiang, Chen Zhang

Abstract:Bayesian optimization (BO) has been widely used to optimize expensive and black-box functions across various domains. Existing BO methods have not addressed tensor-output functions. To fill this gap, we propose a novel tensor-output BO method. Specifically, we first introduce a tensor-output Gaussian process (TOGP) with two classes of tensor-output kernels as a surrogate model of the tensor-output function, which can effectively capture the structural dependencies within the tensor. Based on it, we develop an upper confidence bound (UCB) acquisition function to select the queried points. Furthermore, we introduce a more complex and practical problem setting, named combinatorial bandit Bayesian optimization (CBBO), where only a subset of the outputs can be selected to contribute to the objective function. To tackle this, we propose a tensor-output CBBO method, which extends TOGP to handle partially observed outputs, and accordingly design a novel combinatorial multi-arm bandit-UCB2 (CMAB-UCB2) criterion to sequentially select both the queried points and the optimal output subset. Theoretical regret bounds for the two methods are established, ensuring their sublinear performance. Extensive synthetic and real-world experiments demonstrate their superiority.

Via

Access Paper or Ask Questions

Function-on-Function Bayesian Optimization

Nov 16, 2025

Jingru Huang, Haijie Xu, Manrui Jiang, Chen Zhang

Figure 1 for Function-on-Function Bayesian Optimization

Figure 2 for Function-on-Function Bayesian Optimization

Figure 3 for Function-on-Function Bayesian Optimization

Figure 4 for Function-on-Function Bayesian Optimization

Abstract:Bayesian optimization (BO) has been widely used to optimize expensive and gradient-free objective functions across various domains. However, existing BO methods have not addressed the objective where both inputs and outputs are functions, which increasingly arise in complex systems as advanced sensing technologies. To fill this gap, we propose a novel function-on-function Bayesian optimization (FFBO) framework. Specifically, we first introduce a function-on-function Gaussian process (FFGP) model with a separable operator-valued kernel to capture the correlations between function-valued inputs and outputs. Compared to existing Gaussian process models, FFGP is modeled directly in the function space. Based on FFGP, we define a scalar upper confidence bound (UCB) acquisition function using a weighted operator-based scalarization strategy. Then, a scalable functional gradient ascent algorithm (FGA) is developed to efficiently identify the optimal function-valued input. We further analyze the theoretical properties of the proposed method. Extensive experiments on synthetic and real-world data demonstrate the superior performance of FFBO over existing approaches.

* 13 pages, 4 figures, conference

Via

Access Paper or Ask Questions