Abstract:Decision trees are widely used due to their interpretability and efficiency, but they struggle in regression tasks that require reliable extrapolation and well-calibrated uncertainty. Piecewise-constant leaf predictions are bounded by the training targets and often become overconfident under distribution shift. We propose a single-tree Bayesian model that extends VSPYCT by equipping each leaf with a GP predictor. Bayesian oblique splits provide uncertainty-aware partitioning of the input space, while GP leaves model local functional behaviour and enable principled extrapolation beyond the observed target range. We present an efficient inference and prediction scheme that combines posterior sampling of split parameters with \gls{gp} posterior predictions, and a gating mechanism that activates GP-based extrapolation when inputs fall outside the training support of a leaf. Experiments on benchmark regression tasks show improvements in the predictive performance compared to standard variational oblique trees, and substantial performance gains in extrapolation scenarios.




Abstract:Similarity between occupations is a crucial piece of information when making career decisions. However, the notion of a single and unified occupation similarity measure is more of a limitation than an asset. The goal of the study is to assess multiple explainable occupation similarity measures that can provide different insights into inter-occupation relations. Several such measures are derived using the framework of bipartite graphs. Their viability is assessed on more than 450,000 job transitions occurring in Slovenia in the period between 2012 and 2021. The results support the hypothesis that several similarity measures are plausible and that they present different feasible career paths. The complete implementation and part of the datasets are available at https://repo.ijs.si/pboskoski/bipartite_job_similarity_code.




Abstract:Mathematical modelling of unemployment dynamics attempts to predict the probability of a job seeker finding a job as a function of time. This is typically achieved by using information in unemployment records. These records are right censored, making survival analysis a suitable approach for parameter estimation. The proposed model uses a deep artificial neural network (ANN) as a non-linear hazard function. Through embedding, high-cardinality categorical features are analysed efficiently. The posterior distribution of the ANN parameters are estimated using a variational Bayes method. The model is evaluated on a time-to-employment data set spanning from 2011 to 2020 provided by the Slovenian public employment service. It is used to determine the employment probability over time for each individual on the record. Similar models could be applied to other questions with multi-dimensional, high-cardinality categorical data including censored records. Such data is often encountered in personal records, for example in medical records.




Abstract:Electrochemical impedance spectra is a widely used tool for characterization of fuel cells and electrochemical conversion systems in general. When applied to the on-line monitoring in context of in-field applications, the disturbances, drifts and sensor noise may cause severe distortions in the evaluated spectra, especially in the low-frequency part. Failure to account for the random effects can implicate difficulties in interpreting the spectra and misleading diagnostic reasoning. In the literature, this fact has been largely ignored. In this paper, we propose a computationally efficient approach to the quantification of the spectral uncertainty by quantifying the uncertainty of the equivalent circuit model (ECM) parameters by means of the Variational Bayes (VB) approach. To assess the quality of the VB posterior estimates, we compare the results of VB approach with those obtained with the Markov Chain Monte Carlo (MCMC) algorithm. Namely, MCMC algorithm is expected to return accurate posterior distributions, while VB approach provides the approximative distributions. By using simulated and real data we show that VB approach generates approximations, which although slightly over-optimistic, are still pretty close to the more realistic MCMC estimates. A great advantage of the VB method for online monitoring is low computational load, which is several orders of magnitude lighter than that of MCMC. The performance of VB algorithm is demonstrated on a case of ECM parameters estimation in a 6 cell solid-oxide fuel cell stack. The complete numerical implementation for recreating the results can be found at https://repo.ijs.si/lznidaric/variational-bayes-supplementary-material.