



Abstract:We provide simple schemes to build Bayesian Neural Networks (BNNs), block by block, inspired by a recent idea of computation skeletons. We show how by adjusting the types of blocks that are used within the computation skeleton, we can identify interesting relationships with Deep Gaussian Processes (DGPs), deep kernel learning (DKL), random features type approximation and other topics. We give strategies to approximate the posterior via doubly stochastic variational inference for such models which yield uncertainty estimates. We give a detailed theoretical analysis and point out extensions that may be of independent interest. As a special case, we instantiate our procedure to define a Bayesian {\em additive} Neural network -- a promising strategy to identify statistical interactions and has direct benefits for obtaining interpretable models.




Abstract:Consider a multi-variate time series $(X_t)_{t=0}^{T}$ where $X_t \in \mathbb{R}^d$ which may represent spike train responses for multiple neurons in a brain, crime event data across multiple regions, and many others. An important challenge associated with these time series models is to estimate an influence network between the $d$ variables, especially when the number of variables $d$ is large meaning we are in the high-dimensional setting. Prior work has focused on parametric vector auto-regressive models. However, parametric approaches are somewhat restrictive in practice. In this paper, we use the non-parametric sparse additive model (SpAM) framework to address this challenge. Using a combination of $\beta$ and $\phi$-mixing properties of Markov chains and empirical process techniques for reproducing kernel Hilbert spaces (RKHSs), we provide upper bounds on mean-squared error in terms of the sparsity $s$, logarithm of the dimension $\log d$, number of time points $T$, and the smoothness of the RKHSs. Our rates are sharp up to logarithm factors in many cases. We also provide numerical experiments that support our theoretical results and display potential advantages of using our non-parametric SpAM framework for a Chicago crime dataset.