In this article, a stochastic gradient based online learning algorithm for Extreme Learning Machines (ELM) is developed (SG-ELM). A stability criterion based on Lyapunov approach is used to prove both asymptotic stability of estimation error and stability in the estimated parameters suitable for identification of nonlinear dynamic systems. The developed algorithm not only guarantees stability, but also reduces the computational demand compared to the OS-ELM approach based on recursive least squares. In order to demonstrate the effectiveness of the algorithm on a real-world scenario, an advanced combustion engine identification problem is considered. The algorithm is applied to two case studies: An online regression learning for system identification of a Homogeneous Charge Compression Ignition (HCCI) Engine and an online classification learning (with class imbalance) for identifying the dynamic operating envelope of the HCCI Engine. The results indicate that the accuracy of the proposed SG-ELM is comparable to that of the state-of-the-art but adds stability and a reduction in computational effort.
Homogeneous charge compression ignition (HCCI) is a futuristic combustion technology that operates with a high fuel efficiency and reduced emissions. HCCI combustion is characterized by complex nonlinear dynamics which necessitates a model based control approach for automotive application. HCCI engine control is a nonlinear, multi-input multi-output problem with state and actuator constraints which makes controller design a challenging task. Typical HCCI controllers make use of a first principles based model which involves a long development time and cost associated with expert labor and calibration. In this paper, an alternative approach based on machine learning is presented using extreme learning machines (ELM) and nonlinear model predictive control (MPC). A recurrent ELM is used to learn the nonlinear dynamics of HCCI engine using experimental data and is shown to accurately predict the engine behavior several steps ahead in time, suitable for predictive control. Using the ELM engine models, an MPC based control algorithm with a simplified quadratic program update is derived for real time implementation. The working and effectiveness of the MPC approach has been analyzed on a nonlinear HCCI engine model for tracking multiple reference quantities along with constraints defined by HCCI states, actuators and operational limits.
This paper studies identifiability and convergence behaviors for parameters of multiple types in finite mixtures, and the effects of model fitting with extra mixing components. First, we present a general theory for strong identifiability, which extends from the previous work of Nguyen [2013] and Chen [1995] to address a broad range of mixture models and to handle matrix-variate parameters. These models are shown to share the same Wasserstein distance based optimal rates of convergence for the space of mixing distributions --- $n^{-1/2}$ under $W_1$ for the exact-fitted and $n^{-1/4}$ under $W_2$ for the over-fitted setting, where $n$ is the sample size. This theory, however, is not applicable to several important model classes, including location-scale multivariate Gaussian mixtures, shape-scale Gamma mixtures and location-scale-shape skew-normal mixtures. The second part of this work is devoted to demonstrating that for these "weakly identifiable" classes, algebraic structures of the density family play a fundamental role in determining convergence rates of the model parameters, which display a very rich spectrum of behaviors. For instance, the optimal rate of parameter estimation in an over-fitted location-covariance Gaussian mixture is precisely determined by the order of a solvable system of polynomial equations --- these rates deteriorate rapidly as more extra components are added to the model. The established rates for a variety of settings are illustrated by a simulation study.
We present a Bayesian nonparametric framework for multilevel clustering which utilizes group-level context information to simultaneously discover low-dimensional structures of the group contents and partitions groups into clusters. Using the Dirichlet process as the building block, our model constructs a product base-measure with a nested structure to accommodate content and context observations at multiple levels. The proposed model possesses properties that link the nested Dirichlet processes (nDP) and the Dirichlet process mixture models (DPM) in an interesting way: integrating out all contents results in the DPM over contexts, whereas integrating out group-specific contexts results in the nDP mixture over content variables. We provide a Polya-urn view of the model and an efficient collapsed Gibbs inference procedure. Extensive experiments on real-world datasets demonstrate the advantage of utilizing context information via our model in both text and image domains.
We propose a general formalism of iterated random functions with semigroup property, under which exact and approximate Bayesian posterior updates can be viewed as specific instances. A convergence theory for iterated random functions is presented. As an application of the general theory we analyze convergence behaviors of exact and approximate message-passing algorithms that arise in a sequential change point detection problem formulated via a latent variable directed graphical model. The sequential inference algorithm and its supporting theory are illustrated by simulated examples.
Advanced combustion technologies such as homogeneous charge compression ignition (HCCI) engines have a narrow stable operating region defined by complex control strategies such as exhaust gas recirculation (EGR) and variable valve timing among others. For such systems, it is important to identify the operating envelope or the boundary of stable operation for diagnostics and control purposes. Obtaining a good model of the operating envelope using physics becomes intractable owing to engine transient effects. In this paper, a machine learning based approach is employed to identify the stable operating boundary of HCCI combustion directly from experimental data. Owing to imbalance in class proportions in the data, two approaches are considered. A re-sampling (under-sampling, over-sampling) based approach is used to develop models using existing algorithms while a cost-sensitive approach is used to modify the learning algorithm without modifying the data set. Support vector machines and recently developed extreme learning machines are used for model development and results compared against linear classification methods show that cost-sensitive versions of ELM and SVM algorithms are well suited to model the HCCI operating envelope. The prediction results indicate that the models have the potential to be used for predicting HCCI instability based on sensor measurement history.
This paper studies convergence behavior of latent mixing measures that arise in finite and infinite mixture models, using transportation distances (i.e., Wasserstein metrics). The relationship between Wasserstein distances on the space of mixing measures and f-divergence functionals such as Hellinger and Kullback-Leibler distances on the space of mixture distributions is investigated in detail using various identifiability conditions. Convergence in Wasserstein metrics for discrete measures implies convergence of individual atoms that provide support for the measures, thereby providing a natural interpretation of convergence of clusters in clustering applications where mixture models are typically employed. Convergence rates of posterior distributions for latent mixing measures are established, for both finite mixtures of multivariate distributions and infinite mixtures based on the Dirichlet process.
We propose a probabilistic formulation that enables sequential detection of multiple change points in a network setting. We present a class of sequential detection rules for certain functionals of change points (minimum among a subset), and prove their asymptotic optimality properties in terms of expected detection delay time. Drawing from graphical model formalism, the sequential detection rules can be implemented by a computationally efficient message-passing protocol which may scale up linearly in network size and in waiting time. The effectiveness of our inference algorithm is demonstrated by simulations.
We consider the problem of analyzing the heterogeneity of clustering distributions for multiple groups of observed data, each of which is indexed by a covariate value, and inferring global clusters arising from observations aggregated over the covariate domain. We propose a novel Bayesian nonparametric method reposing on the formalism of spatial modeling and a nested hierarchy of Dirichlet processes. We provide an analysis of the model properties, relating and contrasting the notions of local and global clusters. We also provide an efficient inference algorithm, and demonstrate the utility of our method in several data examples, including the problem of object tracking and a global clustering analysis of functional data where the functional identity information is not available.