Community detection is a challenging and relevant problem in various disciplines of science and engineering like power systems, gene-regulatory networks, social networks, financial networks, astronomy etc. Furthermore, in many of these applications the underlying system is dynamical in nature and because of the complexity of the systems involved, deriving a mathematical model which can be used for clustering and community detection, is often impossible. Moreover, while clustering dynamical systems, it is imperative that the dynamical nature of the underlying system is taken into account. In this paper, we propose a novel approach for clustering dynamical systems purely from time-series data which inherently takes into account the dynamical evolution of the underlying system. In particular, we define a \emph{distance/similarity} measure between the states of the system which is a function of the influence that the states have on each other, and use the proposed measure for clustering of the dynamical system. For data-driven computation we leverage the Koopman operator framework which takes into account the nonlinearities (if present) of the underlying system, thus making the proposed framework applicable to a wide range of application areas. We illustrate the efficacy of the proposed approach by clustering three different dynamical systems, namely, a linear system, which acts like a proof of concept, the highly non-linear IEEE 39 bus transmission network and dynamic variables obtained from atmospheric data over the Amazon rain forest.
Causal analysis and classification of injury severity applying non-parametric methods for traffic crashes has received limited attention. This study presents a methodological framework for causal inference, using Granger causality analysis, and injury severity classification of traffic crashes, occurring on interstates, with different machine learning techniques including decision trees (DT), random forest (RF), extreme gradient boosting (XGBoost), and deep neural network (DNN). The data used in this study were obtained for traffic crashes on all interstates across the state of Texas from a period of six years between 2014 and 2019. The output of the proposed severity classification approach includes three classes for fatal and severe injury (KA) crashes, non-severe and possible injury (BC) crashes, and property damage only (PDO) crashes. While Granger Causality helped identify the most influential factors affecting crash severity, the learning-based models predicted the severity classes with varying performance. The results of Granger causality analysis identified the speed limit, surface and weather conditions, traffic volume, presence of workzones, workers in workzones, and high occupancy vehicle (HOV) lanes, among others, as the most important factors affecting crash severity. The prediction performance of the classifiers yielded varying results across the different classes. Specifically, while decision tree and random forest classifiers provided the greatest performance for PDO and BC severities, respectively, for the KA class, the rarest class in the data, deep neural net classifier performed superior than all other algorithms, most likely due to its capability of approximating nonlinear models. This study contributes to the limited body of knowledge pertaining to causal analysis and classification prediction of traffic crash injury severity using non-parametric approaches.
Since the increasing outspread of COVID-19 in the U.S., with the highest number of confirmed cases and deaths in the world as of September 2020, most states in the country have enforced travel restrictions resulting in sharp reductions in mobility. However, the overall impact and long-term implications of this crisis to travel and mobility remain uncertain. To this end, this study develops an analytical framework that determines and analyzes the most dominant factors impacting human mobility and travel in the U.S. during this pandemic. In particular, the study uses Granger causality to determine the important predictors influencing daily vehicle miles traveled and utilize linear regularization algorithms, including Ridge and LASSO techniques, to model and predict mobility. State-level time-series data were obtained from various open-access sources for the period starting from March 1, 2020 through June 13, 2020 and the entire data set was divided into two parts for training and testing purposes. The variables selected by Granger causality were used to train the three different reduced order models by ordinary least square regression, Ridge regression, and LASSO regression algorithms. Finally, the prediction accuracy of the developed models was examined on the test data. The results indicate that the factors including the number of new COVID cases, social distancing index, population staying at home, percent of out of county trips, trips to different destinations, socioeconomic status, percent of people working from home, and statewide closure, among others, were the most important factors influencing daily VMT. Also, among all the modeling techniques, Ridge regression provides the most superior performance with the least error, while LASSO regression also performed better than the ordinary least square model.