Deep neural networks are increasingly utilized in mobility prediction tasks, yet their intricate internal workings pose challenges for interpretability, especially in comprehending how various aspects of mobility behavior affect predictions. In this study, we introduce a causal intervention framework to assess the impact of mobility-related factors on neural networks designed for next location prediction -- a task focusing on predicting the immediate next location of an individual. To achieve this, we employ individual mobility models to generate synthetic location visit sequences and control behavior dynamics by intervening in their data generation process. We evaluate the interventional location sequences using mobility metrics and input them into well-trained networks to analyze performance variations. The results demonstrate the effectiveness in producing location sequences with distinct mobility behaviors, thus facilitating the simulation of diverse spatial and temporal changes. These changes result in performance fluctuations in next location prediction networks, revealing impacts of critical mobility behavior factors, including sequential patterns in location transitions, proclivity for exploring new locations, and preferences in location choices at population and individual levels. The gained insights hold significant value for the real-world application of mobility prediction networks, and the framework is expected to promote the use of causal inference for enhancing the interpretability and robustness of neural networks in mobility applications.
Concerns about data privacy are omnipresent, given the increasing usage of digital applications and their underlying business model that includes selling user data. Location data is particularly sensitive since they allow us to infer activity patterns and interests of users, e.g., by categorizing visited locations based on nearby points of interest (POI). On top of that, machine learning methods provide new powerful tools to interpret big data. In light of these considerations, we raise the following question: What is the actual risk that realistic, machine learning based privacy attacks can obtain meaningful semantic information from raw location data, subject to inaccuracies in the data? In response, we present a systematic analysis of two attack scenarios, namely location categorization and user profiling. Experiments on the Foursquare dataset and tracking data demonstrate the potential for abuse of high-quality spatial information, leading to a significant privacy loss even with location inaccuracy of up to 200m. With location obfuscation of more than 1 km, spatial information hardly adds any value, but a high privacy risk solely from temporal information remains. The availability of public context data such as POIs plays a key role in inference based on spatial information. Our findings point out the risks of ever-growing databases of tracking data and spatial context data, which policymakers should consider for privacy regulations, and which could guide individuals in their personal location protection measures.
Despite the strong predictive performance of deep learning models for traffic prediction, their widespread deployment in real-world intelligent transportation systems has been restrained by a lack of interpretability. Uncertainty quantification (UQ) methods provide an approach to induce probabilistic reasoning, improve decision-making and enhance model deployment potential. To gain a comprehensive picture of the usefulness of existing UQ methods for traffic prediction and the relation between obtained uncertainties and city-wide traffic dynamics, we investigate their application to a large-scale image-based traffic dataset spanning multiple cities and time periods. We compare two epistemic and two aleatoric UQ methods on both temporal and spatio-temporal transfer tasks, and find that meaningful uncertainty estimates can be recovered. We further demonstrate how uncertainty estimates can be employed for unsupervised outlier detection on changes in city traffic dynamics. We find that our approach can capture both temporal and spatial effects on traffic behaviour in a representative case study for the city of Moscow. Our work presents a further step towards boosting uncertainty awareness in traffic prediction tasks, and aims to highlight the value contribution of UQ methods to a better understanding of city traffic dynamics.
Detecting travel modes from global navigation satellite system (GNSS) trajectories is essential for understanding individual travel behaviour and a prerequisite for achieving sustainable transport systems. While studies have acknowledged the benefits of incorporating geospatial context information into travel mode detection models, few have summarized context modelling approaches and analyzed the significance of these context features, hindering the development of an efficient model. Here, we identify context representations from related work and propose an analytical pipeline to assess the contribution of geospatial context information for travel mode detection based on a random forest model and the SHapley Additive exPlanation (SHAP) method. Through experiments on a large-scale GNSS tracking dataset, we report that features describing relationships with infrastructure networks, such as the distance to the railway or road network, significantly contribute to the model's prediction. Moreover, features related to the geospatial point entities help identify public transport travel, but most land-use and land-cover features barely contribute to the task. We finally reveal that geospatial contexts have distinct contributions in identifying different travel modes, providing insights into selecting appropriate context information and modelling approaches. The results from this study enhance our understanding of the relationship between movement and geospatial context and guide the implementation of effective and efficient transport mode detection models.
In recent years, car-sharing services have emerged as viable alternatives to private individual mobility, promising more sustainable and resource-efficient, but still comfortable transportation. Research on short-term prediction and optimization methods has improved operations and fleet control of car-sharing services; however, long-term projections and spatial analysis are sparse in the literature. We propose to analyze the average monthly demand in a station-based car-sharing service with spatially-aware learning algorithms that offer high predictive performance as well as interpretability. In particular, we compare the spatially-implicit Random Forest model with spatially-aware methods for predicting average monthly per-station demand. The study utilizes a rich set of socio-demographic, location-based (e.g., POIs), and car-sharing-specific features as input, extracted from a large proprietary car-sharing dataset and publicly available datasets. We show that the global Random Forest model with geo-coordinates as an input feature achieves the highest predictive performance with an R-squared score of 0.87, while local methods such as Geographically Weighted Regression perform almost on par and additionally yield exciting insights into the heterogeneous spatial distributions of factors influencing car-sharing behaviour. Additionally, our study offers effective as well as highly interpretable methods for diagnosing and planning the placement of car-sharing stations.
Accurate activity location prediction is a crucial component of many mobility applications and is particularly required to develop personalized, sustainable transportation systems. Despite the widespread adoption of deep learning models, next location prediction models lack a comprehensive discussion and integration of mobility-related spatio-temporal contexts. Here, we utilize a multi-head self-attentional (MHSA) neural network that learns location transition patterns from historical location visits, their visit time and activity duration, as well as their surrounding land use functions, to infer an individual's next location. Specifically, we adopt point-of-interest data and latent Dirichlet allocation for representing locations' land use contexts at multiple spatial scales, generate embedding vectors of the spatio-temporal features, and learn to predict the next location with an MHSA network. Through experiments on two large-scale GNSS tracking datasets, we demonstrate that the proposed model outperforms other state-of-the-art prediction models, and reveal the contribution of various spatio-temporal contexts to the model's performance. Moreover, we find that the model trained on population data achieves higher prediction performance with fewer parameters than individual-level models due to learning from collective movement patterns. We also reveal mobility conducted in the recent past and one week before has the largest influence on the current prediction, showing that learning from a subset of the historical mobility is sufficient to obtain an accurate location prediction result. We believe that the proposed model is vital for context-aware mobility prediction. The gained insights will help to understand location prediction models and promote their implementation for mobility applications.
Artificial intelligence (AI) is revolutionizing many areas of our lives, leading a new era of technological advancement. Particularly, the transportation sector would benefit from the progress in AI and advance the development of intelligent transportation systems. Building intelligent transportation systems requires an intricate combination of artificial intelligence and mobility analysis. The past few years have seen rapid development in transportation applications using advanced deep neural networks. However, such deep neural networks are difficult to interpret and lack robustness, which slows the deployment of these AI-powered algorithms in practice. To improve their usability, increasing research efforts have been devoted to developing interpretable and robust machine learning methods, among which the causal inference approach recently gained traction as it provides interpretable and actionable information. Moreover, most of these methods are developed for image or sequential data which do not satisfy specific requirements of mobility data analysis. This vision paper emphasizes research challenges in deep learning-based mobility analysis that require interpretability and robustness, summarizes recent developments in using causal inference for improving the interpretability and robustness of machine learning methods, and highlights opportunities in developing causally-enabled machine learning models tailored for mobility analysis. This research direction will make AI in the transportation sector more interpretable and reliable, thus contributing to safer, more efficient, and more sustainable future transportation systems.
Predicting the next visited location of an individual is a key problem in human mobility analysis, as it is required for the personalization and optimization of sustainable transport options. Here, we propose a transformer decoder-based neural network to predict the next location an individual will visit based on historical locations, time, and travel modes, which are behaviour dimensions often overlooked in previous work. In particular, the prediction of the next travel mode is designed as an auxiliary task to help guide the network's learning. For evaluation, we apply this approach to two large-scale and long-term GPS tracking datasets involving more than 600 individuals. Our experiments show that the proposed method significantly outperforms other state-of-the-art next location prediction methods by a large margin (8.05% and 5.60% relative increase in F1-score for the two datasets, respectively). We conduct an extensive ablation study that quantifies the influence of considering temporal features, travel mode information, and the auxiliary task on the prediction results. Moreover, we experimentally determine the performance upper bound when including the next mode prediction in our model. Finally, our analysis indicates that the performance of location prediction varies significantly with the chosen next travel mode by the individual. These results show potential for a more systematic consideration of additional dimensions of travel behaviour in human mobility prediction tasks. The source code of our model and experiments is available at https://github.com/mie-lab/location-mode-prediction.
The IARAI Traffic4cast competitions at NeurIPS 2019 and 2020 showed that neural networks can successfully predict future traffic conditions 1 hour into the future on simply aggregated GPS probe data in time and space bins. We thus reinterpreted the challenge of forecasting traffic conditions as a movie completion task. U-Nets proved to be the winning architecture, demonstrating an ability to extract relevant features in this complex real-world geo-spatial process. Building on the previous competitions, Traffic4cast 2021 now focuses on the question of model robustness and generalizability across time and space. Moving from one city to an entirely different city, or moving from pre-COVID times to times after COVID hit the world thus introduces a clear domain shift. We thus, for the first time, release data featuring such domain shifts. The competition now covers ten cities over 2 years, providing data compiled from over 10^12 GPS probe data. Winning solutions captured traffic dynamics sufficiently well to even cope with these complex domain shifts. Surprisingly, this seemed to require only the previous 1h traffic dynamic history and static road graph as input.
Advances in traffic forecasting technology can greatly impact urban mobility. In the traffic4cast competition, the task of short-term traffic prediction is tackled in unprecedented detail, with traffic volume and speed information available at 5 minute intervals and high spatial resolution. To improve generalization to unknown cities, as required in the 2021 extended challenge, we propose to predict small quadratic city sections, rather than processing a full-city-raster at once. At test time, breaking down the test data into spatially-cropped overlapping snippets improves stability and robustness of the final predictions, since multiple patches covering one cell can be processed independently. With the performance on the traffic4cast test data and further experiments on a validation set it is shown that patch-wise prediction indeed improves accuracy. Further advantages can be gained with a Unet++ architecture and with an increasing number of patches per sample processed at test time. We conclude that our snippet-based method, combined with other successful network architectures proposed in the competition, can leverage performance, in particular on unseen cities. All source code is available at https://github.com/NinaWie/NeurIPS2021-traffic4cast.