Abstract:Data on citywide street-segment traffic volumes are essential for urban planning and sustainable mobility management. Yet such data are available only for a limited subset of streets due to the high costs of sensor deployment and maintenance. Traffic volumes on the remaining network are therefore interpolated based on existing sensor measurements. However, current sensor locations are often determined by administrative priorities rather than by data-driven optimization, leading to biased coverage and reduced estimation performance. This study provides a large-scale, real-world benchmarking of easily implementable, data-driven strategies for optimizing the placement of permanent and temporary traffic sensors, using segment-level data from Berlin (Strava bicycle counts) and Manhattan (taxi counts). It compares spatial placement strategies based on network centrality, spatial coverage, feature coverage, and active learning. In addition, the study examines temporal deployment schemes for temporary sensors. The findings highlight that spatial placement strategies that emphasize even spatial coverage and employ active learning achieve the lowest prediction errors. With only 10 sensors, they reduce the mean absolute error by over 60% in Berlin and 70% in Manhattan compared to alternatives. Temporal deployment choices further improve performance: distributing measurements evenly across weekdays reduces error by an additional 7% in Berlin and 21% in Manhattan. Together, these spatial and temporal principles allow temporary deployments to closely approximate the performance of optimally placed permanent deployments. From a policy perspective, the results indicate that cities can substantially improve data usefulness by adopting data-driven sensor placement strategies, while retaining flexibility in choosing between temporary and permanent deployments.
Abstract:Reliable street-level traffic volume data, covering multiple modes of transportation, helps urban planning by informing decisions on infrastructure improvements, traffic management, and public transportation. Yet, traffic sensors measuring traffic volume are typically scarcely located, due to their high deployment and maintenance costs. To address this, interpolation methods can estimate traffic volumes at unobserved locations using available data. Graph Neural Networks have shown strong performance in traffic volume forecasting, particularly on highways and major arterial networks. Applying them to urban settings, however, presents unique challenges: urban networks exhibit greater structural diversity, traffic volumes are highly overdispersed with many zeros, the best way to account for spatial dependencies remains unclear, and sensor coverage is often very sparse. We introduce the Graph Neural Network for Urban Interpolation (GNNUI), a novel urban traffic volume estimation approach. GNNUI employs a masking algorithm to learn interpolation, integrates node features to capture functional roles, and uses a loss function tailored to zero-inflated traffic distributions. In addition to the model, we introduce two new open, large-scale urban traffic volume benchmarks, covering different transportation modes: Strava cycling data from Berlin and New York City taxi data. GNNUI outperforms recent, some graph-based, interpolation methods across metrics (MAE, RMSE, true-zero rate, Kullback-Leibler divergence) and remains robust from 90% to 1% sensor coverage. On Strava, for instance, MAE rises only from 7.1 to 10.5, on Taxi from 23.0 to 40.4, demonstrating strong performance under extreme data scarcity, common in real-world urban settings. We also examine how graph connectivity choices influence model accuracy.