Abstract:We present IT-DPC-SRI, the first publicly available long-term archive of Italian weather radar precipitation estimates, spanning 16 years (2010--2025). The dataset contains Surface Rainfall Intensity (SRI) observations from the Italian Civil Protection Department's national radar mosaic, harmonized into a coherent Analysis-Ready Cloud-Optimized (ARCO) Zarr datacube. The archive comprises over one million timesteps at temporal resolutions from 15 to 5 minutes, covering a $1200\times1400$ kilometer domain at 1 kilometer spatial resolution, compressed from 7TB to 51GB on disk. We address the historical fragmentation of Italian radar data - previously scattered across heterogeneous formats (OPERA BUFR, HDF5, GeoTIFF) with varying spatial domains and projections - by reprocessing the entire record into a unified store. The dataset is accessible as a static versioned snapshot on Zenodo, via cloud-native access on the ECMWF European Weather Cloud, and as a continuously updated live version on the ArcoDataHub platform. This release fills a significant gap in European radar data availability, as Italy does not participate in the EUMETNET OPERA pan-European radar composite. The dataset is released under a CC BY-SA 4.0 license.




Abstract:In recent years traditional numerical methods for accurate weather prediction have been increasingly challenged by deep learning methods. Numerous historical datasets used for short and medium-range weather forecasts are typically organized into a regular spatial grid structure. This arrangement closely resembles images: each weather variable can be visualized as a map or, when considering the temporal axis, as a video. Several classes of generative models, comprising Generative Adversarial Networks, Variational Autoencoders, or the recent Denoising Diffusion Models have largely proved their applicability to the next-frame prediction problem, and is thus natural to test their performance on the weather prediction benchmarks. Diffusion models are particularly appealing in this context, due to the intrinsically probabilistic nature of weather forecasting: what we are really interested to model is the probability distribution of weather indicators, whose expected value is the most likely prediction. In our study, we focus on a specific subset of the ERA-5 dataset, which includes hourly data pertaining to Central Europe from the years 2016 to 2021. Within this context, we examine the efficacy of diffusion models in handling the task of precipitation nowcasting. Our work is conducted in comparison to the performance of well-established U-Net models, as documented in the existing literature. Our proposed approach of Generative Ensemble Diffusion (GED) utilizes a diffusion model to generate a set of possible weather scenarios which are then amalgamated into a probable prediction via the use of a post-processing network. This approach, in comparison to recent deep learning models, substantially outperformed them in terms of overall performance.