Abstract:Foundation models are now increasingly being developed for Earth observation (EO), yet they often rely on stochastic masking that do not explicitly enforce physics constraints; a critical trustworthiness limitation, in particular for predictive models that guide public health decisions. In this work, we propose SpecTM (Spectral Targeted Masking), a physics-informed masking design that encourages the reconstruction of targeted bands from cross-spectral context during pretraining. To achieve this, we developed an adaptable multi-task (band reconstruction, bio-optical index inference, and 8-day-ahead temporal prediction) self-supervised learning (SSL) framework that encodes spectrally intrinsic representations via joint optimization, and evaluated it on a downstream microcystin concentration regression model using NASA PACE hyperspectral imagery over Lake Erie. SpecTM achieves R^2 = 0.695 (current week) and R^2 = 0.620 (8-day-ahead) predictions surpassing all baseline models by (+34% (0.51 Ridge) and +99% (SVR 0.31)) respectively. Our ablation experiments show targeted masking improves predictions by +0.037 R^2 over random masking. Furthermore, it outperforms strong baselines with 2.2x superior label efficiency under extreme scarcity. SpecTM enables physics-informed representation learning across EO domains and improves the interpretability of foundation models.




Abstract:Soil erosion is a significant threat to the environment and long-term land management around the world. Accelerated soil erosion by human activities inflicts extreme changes in terrestrial and aquatic ecosystems, which is not fully surveyed/predicted for the present and probable future at field-scales (30-m). Here, we estimate/predict soil erosion rates by water erosion, (sheet and rill erosion), using three alternative (2.6, 4.5, and 8.5) Shared Socioeconomic Pathway and Representative Concentration Pathway (SSP-RCP) scenarios across the contiguous United States. Field Scale Soil Erosion Model (FSSLM) estimations rely on a high resolution (30-m) G2 erosion model integrated by satellite- and imagery-based estimations of land use and land cover (LULC), gauge observations of long-term precipitation, and scenarios of the Coupled Model Intercomparison Project Phase 6 (CMIP6). The baseline model (2020) estimates soil erosion rates of 2.32 Mg ha 1 yr 1 with current agricultural conservation practices (CPs). Future scenarios with current CPs indicate an increase between 8% to 21% under different combinations of SSP-RCP scenarios of climate and LULC changes. The soil erosion forecast for 2050 suggests that all the climate and LULC scenarios indicate either an increase in extreme events or a change in the spatial location of extremes largely from the southern to the eastern and northeastern regions of the United States.