Abstract:Ancient populations markedly transformed Neotropical forests, yet understanding the long-term effects of ancient human management, particularly at high-resolution scales, remains challenging. In this work we propose a new approach to investigate archaeological areas of influence based on vegetation signatures. It consists of a deep learning model trained on satellite imagery to identify palm trees, followed by a clustering algorithm to identify palm clusters, which are then used to estimate ancient management areas. To assess the palm distribution in relation to past human activity, we applied the proposed approach to unique high-resolution satellite imagery data covering 765 km2 of the Sierra Nevada de Santa Marta, Colombia. With this work, we also release a manually annotated palm tree dataset along with estimated locations of archaeological sites from ground-surveys and legacy records. Results demonstrate how palms were significantly more abundant near archaeological sites showing large infrastructure investment. The extent of the largest palm cluster indicates that ancient human-managed areas linked to major infrastructure sites may be up to two orders of magnitude bigger than indicated by archaeological evidence alone. Our findings suggest that pre-Columbian populations influenced local vegetation fostering conditions conducive to palm proliferation, leaving a lasting ecological footprint. This may have lowered the logistical costs of establishing infrastructure-heavy settlements in otherwise less accessible locations. Overall, this study demonstrates the potential of integrating artificial intelligence approaches with new ecological and archaeological data to identify archaeological areas of interest through vegetation patterns, revealing fine-scale human-environment interactions.
Abstract:When dealing with sensitive data in automated data-driven decision-making, an important concern is to learn predictors with high performance towards a class label, whilst minimising for the discrimination towards some sensitive attribute, like gender or race, induced from biased data. Various hybrid optimisation criteria exist which combine classification performance with a fairness metric. However, while the threshold-free ROC-AUC is the standard for measuring traditional classification model performance, current fair decision tree methods only optimise for a fixed threshold on both the classification task as well as the fairness metric. Moreover, current tree learning frameworks do not allow for fair treatment with respect to multiple categories or multiple sensitive attributes. Lastly, the end-users of a fair model should be able to balance fairness and classification performance according to their specific ethical, legal, and societal needs. In this paper we address these shortcomings by proposing a threshold-independent fairness metric termed uniform demographic parity, and a derived splitting criterion entitled SCAFF -- Splitting Criterion AUC for Fairness -- towards fair decision tree learning, which extends to bagged and boosted frameworks. Compared to the state-of-the-art, our method provides three main advantages: (1) classifier performance and fairness are defined continuously instead of relying upon an, often arbitrary, decision threshold; (2) it leverages multiple sensitive attributes simultaneously, of which the values may be multicategorical; and (3) the unavoidable performance-fairness trade-off is tunable during learning. In our experiments, we demonstrate how SCAFF attains high predictive performance towards the class label and low discrimination with respect to binary, multicategorical, and multiple sensitive attributes, further substantiating our claims.