Raster Forge is a Python library and graphical user interface for raster data manipulation and analysis. The tool is focused on remote sensing applications, particularly in wildfire management. It allows users to import, visualize, and process raster layers for tasks such as image compositing or topographical analysis. For wildfire management, it generates fuel maps using predefined models. Its impact extends from disaster management to hydrological modeling, agriculture, and environmental monitoring. Raster Forge can be a valuable asset for geoscientists and researchers who rely on raster data analysis, enhancing geospatial data processing and visualization across various disciplines.
The integration of data science into Geographic Information Systems (GIS) has facilitated the evolution of these tools into complete spatial analysis platforms. The adoption of machine learning and big data techniques has equipped these platforms with the capacity to handle larger amounts of increasingly complex data, transcending the limitations of more traditional approaches. This work traces the historical and technical evolution of data science and GIS as fields of study, highlighting the critical points of convergence between domains, and underlining the many sectors that rely on this integration. A GIS application is presented as a case study in the disaster management sector where we utilize aerial data from Tr\'oia, Portugal, to emphasize the process of insight extraction from raw data. We conclude by outlining prospects for future research in integration of these fields in general, and the developed application in particular.
Text clustering is an important approach for organising the growing amount of digital content, helping to structure and find hidden patterns in uncategorised data. In this research, we investigated how different textual embeddings - particularly those used in large language models (LLMs) - and clustering algorithms affect how text datasets are clustered. A series of experiments were conducted to assess how embeddings influence clustering results, the role played by dimensionality reduction through summarisation, and embedding size adjustment. Results reveal that LLM embeddings excel at capturing the nuances of structured language, while BERT leads the lightweight options in performance. In addition, we find that increasing embedding dimensionality and summarisation techniques do not uniformly improve clustering efficiency, suggesting that these strategies require careful analysis to use in real-life models. These results highlight a complex balance between the need for nuanced text representation and computational feasibility in text clustering applications. This study extends traditional text clustering frameworks by incorporating embeddings from LLMs, thereby paving the way for improved methodologies and opening new avenues for future research in various types of textual analysis.
This paper highlights and summarizes the most important multispectral indices and associated methodologies for fire management. Various fields of study are examined where multispectral indices align with wildfire prevention and management, including vegetation and soil attribute extraction, water feature mapping, artificial structure identification, and post-fire burnt area estimation. The versatility and effectiveness of multispectral indices in addressing specific issues in wildfire management are emphasized. Fundamental insights for optimizing data extraction are presented. Concrete indices for each task, including the NDVI and the NDWI, are suggested. Moreover, to enhance accuracy and address inherent limitations of individual index applications, the integration of complementary processing solutions and additional data sources like high-resolution imagery and ground-based measurements is recommended. This paper aims to be an immediate and comprehensive reference for researchers and stakeholders working on multispectral indices related to the prevention and management of fires.
Synthetic data is essential for assessing clustering techniques, complementing and extending real data, and allowing for a more complete coverage of a given problem's space. In turn, synthetic data generators have the potential of creating vast amounts of data -- a crucial activity when real-world data is at premium -- while providing a well-understood generation procedure and an interpretable instrument for methodically investigating cluster analysis algorithms. Here, we present \textit{Clugen}, a modular procedure for synthetic data generation, capable of creating multidimensional clusters supported by line segments using arbitrary distributions. \textit{Clugen} is open source, 100\% unit tested and fully documented, and is available for the Python, R, Julia and MATLAB/Octave ecosystems. We demonstrate that our proposal is able to produce rich and varied results in various dimensions, is fit for use in the assessment of clustering algorithms, and has the potential to be a widely used framework in diverse clustering-related research tasks.
This article presents a dataset of 10,917 news articles with hierarchical news categories collected between January 1st 2019, and December 31st 2019. We manually labelled the articles based on a hierarchical taxonomy with 17 first-level and 109 second-level categories. This dataset can be used to train machine learning models for automatically classifying news articles by topic. This dataset can be helpful for researchers working on news structuring, classification, and predicting future events based on released news.
In this paper we present a technique for procedurally generating 3D maps using a set of premade meshes which snap together based on designer-specified visual constraints. The proposed approach avoids size and layout limitations, offering the designer control over the look and feel of the generated maps, as well as immediate feedback on a given map's navigability. A prototype implementation of the method, developed in the Unity game engine, is discussed, and a number of case studies are analyzed. These include a multiplayer game where the method was used, together with a number of illustrative examples which highlight various parameterizations and generation methods. We argue that the technique is designer-friendly and can be used as a map composition method and/or as a prototyping system in 3D level design, opening the door for quality map and level creation in a fraction of the time of a fully human-based approach.
ColorShapeLinks is an AI board game competition framework specially designed for students and educators in videogame development, with openness and accessibility in mind. The competition is based on an arbitrarily-sized version of the Simplexity board game, the motto of which, "simple to learn, complex to master", is curiously also applicable to AI agents. ColorShapeLinks offers graphical and text-based frontends and a completely open and documented development framework built using industry standard tools and following software engineering best practices. ColorShapeLinks is not only a competition, but both a game and a framework which educators and students can extend and use to host their own competitions.