Generation of science-ready data from processed data products is one of the major challenges in next-generation radio continuum surveys with the Square Kilometre Array (SKA) and its precursors, due to the expected data volume and the need to achieve a high degree of automated processing. Source extraction, characterization, and classification are the major stages involved in this process. In this work we focus on the classification of compact radio sources in the Galactic plane using both radio and infrared images as inputs. To this aim, we produced a curated dataset of ~20,000 images of compact sources of different astronomical classes, obtained from past radio and infrared surveys, and novel radio data from pilot surveys carried out with the Australian SKA Pathfinder (ASKAP). Radio spectral index information was also obtained for a subset of the data. We then trained two different classifiers on the produced dataset. The first model uses gradient-boosted decision trees and is trained on a set of pre-computed features derived from the data, which include radio-infrared colour indices and the radio spectral index. The second model is trained directly on multi-channel images, employing convolutional neural networks. Using a completely supervised procedure, we obtained a high classification accuracy (F1-score>90%) for separating Galactic objects from the extragalactic background. Individual class discrimination performances, ranging from 60% to 75%, increased by 10% when adding far-infrared and spectral index information, with extragalactic objects, PNe and HII regions identified with higher accuracies. The implemented tools and trained models were publicly released, and made available to the radioastronomical community for future application on new radio data.
Innovative developments in data processing, archiving, analysis, and visualization are nowadays unavoidable to deal with the data deluge expected in next-generation facilities for radio astronomy, such as the Square Kilometre Array (SKA) and its precursors. In this context, the integration of source extraction and analysis algorithms into data visualization tools could significantly improve and speed up the cataloguing process of large area surveys, boosting astronomer productivity and shortening publication time. To this aim, we are developing a visual analytic platform (CIRASA) for advanced source finding and classification, integrating state-of-the-art tools, such as the CAESAR source finder, the ViaLactea Visual Analytic (VLVA) and Knowledge Base (VLKB). In this work, we present the project objectives and the platform architecture, focusing on the implemented source finding services.
We present observations of a region of the Galactic plane taken during the Early Science Program of the Australian Square Kilometre Array Pathfinder (ASKAP). In this context, we observed the SCORPIO field at 912 MHz with an uncompleted array consisting of 15 commissioned antennas. The resulting map covers a square region of ~40 deg^2, centred on (l, b)=(343.5{\deg}, 0.75{\deg}), with a synthesized beam of 24"x21" and a background rms noise of 150-200 {\mu}Jy/beam, increasing to 500-600 {\mu}Jy/beam close to the Galactic plane. A total of 3963 radio sources were detected and characterized in the field using the CAESAR source finder. We obtained differential source counts in agreement with previously published data after correction for source extraction and characterization uncertainties, estimated from simulated data. The ASKAP positional and flux density scale accuracy were also investigated through comparison with previous surveys (MGPS, NVSS) and additional observations of the SCORPIO field, carried out with ATCA at 2.1 GHz and 10" spatial resolution. These allowed us to obtain a measurement of the spectral index for a subset of the catalogued sources and an estimated fraction of (at least) 8% of resolved sources in the reported catalogue. We cross-matched our catalogued sources with different astronomical databases to search for possible counterparts, finding ~150 associations to known Galactic objects. Finally, we explored a multiparametric approach for classifying previously unreported Galactic sources based on their radio-infrared colors.