In this work, we demonstrate continuous-time radar-inertial and lidar-inertial odometry using a Gaussian process motion prior. Using a sparse prior, we demonstrate improved computational complexity during preintegration and interpolation. We use a white-noise-on-acceleration motion prior and treat the gyroscope as a direct measurement of the state while preintegrating accelerometer measurements to form relative velocity factors. Our odometry is implemented using sliding-window batch trajectory estimation. To our knowledge, our work is the first to demonstrate radar-inertial odometry with a spinning mechanical radar using both gyroscope and accelerometer measurements. We improve the performance of our radar odometry by 19\% by incorporating an IMU. Our approach is efficient and we demonstrate real-time performance. Code for this project can be found at: https://github.com/utiasASRL/steam_icp
Unlike in a clinical trial, where researchers get to determine the least number of positive and negative samples required, or in a machine learning study where the size and the class distribution of the validation set is static and known, in a real-world scenario, there is little control over the size and distribution of incoming patients. As a result, when measured during different time periods, evaluation metrics like Area under the Receiver Operating Curve (AUCROC) and Area Under the Precision-Recall Curve(AUCPR) may not be directly comparable. Therefore, in this study, for binary classifiers running in a long time period, we proposed to adjust these performance metrics for sample size and class distribution, so that a fair comparison can be made between two time periods. Note that the number of samples and the class distribution, namely the ratio of positive samples, are two robustness factors which affect the variance of AUCROC. To better estimate the mean of performance metrics and understand the change of performance over time, we propose a Kalman filter based framework with extrapolated variance adjusted for the total number of samples and the number of positive samples during different time periods. The efficacy of this method is demonstrated first on a synthetic dataset and then retrospectively applied to a 2-days ahead in-hospital mortality prediction model for COVID-19 patients during 2021 and 2022. Further, we conclude that our prediction model is not significantly affected by the evolution of the disease, improved treatments and changes in hospital operational plans.
Purpose: Echo modulation curve (EMC) modeling can provide accurate and reproducible quantification of T2 relaxation times. The standard EMC-T2 mapping framework, however, requires sufficient echoes and cumbersome pixel-wise dictionary-matching steps. This work proposes a deep learning version of EMC-T2 mapping, called DeepEMC-T2 mapping, to efficiently estimate accurate T2 maps from fewer echoes without a dictionary. Methods: DeepEMC-T2 mapping was developed using a modified U-Net to estimate both T2 and Proton Density (PD) maps directly from multi-echo spin-echo (MESE) images. The modified U-Net employs several new features to improve the accuracy of T2/PD estimation. MESE datasets from 68 subjects were used for training and evaluation of the DeepEMC-T2 mapping technique. Multiple experiments were conducted to evaluate the impact of the proposed new features on DeepEMC-T2 mapping. Results: DeepEMC-T2 mapping achieved T2 estimation errors ranging from 3%-12% in different T2 ranges and 0.8%-1.7% for PD estimation with 10/7/5/3 echoes, which yielded more accurate parameter estimation than standard EMC-T2 mapping. The new features proposed in DeepEMC-T2 mapping enabled improved parameter estimation. The use of a larger echo spacing with fewer echoes can maintain the accuracy of T2 and PD estimations while reducing the number of 180-degree refocusing pulses. Conclusions: DeepEMC-T2 mapping enables simplified, efficient, and accurate T2 quantification directly from MESE images without a time-consuming dictionary-matching step and requires fewer echoes. This allows for increased volumetric coverage and/or decreased SAR by reducing the number of 180-degree refocusing pulses.
Graph representation learning (GRL) makes considerable progress recently, which encodes graphs with topological structures into low-dimensional embeddings. Meanwhile, the time-consuming and costly process of annotating graph labels manually prompts the growth of self-supervised learning (SSL) techniques. As a dominant approach of SSL, Contrastive learning (CL) learns discriminative representations by differentiating between positive and negative samples. However, when applied to graph data, it overemphasizes global patterns while neglecting local structures. To tackle the above issue, we propose \underline{Local}-aware \underline{G}raph \underline{C}ontrastive \underline{L}earning (\textbf{\methnametrim}), a self-supervised learning framework that supplementarily captures local graph information with masking-based modeling compared with vanilla contrastive learning. Extensive experiments validate the superiority of \methname against state-of-the-art methods, demonstrating its promise as a comprehensive graph representation learner.
This work proposes a semantic segmentation network that produces high-quality uncertainty estimates in a single forward pass. We exploit general representations from foundation models and unlabelled datasets through a Masked Image Modeling (MIM) approach, which is robust to augmentation hyper-parameters and simpler than previous techniques. For neural networks used in safety-critical applications, bias in the training data can lead to errors; therefore it is crucial to understand a network's limitations at run time and act accordingly. To this end, we test our proposed method on a number of test domains including the SAX Segmentation benchmark, which includes labelled test data from dense urban, rural and off-road driving domains. The proposed method consistently outperforms uncertainty estimation and Out-of-Distribution (OoD) techniques on this difficult benchmark.
Unmanned Aerial Vehicle (UAVs) have become very popular in the last decade due to some advantages such as strong terrain adaptation, low cost, zero casualties, and so on. One of the most interesting advances in this field is the automation of mission planning (task allocation) and real-time replanning, which are highly useful to increase the autonomy of the vehicle and reduce the operator workload. These automated mission planning and replanning systems require a Human Computer Interface (HCI) that facilitates the visualization and selection of plans that will be executed by the vehicles. In addition, most missions should be assessed before their real-life execution. This paper extends QGroundControl, an open-source simulation environment for flight control of multiple vehicles, by adding a mission designer that permits the operator to build complex missions with tasks and other scenario items; an interface for automated mission planning and replanning, which works as a test bed for different algorithms, and a Decision Support System (DSS) that helps the operator in the selection of the plan. In this work, a complete guide of these systems and some practical use cases are provided.
Automatic workflow composition (AWC) is a relevant problem in automated machine learning (AutoML) that allows finding suitable sequences of preprocessing and prediction models together with their optimal hyperparameters. This problem can be solved using evolutionary algorithms and, in particular, grammar-guided genetic programming (G3P). Current G3P approaches to AWC define a fixed grammar that formally specifies how workflow elements can be combined and which algorithms can be included. In this paper we present \ourmethod, an interactive G3P algorithm that allows users to dynamically modify the grammar to prune the search space and focus on their regions of interest. Our proposal is the first to combine the advantages of a G3P method with ideas from interactive optimisation and human-guided machine learning, an area little explored in the context of AutoML. To evaluate our approach, we present an experimental study in which 20 participants interact with \ourmethod to evolve workflows according to their preferences. Our results confirm that the collaboration between \ourmethod and humans allows us to find high-performance workflows in terms of accuracy that require less tuning time than those found without human intervention.
This paper proposes a method for Acoustic Constrained Segmentation (ACS) in audio recordings of vehicles driven through a production test track, delimiting the boundaries of surface types in the track. ACS is a variant of classical acoustic segmentation where the sequence of labels is known, contiguous and invariable, which is especially useful in this work as the test track has a standard configuration of surface types. The proposed ConvDTW-ACS method utilizes a Convolutional Neural Network for classifying overlapping image chunks extracted from the full audio spectrogram. Then, our custom Dynamic Time Warping algorithm aligns the sequence of predicted probabilities to the sequence of surface types in the track, from which timestamps of the surface type boundaries can be extracted. The method was evaluated on a real-world dataset collected from the Ford Manufacturing Plant in Valencia (Spain), achieving a mean error of 166 milliseconds when delimiting, within the audio, the boundaries of the surfaces in the track. The results demonstrate the effectiveness of the proposed method in accurately segmenting different surface types, which could enable the development of more specialized AI systems to improve the quality inspection process.
Health insurance companies have a defined process called prior authorization (PA) which is a health plan cost-control process that requires doctors and other healthcare professionals to get clearance in advance from a health plan before performing a particular procedure on a patient in order to be eligible for payment coverage. For health insurance companies, approving PA requests for patients in the medical domain is a time-consuming and challenging task. One of those key challenges is validating if a request matches up to certain criteria such as age, gender, etc. In this work, we evaluate whether GPT can validate numerous key factors, in turn helping health plans reach a decision drastically faster. We frame it as a question answering task, prompting GPT to answer a question from patient electronic health record. We experiment with different conventional prompting techniques as well as introduce our own novel prompting technique. Moreover, we report qualitative assessment by humans on the natural language generation outputs from our approach. Results show that our method achieves superior performance with the mean weighted F1 score of 0.61 as compared to its standard counterparts.
We describe a framework for using natural language to design state abstractions for imitation learning. Generalizable policy learning in high-dimensional observation spaces is facilitated by well-designed state representations, which can surface important features of an environment and hide irrelevant ones. These state representations are typically manually specified, or derived from other labor-intensive labeling procedures. Our method, LGA (language-guided abstraction), uses a combination of natural language supervision and background knowledge from language models (LMs) to automatically build state representations tailored to unseen tasks. In LGA, a user first provides a (possibly incomplete) description of a target task in natural language; next, a pre-trained LM translates this task description into a state abstraction function that masks out irrelevant features; finally, an imitation policy is trained using a small number of demonstrations and LGA-generated abstract states. Experiments on simulated robotic tasks show that LGA yields state abstractions similar to those designed by humans, but in a fraction of the time, and that these abstractions improve generalization and robustness in the presence of spurious correlations and ambiguous specifications. We illustrate the utility of the learned abstractions on mobile manipulation tasks with a Spot robot.