We present a study using new computational methods, based on a novel combination of machine learning for inferring admixture hidden Markov models and probabilistic model checking, to uncover interaction styles in a mobile app. These styles are then used to inform a redesign, which is implemented, deployed, and then analysed using the same methods. The data sets are logged user traces, collected over two six-month deployments of each version, involving thousands of users and segmented into different time intervals. The methods do not assume tasks or absolute metrics such as measures of engagement, but uncover the styles through unsupervised inference of clusters and analysis with probabilistic temporal logic. For both versions there was a clear distinction between the styles adopted by users during the first day/week/month of usage, and during the second and third months, a result we had not anticipated.
Nonorthogonal multiple access (NOMA) with multi-antenna base station (BS) is a promising technology for next-generation wireless communication, which has high potential in performance and user fairness. Since the performance of NOMA depends on the channel conditions, we can combine NOMA and reconfigurable intelligent surface (RIS), which is a large and passive antenna array and can optimize the wireless channel. However, the high dimensionality makes the RIS optimization a complicated problem. In this work, we propose a machine learning approach to solve the problem of joint optimization of precoding and RIS configuration. We apply the RIS to realize the quasi-degradation of the channel, which allows for optimal precoding in closed form. The neural network architecture RISnet is used, which is designed dedicatedly for RIS optimization. The proposed solution is superior than the works in the literature in terms of performance and computation time.
We propose a novel low-rank tensor method for respiratory motion-resolved multi-echo image reconstruction. The key idea is to construct a 3-way image tensor (space $\times$ echo $\times$ motion state) from the conventional gridding reconstruction of highly undersampled multi-echo k-space raw data, and exploit low-rank tensor structure to separate it from undersampling artifacts. Healthy volunteers and patients with iron overload were recruited and imaged on a 3T clinical MRI system for this study. Results show that our proposed method Successfully reduced severe undersampling artifacts in respiratory motion-state resolved complex source images, as well as subsequent R2* and quantitative susceptibility mapping (QSM). Compared to conventional respiratory motion-resolved compressed sensing (CS) image reconstruction, the proposed method had a reconstruction time at least three times faster, accounting for signal evolution along the echo dimension in the multi-echo data.
We present a generalized linear structural causal model, coupled with a novel data-adaptive linear regularization, to recover causal directed acyclic graphs (DAGs) from time series. By leveraging a recently developed stochastic monotone Variational Inequality (VI) formulation, we cast the causal discovery problem as a general convex optimization. Furthermore, we develop a non-asymptotic recovery guarantee and quantifiable uncertainty by solving a linear program to establish confidence intervals for a wide range of non-linear monotone link functions. We validate our theoretical results and show the competitive performance of our method via extensive numerical experiments. Most importantly, we demonstrate the effectiveness of our approach in recovering highly interpretable causal DAGs over Sepsis Associated Derangements (SADs) while achieving comparable prediction performance to powerful ``black-box'' models such as XGBoost. Thus, the future adoption of our proposed method to conduct continuous surveillance of high-risk patients by clinicians is much more likely.
Slope failures possess destructive power that can cause significant damage to both life and infrastructure. Monitoring slopes prone to instabilities is therefore critical in mitigating the risk posed by their failure. The purpose of slope monitoring is to detect precursory signs of stability issues, such as changes in the rate of displacement with which a slope is deforming. This information can then be used to predict the timing or probability of an imminent failure in order to provide an early warning. In this study, a more objective, statistical-learning algorithm is proposed to detect and characterise the risk of a slope failure, based on spectral analysis of serially correlated displacement time series data. The algorithm is applied to satellite-based interferometric synthetic radar (InSAR) displacement time series data to retrospectively analyse the risk of the 2019 Brumadinho tailings dam collapse in Brazil. Two potential risk milestones are identified and signs of a definitive but emergent risk (27 February 2018 to 26 August 2018) and imminent risk of collapse of the tailings dam (27 June 2018 to 24 December 2018) are detected by the algorithm. Importantly, this precursory indication of risk of failure is detected as early as at least five months prior to the dam collapse on 25 January 2019. The results of this study demonstrate that the combination of spectral methods and second order statistical properties of InSAR displacement time series data can reveal signs of a transition into an unstable deformation regime, and that this algorithm can provide sufficient early warning that could help mitigate catastrophic slope failures.
Manufacturing advanced materials and products with a specific property or combination of properties is often warranted. To achieve that it is crucial to find out the optimum recipe or processing conditions that can generate the ideal combination of these properties. Most of the time, a sufficient number of experiments are needed to generate a Pareto front. However, manufacturing experiments are usually costly and even conducting a single experiment can be a time-consuming process. So, it's critical to determine the optimal location for data collection to gain the most comprehensive understanding of the process. Sequential learning is a promising approach to actively learn from the ongoing experiments, iteratively update the underlying optimization routine, and adapt the data collection process on the go. This paper presents a novel data-driven Bayesian optimization framework that utilizes sequential learning to efficiently optimize complex systems with multiple conflicting objectives. Additionally, this paper proposes a novel metric for evaluating multi-objective data-driven optimization approaches. This metric considers both the quality of the Pareto front and the amount of data used to generate it. The proposed framework is particularly beneficial in practical applications where acquiring data can be expensive and resource intensive. To demonstrate the effectiveness of the proposed algorithm and metric, the algorithm is evaluated on a manufacturing dataset. The results indicate that the proposed algorithm can achieve the actual Pareto front while processing significantly less data. It implies that the proposed data-driven framework can lead to similar manufacturing decisions with reduced costs and time.
Coffee which is prepared from the grinded roasted seeds of harvested coffee cherries, is one of the most consumed beverage and traded commodity, globally. To manually monitor the coffee field regularly, and inform about plant and soil health, as well as estimate yield and harvesting time, is labor-intensive, time-consuming and error-prone. Some recent studies have developed sensors for estimating coffee yield at the time of harvest, however a more inclusive and applicable technology to remotely monitor multiple parameters of the field and estimate coffee yield and quality even at pre-harvest stage, was missing. Following precision agriculture approach, we employed machine learning algorithm YOLO, for image processing of coffee plant. In this study, the latest version of the state-of-the-art algorithm YOLOv7 was trained with 324 annotated images followed by its evaluation with 82 unannotated images as test data. Next, as an innovative approach for annotating the training data, we trained K-means models which led to machine-generated color classes of coffee fruit and could thus characterize the informed objects in the image. Finally, we attempted to develop an AI-based handy mobile application which would not only efficiently predict harvest time, estimate coffee yield and quality, but also inform about plant health. Resultantly, the developed model efficiently analyzed the test data with a mean average precision of 0.89. Strikingly, our innovative semi-supervised method with an mean average precision of 0.77 for multi-class mode surpassed the supervised method with mean average precision of only 0.60, leading to faster and more accurate annotation. The mobile application we designed based on the developed code, was named CoffeApp, which possesses multiple features of analyzing fruit from the image taken by phone camera with in field and can thus track fruit ripening in real time.
Current embedded systems are specifically designed to run multimedia applications. These applications have a big impact on both performance and energy consumption. Both metrics can be optimized selecting the best cache configuration for a target set of applications. Multi-objective optimization may help to minimize both conflicting metrics in an independent manner. In this work, we propose an optimization method that based on Multi-Objective Evolutionary Algorithms, is able to find the best cache configuration for a given set of applications. To evaluate the goodness of candidate solutions, the execution of the optimization algorithm is combined with a static profiling methodology using several well-known simulation tools. Results show that our optimization framework is able to obtain an optimized cache for Mediabench applications. Compared to a baseline cache memory, our design method reaches an average improvement of 64.43\% and 91.69\% in execution time and energy consumption, respectively.
Neural operator learning as a means of mapping between complex function spaces has garnered significant attention in the field of computational science and engineering (CS&E). In this paper, we apply Neural operator learning to the time-of-flight ultrasound computed tomography (USCT) problem. We learn the mapping between time-of-flight (TOF) data and the heterogeneous sound speed field using a full-wave solver to generate the training data. This novel application of operator learning circumnavigates the need to solve the computationally intensive iterative inverse problem. The operator learns the non-linear mapping offline and predicts the heterogeneous sound field with a single forward pass through the model. This is the first time operator learning has been used for ultrasound tomography and is the first step in potential real-time predictions of soft tissue distribution for tumor identification in beast imaging.
Well curated, large-scale corpora of social media posts containing broad public opinion offer an alternative data source to complement traditional surveys. While surveys are effective at collecting representative samples and are capable of achieving high accuracy, they can be both expensive to run and lag public opinion by days or weeks. Both of these drawbacks could be overcome with a real-time, high volume data stream and fast analysis pipeline. A central challenge in orchestrating such a data pipeline is devising an effective method for rapidly selecting the best corpus of relevant documents for analysis. Querying with keywords alone often includes irrelevant documents that are not easily disambiguated with bag-of-words natural language processing methods. Here, we explore methods of corpus curation to filter irrelevant tweets using pre-trained transformer-based models, fine-tuned for our binary classification task on hand-labeled tweets. We are able to achieve F1 scores of up to 0.95. The low cost and high performance of fine-tuning such a model suggests that our approach could be of broad benefit as a pre-processing step for social media datasets with uncertain corpus boundaries.