Split learning (SL) is an emergent distributed learning framework which can mitigate the computation and wireless communication overhead of federated learning. It splits a machine learning model into a device-side model and a server-side model at a cut layer. Devices only train their allocated model and transmit the activations of the cut layer to the server. However, SL can lead to data leakage as the server can reconstruct the input data using the correlation between the input and intermediate activations. Although allocating more layers to a device-side model can reduce the possibility of data leakage, this will lead to more energy consumption for resource-constrained devices and more training time for the server. Moreover, non-iid datasets across devices will reduce the convergence rate leading to increased training time. In this paper, a new personalized SL framework is proposed. For this framework, a novel approach for choosing the cut layer that can optimize the tradeoff between the energy consumption for computation and wireless transmission, training time, and data privacy is developed. In the considered framework, each device personalizes its device-side model to mitigate non-iid datasets while sharing the same server-side model for generalization. To balance the energy consumption for computation and wireless transmission, training time, and data privacy, a multiplayer bargaining problem is formulated to find the optimal cut layer between devices and the server. To solve the problem, the Kalai-Smorodinsky bargaining solution (KSBS) is obtained using the bisection method with the feasibility test. Simulation results show that the proposed personalized SL framework with the cut layer from the KSBS can achieve the optimal sum utilities by balancing the energy consumption, training time, and data privacy, and it is also robust to non-iid datasets.
We address the problem of predicting when a disease will develop, i.e., medical event time (MET), from a patient's electronic health record (EHR). The MET of non-communicable diseases like diabetes is highly correlated to cumulative health conditions, more specifically, how much time the patient spent with specific health conditions in the past. The common time-series representation is indirect in extracting such information from EHR because it focuses on detailed dependencies between values in successive observations, not cumulative information. We propose a novel data representation for EHR called cumulative stay-time representation (CTR), which directly models such cumulative health conditions. We derive a trainable construction of CTR based on neural networks that has the flexibility to fit the target data and scalability to handle high-dimensional EHR. Numerical experiments using synthetic and real-world datasets demonstrate that CTR alone achieves a high prediction performance, and it enhances the performance of existing models when combined with them.
We investigate a model for image/video quality assessment based on building a set of codevectors representing in a sense some basic properties of images, similar to well-known CORNIA model. We analyze the codebook building method and propose some modifications for it. Also the algorithm is investigated from the point of inference time reduction. Both natural and synthetic images are used for building codebooks and some analysis of synthetic images used for codebooks is provided. It is demonstrated the results on quality assessment may be improves with the use if synthetic images for codebook construction. We also demonstrate regimes of the algorithm in which real time execution on CPU is possible for sufficiently high correlations with mean opinion score (MOS). Various pooling strategies are considered as well as the problem of metric sensitivity to bitrate.
If our aesthetic preferences are affected by fractal geometry of nature, scaling regularities would be expected to appear in all art forms, including music. While a variety of statistical tools have been proposed to analyze time series in sound, no consensus has as yet emerged regarding the most meaningful measure of complexity in music, or how to discern fractal patterns in compositions in the first place. Here we offer a new approach based on self-similarity of the melodic lines recurring at various temporal scales. In contrast to the statistical analyses advanced in recent literature, the proposed method does not depend on averaging within time-windows and is distinctively local. The corresponding definition of the fractal dimension is based on the temporal scaling hierarchy and depends on the tonal contours of the musical motifs. The new concepts are tested on musical 'renditions' of the Cantor Set and Koch Curve, and then applied to a number of carefully selected masterful compositions spanning five centuries of music making.
Event cameras are novel bio-inspired sensors that offer advantages over traditional cameras (low latency, high dynamic range, low power, etc.). Optical flow estimation methods that work on packets of events trade off speed for accuracy, while event-by-event (incremental) methods have strong assumptions and have not been tested on common benchmarks that quantify progress in the field. Towards applications on resource-constrained devices, it is important to develop optical flow algorithms that are fast, light-weight and accurate. This work leverages insights from neuroscience, and proposes a novel optical flow estimation scheme based on triplet matching. The experiments on publicly available benchmarks demonstrate its capability to handle complex scenes with comparable results as prior packet-based algorithms. In addition, the proposed method achieves the fastest execution time (> 10 kHz) on standard CPUs as it requires only three events in estimation. We hope that our research opens the door to real-time, incremental motion estimation methods and applications in real-world scenarios.
The current research work is being developed as a training and evaluation object. the performance of a predictive model to apply it to the imports of vegetable products into Peru using artificial intelligence algorithms, specifying for this study the Machine Learning models: LSTM and PROPHET. The forecast is made with data from the monthly record of imports of vegetable products(in kilograms) from Peru, collected from the years 2021 to 2022. As part of applying the training methodology for automatic learning algorithms, the exploration and construction of an appropriate dataset according to the parameters of a Time Series. Subsequently, the model with better performance will be selected, evaluating the precision of the predicted values so that they account for sufficient reliability to consider it a useful resource in the forecast of imports in Peru.
Climate change, population growth, and water scarcity present unprecedented challenges for agriculture. This project aims to forecast soil moisture using domain knowledge and machine learning for crop management decisions that enable sustainable farming. Traditional methods for predicting hydrological response features require significant computational time and expertise. Recent work has implemented machine learning models as a tool for forecasting hydrological response features, but these models neglect a crucial component of traditional hydrological modeling that spatially close units can have vastly different hydrological responses. In traditional hydrological modeling, units with similar hydrological properties are grouped together and share model parameters regardless of their spatial proximity. Inspired by this domain knowledge, we have constructed a novel domain-inspired temporal graph convolution neural network. Our approach involves clustering units based on time-varying hydrological properties, constructing graph topologies for each cluster, and forecasting soil moisture using graph convolutions and a gated recurrent neural network. We have trained, validated, and tested our method on field-scale time series data consisting of approximately 99,000 hydrological response units spanning 40 years in a case study in northeastern United States. Comparison with existing models illustrates the effectiveness of using domain-inspired clustering with time series graph neural networks. The framework is being deployed as part of a pro bono social impact program. The trained models are being deployed on small-holding farms in central Texas.
In classic reinforcement learning algorithms, agents make decisions at discrete and fixed time intervals. The physical duration between one decision and the next becomes a critical hyperparameter. When this duration is too short, the agent needs to make many decisions to achieve its goal, aggravating the problem's difficulty. But when this duration is too long, the agent becomes incapable of controlling the system. Physical systems, however, do not need a constant control frequency. For learning agents, it is desirable to operate with low frequency when possible and high frequency when necessary. We propose a framework called Continuous-Time Continuous-Options (CTCO), where the agent chooses options as sub-policies of variable durations. Such options are time-continuous and can interact with the system at any desired frequency providing a smooth change of actions. The empirical analysis shows that our algorithm is competitive w.r.t. other time-abstraction techniques, such as classic option learning and action repetition, and practically overcomes the difficult choice of the decision frequency.
A virtual or digital tour is a form of virtual reality technology which allows a user to experience a specific location remotely. Currently, these virtual tours are created by following a 2-step strategy. First, a photographer clicks a 360 degree equirectangular image; then, a team of annotators manually links these images for the "walkthrough" user experience. The major challenge in the mass adoption of virtual tours is the time and cost involved in manual annotation/linking of images. Therefore, this paper presents an end-to-end pipeline to automate the generation of 3D virtual tours using equirectangular images for real-estate properties. We propose a novel HSV-based coloring scheme for paper tags that need to be placed at different locations before clicking the equirectangular images using 360 degree cameras. These tags have two characteristics: i) they are numbered to help the photographer for placement of tags in sequence and; ii) bi-colored, which allows better learning of tag detection (using YOLOv5 architecture) in an image and digit recognition (using custom MobileNet architecture) tasks. Finally, we link/connect all the equirectangular images based on detected tags. We show the efficiency of the proposed pipeline on a real-world equirectangular image dataset collected from the Housing.com database.
We investigate the effects on authorship identification tasks of a fundamental shift in how to conceive the vectorial representations of documents that are given as input to a supervised learner. In ``classic'' authorship analysis a feature vector represents a document, the value of a feature represents (an increasing function of) the relative frequency of the feature in the document, and the class label represents the author of the document. We instead investigate the situation in which a feature vector represents an unordered pair of documents, the value of a feature represents the absolute difference in the relative frequencies (or increasing functions thereof) of the feature in the two documents, and the class label indicates whether the two documents are from the same author or not. This latter (learner-independent) type of representation has been occasionally used before, but has never been studied systematically. We argue that it is advantageous, and that in some cases (e.g., authorship verification) it provides a much larger quantity of information to the training process than the standard representation. The experiments that we carry out on several publicly available datasets (among which one that we here make available for the first time) show that feature vectors representing pairs of documents (that we here call Diff-Vectors) bring about systematic improvements in the effectiveness of authorship identification tasks, and especially so when training data are scarce (as it is often the case in real-life authorship identification scenarios). Our experiments tackle same-author verification, authorship verification, and closed-set authorship attribution; while DVs are naturally geared for solving the 1st, we also provide two novel methods for solving the 2nd and 3rd that use a solver for the 1st as a building block.