Augmentative and alternative communication (AAC) is a practical means of communication for people with language disabilities. In this study, we propose PicTalky, which is an AI-based AAC system that helps children with language developmental disabilities to improve their communication skills and language comprehension abilities. PicTalky can process both text and pictograms more accurately by connecting a series of neural-based NLP modules. Moreover, we perform quantitative and qualitative analyses on the essential features of PicTalky. It is expected that those suffering from language problems will be able to express their intentions or desires more easily and improve their quality of life by using this service. We have made the models freely available alongside a demonstration of the Web interface. Furthermore, we implemented robotics AAC for the first time by applying PicTalky to the NAO robot.
Grapevine winter pruning is a complex task, that requires skilled workers to execute it correctly. The complexity makes it time consuming. It is an operation that requires about 80-120 hours per hectare annually, making an automated robotic system that helps in speeding up the process a crucial tool in large-size vineyards. We will describe (a) a novel expert annotated dataset for grapevine segmentation, (b) a state of the art neural network implementation and (c) generation of pruning points following agronomic rules, leveraging the simplified structure of the plant. With this approach, we are able to generate a set of pruning points on the canes, paving the way towards a correct automation of grapevine winter pruning.
Estimating the travel time for a given path is a fundamental problem in many urban transportation systems. However, prior works fail to well capture moving behaviors embedded in paths and thus do not estimate the travel time accurately. To fill in this gap, in this work, we propose a novel neural network framework, namely {\em Deep Image-based Spatio-Temporal network (DeepIST)}, for travel time estimation of a given path. The novelty of DeepIST lies in the following aspects: 1) we propose to plot a path as a sequence of "generalized images" which include sub-paths along with additional information, such as traffic conditions, road network and traffic signals, in order to harness the power of convolutional neural network model (CNN) on image processing; 2) we design a novel two-dimensional CNN, namely {\em PathCNN}, to extract spatial patterns for lines in images by regularization and adopting multiple pooling methods; and 3) we apply a one-dimensional CNN to capture temporal patterns among the spatial patterns along the paths for the estimation. Empirical results show that DeepIST soundly outperforms the state-of-the-art travel time estimation models by 24.37\% to 25.64\% of mean absolute error (MAE) in multiple large-scale real-world datasets.
Surrogate testing techniques have been used widely to investigate the presence of dynamical nonlinearities, an essential ingredient of deterministic chaotic processes. Traditional surrogate testing subscribes to statistical hypothesis testing and investigates potential differences in discriminant statistics between the given empirical sample and its surrogate counterparts. The choice and estimation of the discriminant statistics can be challenging across short time series. Also, conclusion based on a single empirical sample is an inherent limitation. The present study proposes a recurrent neural network classification framework that uses the raw time series obviating the need for discriminant statistic while accommodating multiple time series realizations for enhanced generalizability of the findings. The results are demonstrated on short time series with lengths (L = 32, 64, 128) from continuous and discrete dynamical systems in chaotic regimes, nonlinear transform of linearly correlated noise and experimental data. Accuracy of the classifier is shown to be markedly higher than >> 50% for the processes in chaotic regimes whereas those of nonlinearly correlated noise were around ~50% similar to that of random guess from a one-sample binomial test. These results are promising and elucidate the usefulness of the proposed framework in identifying potential dynamical nonlinearities from short experimental time series.
Intelligent reflecting surface (IRS) has emerged as a key enabling technology to realize smart and reconfigurable radio environment for wireless communications, by digitally controlling the signal reflection via a large number of passive reflecting elements in real time. Different from conventional wireless communication techniques that only adapt to but have no or limited control over dynamic wireless channels, IRS provides a new and cost-effective means to combat the wireless channel impair-ments in a proactive manner. However, despite its great potential, IRS faces new and unique challenges in its efficient integration into wireless communication systems, especially its channel estimation and passive beamforming design under various practical hardware constraints. In this paper, we provide a comprehensive survey on the up-to-date research in IRS-aided wireless communications, with an emphasis on the promising solutions to tackle practical design issues. Furthermore, we discuss new and emerging IRS architectures and applications as well as their practical design problems to motivate future research.
Active learning theories and methods have been extensively studied in classical statistical learning settings. However, deep active learning, i.e., active learning with deep learning models, is usually based on empirical criteria without solid theoretical justification, thus suffering from heavy doubts when some of those fail to provide benefits in applications. In this paper, by exploring the connection between the generalization performance and the training dynamics, we propose a theory-driven deep active learning method (dynamicAL) which selects samples to maximize training dynamics. In particular, we prove that convergence speed of training and the generalization performance is positively correlated under the ultra-wide condition and show that maximizing the training dynamics leads to a better generalization performance. Further on, to scale up to large deep neural networks and data sets, we introduce two relaxations for the subset selection problem and reduce the time complexity from polynomial to constant. Empirical results show that dynamicAL not only outperforms the other baselines consistently but also scales well on large deep learning models. We hope our work inspires more attempts in bridging the theoretical findings of deep networks and practical impacts in deep active learning applications.
Electroencephalography (EEG) is a method of measuring the brain's electrical activity, using non-invasive scalp electrodes. In this article, we propose the Portiloop, a deep learning-based portable and low-cost device enabling the neuroscience community to capture EEG, process it in real time, detect patterns of interest, and respond with precisely-timed stimulation. The core of the Portiloop is a System on Chip composed of an Analog to Digital Converter (ADC) and a Field-Programmable Gate Array (FPGA). After being converted to digital by the ADC, the EEG signal is processed in the FPGA. The FPGA contains an ad-hoc Artificial Neural Network (ANN) with convolutional and recurrent units, directly implemented in hardware. The output of the ANN is then used to trigger the user-defined feedback. We use the Portiloop to develop a real-time sleep spindle stimulating application, as a case study. Sleep spindles are a specific type of transient oscillation ($\sim$2.5 s, 12-16 Hz) that are observed in EEG recordings, and are related to memory consolidation during sleep. We tested the Portiloop's capacity to detect and stimulate sleep spindles in real time using an existing database of EEG sleep recordings. With 71% for both precision and recall as compared with expert labels, the system is able to stimulate spindles within $\sim$300 ms of their onset, enabling experimental manipulation of early the entire spindle. The Portiloop can be extended to detect and stimulate other neural events in EEG. It is fully available to the research community as an open science project.
Massive amounts of satellite data have been gathered over time, holding the potential to unveil a spatiotemporal chronicle of the surface of Earth. These data allow scientists to investigate various important issues, such as land use changes, on a global scale. However, not all land-use phenomena are equally visible on satellite imagery. In particular, the creation of an inventory of the planet's road infrastructure remains a challenge, despite being crucial to analyze urbanization patterns and their impact. Towards this end, this work advances data-driven approaches for the automatic identification of roads based on open satellite data. Given the typical resolutions of these historical satellite data, we observe that there is inherent variation in the visibility of different road types. Based on this observation, we propose two deep learning frameworks that extend state-of-the-art deep learning methods by formalizing road detection as an ordinal classification task. In contrast to related schemes, one of the two models also resorts to satellite time series data that are potentially affected by missing data and cloud occlusion. Taking these time series data into account eliminates the need to manually curate datasets of high-quality image tiles, substantially simplifying the application of such models on a global scale. We evaluate our approaches on a dataset that is based on Sentinel~2 satellite imagery and OpenStreetMap vector data. Our results indicate that the proposed models can successfully identify large and medium-sized roads. We also discuss opportunities and challenges related to the detection of roads and other infrastructure on a global scale.
In recent years, the research landscape of machine learning in medical imaging has changed drastically from supervised to semi-, weakly- or unsupervised methods. This is mainly due to the fact that ground-truth labels are time-consuming and expensive to obtain manually. Generating labels from patient metadata might be feasible but it suffers from user-originated errors which introduce biases. In this work, we propose an unsupervised approach for automatically clustering and categorizing large-scale medical image datasets, with a focus on cardiac MR images, and without using any labels. We investigated the end-to-end training using both class-balanced and imbalanced large-scale datasets. Our method was able to create clusters with high purity and achieved over 0.99 cluster purity on these datasets. The results demonstrate the potential of the proposed method for categorizing unstructured large medical databases, such as organizing clinical PACS systems in hospitals.
Training neural networks using limited annotations is an important problem in the medical domain. Deep Neural Networks (DNNs) typically require large, annotated datasets to achieve acceptable performance which, in the medical domain, are especially difficult to obtain as they require significant time from expert radiologists. Semi-supervised learning aims to overcome this problem by learning segmentations with very little annotated data, whilst exploiting large amounts of unlabelled data. However, the best-known technique, which utilises inferred pseudo-labels, is vulnerable to inaccurate pseudo-labels degrading the performance. We propose a framework based on superpixels - meaningful clusters of adjacent pixels - to improve the accuracy of the pseudo labels and address this issue. Our framework combines superpixels with semi-supervised learning, refining the pseudo-labels during training using the features and edges of the superpixel maps. This method is evaluated on a multimodal magnetic resonance imaging (MRI) dataset for the task of brain tumour region segmentation. Our method demonstrates improved performance over the standard semi-supervised pseudo-labelling baseline when there is a reduced annotator burden and only 5 annotated patients are available. We report DSC=0.824 and DSC=0.707 for the test set whole tumour and tumour core regions respectively.