E-commerce with major online retailers is changing the way people consume. The goal of increasing delivery speed while remaining cost-effective poses significant new challenges for supply chains as they race to satisfy the growing and fast-changing demand. In this paper, we consider a warehouse with a Robotic Mobile Fulfillment System (RMFS), in which a fleet of robots stores and retrieves shelves of items and brings them to human pickers. To adapt to changing demand, uncertainty, and differentiated service (e.g., prime vs. regular), one can dynamically modify the storage allocation of a shelf. The objective is to define a dynamic storage policy to minimise the average cycle time used by the robots to fulfil requests. We propose formulating this system as a Partially Observable Markov Decision Process, and using a Deep Q-learning agent from Reinforcement Learning, to learn an efficient real-time storage policy that leverages repeated experiences and insightful forecasts using simulations. Additionally, we develop a rollout strategy to enhance our method by leveraging more information available at a given time step. Using simulations to compare our method to traditional storage rules used in the industry showed preliminary results up to 14\% better in terms of travelling times.
Transformers are state-of-the-art models for a variety of sequence modeling tasks. At their core is an attention function which models pairwise interactions between the inputs at every timestep. While attention is powerful, it does not scale efficiently to long sequences due to its quadratic time and space complexity in the sequence length. We propose RFA, a linear time and space attention that uses random feature methods to approximate the softmax function, and explore its application in transformers. RFA can be used as a drop-in replacement for conventional softmax attention and offers a straightforward way of learning with recency bias through an optional gating mechanism. Experiments on language modeling and machine translation demonstrate that RFA achieves similar or better performance compared to strong transformer baselines. In the machine translation experiment, RFA decodes twice as fast as a vanilla transformer. Compared to existing efficient transformer variants, RFA is competitive in terms of both accuracy and efficiency on three long text classification datasets. Our analysis shows that RFA's efficiency gains are especially notable on long sequences, suggesting that RFA will be particularly useful in tasks that require working with large inputs, fast decoding speed, or low memory footprints.
This work addresses the problem of reconstructing biomedical signals from their lower dimensional projections. Traditionally Compressed Sensing (CS) based techniques have been employed for this task. These are transductive inversion processes, the problem with these approaches is that the inversion is time-consuming and hence not suitable for real-time applications. With the recent advent of deep learning, Stacked Sparse Denoising Autoencoder (SSDAE) has been used for learning inversion in an inductive setup. The training period for inductive learning is large but is very fast during application -- capable of real-time speed. This work proposes a new approach for inductive learning of the inversion process. It is based on Coupled Analysis Dictionary Learning. Results on Biomedical signal reconstruction show that our proposed approach is very fast and yields result far better than CS and SSDAE.
Recognizing every person's action in a crowded and cluttered environment is a challenging task. In this paper, we propose a real-time action recognition method, Action4D, which gives reliable and accurate results in the real-world settings. We propose to tackle the action recognition problem using a holistic 4D "scan" of a cluttered scene to include every detail about the people and environment. Recognizing multiple people's actions in the cluttered 4D representation is a new problem. In this paper, we propose novel methods to solve this problem. We propose a new method to track people in 4D, which can reliably detect and follow each person in real time. We propose a new deep neural network, the Action4D-Net, to recognize the action of each tracked person. The Action4D-Net's novel structure uses both the global feature and the focused attention to achieve state-of-the-art result. Our real-time method is invariant to camera view angles, resistant to clutter and able to handle crowd. The experimental results show that the proposed method is fast, reliable and accurate. Our method paves the way to action recognition in the real-world applications and is ready to be deployed to enable smart homes, smart factories and smart stores.
In this survey, we analyze the newest machine learning (ML) techniques for optical orthogonal frequency division multiplexing (O-OFDM)-based optical communications. ML has been proposed to mitigate channel and transceiver imperfections. For instance, ML can improve the signal quality under low modulation extinction ratio or can tackle both determinist and stochastic-induced nonlinearities such as parametric noise amplification in long-haul transmission. The proposed ML algorithms for O-OFDM can in particularly tackle inter-subcarrier nonlinear effects such as four-wave mixing and cross-phase modulation. In essence, these ML techniques could be beneficial for any multi-carrier approach (e.g. filter bank modulation). Supervised and unsupervised ML techniques are analyzed in terms of both O-OFDM transmission performance and computational complexity for potential real-time implementation. We indicate the strict conditions under which a ML algorithm should perform classification, regression or clustering. The survey also discusses open research issues and future directions towards the ML implementation.
Intelligent reflecting surface (IRS) is being considered as a prospective candidate for next-generation wireless communication due to its ability to significantly improve coverage and spectral efficiency by controlling the propagation environment. One of the ways IRS increases spectral efficiency is by adjusting phase shifts to perform passive beamforming. In this letter, we integrate the concept of IRS-aided communication to the domain of multi-direction beamforming, whereby multiple receive antennas are selected to convey more information bits than existing spatial modulation (SM) techniques at any specific time. To complement this system, we also propose a successive signal detection (SSD) technique at the receiver. Numerical results show that the proposed design is able to improve the average successful bits transmitted (ASBT) by the system, which outperforms other state-of-the-art methods proposed in the literature.
Invariance to a broad array of image corruptions, such as warping, noise, or color shifts, is an important aspect of building robust models in computer vision. Recently, several new data augmentations have been proposed that significantly improve performance on ImageNet-C, a benchmark of such corruptions. However, there is still a lack of basic understanding on the relationship between data augmentations and test-time corruptions. To this end, we develop a feature space for image transforms, and then use a new measure in this space between augmentations and corruptions called the Minimal Sample Distance to demonstrate there is a strong correlation between similarity and performance. We then investigate recent data augmentations and observe a significant degradation in corruption robustness when the test-time corruptions are sampled to be perceptually dissimilar from ImageNet-C in this feature space. Our results suggest that test error can be improved by training on perceptually similar augmentations, and data augmentations may not generalize well beyond the existing benchmark. We hope our results and tools will allow for more robust progress towards improving robustness to image corruptions.
The material presented in this paper contributes to establishing a basis deemed essential for substantial progress in Automated Deduction. It identifies and studies global features in selected problems and their proofs which offer the potential of guiding proof search in a more direct way. The studied problems are of the wide-spread form of "axiom(s) and rule(s) imply goal(s)". The features include the well-known concept of lemmas. For their elaboration both human and automated proofs of selected theorems are taken into a close comparative consideration. The study at the same time accounts for a coherent and comprehensive formal reconstruction of historical work by {\L}ukasiewicz, Meredith and others. First experiments resulting from the study indicate novel ways of lemma generation to supplement automated first-order provers of various families, strengthening in particular their ability to find short proofs.
There is a lack of scalable quantitative measures of reactivity for functional groups in organic chemistry. Measuring reactivity experimentally is costly and time-consuming and does not scale to the astronomical size of chemical space. In previous quantum chemistry studies, we have introduced Methyl Cation Affinities (MCA*) and Methyl Anion Affinities (MAA*), using a solvation model, as quantitative measures of reactivity for organic functional groups over the broadest range. Although MCA* and MAA* offer good estimates of reactivity parameters, their calculation through Density Functional Theory (DFT) simulations is time-consuming. To circumvent this problem, we first use DFT to calculate MCA* and MAA* for more than 2,400 organic molecules thereby establishing a large dataset of chemical reactivity scores. We then design deep learning methods to predict the reactivity of molecular structures and train them using this curated dataset in combination with different representations of molecular structures. Using ten-fold cross-validation, we show that graph attention neural networks applied to informative input fingerprints produce the most accurate estimates of reactivity, achieving over 91% test accuracy for predicting the MCA* plus-minus 3.0 or MAA* plus-minus 3.0, over 50 orders of magnitude. Finally, we demonstrate the application of these reactivity scores to two tasks: (1) chemical reaction prediction; (2) combinatorial generation of reaction mechanisms. The curated dataset of MCA* and MAA* scores is available through the ChemDB chemoinformatics web portal at www.cdb.ics.uci.edu.
During the last decade or so, there has been an insurgence in the deep learning community to solve health-related issues, particularly breast cancer. Following the Camelyon-16 challenge in 2016, several researchers have dedicated their time to build Convolutional Neural Networks (CNNs) to help radiologists and other clinicians diagnose breast cancer. In particular, there has been an emphasis on Ductal Carcinoma in Situ (DCIS); the clinical term for early-stage breast cancer. Large companies have given their fair share of research into this subject, among these Google Deepmind who developed a model in 2020 that has proven to be better than radiologists themselves to diagnose breast cancer correctly. We found that among the issues which exist, there is a need for an explanatory system that goes through the hidden layers of a CNN to highlight those pixels that contributed to the classification of a mammogram. We then chose an open-source, reasonably successful project developed by Prof. Shen, using the CBIS-DDSM image database to run our experiments on. It was later improved using the Resnet-50 and VGG-16 patch-classifiers, analytically comparing the outcome of both. The results showed that the Resnet-50 one converged earlier in the experiments. Following the research by Montavon and Binder, we used the DeepTaylor Layer-wise Relevance Propagation (LRP) model to highlight those pixels and regions within a mammogram which contribute most to its classification. This is represented as a map of those pixels in the original image, which contribute to the diagnosis and the extent to which they contribute to the final classification. The most significant advantage of this algorithm is that it performs exceptionally well with the Resnet-50 patch classifier architecture.