In the past ten years, the use of 3D Time-of-Flight (ToF) LiDARs in mobile robotics has grown rapidly. Based on our accumulation of relevant research, this article systematically reviews and analyzes the use 3D ToF LiDARs in research and industrial applications. The former includes object detection, robot localization, long-term autonomy, LiDAR data processing under adverse weather conditions, and sensor fusion. The latter encompasses service robots, assisted and autonomous driving, and recent applications performed in response to public health crises. We hope that our efforts can effectively provide readers with relevant references and promote the deployment of existing mature technologies in real-world systems.
While recent automated data augmentation methods lead to state-of-the-art results, their design spaces and the derived data augmentation strategies still incorporate strong human priors. In this work, instead of fixing a set of hand-picked default augmentations alongside the searched data augmentations, we propose a fully automated approach for data augmentation search named Deep AutoAugment (DeepAA). DeepAA progressively builds a multi-layer data augmentation pipeline from scratch by stacking augmentation layers one at a time until reaching convergence. For each augmentation layer, the policy is optimized to maximize the cosine similarity between the gradients of the original and augmented data along the direction with low variance. Our experiments show that even without default augmentations, we can learn an augmentation policy that achieves strong performance with that of previous works. Extensive ablation studies show that the regularized gradient matching is an effective search method for data augmentation policies. Our code is available at: https://github.com/MSU-MLSys-Lab/DeepAA .
Developing neuromorphic intelligence on event-based datasets with spiking neural networks (SNNs) has recently attracted much research attention. However, the limited size of event-based datasets makes SNNs prone to overfitting and unstable convergence. This issue remains unexplored by previous academic works. In an effort to minimize this generalization gap, we propose neuromorphic data augmentation (NDA), a family of geometric augmentations specifically designed for event-based datasets with the goal of significantly stabilizing the SNN training and reducing the generalization gap between training and test performance. The proposed method is simple and compatible with existing SNN training pipelines. Using the proposed augmentation, for the first time, we demonstrate the feasibility of unsupervised contrastive learning for SNNs. We conduct comprehensive experiments on prevailing neuromorphic vision benchmarks and show that NDA yields substantial improvements over previous state-of-the-art results. For example, NDA-based SNN achieves accuracy gain on CIFAR10-DVS and N-Caltech 101 by 10.1% and 13.7%, respectively.
Background and Aim: Each stage of sleep can affect human health, and not getting enough sleep at any stage may lead to sleep disorder like parasomnia, apnea, insomnia, etc. Sleep-related diseases could be diagnosed using Convolutional Neural Network Classifier. However, this classifier has not been successfully implemented into sleep stage classification systems due to high complexity and low accuracy of classification. The aim of this research is to increase the accuracy and reduce the learning time of Convolutional Neural Network Classifier. Methodology: The proposed system used a modified Orthogonal Convolutional Neural Network and a modified Adam optimisation technique to improve the sleep stage classification accuracy and reduce the gradient saturation problem that occurs due to sigmoid activation function. The proposed system uses Leaky Rectified Linear Unit (ReLU) instead of sigmoid activation function as an activation function. Results: The proposed system called Enhanced Sleep Stage Classification system (ESSC) used six different databases for training and testing the proposed model on the different sleep stages. These databases are University College Dublin database (UCD), Beth Israel Deaconess Medical Center MIT database (MIT-BIH), Sleep European Data Format (EDF), Sleep EDF Extended, Montreal Archive of Sleep Studies (MASS), and Sleep Heart Health Study (SHHS). Our results show that the gradient saturation problem does not exist anymore. The modified Adam optimiser helps to reduce the noise which in turn result in faster convergence time. Conclusion: The convergence speed of ESSC is increased along with better classification accuracy compared to the state of art solution.
We consider the problem of extracting features from passive, multi-channel electroencephalogram (EEG) devices for downstream inference tasks related to high-level mental states such as stress and cognitive load. Our proposed method leverages recently developed multi-graph tools and applies them to the time series of graphs implied by the statistical dependence structure (e.g., correlation) amongst the multiple sensors. We compare the effectiveness of the proposed features to traditional band power-based features in the context of three classification experiments and find that the two feature sets offer complementary predictive information. We conclude by showing that the importance of particular channels and pairs of channels for classification when using the proposed features is neuroscientifically valid.
Recent years have witnessed an emerging paradigm shift toward embodied artificial intelligence, in which an agent must learn to solve challenging tasks by interacting with its environment. There are several challenges in solving embodied multimodal tasks, including long-horizon planning, vision-and-language grounding, and efficient exploration. We focus on a critical bottleneck, namely the performance of planning and navigation. To tackle this challenge, we propose a Neural SLAM approach that, for the first time, utilizes several modalities for exploration, predicts an affordance-aware semantic map, and plans over it at the same time. This significantly improves exploration efficiency, leads to robust long-horizon planning, and enables effective vision-and-language grounding. With the proposed Affordance-aware Multimodal Neural SLAM (AMSLAM) approach, we obtain more than $40\%$ improvement over prior published work on the ALFRED benchmark and set a new state-of-the-art generalization performance at a success rate of $23.48\%$ on the test unseen scenes.
The increasing amount of data to be processed on edge devices, such as cameras, has motivated Artificial Intelligence (AI) integration at the edge. Typical image processing methods performed at the edge, such as feature extraction or edge detection, use convolutional filters that are energy, computation, and memory hungry algorithms. But edge devices and cameras have scarce computational resources, bandwidth, and power and are limited due to privacy constraints to send data over to the cloud. Thus, there is a need to process image data at the edge. Over the years, this need has incited a lot of interest in implementing neuromorphic computing at the edge. Neuromorphic systems aim to emulate the biological neural functions to achieve energy-efficient computing. Recently, Oscillatory Neural Networks (ONN) present a novel brain-inspired computing approach by emulating brain oscillations to perform autoassociative memory types of applications. To speed up image edge detection and reduce its power consumption, we perform an in-depth investigation with ONNs. We propose a novel image processing method by using ONNs as a hetero-associative memory (HAM) for image edge detection. We simulate our ONN-HAM solution using first, a Matlab emulator, and then a fully digital ONN design. We show results on gray scale square evaluation maps, also on black and white and gray scale 28x28 MNIST images and finally on black and white 512x512 standard test images. We compare our solution with standard edge detection filters such as Sobel and Canny. Finally, using the fully digital design simulation results, we report on timing and resource characteristics, and evaluate its feasibility for real-time image processing applications. Our digital ONN-HAM solution can process images with up to 120x120 pixels (166 MHz system frequency) respecting real-time camera constraints. This work is the first to explore ONNs as hetero-associative memory for image processing applications.
An Xception model reaches state-of-the-art (SOTA) accuracy on the ESC-50 dataset for audio event detection through knowledge transfer from ImageNet weights, pretraining on AudioSet, and an on-the-fly data augmentation pipeline. This paper presents an ablation study that analyzes which components contribute to the boost in performance and training time. A smaller Xception model is also presented which nears SOTA performance with almost a third of the parameters.
In Neural Architecture Search (NAS), reducing the cost of architecture evaluation remains one of the most crucial challenges. Among a plethora of efforts to bypass training of each candidate architecture to convergence for evaluation, the Neural Tangent Kernel (NTK) is emerging as a promising theoretical framework that can be utilized to estimate the performance of a neural architecture at initialization. In this work, we revisit several at-initialization metrics that can be derived from the NTK and reveal their key shortcomings. Then, through the empirical analysis of the time evolution of NTK, we deduce that modern neural architectures exhibit highly non-linear characteristics, making the NTK-based metrics incapable of reliably estimating the performance of an architecture without some amount of training. To take such non-linear characteristics into account, we introduce Label-Gradient Alignment (LGA), a novel NTK-based metric whose inherent formulation allows it to capture the large amount of non-linear advantage present in modern neural architectures. With minimal amount of training, LGA obtains a meaningful level of rank correlation with the post-training test accuracy of an architecture. Lastly, we demonstrate that LGA, complemented with few epochs of training, successfully guides existing search algorithms to achieve competitive search performances with significantly less search cost. The code is available at: https://github.com/nutellamok/DemystifyingNTK.
Unmanned aerial vehicles (UAVs) can help facilitate cost-effective and flexible service provisioning in future smart cities. Nevertheless, UAV applications generally suffer severe flight time limitations due to constrained onboard battery capacity, causing a necessity of frequent battery recharging or replacement when performing persistent missions. Utilizing wireless mobile chargers, such as vehicles with wireless charging equipment for on-demand self-recharging has been envisioned as a promising solution to address this issue. In this article, we present a comprehensive study of \underline{v}ehicle-assisted \underline{w}ireless rechargeable \underline{U}AV \underline{n}etworks (VWUNs) to promote on-demand, secure, and efficient UAV recharging services. Specifically, we first discuss the opportunities and challenges of deploying VWUNs and review state-of-the-art solutions in this field. We then propose a secure and privacy-preserving VWUN framework for UAVs and ground vehicles based on differential privacy (DP). Within this framework, an online double auction mechanism is developed for optimal charging scheduling, and a two-phase DP algorithm is devised to preserve the sensitive bidding and energy trading information of participants. Experimental results demonstrate that the proposed framework can effectively enhance charging efficiency and security. Finally, we outline promising directions for future research in this emerging field.