Numerous Simultaneous Localization and Mapping (SLAM) algorithms have been presented in last decade using different sensor modalities. However, robust SLAM in extreme weather conditions is still an open research problem. In this paper, RadarSLAM, a full radar based graph SLAM system, is proposed for reliable localization and mapping in large-scale environments. It is composed of pose tracking, local mapping, loop closure detection and pose graph optimization, enhanced by novel feature matching and probabilistic point cloud generation on radar images. Extensive experiments are conducted on a public radar dataset and several self-collected radar sequences, demonstrating the state-of-the-art reliability and localization accuracy in various adverse weather conditions, such as dark night, dense fog and heavy snowfall.
Polarization images are known to be able to capture polarized reflected lights that preserve rich geometric cues of an object, which has motivated its recent applications in reconstructing detailed surface normal of the objects of interest. Meanwhile, inspired by the recent breakthroughs in human shape estimation from a single color image, we attempt to investigate the new question of whether the geometric cues from polarization camera could be leveraged in estimating detailed human body shapes. This has led to the curation of Polarization Human Shape and Pose Dataset (PHSPD)5, our home-grown polarization image dataset of various human shapes and poses.
An integral part of video analysis and surveillance is temporal activity detection, which means to simultaneously recognize and localize activities in long untrimmed videos. Currently, the most effective methods of temporal activity detection are based on deep learning, and they typically perform very well with large scale annotated videos for training. However, these methods are limited in real applications due to the unavailable videos about certain activity classes and the time-consuming data annotation. To solve this challenging problem, we propose a novel task setting called zero-shot temporal activity detection (ZSTAD), where activities that have never been seen in training can still be detected. We design an end-to-end deep network based on R-C3D as the architecture for this solution. The proposed network is optimized with an innovative loss function that considers the embeddings of activity labels and their super-classes while learning the common semantics of seen and unseen activities. Experiments on both the THUMOS14 and the Charades datasets show promising performance in terms of detecting unseen activities.
Chinese calligraphy is a unique art form with great artistic value but difficult to master. In this paper, we formulate the calligraphy writing problem as a trajectory optimization problem, and propose an improved virtual brush model for simulating the real writing process. Our approach is inspired by pseudospectral optimal control in that we parameterize the actuator trajectory for each stroke as a Chebyshev polynomial. The proposed dynamic virtual brush model plays a key role in formulating the objective function to be optimized. Our approach shows excellent performance in drawing aesthetically pleasing characters, and does so much more efficiently than previous work, opening up the possibility to achieve real-time closed-loop control.
With the aggressive growth of smart environments, a large amount of data are generated by edge devices. As a result, content delivery has been quickly pushed to network edges. Compared with classical content delivery networks, edge caches with smaller size usually suffer from more bursty requests, which makes conventional caching algorithms perform poorly in edge networks. This paper aims to propose an effective caching decision policy called PA-Cache that uses evolving deep learning to adaptively learn time-varying content popularity to decide which content to evict when the cache is full. Unlike prior learning-based approaches that either use a small set of features for decision making or require the entire training dataset to be available for learning a fine-tuned but might outdated prediction model, PA-Cache weights a large set of critical features to train the neural network in an evolving manner so as to meet the edge requests with fluctuations and bursts. We demonstrate the effectiveness of PA-Cache through extensive experiments with real-world data traces from a large commercial video-on-demand service provider. The evaluation shows that PA-Cache improves the hit rate in comparison with state-of-the-art methods at a lower computational cost.
Image data has been greatly produced by individuals and commercial vendors in the daily life, and it has been used across various domains, like advertising, medical and traffic analysis. Recently, image data also appears to be greatly important in social utility, like emergency response. However, the privacy concern becomes the biggest obstacle that prevents further exploration of image data, due to that the image could reveal sensitive information, like the personal identity and locations. The recent developed Local Differential Privacy (LDP) brings us a promising solution, which allows the data owners to randomly perturb their input to provide the plausible deniability of the data before releasing. In this paper, we consider a two-party image classification problem, in which data owners hold the image and the untrustworthy data user would like to fit a machine learning model with these images as input. To protect the image privacy, we propose to locally perturb the image representation before revealing to the data user. Subsequently, we analyze how the perturbation satisfies {\epsilon}-LDP and affect the data utility regarding count-based and distance-based machine learning algorithm, and propose a supervised image feature extractor, DCAConv, which produces an image representation with scalable domain size. Our experiments show that DCAConv could maintain a high data utility while preserving the privacy regarding multiple image benchmark datasets.
In machine learning, boosting is one of the most popular methods that designed to combine multiple base learners to a superior one. The well-known Boosted Decision Tree classifier, has been widely adopted in many areas. In the big data era, the data held by individual and entities, like personal images, browsing history and census information, are more likely to contain sensitive information. The privacy concern raises when such data leaves the hand of the owners and be further explored or mined. Such privacy issue demands that the machine learning algorithm should be privacy aware. Recently, Local Differential Privacy is proposed as an effective privacy protection approach, which offers a strong guarantee to the data owners, as the data is perturbed before any further usage, and the true values never leave the hands of the owners. Thus the machine learning algorithm with the private data instances is of great value and importance. In this paper, we are interested in developing the privacy-preserving boosting algorithm that a data user is allowed to build a classifier without knowing or deriving the exact value of each data samples. Our experiments demonstrate the effectiveness of the proposed boosting algorithm and the high utility of the learned classifiers.
Demand for smartwatches has taken off in recent years with new models which can run independently from smartphones and provide more useful features, becoming first-class mobile platforms. One can access online banking or even make payments on a smartwatch without a paired phone. This makes smartwatches more attractive and vulnerable to malicious attacks, which to date have been largely overlooked. In this paper, we demonstrate Snoopy, a password extraction and inference system which is able to accurately infer passwords entered on Android/Apple watches within 20 attempts, just by eavesdropping on motion sensors. Snoopy uses a uniform framework to extract the segments of motion data when passwords are entered, and uses novel deep neural networks to infer the actual passwords. We evaluate the proposed Snoopy system in the real-world with data from 362 participants and show that our system offers a 3-fold improvement in the accuracy of inferring passwords compared to the state-of-the-art, without consuming excessive energy or computational resources. We also show that Snoopy is very resilient to user and device heterogeneity: it can be trained on crowd-sourced motion data (e.g. via Amazon Mechanical Turk), and then used to attack passwords from a new user, even if they are wearing a different model. This paper shows that, in the wrong hands, Snoopy can potentially cause serious leaks of sensitive information. By raising awareness, we invite the community and manufacturers to revisit the risks of continuous motion sensing on smart wearable devices.
For high resolution scene mapping and object recognition, optical technologies such as cameras and LiDAR are the sensors of choice. However, for robust future vehicle autonomy and driver assistance in adverse weather conditions, improvements in automotive radar technology, and the development of algorithms and machine learning for robust mapping and recognition are essential. In this paper, we describe a methodology based on deep neural networks to recognise objects in 300GHz radar images, investigating robustness to changes in range, orientation and different receivers in a laboratory environment. As the training data is limited, we have also investigated the effects of transfer learning. As a necessary first step before road trials, we have also considered detection and classification in multiple object scenes.