Visible light positioning (VLP) technology is a promising technique as it can provide high accuracy positioning based on the existing lighting infrastructure. However, existing approaches often require dense lighting distributions. Additionally, due to complicated indoor environments, it is still challenging to develop a robust VLP. In this work, we proposed loosely-coupled multi-sensor fusion method based on VLP and Simultaneous Localization and Mapping (SLAM), with light detection and ranging (LiDAR), odometry, and rolling shutter camera. Our method can provide accurate and robust robotics localization and navigation in LED-shortage or even outage situations. The efficacy of the proposed scheme is verified by extensive real-time experiment 1 . The results show that our proposed scheme can provide an average accuracy of 2 cm and the average computational time in low-cost embedded platforms is around 50 ms.
Aerial pixel-wise scene perception of the surrounding environment is an important task for UAVs (Unmanned Aerial Vehicles). Previous research works mainly adopt conventional pinhole cameras or fisheye cameras as the imaging device. However, these imaging systems cannot achieve large Field of View (FoV), small size, and lightweight at the same time. To this end, we design a UAV system with a Panoramic Annular Lens (PAL), which has the characteristics of small size, low weight, and a 360-degree annular FoV. A lightweight panoramic annular semantic segmentation neural network model is designed to achieve high-accuracy and real-time scene parsing. In addition, we present the first drone-perspective panoramic scene segmentation dataset Aerial-PASS, with annotated labels of track, field, and others. A comprehensive variety of experiments shows that the designed system performs satisfactorily in aerial panoramic scene parsing. In particular, our proposed model strikes an excellent trade-off between segmentation performance and inference speed suitable, validated on both public street-scene and our established aerial-scene datasets.
Unmanned aerial vehicles (UAVs) are reaching offshore. In this work, we formulate the novel problem of a marine locomotive quadrotor UAV, which manipulates the surge velocity of a floating buoy by means of a cable. The proposed robotic system can have a variety of novel applications for UAVs where their high speed and maneuverability, as well as their ease of deployment and wide field of vision, give them a superior advantage. In addition, the major limitation of limited flight time of quadrotor UAVs is typically addressed through an umbilical power cable, which naturally integrates with the proposed system. A detailed high-fidelity dynamic model is presented for the buoy, UAV, and water environment. In addition, a stable control system design is proposed to manipulate the surge velocity of the buoy within certain constraints that keep the buoy in contact with the water surface. Polar coordinates are used in the controller design process since they outperform traditional Cartesian-based velocity controllers when it comes to ensuring correlated effects on the tracking performance, where each control channel independently affects one control parameter. The system model and controller design are validated in numerical simulation under different wave scenarios.
Robot motion generation methods using machine learning have been studied in recent years. Bilateral controlbased imitation learning can imitate human motions using force information. By means of this method, variable speed motion generation that considers physical phenomena such as the inertial force and friction can be achieved. Previous research demonstrated that the complex relationship between the force and speed can be learned by using a neural network model. However, the previous study only focused on a simple reciprocating motion. To learn the complex relationship between the force and speed more accurately, it is necessary to learn multiple actions using many joints. In this paper, we propose a variable speed motion generation method for multiple motions. We considered four types of neural network models for the motion generation and determined the best model for multiple motions at variable speeds. Subsequently, we used the best model to evaluate the reproducibility of the task completion time for the input completion time command. The results revealed that the proposed method could change the task completion time according to the specified completion time command in multiple motions.
SARS-CoV-2, like any other virus, continues to mutate as it spreads, according to an evolutionary process. Unlike any other virus, the number of currently available sequences of SARS-CoV-2 in public databases such as GISAID is already several million. This amount of data has the potential to uncover the evolutionary dynamics of a virus like never before. However, a million is already several orders of magnitude beyond what can be processed by the traditional methods designed to reconstruct a virus's evolutionary history, such as those that build a phylogenetic tree. Hence, new and scalable methods will need to be devised in order to make use of the ever increasing number of viral sequences being collected. Since identifying variants is an important part of understanding the evolution of a virus, in this paper, we propose an approach based on clustering sequences to identify the current major SARS-CoV-2 variants. Using a $k$-mer based feature vector generation and efficient feature selection methods, our approach is effective in identifying variants, as well as being efficient and scalable to millions of sequences. Such a clustering method allows us to show the relative proportion of each variant over time, giving the rate of spread of each variant in different locations -- something which is important for vaccine development and distribution. We also compute the importance of each amino acid position of the spike protein in identifying a given variant in terms of information gain. Positions of high variant-specific importance tend to agree with those reported by the USA's Centers for Disease Control and Prevention (CDC), further demonstrating our approach.
Specularity prediction is essential to many computer vision applications by giving important visual cues that could be used in Augmented Reality (AR), Simultaneous Localisation and Mapping (SLAM), 3D reconstruction and material modeling, thus improving scene understanding. However, it is a challenging task requiring numerous information from the scene including the camera pose, the geometry of the scene, the light sources and the material properties. Our previous work have addressed this task by creating an explicit model using an ellipsoid whose projection fits the specularity image contours for a given camera pose. These ellipsoid-based approaches belong to a family of models called JOint-LIght MAterial Specularity (JOLIMAS), where we have attempted to gradually remove assumptions on the scene such as the geometry of the specular surfaces. However, our most recent approach is still limited to uniformly curved surfaces. This paper builds upon these methods by generalising JOLIMAS to any surface geometry while improving the quality of specularity prediction, without sacrificing computation performances. The proposed method establishes a link between surface curvature and specularity shape in order to lift the geometric assumptions from previous work. Contrary to previous work, our new model is built from a physics-based local illumination model namely Torrance-Sparrow, providing a better model reconstruction. Specularity prediction using our new model is tested against the most recent JOLIMAS version on both synthetic and real sequences with objects of varying shape curvatures. Our method outperforms previous approaches in specularity prediction, including the real-time setup, as shown in the supplementary material using videos.
In this study, we explore the implications of integrating social distancing with emergency evacuation when a hurricane approaches a major city during the COVID-19 pandemic. Specifically, we compare DNN (Deep Neural Network)-based and non-DNN methods for generating evacuation strategies that minimize evacuation time while allowing for social distancing in rescue vehicles. A central question is whether a DNN-based method provides sufficient extra efficiency to accommodate social distancing, in a time-constrained evacuation operation. We describe the problem as a Capacitated Vehicle Routing Problem and solve it using one non-DNN solution (Sweep Algorithm) and one DNN-based solution (Deep Reinforcement Learning). DNN-based solution can provide decision-makers with more efficient routing than non-DNN solution. Although DNN-based solution can save considerable time in evacuation routing, it does not come close to compensating for the extra time required for social distancing and its advantage disappears as the vehicle capacity approaches the number of people per household.
Surveillance and exploration of large environments is a tedious task. In spaces with limited environmental cues, random-like search appears to be an effective approach as it allows the robot to perform online coverage of environments using a simple design. One way to generate random-like scanning is to use nonlinear dynamical systems to impart chaos into the robot's controller. This will result in generation of unpredictable but at the same time deterministic trajectories, allowing the designer to control the system and achieve a high scanning coverage. However, the unpredictability comes at the cost of increased coverage time and lack of scalability, both of which have been ignored by the state-of-the-art chaotic path planners. This study introduces a new scalable technique that helps a robot to steer away from the obstacles and cover the entire space in a short period of time. The technique involves coupling and manipulating two chaotic systems to minimize the coverage time and enable scanning of unknown environments with different properties online. Using this technique resulted in 49% boost, on average, in the robot's performance compared to the state-of-the-art planners. While ensuring unpredictability in the paths, the overall performance of the chaotic planner remained comparable to optimal systems.
Large, pre-trained transformer models like BERT have achieved state-of-the-art results on document understanding tasks, but most implementations can only consider 512 tokens at a time. For many real-world applications, documents can be much longer, and the segmentation strategies typically used on longer documents miss out on document structure and contextual information, hurting their results on downstream tasks. In our work on legal agreements, we find that visual cues such as layout, style, and placement of text in a document are strong features that are crucial to achieving an acceptable level of accuracy on long documents. We measure the impact of incorporating such visual cues, obtained via computer vision methods, on the accuracy of document understanding tasks including document segmentation, entity extraction, and attribute classification. Our method of segmenting documents based on structural metadata out-performs existing methods on four long-document understanding tasks as measured on the Contract Understanding Atticus Dataset.
In this paper, we present a novel neuroevolutionary method to identify the architecture and hyperparameters of convolutional autoencoders. Remarkably, we used a hypervolume indicator in the context of neural architecture search for autoencoders, for the first time to our current knowledge. Results show that images were compressed by a factor of more than 10, while still retaining enough information to achieve image classification for the majority of the tasks. Thus, this new approach can be used to speed up the AutoML pipeline for image compression.