Most business decisions are made with analysis, but some are judgment calls not susceptible to analysis due to time or information constraints. In this article, we present a real-life case study of critical business decision making of PerceptIn, an autonomous driving technology startup. In early years of PerceptIn, PerceptIn had to make a decision on the design of computing systems for its autonomous vehicle products. By providing details on PerceptIn's decision process and the results of the decision, we hope to provide some insights that can be beneficial to entrepreneurs and engineering managers in technology startups.
Compressing massive LiDAR point clouds in real-time is critical to autonomous machines such as drones and self-driving cars. While most of the recent prior work has focused on compressing individual point cloud frames, this paper proposes a novel system that effectively compresses a sequence of point clouds. The idea to exploit both the spatial and temporal redundancies in a sequence of point cloud frames. We first identify a key frame in a point cloud sequence and spatially encode the key frame by iterative plane fitting. We then exploit the fact that consecutive point clouds have large overlaps in the physical space, and thus spatially encoded data can be (re-)used to encode the temporal stream. Temporal encoding by reusing spatial encoding data not only improves the compression rate, but also avoids redundant computations, which significantly improves the compression speed. Experiments show that our compression system achieves 40x to 90x compression rate, significantly higher than the MPEG's LiDAR point cloud compression standard, while retaining high end-to-end application accuracies. Meanwhile, our compression system has a compression speed that matches the point cloud generation rate by today LiDARs and out-performs existing compression systems, enabling real-time point cloud transmission.
Assuming hardware is the major constraint for enabling real-time mobile intelligence, the industry has mainly dedicated their efforts to developing specialized hardware accelerators for machine learning and inference. This article challenges the assumption. By drawing on a recent real-time AI optimization framework CoCoPIE, it maintains that with effective compression-compiler co-design, it is possible to enable real-time artificial intelligence on mainstream end devices without special hardware. CoCoPIE is a software framework that holds numerous records on mobile AI: the first framework that supports all main kinds of DNNs, from CNNs to RNNs, transformer, language models, and so on; the fastest DNN pruning and acceleration framework, up to 180X faster compared with current DNN pruning on other frameworks such as TensorFlow-Lite; making many representative AI applications able to run in real-time on off-the-shelf mobile devices that have been previously regarded possible only with special hardware support; making off-the-shelf mobile devices outperform a number of representative ASIC and FPGA solutions in terms of energy efficiency and/or performance.
Assuming hardware is the major constraint for enabling real-time mobile intelligence, the industry has mainly dedicated their efforts to developing specialized hardware accelerators for machine learning and inference. This article challenges the assumption. By drawing on a recent real-time AI optimization framework CoCoPIE, it maintains that with effective compression-compiler co-design, it is possible to enable real-time artificial intelligence on mainstream end devices without special hardware. CoCoPIE is a software framework that holds numerous records on mobile AI: the first framework that supports all main kinds of DNNs, from CNNs to RNNs, transformer, language models, and so on; the fastest DNN pruning and acceleration framework, up to 180X faster compared with current DNN pruning on other frameworks such as TensorFlow-Lite; making many representative AI applications able to run in real-time on off-the-shelf mobile devices that have been previously regarded possible only with special hardware support; making off-the-shelf mobile devices outperform a number of representative ASIC and FPGA solutions in terms of energy efficiency and/or performance.
E-commerce has evolved with the digital technology revolution over the years. Last-mile logistics service contributes a significant part of the e-commerce experience. In contrast to the traditional last-mile logistics services, smart logistics service with autonomous driving technologies provides a promising solution to reduce the delivery cost and to improve efficiency. However, the traffic conditions in complex traffic environments, such as those in China, are more challenging compared to those in well-developed countries. Many types of moving objects (such as pedestrians, bicycles, electric bicycles, and motorcycles, etc.) share the road with autonomous vehicles, and their behaviors are not easy to track and predict. This paper introduces a technical solution from JD.com, a leading E-commerce company in China, to the autonomous last-mile delivery in complex traffic environments. Concretely, the methodologies in each module of our autonomous vehicles are presented, together with safety guarantee strategies. Up to this point, JD.com has deployed more than 300 self-driving vehicles for trial operations in tens of provinces of China, with an accumulated 715,819 miles and up to millions of on-road testing hours.
In contrast to manned missions, the application of autonomous robots for space exploration missions decreases the safety concerns of the exploration missions while extending the exploration distance since returning transportation is not necessary for robotics missions. In addition, the employment of robots in these missions also decreases mission complexities and costs because there is no need for onboard life support systems: robots can withstand and operate in harsh conditions, for instance, extreme temperature, pressure, and radiation, where humans cannot survive. In this article, we introduce environments on Mars, review the existing autonomous driving techniques deployed on Earth, as well as explore technologies required to enable future commercial autonomous space robotic explorers. Last but not least, we also present that one of the urgent technical challenges for autonomous space explorers, namely, computing power onboard.
Bundle adjustment (BA) is a fundamental optimization technique used in many crucial applications, including 3D scene reconstruction, robotic localization, camera calibration, autonomous driving, space exploration, street view map generation etc. Essentially, BA is a joint non-linear optimization problem, and one which can consume a significant amount of time and power, especially for large optimization problems. Previous approaches of optimizing BA performance heavily rely on parallel processing or distributed computing, which trade higher power consumption for higher performance. In this paper we propose {\pi}-BA, the first hardware-software co-designed BA engine on an embedded FPGA-SoC that exploits custom hardware for higher performance and power efficiency. Specifically, based on our key observation that not all points appear on all images in a BA problem, we designed and implemented a Co-Observation Optimization technique to accelerate BA operations with optimized usage of memory and computation resources. Experimental results confirm that {\pi}-BA outperforms the existing software implementations in terms of performance and power consumption.
In this paper, we present the Trifo Visual Inertial Odometry (Trifo-VIO), a tightly-coupled filtering-based stereo VIO system using both points and lines. Line features help improve system robustness in challenging scenarios when point features cannot be reliably detected or tracked, e.g. low-texture environment or lighting change. In addition, we propose a novel lightweight filtering-based loop closing technique to reduce accumulated drift without global bundle adjustment or pose graph optimization. We formulate loop closure as EKF updates to optimally relocate the current sliding window maintained by the filter to past keyframes. We also present the Trifo Ironsides dataset, a new visual-inertial dataset, featuring high-quality synchronized stereo camera and IMU data from the Ironsides sensor [3] with various motion types and textures and millimeter-accuracy groundtruth. To validate the performance of the proposed system, we conduct extensive comparison with state-of-the-art approaches (OKVIS, VINS-MONO and S-MSCKF) using both the public EuRoC dataset and the Trifo Ironsides dataset.
Autonomous driving is not one single technology but rather a complex system integrating many technologies, which means that teaching autonomous driving is a challenging task. Indeed, most existing autonomous driving classes focus on one of the technologies involved. This not only fails to provide a comprehensive coverage, but also sets a high entry barrier for students with different technology backgrounds. In this paper, we present a modular, integrated approach to teaching autonomous driving. Specifically, we organize the technologies used in autonomous driving into modules. This is described in the textbook we have developed as well as a series of multimedia online lectures designed to provide technical overview for each module. Then, once the students have understood these modules, the experimental platforms for integration we have developed allow the students to fully understand how the modules interact with each other. To verify this teaching approach, we present three case studies: an introductory class on autonomous driving for students with only a basic technology background; a new session in an existing embedded systems class to demonstrate how embedded system technologies can be applied to autonomous driving; and an industry professional training session to quickly bring up experienced engineers to work in autonomous driving. The results show that students can maintain a high interest level and make great progress by starting with familiar concepts before moving onto other modules.
Enabling full robotic workloads with diverse behaviors on mobile systems with stringent resource and energy constraints remains a challenge. In recent years, attempts have been made to deploy single-accelerator-based computing platforms (such as GPU, DSP, or FPGA) to address this challenge, but with little success. The core problem is two-fold: firstly, different robotic tasks require different accelerators, and secondly, managing multiple accelerators simultaneously is overwhelming for developers. In this paper, we propose PIRT, the first robotic runtime framework to efficiently manage dynamic task executions on mobile systems with multiple accelerators as well as on the cloud to achieve better performance and energy savings. With PIRT, we enable a robot to simultaneously perform autonomous navigation with 25 FPS of localization, obstacle detection with 3 FPS, route planning, large map generation, and scene understanding, traveling at a max speed of 5 miles per hour, all within an 11W computing power envelope.