With the advancement of robotics and AI technologies in the past decade, we have now entered the age of autonomous machines. In this new age of information technology, autonomous machines, such as service robots, autonomous drones, delivery robots, and autonomous vehicles, rather than humans, will provide services. In this article, through examining the technical challenges and economic impact of the digital economy, we argue that scalability is both highly necessary from a technical perspective and significantly advantageous from an economic perspective, thus is the key for the autonomy industry to achieve its full potential. Nonetheless, the current development paradigm, dubbed Autonomy 1.0, scales with the number of engineers, instead of with the amount of data or compute resources, hence preventing the autonomy industry to fully benefit from the economies of scale, especially the exponentially cheapening compute cost and the explosion of available data. We further analyze the key scalability blockers and explain how a new development paradigm, dubbed Autonomy 2.0, can address these problems to greatly boost the autonomy industry.
Raman spectroscopy (RS) has been widely used for disease diagnosis, e.g., cardiovascular disease (CVD), owing to its efficiency and component-specific testing capabilities. A series of popular deep learning methods have recently been introduced to learn nuance features from RS for binary classifications and achieved outstanding performance than conventional machine learning methods. However, these existing deep learning methods still confront some challenges in classifying subtypes of CVD. For example, the nuance between subtypes is quite hard to capture and represent by intelligent models due to the chillingly similar shape of RS sequences. Moreover, medical history information is an essential resource for distinguishing subtypes, but they are underutilized. In light of this, we propose a multi-modality multi-scale model called M3S, which is a novel deep learning method with two core modules to address these issues. First, we convert RS data to various resolution images by the Gramian angular field (GAF) to enlarge nuance, and a two-branch structure is leveraged to get embeddings for distinction in the multi-scale feature extraction module. Second, a probability matrix and a weight matrix are used to enhance the classification capacity by combining the RS and medical history data in the multi-modality data fusion module. We perform extensive evaluations of M3S and found its outstanding performance on our in-house dataset, with accuracy, precision, recall, specificity, and F1 score of 0.9330, 0.9379, 0.9291, 0.9752, and 0.9334, respectively. These results demonstrate that the M3S has high performance and robustness compared with popular methods in diagnosing CVD subtypes.
Artificial intelligence-enabled histopathological data analysis has become a valuable assistant to the pathologist. However, existing models lack representation and inference abilities compared with those of pathologists, especially in cancer subtype diagnosis, which is unconvincing in clinical practice. For instance, pathologists typically observe the lesions of a slide from global to local, and then can give a diagnosis based on their knowledge and experience. In this paper, we propose a Data and Knowledge Co-driving (D&K) model to replicate the process of cancer subtype classification on a histopathological slide like a pathologist. Specifically, in the data-driven module, the bagging mechanism in ensemble learning is leveraged to integrate the histological features from various bags extracted by the embedding representation unit. Furthermore, a knowledge-driven module is established based on the Gestalt principle in psychology to build the three-dimensional (3D) expert knowledge space and map histological features into this space for metric. Then, the diagnosis can be made according to the Euclidean distance between them. Extensive experimental results on both public and in-house datasets demonstrate that the D&K model has a high performance and credible results compared with the state-of-the-art methods for diagnosing histopathological subtypes. Code: https://github.com/Dennis-YB/Data-and-Knowledge-Co-driving-for-Cancer-Subtypes-Classification
As Deep Neural Networks (DNNs) are increasingly deployed in safety critical and privacy sensitive applications such as autonomous driving and biometric authentication, it is critical to understand the fault-tolerance nature of DNNs. Prior work primarily focuses on metrics such as Failures In Time (FIT) rate and the Silent Data Corruption (SDC) rate, which quantify how often a device fails. Instead, this paper focuses on quantifying the DNN accuracy given that a transient error has occurred, which tells us how well a network behaves when a transient error occurs. We call this metric Resiliency Accuracy (RA). We show that existing RA formulation is fundamentally inaccurate, because it incorrectly assumes that software variables (model weights/activations) have equal faulty probability under hardware transient faults. We present an algorithm that captures the faulty probabilities of DNN variables under transient faults and, thus, provides correct RA estimations validated by hardware. To accelerate RA estimation, we reformulate RA calculation as a Monte Carlo integration problem, and solve it using importance sampling driven by DNN specific heuristics. Using our lightweight RA estimation method, we show that transient faults lead to far greater accuracy degradation than what todays DNN resiliency tools estimate. We show how our RA estimation tool can help design more resilient DNNs by integrating it with a Network Architecture Search framework.
Infrastructure-Vehicle Cooperative Autonomous Driving (IVCAD) is a new paradigm of autonomous driving, which relies on the cooperation between intelligent roads and autonomous vehicles. This paradigm has been shown to be safer and more efficient compared to the on-vehicle-only autonomous driving paradigm. Our real-world deployment data indicates that the effectiveness of IVCAD is constrained by reliability and performance of commercial communication networks. This paper targets this exact problem, and proposes INTERNEURON, a middleware to achieve high communication reliability between intelligent roads and autonomous vehicles, in the context of IVCAD. Specifically, INTERNEURON dynamically matches IVCAD applications and the underlying communication technologies based on varying communication performance and quality needs. Evaluation results confirm that INTERNEURON reduces deadline violations by more than 95\%, significantly improving the reliability of IVCAD systems.
Factor graph is a graph representing the factorization of a probability distribution function, and has been utilized in many autonomous machine computing tasks, such as localization, tracking, planning and control etc. We are developing an architecture with the goal of using factor graph as a common abstraction for most, if not, all autonomous machine computing tasks. If successful, the architecture would provide a very simple interface of mapping autonomous machine functions to the underlying compute hardware. As a first step of such an attempt, this paper presents our most recent work of developing a factor graph accelerator for LiDAR-Inertial Odometry (LIO), an essential task in many autonomous machines, such as autonomous vehicles and mobile robots. By modeling LIO as a factor graph, the proposed accelerator not only supports multi-sensor fusion such as LiDAR, inertial measurement unit (IMU), GPS, etc., but solves the global optimization problem of robot navigation in batch or incremental modes. Our evaluation demonstrates that the proposed design significantly improves the real-time performance and energy efficiency of autonomous machine navigation systems. The initial success suggests the potential of generalizing the factor graph architecture as a common abstraction for autonomous machine computing, including tracking, planning, and control etc.
This paper is the first to provide a thorough system design overview along with the fusion methods selection criteria of a real-world cooperative autonomous driving system, named Infrastructure-Augmented Autonomous Driving or IAAD. We present an in-depth introduction of the IAAD hardware and software on both road-side and vehicle-side computing and communication platforms. We extensively characterize the IAAD system in the context of real-world deployment scenarios and observe that the network condition that fluctuates along the road is currently the main technical roadblock for cooperative autonomous driving. To address this challenge, we propose new fusion methods, dubbed "inter-frame fusion" and "planning fusion" to complement the current state-of-the-art "intra-frame fusion". We demonstrate that each fusion method has its own benefit and constraint.
Robotic computing has reached a tipping point, with a myriad of robots (e.g., drones, self-driving cars, logistic robots) being widely applied in diverse scenarios. The continuous proliferation of robotics, however, critically depends on efficient computing substrates, driven by real-time requirements, robotic size-weight-and-power constraints, cybersecurity considerations, and dynamically changing scenarios. Within all platforms, FPGA is able to deliver both software and hardware solutions with low power, high performance, reconfigurability, reliability, and adaptivity characteristics, serving as the promising computing substrate for robotic applications. This paper highlights the current progress, design techniques, challenges, and open research challenges in the domain of robotic computing on FPGAs.
The commercialization of autonomous machines is a thriving sector, and likely to be the next major computing demand driver, after PC, cloud computing, and mobile computing. Nevertheless, a suitable computer architecture for autonomous machines is missing, and many companies are forced to develop ad hoc computing solutions that are neither scalable nor extensible. In this article, we analyze the demands of autonomous machine computing, and argue for the promise of dataflow architectures in autonomous machines.
Stereo matching is a critical task for robot navigation and autonomous vehicles, providing the depth estimation of surroundings. Among all stereo matching algorithms, Efficient Large-scale Stereo (ELAS) offers one of the best tradeoffs between efficiency and accuracy. However, due to the inherent iterative process and unpredictable memory access pattern, ELAS can only run at 1.5-3 fps on high-end CPUs and difficult to achieve real-time performance on low-power platforms. In this paper, we propose an energy-efficient architecture for real-time ELAS-based stereo matching on FPGA platform. Moreover, the original computational-intensive and irregular triangulation module is reformed in a regular manner with points interpolation, which is much more hardware-friendly. Optimizations, including memory management, parallelism, and pipelining, are further utilized to reduce memory footprint and improve throughput. Compared with Intel i7 CPU and the state-of-the-art CPU+FPGA implementation, our FPGA realization achieves up to 38.4x and 3.32x frame rate improvement, and up to 27.1x and 1.13x energy efficiency improvement, respectively.