Mixed-precision quantization (MPQ) suffers from time-consuming policy search process (i.e., the bit-width assignment for each layer) on large-scale datasets (e.g., ISLVRC-2012), which heavily limits its practicability in real-world deployment scenarios. In this paper, we propose to search the effective MPQ policy by using a small proxy dataset for the model trained on a large-scale one. It breaks the routine that requires a consistent dataset at model training and MPQ policy search time, which can improve the MPQ searching efficiency significantly. However, the discrepant data distributions bring difficulties in searching for such a transferable MPQ policy. Motivated by the observation that quantization narrows the class margin and blurs the decision boundary, we search the policy that guarantees a general and dataset-independent property: discriminability of feature representations. Namely, we seek the policy that can robustly keep the intra-class compactness and inter-class separation. Our method offers several advantages, i.e., high proxy data utilization, no extra hyper-parameter tuning for approximating the relationship between full-precision and quantized model and high searching efficiency. We search high-quality MPQ policies with the proxy dataset that has only 4% of the data scale compared to the large-scale target dataset, achieving the same accuracy as searching directly on the latter, and improving the MPQ searching efficiency by up to 300 times.
Predicting the demand for electricity with uncertainty helps in planning and operation of the grid to provide reliable supply of power to the consumers. Machine learning (ML)-based demand forecasting approaches can be categorized into (1) sample-based approaches, where each forecast is made independently, and (2) time series regression approaches, where some historical load and other feature information is used. When making a short-to-mid-term electricity demand forecast, some future information is available, such as the weather forecast and calendar variables. However, in existing forecasting models this future information is not fully incorporated. To overcome this limitation of existing approaches, we propose Masked Multi-Step Multivariate Probabilistic Forecasting (MMMPF), a novel and general framework to train any neural network model capable of generating a sequence of outputs, that combines both the temporal information from the past and the known information about the future to make probabilistic predictions. Experiments are performed on a real-world dataset for short-to-mid-term electricity demand forecasting for multiple regions and compared with various ML methods. They show that the proposed MMMPF framework outperforms not only sample-based methods but also existing time-series forecasting models with the exact same base models. Models trainded with MMMPF can also generate desired quantiles to capture uncertainty and enable probabilistic planning for grid of the future.
This paper deals with the development of a localization methodology for autonomous vehicles using only a $3\Dim$ LIDAR sensor. In the context of this paper, localizing a vehicle in a known 3D global map of the environment is essentially to find its global $3\Dim$ pose (position and orientation) within this map. The problem of tracking is then to use sequential LIDAR scan measurement to also estimate other states such as velocity and angular rates, in addition to the global pose of the vehicle. Particle filters are often used in localization and tracking, as in applications of simultaneously localization and mapping. But particle filters become computationally prohibitive with the increase in particles, often required to localize in a large $3\Dim$ map. Further, computing the likelihood of a LIDAR scan for each particle is in itself a computationally expensive task, thus limiting the number of particles that can be used for real time performance. To this end, we propose a hybrid approach that combines the advantages of a particle filter with a global-local scan matching method to better inform the re-sampling stage of the particle filter. Further, we propose to use a pre-computed likelihood grid to speedup the computation of LIDAR scans. Finally, we develop the complete algorithm to extensively leverage parallel processing to achieve near sufficient real-time performance on publicly available KITTI datasets.
Transport maps can ease the sampling of distributions with non-trivial geometries by transforming them into distributions that are easier to handle. The potential of this approach has risen with the development of Normalizing Flows (NF) which are maps parameterized with deep neural networks trained to push a reference distribution towards a target. NF-enhanced samplers recently proposed blend (Markov chain) Monte Carlo methods with either (i) proposal draws from the flow or (ii) a flow-based reparametrization. In both cases, the quality of the learned transport conditions performance. The present work clarifies for the first time the relative strengths and weaknesses of these two approaches. Our study concludes that multimodal targets can reliability be handled with flow-based proposals up to moderately high dimensions. In contrast, methods relying on reparametrization struggle with multimodality but are more robust otherwise in high-dimensional settings and under poor training. To further illustrate the influence of target-proposal adequacy, we also derive a new quantitative bound for the mixing time of the Independent Metropolis-Hastings sampler.
Text-to-image generation models represent the next step of evolution in image synthesis, offering natural means of flexible yet fine-grained control over the result. One emerging area of research is the rapid adaptation of large text-to-image models to smaller datasets or new visual concepts. However, the most efficient method of adaptation, called textual inversion, has a known limitation of long training time, which both restricts practical applications and increases the experiment time for research. In this work, we study the training dynamics of textual inversion, aiming to speed it up. We observe that most concepts are learned at early stages and do not improve in quality later, but standard model convergence metrics fail to indicate that. Instead, we propose a simple early stopping criterion that only requires computing the textual inversion loss on the same inputs for all training iterations. Our experiments on both Latent Diffusion and Stable Diffusion models for 93 concepts demonstrate the competitive performance of our method, speeding adaptation up to 15 times with no significant drops in quality.
This paper introduces a novel and general method to address the problem of using a general-purpose robot manipulator with a parallel gripper to wrap a deformable linear object (DLO), called a rope, around a rigid object, called a rod, autonomously. Our method does not require prior knowledge of the physical and geometrical properties of the objects but enables the robot to use real-time RGB-D perception to determine the wrapping state and feedback control to achieve high-quality results. As such, it provides the robot manipulator with the general capabilities to handle wrapping tasks of different rods or ropes. We tested our method on 6 combinations of 3 different ropes and 2 rods. The result shows that the wrapping quality improved and converged within 5 wraps for all test cases.
Explosive Ordnance Disposal (EOD) suits are widely used to protect human operators to execute emergency tasks such as bomb disposal and neutralization. Current suit designs still need to be improved in terms of wearer comfort, which can be assessed based on the interaction forces at the human-suit contact regions. This paper introduces a simulation-based modeling framework that computes the interaction loads at the human-suit interface based on a wearer's kinematic movement data. The proposed modeling framework consists of three primary components: a) inertial and geometric modeling of the EOD suit, b) state estimation of the wearer's in-suit movement, and c) inverse dynamics analysis to calculate the human-suit interface forces based on the simulated human-suit model and the estimated human movement data. This simulation-based modeling method could be used to complement experimental testing for improving the time and cost efficiency of EOD suit evaluation. The accuracy of the simulated interface load was experimentally benchmarked during three different human tasks (each with three trials), by comparing the predicted interface forces with that measured by commercial pressure sensors.
Recommendation systems are a core feature of social media companies with their uses including recommending organic and promoted contents. Many modern recommendation systems are split into multiple stages - candidate generation and heavy ranking - to balance computational cost against recommendation quality. We focus on the candidate generation phase of a large-scale ads recommendation problem in this paper, and present a machine learning first heterogeneous re-architecture of this stage which we term TwERC. We show that a system that combines a real-time light ranker with sourcing strategies capable of capturing additional information provides validated gains. We present two strategies. The first strategy uses a notion of similarity in the interaction graph, while the second strategy caches previous scores from the ranking stage. The graph based strategy achieves a 4.08% revenue gain and the rankscore based strategy achieves a 1.38% gain. These two strategies have biases that complement both the light ranker and one another. Finally, we describe a set of metrics that we believe are valuable as a means of understanding the complex product trade offs inherent in industrial candidate generation systems.
A new class of iterated linearization-based nonlinear filters, dubbed dynamically iterated filters, is presented. Contrary to regular iterated filters such as the iterated extended Kalman filter (IEKF), iterated unscented Kalman filter (IUKF) and iterated posterior linearization filter (IPLF), dynamically iterated filters also take nonlinearities in the transition model into account. The general filtering algorithm is shown to essentially be a (locally over one time step) iterated Rauch-Tung-Striebel smoother. Three distinct versions of the dynamically iterated filters are especially investigated: analogues to the IEKF, IUKF and IPLF. The developed algorithms are evaluated on 25 different noise configurations of a tracking problem with a nonlinear transition model and linear measurement model, a scenario where conventional iterated filters are not useful. Even in this "simple" scenario, the dynamically iterated filters are shown to have superior root mean-squared error performance as compared to their respective baselines, the EKF and UKF. Particularly, even though the EKF diverges in 22 out of 25 configurations, the dynamically iterated EKF remains stable in 20 out of 25 scenarios, only diverging under high noise.
Collective decision-making is an essential capability of large-scale multi-robot systems to establish autonomy on the swarm level. A large portion of literature on collective decision-making in swarm robotics focuses on discrete decisions selecting from a limited number of options. Here we assign a decentralized robot system with the task of exploring an unbounded environment, finding consensus on the mean of a measurable environmental feature, and aggregating at areas where that value is measured (e.g., a contour line). A unique quality of this task is a causal loop between the robots' dynamic network topology and their decision-making. For example, the network's mean node degree influences time to convergence while the currently agreed-on mean value influences the swarm's aggregation location, hence, also the network structure as well as the precision error. We propose a control algorithm and study it in real-world robot swarm experiments in different environments. We show that our approach is effective and achieves higher precision than a control experiment. We anticipate applications, for example, in containing pollution with surface vehicles.