The current dominant paradigm for robotic manipulation involves two separate stages: manipulator design and control. Because the robot's morphology and how it can be controlled are intimately linked, joint optimization of design and control can significantly improve performance. Existing methods for co-optimization are limited and fail to explore a rich space of designs. The primary reason is the trade-off between the complexity of designs that is necessary for contact-rich tasks against the practical constraints of manufacturing, optimization, contact handling, etc. We overcome several of these challenges by building an end-to-end differentiable framework for contact-aware robot design. The two key components of this framework are: a novel deformation-based parameterization that allows for the design of articulated rigid robots with arbitrary, complex geometry, and a differentiable rigid body simulator that can handle contact-rich scenarios and computes analytical gradients for a full spectrum of kinematic and dynamic parameters. On multiple manipulation tasks, our framework outperforms existing methods that either only optimize for control or for design using alternate representations or co-optimize using gradient-free methods.
We present methods for co-designing rigid robots over control and morphology (including discrete topology) over multiple objectives. Previous work has addressed problems in single-objective robot co-design or multi-objective control. However, the joint multi-objective co-design problem is extremely important for generating capable, versatile, algorithmically designed robots. In this work, we present Multi-Objective Graph Heuristic Search, which extends a single-objective graph heuristic search from previous work to enable a highly efficient multi-objective search in a combinatorial design topology space. Core to this approach, we introduce a new universal, multi-objective heuristic function based on graph neural networks that is able to effectively leverage learned information between different task trade-offs. We demonstrate our approach on six combinations of seven terrestrial locomotion and design tasks, including one three-objective example. We compare the captured Pareto fronts across different methods and demonstrate that our multi-objective graph heuristic search quantitatively and qualitatively outperforms other techniques.
Cloth simulation has wide applications including computer animation, garment design, and robot-assisted dressing. In this work, we present a differentiable cloth simulator whose additional gradient information facilitates cloth-related applications. Our differentiable simulator extends the state-of-the-art cloth simulator based on Projective Dynamics and with dry frictional contact governed by the Signorini-Coulomb law. We derive gradients with contact in this forward simulation framework and speed up the computation with Jacobi iteration inspired by previous differentiable simulation work. To our best knowledge, we present the first differentiable cloth simulator with the Coulomb law of friction. We demonstrate the efficacy of our simulator in various applications, including system identification, manipulation, inverse design, and a real-to-sim task. Many of our applications have not been demonstrated in previous differentiable cloth simulators. The gradient information from our simulator enables efficient gradient-based task solvers from which we observe a substantial speedup over standard gradient-free methods.
Polymers are widely-studied materials with diverse properties and applications determined by different molecular structures. It is essential to represent these structures clearly and explore the full space of achievable chemical designs. However, existing approaches are unable to offer comprehensive design models for polymers because of their inherent scale and structural complexity. Here, we present a parametric, context-sensitive grammar designed specifically for the representation and generation of polymers. As a demonstrative example, we implement our grammar for polyurethanes. Using our symbolic hypergraph representation and 14 simple production rules, our PolyGrammar is able to represent and generate all valid polyurethane structures. We also present an algorithm to translate any polyurethane structure from the popular SMILES string format into our PolyGrammar representation. We test the representative power of PolyGrammar by translating a dataset of over 600 polyurethane samples collected from literature. Furthermore, we show that PolyGrammar can be easily extended to the other copolymers and homopolymers such as polyacrylates. By offering a complete, explicit representation scheme and an explainable generative model with validity guarantees, our PolyGrammar takes an important step toward a more comprehensive and practical system for polymer discovery and exploration. As the first bridge between formal languages and chemistry, PolyGrammar also serves as a critical blueprint to inform the design of similar grammars for other chemistries, including organic and inorganic molecules.
We present AutoOED, an Optimal Experiment Design platform powered with automated machine learning to accelerate the discovery of optimal solutions. The platform solves multi-objective optimization problems in time- and data-efficient manner by automatically guiding the design of experiments to be evaluated. To automate the optimization process, we implement several multi-objective Bayesian optimization algorithms with state-of-the-art performance. AutoOED is open-source and written in Python. The codebase is modular, facilitating extensions and tailoring the code, serving as a testbed for machine learning researchers to easily develop and evaluate their own multi-objective Bayesian optimization algorithms. An intuitive graphical user interface (GUI) is provided to visualize and guide the experiments for users with little or no experience with coding, machine learning, or optimization. Furthermore, a distributed system is integrated to enable parallelized experimental evaluations by independent workers in remote locations. The platform is available at https://autooed.org.
The computational design of soft underwater swimmers is challenging because of the high degrees of freedom in soft-body modeling. In this paper, we present a differentiable pipeline for co-designing a soft swimmer's geometry and controller. Our pipeline unlocks gradient-based algorithms for discovering novel swimmer designs more efficiently than traditional gradient-free solutions. We propose Wasserstein barycenters as a basis for the geometric design of soft underwater swimmers since it is differentiable and can naturally interpolate between bio-inspired base shapes via optimal transport. By combining this design space with differentiable simulation and control, we can efficiently optimize a soft underwater swimmer's performance with fewer simulations than baseline methods. We demonstrate the efficacy of our method on various design problems such as fast, stable, and energy-efficient swimming and demonstrate applicability to multi-objective design.
Photorealistic editing of portraits is a challenging task as humans are very sensitive to inconsistencies in faces. We present an approach for high-quality intuitive editing of the camera viewpoint and scene illumination in a portrait image. This requires our method to capture and control the full reflectance field of the person in the image. Most editing approaches rely on supervised learning using training data captured with setups such as light and camera stages. Such datasets are expensive to acquire, not readily available and do not capture all the rich variations of in-the-wild portrait images. In addition, most supervised approaches only focus on relighting, and do not allow camera viewpoint editing. Thus, they only capture and control a subset of the reflectance field. Recently, portrait editing has been demonstrated by operating in the generative model space of StyleGAN. While such approaches do not require direct supervision, there is a significant loss of quality when compared to the supervised approaches. In this paper, we present a method which learns from limited supervised training data. The training images only include people in a fixed neutral expression with eyes closed, without much hair or background variations. Each person is captured under 150 one-light-at-a-time conditions and under 8 camera poses. Instead of training directly in the image space, we design a supervised problem which learns transformations in the latent space of StyleGAN. This combines the best of supervised learning and generative adversarial modeling. We show that the StyleGAN prior allows for generalisation to different expressions, hairstyles and backgrounds. This produces high-quality photorealistic results for in-the-wild images and significantly outperforms existing methods. Our approach can edit the illumination and pose simultaneously, and runs at interactive rates.
We present a novel, fast differentiable simulator for soft-body learning and control applications. Existing differentiable soft-body simulators can be classified into two categories based on their time integration methods. Simulators using explicit time-stepping scheme require tiny time steps to avoid numerical instabilities in gradient computation, and simulators using implicit time integration typically compute gradients by employing the adjoint method to solve the expensive linearized dynamics. Inspired by Projective Dynamics (PD), we present DiffPD, an efficient differentiable soft-body simulator with implicit time integration. The key idea in DiffPD is to speed up backpropagation by exploiting the prefactorized Cholesky decomposition in PD to achieve a super-linear convergence rate. To handle contacts, DiffPD solves contact forces by analyzing a linear complementarity problem (LCP) and its gradients. With the assumption that contacts occur on a small number of nodes, we develop an efficient method for gradient computation by exploring the low-rank structure in the linearized dynamics. We evaluate the performance of DiffPD and observe a speedup of 4-19 times compared to the standard Newton's method in various applications including system identification, inverse design problems, trajectory optimization, and closed-loop control.
Parametric computer-aided design (CAD) is a standard paradigm used for the design of manufactured objects. CAD designers perform modeling operations, such as sketch and extrude, to form a construction sequence that makes up a final design. Despite the pervasiveness of parametric CAD and growing interest from the research community, a dataset of human designed 3D CAD construction sequences has not been available to-date. In this paper we present the Fusion 360 Gallery reconstruction dataset and environment for learning CAD reconstruction. We provide a dataset of 8,625 designs, comprising sequential sketch and extrude modeling operations, together with a complementary environment called the Fusion 360 Gym, to assist with performing CAD reconstruction. We outline a standard CAD reconstruction task, together with evaluation metrics, and present results from a novel method using neurally guided search to recover a construction sequence from raw geometry.
The reflectance field of a face describes the reflectance properties responsible for complex lighting effects including diffuse, specular, inter-reflection and self shadowing. Most existing methods for estimating the face reflectance from a monocular image assume faces to be diffuse with very few approaches adding a specular component. This still leaves out important perceptual aspects of reflectance as higher-order global illumination effects and self-shadowing are not modeled. We present a new neural representation for face reflectance where we can estimate all components of the reflectance responsible for the final appearance from a single monocular image. Instead of modeling each component of the reflectance separately using parametric models, our neural representation allows us to generate a basis set of faces in a geometric deformation-invariant space, parameterized by the input light direction, viewpoint and face geometry. We learn to reconstruct this reflectance field of a face just from a monocular image, which can be used to render the face from any viewpoint in any light condition. Our method is trained on a light-stage training dataset, which captures 300 people illuminated with 150 light conditions from 8 viewpoints. We show that our method outperforms existing monocular reflectance reconstruction methods, in terms of photorealism due to better capturing of physical premitives, such as sub-surface scattering, specularities, self-shadows and other higher-order effects.