Abstract:Deployable structures inspired by origami offer lightweight, compact, and reconfigurable solutions for robotic and architectural applications. We present a geometric and mechanical framework for Yoshimura-Ori modules that supports a diverse set of metastable states, including newly identified asymmetric "pop-out" and "hyperfolded" configurations. These states are governed by three parameters -- tilt angle, phase shift, and slant height -- and enable discrete, programmable transformations. Using this model, we develop forward and inverse kinematic strategies to stack modules into deployable booms that approximate complex 3D shapes. We validate our approach through mechanical tests and demonstrate a tendon- and pneumatically-actuated Yoshimura Space Crane capable of object manipulation, solar tracking, and high load-bearing performance. A meter-scale solar charging station further illustrates the design's scalability. These results establish Yoshimura-Ori structures as a promising platform for adaptable, multifunctional deployable systems in both terrestrial and space environments.
Abstract:Physical computing has emerged as a powerful tool for performing intelligent tasks directly in the mechanical domain of functional materials and robots, reducing our reliance on the more traditional COMS computers. However, no systematic study explains how mechanical design can influence physical computing performance. This study sheds insights into this question by repurposing an origami-inspired modular robotic manipulator into an adaptive physical reservoir and systematically evaluating its computing capacity with different physical configurations, input setups, and computing tasks. By challenging this adaptive reservoir computer to complete the classical NARMA benchmark tasks, this study shows that its time series emulation performance directly correlates to the Peak Similarity Index (PSI), which quantifies the frequency spectrum correlation between the target output and reservoir dynamics. The adaptive reservoir also demonstrates perception capabilities, accurately extracting its payload weight and orientation information from the intrinsic dynamics. Importantly, such information extraction capability can be measured by the spatial correlation between nodal dynamics within the reservoir body. Finally, by integrating shape memory alloy (SMA) actuation, this study demonstrates how to exploit such computing power embodied in the physical body for practical, robotic operations. This study provides a strategic framework for harvesting computing power from soft robots and functional materials, demonstrating how design parameters and input selection can be configured based on computing task requirements. Extending this framework to bio-inspired adaptive materials, prosthetics, and self-adaptive soft robotic systems could enable next-generation embodied intelligence, where the physical structure can compute and interact with their digital counterparts.
Abstract:Soft robots have become increasingly popular for complex manipulation tasks requiring gentle and safe contact. However, their softness makes accurate control challenging, and high-fidelity sensing is a prerequisite to adequate control performance. To this end, many flexible and embedded sensors have been created over the past decade, but they inevitably increase the robot's complexity and stiffness. This study demonstrates a novel approach that uses simple bending strain gauges embedded inside a modular arm to extract complex information regarding its deformation and working conditions. The core idea is based on physical reservoir computing (PRC): A soft body's rich nonlinear dynamic responses, captured by the inter-connected bending sensor network, could be utilized for complex multi-modal sensing with a simple linear regression algorithm. Our results show that the soft modular arm reservoir can accurately predict body posture (bending angle), estimate payload weight, determine payload orientation, and even differentiate two payloads with only minimal difference in weight -- all using minimal digital computing power.
Abstract:Over the past decades, we have witnessed a rapid emergence of soft and reconfigurable robots thanks to their capability to interact safely with humans and adapt to complex environments. However, their softness makes accurate control very challenging. High-fidelity sensing is critical in improving control performance, especially posture and contact estimation. To this end, traditional camera-based sensors and load cells have limited portability and accuracy, and they will inevitably increase the robot's cost and weight. In this study, instead of using specialized sensors, we only collect distributed pressure data inside a pneumatics-driven soft arm and apply the physical reservoir computing principle to simultaneously predict its kinematic posture (i.e., bending angle) and payload status (i.e., payload mass). Our results show that, with careful readout training, one can obtain accurate bending angle and payload mass predictions via simple, weighted linear summations of pressure readings. In addition, our comparative analysis shows that, to guarantee low prediction errors within 10\%, bending angle prediction requires less training data than payload prediction. This result reveals that balanced linear and nonlinear body dynamics are critical for the physical reservoir to accomplish complex proprioceptive and exteroceptive information perception tasks. Finally, the method of exploring the most efficient readout training methods presented in this paper could be extended to other soft robotic systems to maximize their perception capabilities.
Abstract:This paper documents our characterization study and practices for serving text-to-image requests with stable diffusion models in production. We first comprehensively analyze inference request traces for commercial text-to-image applications. It commences with our observation that add-on modules, i.e., ControlNets and LoRAs, that augment the base stable diffusion models, are ubiquitous in generating images for commercial applications. Despite their efficacy, these add-on modules incur high loading overhead, prolong the serving latency, and swallow up expensive GPU resources. Driven by our characterization study, we present SwiftDiffusion, a system that efficiently generates high-quality images using stable diffusion models and add-on modules. To achieve this, SwiftDiffusion reconstructs the existing text-to-image serving workflow by identifying the opportunities for parallel computation and distributing ControlNet computations across multiple GPUs. Further, SwiftDiffusion thoroughly analyzes the dynamics of image generation and develops techniques to eliminate the overhead associated with LoRA loading and patching while preserving the image quality. Last, SwiftDiffusion proposes specialized optimizations in the backbone architecture of the stable diffusion models, which are also compatible with the efficient serving of add-on modules. Compared to state-of-the-art text-to-image serving systems, SwiftDiffusion reduces serving latency by up to 5x and improves serving throughput by up to 2x without compromising image quality.
Abstract:Compositional Zero-shot Learning (CZSL) aims to identify novel compositions via known attribute-object pairs. The primary challenge in CZSL tasks lies in the significant discrepancies introduced by the complex interaction between the visual primitives of attribute and object, consequently decreasing the classification performance towards novel compositions. Previous remarkable works primarily addressed this issue by focusing on disentangling strategy or utilizing object-based conditional probabilities to constrain the selection space of attributes. Unfortunately, few studies have explored the problem from the perspective of modeling the mechanism of visual primitive interactions. Inspired by the success of vanilla adversarial learning in Cross-Domain Few-Shot Learning, we take a step further and devise a model-agnostic and Primitive-Based Adversarial training (PBadv) method to deal with this problem. Besides, the latest studies highlight the weakness of the perception of hard compositions even under data-balanced conditions. To this end, we propose a novel over-sampling strategy with object-similarity guidance to augment target compositional training data. We performed detailed quantitative analysis and retrieval experiments on well-established datasets, such as UT-Zappos50K, MIT-States, and C-GQA, to validate the effectiveness of our proposed method, and the state-of-the-art (SOTA) performance demonstrates the superiority of our approach. The code is available at https://github.com/lisuyi/PBadv_czsl.
Abstract:Yoshimura origami is a classical folding pattern that has inspired many deployable structure designs. Its applications span from space exploration, kinetic architectures, and soft robots to even everyday household items. However, despite its wide usage, Yoshimura has been fixated on a set of design constraints to ensure its flat-foldability. Through extensive kinematic analysis and prototype tests, this study presents a new Yoshimura that intentionally defies these constraints. Remarkably, one can impart a unique meta-stability by using the Golden Ratio angle to define the triangular facets of a generalized Yoshimura. As a result, when its facets are strategically popped out, a ``Golden Ratio Yoshimura'' boom with $m$ modules can be theoretically reconfigured into $8^m$ geometrically unique and load-bearing shapes. This result not only challenges the existing design norms but also opens up a new avenue to create deployable and versatile structural systems.
Abstract:This study presents a modular, electronics-free, and fully onboard control and actuation approach for SMA-based soft robots to achieve locomotion tasks. This approach exploits the nonlinear mechanics of compliant curved beams and carefully designed mechanical control circuits to create and synchronize rhythmic deformation cycles, mimicking the central pattern generators (CPG) prevalent in animal locomotions. More specifically, the study elucidates a new strategy to amplify the actuation performance of the shape memory coil actuator by coupling it to a carefully designed, mono-stable curve beam with a snap-through buckling behavior. Such SMA-curved beam assembly is integrated with an entirely mechanical circuit featuring a slider mechanism. This circuit can automatically cut off and supply current to the SMA according to its deformation status, generating a self-sustained rhythmic deformation cycle using a simple DC power supply. Finally, this study presents a new strategy to coordinate (synchronize) two rhythmic deformation cycles from two robotic modules to achieve efficient crawling locomotion but still use a single DC power. This work represents a significant step towards fully autonomous, electronics-free SMA-based locomotion robots with fully onboard actuation and control.
Abstract:In this paper, we experimentally examine the cognitive capability of a simple, paper-based Miura-ori -- using the physical reservoir computing framework -- to achieve different information perception tasks. The body dynamics of Miura-ori (aka. its vertices displacements), which is excited by a simple harmonic base excitation, can be exploited as the reservoir computing resource. By recording these dynamics with a high-resolution camera and image processing program and then using linear regression for training, we show that the origami reservoir has sufficient computing capacity to estimate the weight and position of a payload. It can also recognize the input frequency and magnitude patterns. Furthermore, multitasking is achievable by simultaneously applying two targeted functions to the same reservoir state matrix. Therefore, we demonstrate that Miura-ori can assess the dynamic interactions between its body and ambient environment to extract meaningful information -- an intelligent behavior in the mechanical domain. Given that Miura-ori has been widely used to construct deployable structures, lightweight materials, and compliant robots, enabling such information perception tasks can add a new dimension to the functionality of such a versatile structure.
Abstract:A new paradigm called physical reservoir computing has recently emerged, where the nonlinear dynamics of high-dimensional and fixed physical systems are harnessed as a computational resource to achieve complex tasks. Via extensive simulations based on a dynamic truss-frame model, this study shows that an origami structure can perform as a dynamic reservoir with sufficient computing power to emulate high-order nonlinear systems, generate stable limit cycles, and modulate outputs according to dynamic inputs. This study also uncovers the linkages between the origami reservoir's physical designs and its computing power, offering a guideline to optimize the computing performance. Comprehensive parametric studies show that selecting optimal feedback crease distribution and fine-tuning the underlying origami folding designs are the most effective approach to improve computing performance. Furthermore, this study shows how origami's physical reservoir computing power can apply to soft robotic control problems by a case study of earthworm-like peristaltic crawling without traditional controllers. These results can pave the way for origami-based robots with embodied mechanical intelligence.