Abstract:Industrial robotic manipulation demands reliable long-horizon execution across embodiments, tasks, and changing object distributions. While Vision-Language-Action models have demonstrated strong generalization, they remain fundamentally reactive. By optimizing the next action given the current observation without evaluating potential futures, they are brittle to the compounding failure modes of long-horizon tasks. Cortex 2.0 shifts from reactive control to plan-and-act by generating candidate future trajectories in visual latent space, scoring them for expected success and efficiency, then committing only to the highest-scoring candidate. We evaluate Cortex 2.0 on a single-arm and dual-arm manipulation platform across four tasks of increasing complexity: pick and place, item and trash sorting, screw sorting, and shoebox unpacking. Cortex 2.0 consistently outperforms state-of-the-art Vision-Language-Action baselines, achieving the best results across all tasks. The system remains reliable in unstructured environments characterized by heavy clutter, frequent occlusions, and contact-rich manipulation, where reactive policies fail. These results demonstrate that world-model-based planning can operate reliably in complex industrial environments.




Abstract:Swarm robotics systems have the potential to transform warfighting in urban environments, but until now have not seen large-scale field testing. We present the Rapid Integration Swarming Ecosystem (RISE), a platform for future multi-agent research and deployment. RISE enables rapid integration of third-party swarm tactics and behaviors, which was demonstrated using both physical and simulated swarms. Our physical testbed is composed of more than 250 networked heterogeneous agents and has been extensively tested in mock warfare scenarios at five urban combat training ranges. RISE implements live, virtual, constructive simulation capabilities to allow the use of both virtual and physical agents simultaneously, while our "fluid fidelity" simulation enables adaptive scaling between low and high fidelity simulation levels based on dynamic runtime requirements. Both virtual and physical agents are controlled with a unified gesture-based interface that enables a greater than 150:1 agent-to-operator ratio. Through this interface, we enable efficient swarm-based mission execution. RISE translates mission needs to robot actuation with rapid tactic integration, a reliable testbed, and efficient operation.