Robot systems in education can leverage Large language models' (LLMs) natural language understanding capabilities to provide assistance and facilitate learning. This paper proposes a multimodal interactive robot (PhysicsAssistant) built on YOLOv8 object detection, cameras, speech recognition, and chatbot using LLM to provide assistance to students' physics labs. We conduct a user study on ten 8th-grade students to empirically evaluate the performance of PhysicsAssistant with a human expert. The Expert rates the assistants' responses to student queries on a 0-4 scale based on Bloom's taxonomy to provide educational support. We have compared the performance of PhysicsAssistant (YOLOv8+GPT-3.5-turbo) with GPT-4 and found that the human expert rating of both systems for factual understanding is the same. However, the rating of GPT-4 for conceptual and procedural knowledge (3 and 3.2 vs 2.2 and 2.6, respectively) is significantly higher than PhysicsAssistant (p < 0.05). However, the response time of GPT-4 is significantly higher than PhysicsAssistant (3.54 vs 1.64 sec, p < 0.05). Hence, despite the relatively lower response quality of PhysicsAssistant than GPT-4, it has shown potential for being used as a real-time lab assistant to provide timely responses and can offload teachers' labor to assist with repetitive tasks. To the best of our knowledge, this is the first attempt to build such an interactive multimodal robotic assistant for K-12 science (physics) education.
To tackle the "reality gap" encountered in Sim-to-Real transfer, this study proposes a diffusion-based framework that minimizes inconsistencies in grasping actions between the simulation settings and realistic environments. The process begins by training an adversarial supervision layout-to-image diffusion model(ALDM). Then, leverage the ALDM approach to enhance the simulation environment, rendering it with photorealistic fidelity, thereby optimizing robotic grasp task training. Experimental results indicate this framework outperforms existing models in both success rates and adaptability to new environments through improvements in the accuracy and reliability of visual grasping actions under a variety of conditions. Specifically, it achieves a 75\% success rate in grasping tasks under plain backgrounds and maintains a 65\% success rate in more complex scenarios. This performance demonstrates this framework excels at generating controlled image content based on text descriptions, identifying object grasp points, and demonstrating zero-shot learning in complex, unseen scenarios.
Multi-agent and multi-robot systems (MRS) often rely on direct communication for information sharing. This work explores an alternative approach inspired by eavesdropping mechanisms in nature that involves casual observation of agent interactions to enhance decentralized knowledge dissemination. We achieve this through a novel IKT-BT framework tailored for a behavior-based MRS, encapsulating knowledge and control actions in Behavior Trees (BT). We present two new BT-based modalities - eavesdrop-update (EU) and eavesdrop-buffer-update (EBU) - incorporating unique eavesdropping strategies and efficient episodic memory management suited for resource-limited swarm robots. We theoretically analyze the IKT-BT framework for an MRS and validate the performance of the proposed modalities through extensive experiments simulating a search and rescue mission. Our results reveal improvements in both global mission performance outcomes and agent-level knowledge dissemination with a reduced need for direct communication.
To circumvent persistent connectivity to the cloud infrastructure, the current emphasis on computing at network edge devices in the multi-robot domain is a promising enabler for delay-sensitive jobs, yet its adoption is rife with challenges. This paper proposes a novel utility-aware dynamic task offloading strategy based on a multi-edge-robot system that takes into account computation, communication, and task execution load to minimize the overall service time for delay-sensitive applications. Prior to task offloading, continuous device, network, and task profiling are performed, and for each task assigned, an edge with maximum utility is derived using a weighted utility maximization technique, and a system reward assignment for task connectivity or sensitivity is performed. A scheduler is in charge of task assignment, whereas an executor is responsible for task offloading on edge devices. Experimental comparisons of the proposed approach with conventional offloading methods indicate better performance in terms of optimizing resource utilization and minimizing task latency.
We propose integrating the edge-computing paradigm into the multi-robot collaborative scheduling to maximize resource utilization for complex collaborative tasks, which many robots must perform together. Examples include collaborative map-merging to produce a live global map during exploration instead of traditional approaches that schedule tasks on centralized cloud-based systems to facilitate computing. Our decentralized approach to a consensus-based scheduling strategy benefits a multi-robot-edge collaboration system by adapting to dynamic computation needs and communication-changing statistics as the system tries to optimize resources while maintaining overall performance objectives. Before collaborative task offloading, continuous device, and network profiling are performed at the computing resources, and the distributed scheduling scheme then selects the resource with maximum utility derived using a utility maximization approach. Thorough evaluations with and without edge servers on simulation and real-world multi-robot systems demonstrate that a lower task latency, a large throughput gain, and better frame rate processing may be achieved compared to the conventional edge-based systems.
Urban vehicle-to-vehicle (V2V) link scheduling with shared spectrum is a challenging problem. Its main goal is to find the scheduling policy that can maximize system performance (usually the sum capacity of each link or their energy efficiency). Given that each link can experience interference from all other active links, the scheduling becomes a combinatorial integer programming problem and generally does not scale well with the number of V2V pairs. Moreover, link scheduling requires accurate channel state information (CSI), which is very difficult to estimate with good accuracy under high vehicle mobility. In this paper, we propose an end-to-end urban V2V link scheduling method called Map2Schedule, which can directly generate V2V scheduling policy from the city map and vehicle locations. Map2Schedule delivers comparable performance to the physical-model-based methods in urban settings while maintaining low computation complexity. This enhanced performance is achieved by machine learning (ML) technologies. Specifically, we first deploy the convolutional neural network (CNN) model to estimate the CSI from street layout and vehicle locations and then apply the graph embedding model for optimal scheduling policy. The results show that the proposed method can achieve high accuracy with much lower overhead and latency.
Relative localization is crucial for multi-robot systems to perform cooperative tasks, especially in GPS-denied environments. Current techniques for multi-robot relative localization rely on expensive or short-range sensors such as cameras and LIDARs. As a result, these algorithms face challenges such as high computational complexity, dependencies on well-structured environments, etc. To overcome these limitations, we propose a new distributed approach to perform relative localization using a Gaussian Processes map of the Radio Signal Strength Indicator (RSSI) values from a single wireless Access Point (AP) to which the robots are connected. Our approach, Gaussian Processes-based Relative Localization (GPRL), combines two pillars. First, the robots locate the AP w.r.t. their local reference frames using novel hierarchical inferencing that significantly reduces computational complexity. Secondly, the robots obtain relative positions of neighbor robots with an AP-oriented vector transformation. The approach readily applies to resource-constrained devices and relies only on the ubiquitously-available RSSI measurement. We extensively validate the performance of the two pillars of the proposed GRPL in Robotarium simulations. We also demonstrate the applicability of GPRL through a multi-robot rendezvous task with a team of three real-world robots. The results demonstrate that GPRL outperformed state-of-the-art approaches regarding accuracy, computation, and real-time performance.
Localizing mobile robotic nodes in indoor and GPS-denied environments is a complex problem, particularly in dynamic, unstructured scenarios where traditional cameras and LIDAR-based sensing and localization modalities may fail. Alternatively, wireless signal-based localization has been extensively studied in the literature yet primarily focuses on fingerprinting and feature-matching paradigms, requiring dedicated environment-specific offline data collection. We propose an online robot localization algorithm enabled by collaborative wireless sensor nodes to remedy these limitations. Our approach's core novelty lies in obtaining the Collaborative Direction of Arrival (CDOA) of wireless signals by exploiting the geometric features and collaboration between wireless nodes. The CDOA is combined with the Expectation Maximization (EM) and Particle Filter (PF) algorithms to calculate the Gaussian probability of the node's location with high efficiency and accuracy. The algorithm relies on RSSI-only data, making it ubiquitous to resource-constrained devices. We theoretically analyze the approach and extensively validate the proposed method's consistency, accuracy, and computational efficiency in simulations, real-world public datasets, as well as real robot demonstrations. The results validate the method's real-time computational capability and demonstrate considerably-high centimeter-level localization accuracy, outperforming relevant state-of-the-art localization approaches.
Frontier exploration and reinforcement learning have historically been used to solve the problem of enabling many mobile robots to autonomously and cooperatively explore complex surroundings. These methods need to keep an internal global map for navigation, but they do not take into consideration the high costs of communication and information sharing between robots. This study offers CQLite, a novel distributed Q-learning technique designed to minimize data communication overhead between robots while achieving rapid convergence and thorough coverage in multi-robot exploration. The proposed CQLite method uses ad hoc map merging, and selectively shares updated Q-values at recently identified frontiers to significantly reduce communication costs. The theoretical analysis of CQLite's convergence and efficiency, together with extensive numerical verification on simulated indoor maps utilizing several robots, demonstrates the method's novelty. With over 2x reductions in computation and communication alongside improved mapping performance, CQLite outperformed cutting-edge multi-robot exploration techniques like Rapidly Exploring Random Trees and Deep Reinforcement Learning.
The availability of accurate localization is critical for multi-robot exploration strategies; noisy or inconsistent localization causes failure in meeting exploration objectives. We aim to achieve high localization accuracy with contemporary exploration map belief and vice versa without needing global localization information. This paper proposes a novel simultaneous exploration and localization (SEAL) approach, which uses Gaussian Processes (GP)-based information fusion for maximum exploration while performing communication graph optimization for relative localization. Both these cross-dependent objectives were integrated through the Rao-Blackwellization technique. Distributed linearized convex hull optimization is used to select the next-best unexplored region for distributed exploration. SEAL outperformed cutting-edge methods on exploration and localization performance in extensive ROS-Gazebo simulations, illustrating the practicality of the approach in real-world applications.