Abstract:The evaluation of drag based image editing models is unreliable due to a lack of standardized benchmarks and metrics. This ambiguity stems from inconsistent evaluation protocols and, critically, the absence of datasets containing ground truth target images, making objective comparisons between competing methods difficult. To address this, we introduce \textbf{RealDrag}, the first comprehensive benchmark for point based image editing that includes paired ground truth target images. Our dataset contains over 400 human annotated samples from diverse video sources, providing source/target images, handle/target points, editable region masks, and descriptive captions for both the image and the editing action. We also propose four novel, task specific metrics: Semantical Distance (SeD), Outer Mask Preserving Score (OMPS), Inner Patch Preserving Score (IPPS), and Directional Similarity (DiS). These metrics are designed to quantify pixel level matching fidelity, check preservation of non edited (out of mask) regions, and measure semantic alignment with the desired task. Using this benchmark, we conduct the first large scale systematic analysis of the field, evaluating 17 SOTA models. Our results reveal clear trade offs among current approaches and establish a robust, reproducible baseline to guide future research. Our dataset and evaluation toolkit will be made publicly available.




Abstract:This paper introduces a distributed leaderless swarm formation control framework to address the problem of collectively driving a swarm of robots to track a time-varying formation. The swarm's formation is captured by the trajectory of an abstract shape that circumscribes the convex hull of robots' positions and is independent of the number of robots and their ordering in the swarm. For each robot in the swarm, given global specifications in terms of the trajectory of the abstract shape parameters, the proposed framework synthesizes a control law that steers the swarm to track the desired formation using the information available at the robot's local neighbors. For this purpose, we generate a suitable local reference trajectory that the robot controller tracks by solving the input-output linearization problem. Here, we select the swarm output to be the parameters of the abstract shape. For this purpose, we design a dynamic average consensus estimator to estimate the abstract shape parameters. The abstract shape parameters are used as the swarm state feedback to generate a suitable robot trajectory. We demonstrate the effectiveness and robustness of the proposed control framework by providing the simulation of coordinated collective navigation of a group of car-like robots in the presence of robots and communication link failures.