Abstract:Recent advances in Multimodal Large Language Models (MLLMs) have enabled their use as intelligent agents for smartphone operation. However, existing methods depend on the Android Debug Bridge (ADB) for data transmission and action execution, limiting their applicability to Android devices. In this work, we introduce the novel Embodied Smartphone Operation (ESO) task and present See-Control, a framework that enables smartphone operation via direct physical interaction with a low-DoF robotic arm, offering a platform-agnostic solution. See-Control comprises three key components: (1) an ESO benchmark with 155 tasks and corresponding evaluation metrics; (2) an MLLM-based embodied agent that generates robotic control commands without requiring ADB or system back-end access; and (3) a richly annotated dataset of operation episodes, offering valuable resources for future research. By bridging the gap between digital agents and the physical world, See-Control provides a concrete step toward enabling home robots to perform smartphone-dependent tasks in realistic environments.




Abstract:Accurate and reliable localization is crucial for various wireless communication applications. Numerous studies have proposed accurate localization methods using hybrid received signal strength (RSS) and angle of arrival (AOA) measurements. However, these studies typically assume identical measurement noise distributions for different anchor nodes, which may not accurately reflect real-world scenarios with varying noise distributions. In this paper, we propose a simple and efficient localization method based on hybrid RSS-AOA measurements that accounts for the varying measurement noises of different nodes. We derive a closed-form estimator for the target location based on the linear weighted least squares (LWLS) algorithm, with each LWLS equation weight being the inverse of its residual variance. Due to the unknown variances of LWLS equation residuals, we employ a two-stage LWLS method for estimation. The proposed method is computationally efficient, adaptable to different types of wireless communication systems and environments, and provides more accurate and reliable localization results compared to existing RSS-AOA localization techniques. Additionally, we derive the Cramer-Rao Lower Bound (CRLB) for the RSS-AOA signal sequences used in the proposed method. Simulation results demonstrate the superiority of the proposed method.
Abstract:Wireless sensor networks require accurate target localization, often achieved through received signal strength (RSS) localization estimation based on maximum likelihood (ML). However, ML-based algorithms can suffer from issues such as low diversity, slow convergence, and local optima, which can significantly affect localization performance. In this paper, we propose a novel localization algorithm that combines opposition-based learning (OBL) and simulated annealing algorithm (SAA) to address these challenges. The algorithm begins by generating an initial solution randomly, which serves as the starting point for the SAA. Subsequently, OBL is employed to generate an opposing initial solution, effectively providing an alternative initial solution. The SAA is then executed independently on both the original and opposing initial solutions, optimizing each towards a potential optimal solution. The final solution is selected as the more effective of the two outcomes from the SAA, thereby reducing the likelihood of the algorithm becoming trapped in local optima. Simulation results indicate that the proposed algorithm consistently outperforms existing algorithms in terms of localization accuracy, demonstrating the effectiveness of our approach.