Abstract:Vision-Language-Action (VLA) models have shown strong potential for robotic manipulation, but real-time deployment on edge hardware remains challenging. In this work, we identify VLM visual and context tokens as a major source of deployment latency: for GEMM-dominated projection operators, computation grows linearly with the number of input tokens when model dimensions are fixed. Motivated by this observation, we propose RhinoVLA, a deployment-oriented VLA model co-designed with the Huixi R1 edge SoC. RhinoVLA adopts a token-efficient Qwen3-VL backbone and a continuous Action Expert, reducing the VLM-side token and computation burden while preserving pretrained multimodal capability. To support cross-robot learning, RhinoVLA further introduces a unified interface that combines View Registry, 72D physical state-action slot space, and robotinstance LoRA, allowing heterogeneous robot observations and action schemas to be aligned under a shared policy. On the deployment side, RhinoVLA is optimized through hardware-aware compilation, mixed-precision execution, and parallel visual encoding. Experiments show that RhinoVLA achieves downstream performance comparable to π0.5 at a similar parameter scale, while reaching 11.69 Hz end-to-end inference on Huixi R1, meeting the 10 Hz real-time closedloop control target. The project will be open-sourced at https://github.com/HuixiAI/RhinoVLA.
Abstract:Remotely sensing the spatial distribution of exposed coal (EC) is significant for understanding the footprints of mining activities. However, widely applicable methods for the identification of EC surfaces remain inadequate because the choices of recent methods confront the diverse EC types and backgrounds. Therefore, this study proposed a new Automated Coal Mapping Index (ACMI) which was empirically formulated by an iterative process of identifying parameters that maximize the separability of EC and non-EC surfaces. The performance of ACMI was tested in six study areas worldwide with different landscape types and coal types. Based on the visual inspection, ACMI was more effective in highlighting EC surfaces and suppressing non-EC surfaces than the existing methods. Compared with the sample points obtained through direct interpretation, ACMI obtained better EC mapping results than previous methods with the F1 score and overall accuracy (OA) no less than 0.91 and 93.20% across all the selected Landsat images of the study areas, respectively. In addition, ACMI was demonstrated to have a stable optimal threshold and 0 can serve as its default threshold. The default threshold makes EC mapping using ACMI an automated process. The new index has the potential to support a variety of mining-activity-related studies, such as the identification of mining disturbances and illegal mining detection at multi-spatial-temporal scales.




Abstract:The spread of the Red Pal Weevil (RPW) has become an existential threat for palm trees around the world. In the Middle East, RPW is causing wide-spread damage to date palm Phoenix dactylifera L., having both agricultural impacts on the palm production and environmental impacts. Early detection of RPW is very challenging, especially at large scale. This research proposes a novel remote sensing approach to recognize and monitor red palm weevil in date palm trees, using a combination of vegetation indices, object detection and semantic segmentation techniques. The study area consists of date palm trees with three classes, including healthy palms, smallish palms and severely infected palms. This proposed method achieved a promising 0.947 F1 score on test data set. This work paves the way for deploying artificial intelligence approaches to monitor RPW in large-scale as well as provide guidance for practitioners.




Abstract:Acoustic pyrometry is a non-contact measurement technology for monitoring furnace combustion reaction, diagnosing energy loss due to incomplete combustion and ensuring safe production. The accuracy of time of flight (TOF) estimation of an acoustic pyrometry directly affects the authenticity of furnace temperature measurement. In this paper presented is a novel TOF (i.e. time delay) estimation algorithm based on digital lock-in filtering (DLF) algorithm. In this research, the time-frequency relationship between the first harmonic of the acoustic signal and the moment of characteristic frequency applied is established through the digital lock-in and low-pass filtering techniques. The accurate estimation of TOF is obtained by extracting and comparing the temporal relationship of the characteristic frequency occurrence between received and source acoustic signals. The computational error analysis indicates that the accuracy of the proposed algorithm is better than that of the classical generalized cross-correlation (GCC) algorithm, and the computational effort is significantly reduced to half of that the GCC can offer. It can be confirmed that with this method, the temperature measurement in furnaces can be improved in terms of computational effort and accuracy, which are vital parameters in furnace combustion control. It provides a new idea of time delay estimation with the utilization of acoustic pyrometry for furnace.