Retrieval-augmented language models (RaLM) have demonstrated the potential to solve knowledge-intensive natural language processing (NLP) tasks by combining a non-parametric knowledge base with a parametric language model. Instead of fine-tuning a fully parametric model, RaLM excels at its low-cost adaptation to the latest data and better source attribution mechanisms. Among various RaLM approaches, iterative RaLM delivers a better generation quality due to a more frequent interaction between the retriever and the language model. Despite the benefits, iterative RaLM usually encounters high overheads due to the frequent retrieval step. To this end, we propose RaLMSpec, a speculation-inspired framework that provides generic speed-up over iterative RaLM while preserving the same model outputs through speculative retrieval and batched verification. By further incorporating prefetching, optimal speculation stride scheduler, and asynchronous verification, RaLMSpec can automatically exploit the acceleration potential to the fullest. For naive iterative RaLM serving, extensive evaluations over three language models on four downstream QA datasets demonstrate that RaLMSpec can achieve a speed-up ratio of 1.75-2.39x, 1.04-1.39x, and 1.31-1.77x when the retriever is an exact dense retriever, approximate dense retriever, and sparse retriever respectively compared with the baseline. For KNN-LM serving, RaLMSpec can achieve a speed-up ratio up to 7.59x and 2.45x when the retriever is an exact dense retriever and approximate dense retriever, respectively, compared with the baseline.
Target detection is pivotal for modern urban computing applications. While image-based techniques are widely adopted, they falter under challenging environmental conditions such as adverse weather, poor lighting, and occlusion. To improve the target detection performance under complex real-world scenarios, this paper proposes an intelligent integrated optical camera and millimeter-wave (mmWave) radar system. Utilizing both physical knowledge and data-driven methods, a long-term robust radar-camera fusion algorithm is proposed to solve the heterogeneous data fusion problem for detection improvement. For the occlusion scenarios, the proposed algorithm can effectively detect occluded targets with the help of memory through performing long-term detection. For dark scenarios with low-light conditions, the proposed algorithm can effectively mark the target in the dark picture as well as provide rough stickman imaging. The above two innovative functions of the hybrid optical camera and mmWave radar system are tested in real-world scenarios. The results demonstrate the robustness and significant enhancement in the target detection performance of our integrated system.