Abstract:Recent advances in recommendation scaling laws have led to foundation models of unprecedented complexity. While these models offer superior performance, their computational demands make real-time serving impractical, often forcing practitioners to rely on knowledge distillation-compromising serving quality for efficiency. To address this challenge, we present SOLARIS (Speculative Offloading of Latent-bAsed Representation for Inference Scaling), a novel framework inspired by speculative decoding. SOLARIS proactively precomputes user-item interaction embeddings by predicting which user-item pairs are likely to appear in future requests, and asynchronously generating their foundation model representations ahead of time. This approach decouples the costly foundation model inference from the latency-critical serving path, enabling real-time knowledge transfer from models previously considered too expensive for online use. Deployed across Meta's advertising system serving billions of daily requests, SOLARIS achieves 0.67% revenue-driving top-line metrics gain, demonstrating its effectiveness at scale.
Abstract:As ride-hailing services have experienced significant growth, the majority of research has concentrated on the dispatching mode, where drivers must adhere to the platform's assigned routes. However, the broadcasting mode, in which drivers can freely choose their preferred orders from those broadcast by the platform, has received less attention. One important but challenging task in such a system is the determination of the optimal matching radius, which usually varies across space, time, and real-time supply/demand characteristics. This study develops a Transformer-Encoder-Based (TEB) model that predicts key system performance metrics for a range of matching radii, which enables the ride-hailing platform to select an optimal matching radius that maximizes overall system performance according to real-time supply and demand information. To simultaneously maximize multiple system performance metrics for matching radius determination, we devise a novel multi-task learning algorithm that enhances convergence speed of each task (corresponding to the optimization of one metric) and delivers more accurate overall predictions. We evaluate our methods in a simulation environment specifically designed for broadcasting-mode-based ride-hailing service. Our findings reveal that dynamically adjusting matching radii based on our proposed predict-then-optimize approach significantly improves system performance, e.g., increasing platform revenue by 7.55% and enhancing order fulfillment rate by 13% compared to benchmark algorithms.
Abstract:The increasing accident rate brought about by the explosive growth of automobiles has made the research on active safety systems of automobiles increasingly important. The importance of improving the accuracy of vehicle target detection is self-evident. To achieve the goals of vehicle detection and distance estimation and provide safety warnings, a Distance Estimation Safety Warning System (DESWS) based on a new neural network model (YOLOv5s-SE) by replacing the IoU with DIoU, embedding SE attention module, and a distance estimation method through using the principle of similar triangles was proposed. In addition, a method that can give safety suggestions based on the estimated distance using nonparametric testing was presented in this work. Through the simulation experiment, it was verified that the mAP was improved by 5.5% and the purpose of giving safety suggestions based on the estimated distance information can be achieved.