Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jingxian Huang

SOLARIS: Speculative Offloading of Latent-bAsed Representation for Inference Scaling

Apr 13, 2026

Zikun Liu, Liang Luo, Qianru Li, Zhengyu Zhang, Wei Ling, Jingyi Shen, Zeliang Chen, Yaning Huang, Jingxian Huang, Abdallah Aboelela(+23 more)

Abstract:Recent advances in recommendation scaling laws have led to foundation models of unprecedented complexity. While these models offer superior performance, their computational demands make real-time serving impractical, often forcing practitioners to rely on knowledge distillation-compromising serving quality for efficiency. To address this challenge, we present SOLARIS (Speculative Offloading of Latent-bAsed Representation for Inference Scaling), a novel framework inspired by speculative decoding. SOLARIS proactively precomputes user-item interaction embeddings by predicting which user-item pairs are likely to appear in future requests, and asynchronously generating their foundation model representations ahead of time. This approach decouples the costly foundation model inference from the latency-critical serving path, enabling real-time knowledge transfer from models previously considered too expensive for online use. Deployed across Meta's advertising system serving billions of daily requests, SOLARIS achieves 0.67% revenue-driving top-line metrics gain, demonstrating its effectiveness at scale.

* Accepted to SIGIR 2026 Industry Track

Via

Access Paper or Ask Questions

Second-Order Semantic Dependency Parsing with End-to-End Neural Networks

Aug 12, 2019

Xinyu Wang, Jingxian Huang, Kewei Tu

Figure 1 for Second-Order Semantic Dependency Parsing with End-to-End Neural Networks

Figure 2 for Second-Order Semantic Dependency Parsing with End-to-End Neural Networks

Figure 3 for Second-Order Semantic Dependency Parsing with End-to-End Neural Networks

Figure 4 for Second-Order Semantic Dependency Parsing with End-to-End Neural Networks

Abstract:Semantic dependency parsing aims to identify semantic relationships between words in a sentence that form a graph. In this paper, we propose a second-order semantic dependency parser, which takes into consideration not only individual dependency edges but also interactions between pairs of edges. We show that second-order parsing can be approximated using mean field (MF) variational inference or loopy belief propagation (LBP). We can unfold both algorithms as recurrent layers of a neural network and therefore can train the parser in an end-to-end manner. Our experiments show that our approach achieves state-of-the-art performance.

Via

Access Paper or Ask Questions