Abstract:Semantic search with large language models (LLMs) enables retrieval by meaning rather than keyword overlap, but scaling it requires major inference efficiency advances. We present LinkedIn's LLM-based semantic search framework for AI Job Search and AI People Search, combining an LLM relevance judge, embedding-based retrieval, and a compact Small Language Model trained via multi-teacher distillation to jointly optimize relevance and engagement. A prefill-oriented inference architecture co-designed with model pruning, context compression, and text-embedding hybrid interactions boosts ranking throughput by over 75x under a fixed latency constraint while preserving near-teacher-level NDCG, enabling one of the first production LLM-based ranking systems with efficiency comparable to traditional approaches and delivering significant gains in quality and user engagement.




Abstract:In this paper, we consider the problem of developing predictive models with limited data for energy assets such as electricity loads, PV power generations, etc. We specifically investigate the cases where the amount of historical data is not sufficient to effectively train the prediction model. We first develop an energy predictive model based on convolutional neural network (CNN) which is well suited to capture the interaday, daily, and weekly cyclostationary patterns, trends and seasonalities in energy assets time series. A transfer learning strategy is then proposed to address the challenge of limited training data. We demonstrate our approach on a usecase of daily electricity demand forecasting. we show practicing the transfer learning strategy on the CNN model results in significant improvement to existing forecasting methods.