Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Krishna Kamath

LLM-based Relevance Assessment for Web-Scale Search Evaluation at Pinterest

Sep 03, 2025

Han Wang, Alex Whitworth, Pak Ming Cheung, Zhenjie Zhang, Krishna Kamath

Figure 1 for LLM-based Relevance Assessment for Web-Scale Search Evaluation at Pinterest

Figure 2 for LLM-based Relevance Assessment for Web-Scale Search Evaluation at Pinterest

Figure 3 for LLM-based Relevance Assessment for Web-Scale Search Evaluation at Pinterest

Figure 4 for LLM-based Relevance Assessment for Web-Scale Search Evaluation at Pinterest

Abstract:Relevance evaluation plays a crucial role in personalized search systems to ensure that search results align with a user's queries and intent. While human annotation is the traditional method for relevance evaluation, its high cost and long turnaround time limit its scalability. In this work, we present our approach at Pinterest Search to automate relevance evaluation for online experiments using fine-tuned LLMs. We rigorously validate the alignment between LLM-generated judgments and human annotations, demonstrating that LLMs can provide reliable relevance measurement for experiments while greatly improving the evaluation efficiency. Leveraging LLM-based labeling further unlocks the opportunities to expand the query set, optimize sampling design, and efficiently assess a wider range of search experiences at scale. This approach leads to higher-quality relevance metrics and significantly reduces the Minimum Detectable Effect (MDE) in online experiment measurements.

* RecSys 2025 EARL Workshop

Via

Access Paper or Ask Questions

Improving Pinterest Search Relevance Using Large Language Models

Oct 22, 2024

Han Wang, Mukuntha Narayanan Sundararaman, Onur Gungor, Yu Xu, Krishna Kamath, Rakesh Chalasani, Kurchi Subhra Hazra, Jinfeng Rao

Figure 1 for Improving Pinterest Search Relevance Using Large Language Models

Figure 2 for Improving Pinterest Search Relevance Using Large Language Models

Figure 3 for Improving Pinterest Search Relevance Using Large Language Models

Figure 4 for Improving Pinterest Search Relevance Using Large Language Models

Abstract:To improve relevance scoring on Pinterest Search, we integrate Large Language Models (LLMs) into our search relevance model, leveraging carefully designed text representations to predict the relevance of Pins effectively. Our approach uses search queries alongside content representations that include captions extracted from a generative visual language model. These are further enriched with link-based text data, historically high-quality engaged queries, user-curated boards, Pin titles and Pin descriptions, creating robust models for predicting search relevance. We use a semi-supervised learning approach to efficiently scale up the amount of training data, expanding beyond the expensive human labeled data available. By utilizing multilingual LLMs, our system extends training data to include unseen languages and domains, despite initial data and annotator expertise being confined to English. Furthermore, we distill from the LLM-based model into real-time servable model architectures and features. We provide comprehensive offline experimental validation for our proposed techniques and demonstrate the gains achieved through the final deployed system at scale.

* CIKM 2024 Workshop on Industrial Recommendation Systems

Via

Access Paper or Ask Questions