Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nitin Yadav

Generating Query-Relevant Document Summaries via Reinforcement Learning

Aug 11, 2025

Nitin Yadav, Changsung Kang, Hongwei Shang, Ming Sun

Abstract:E-commerce search engines often rely solely on product titles as input for ranking models with latency constraints. However, this approach can result in suboptimal relevance predictions, as product titles often lack sufficient detail to capture query intent. While product descriptions provide richer information, their verbosity and length make them unsuitable for real-time ranking, particularly for computationally expensive architectures like cross-encoder ranking models. To address this challenge, we propose ReLSum, a novel reinforcement learning framework designed to generate concise, query-relevant summaries of product descriptions optimized for search relevance. ReLSum leverages relevance scores as rewards to align the objectives of summarization and ranking, effectively overcoming limitations of prior methods, such as misaligned learning targets. The framework employs a trainable large language model (LLM) to produce summaries, which are then used as input for a cross-encoder ranking model. Experimental results demonstrate significant improvements in offline metrics, including recall and NDCG, as well as online user engagement metrics. ReLSum provides a scalable and efficient solution for enhancing search relevance in large-scale e-commerce systems.

Via

Access Paper or Ask Questions

Knowledge Distillation for Enhancing Walmart E-commerce Search Relevance Using Large Language Models

May 11, 2025

Hongwei Shang, Nguyen Vo, Nitin Yadav, Tian Zhang, Ajit Puthenputhussery, Xunfan Cai, Shuyi Chen, Prijith Chandran, Changsung Kang

Abstract:Ensuring the products displayed in e-commerce search results are relevant to users queries is crucial for improving the user experience. With their advanced semantic understanding, deep learning models have been widely used for relevance matching in search tasks. While large language models (LLMs) offer superior ranking capabilities, it is challenging to deploy LLMs in real-time systems due to the high-latency requirements. To leverage the ranking power of LLMs while meeting the low-latency demands of production systems, we propose a novel framework that distills a high performing LLM into a more efficient, low-latency student model. To help the student model learn more effectively from the teacher model, we first train the teacher LLM as a classification model with soft targets. Then, we train the student model to capture the relevance margin between pairs of products for a given query using mean squared error loss. Instead of using the same training data as the teacher model, we significantly expand the student model dataset by generating unlabeled data and labeling it with the teacher model predictions. Experimental results show that the student model performance continues to improve as the size of the augmented training data increases. In fact, with enough augmented data, the student model can outperform the teacher model. The student model has been successfully deployed in production at Walmart.com with significantly positive metrics.

* The Web Conference 2025
* 9 pages, published at WWWW'25

Via

Access Paper or Ask Questions

Phase transition in the knapsack problem

Jun 26, 2018

Nitin Yadav, Carsten Murawski, Sebastian Sardina, Peter Bossaerts

Figure 1 for Phase transition in the knapsack problem

Figure 2 for Phase transition in the knapsack problem

Figure 3 for Phase transition in the knapsack problem

Abstract:We examine the phase transition phenomenon for the Knapsack problem from both a computational and a human perspective. We first provide, via an empirical and a theoretical analysis, a characterization of the phenomenon in terms of two instance properties; normalised capacity and normalised profit. Then, we show evidence that average time spent by human decision makers in solving an instance peaks near the phase transition. Given the ubiquity of the Knapsack problem in every-day life, a better understanding of its structure can improve our understanding not only of computational techniques but also of human behavior, including the use and development of heuristics and occurrence of biases.

Via

Access Paper or Ask Questions

Supervisory Control for Behavior Composition

Apr 29, 2016

Paolo Felli, Nitin Yadav, Sebastian Sardina

Figure 1 for Supervisory Control for Behavior Composition

Figure 2 for Supervisory Control for Behavior Composition

Figure 3 for Supervisory Control for Behavior Composition

Figure 4 for Supervisory Control for Behavior Composition

Abstract:We relate behavior composition, a synthesis task studied in AI, to supervisory control theory from the discrete event systems field. In particular, we show that realizing (i.e., implementing) a target behavior module (e.g., a house surveillance system) by suitably coordinating a collection of available behaviors (e.g., automatic blinds, doors, lights, cameras, etc.) amounts to imposing a supervisor onto a special discrete event system. Such a link allows us to leverage on the solid foundations and extensive work on discrete event systems, including borrowing tools and ideas from that field. As evidence of that we show how simple it is to introduce preferences in the mapped framework.

Via

Access Paper or Ask Questions

Reasoning about Agent Programs using ATL-like Logics

Jul 17, 2012

Nitin Yadav, Sebastian Sardina

Figure 1 for Reasoning about Agent Programs using ATL-like Logics

Abstract:We propose a variant of Alternating-time Temporal Logic (ATL) grounded in the agents' operational know-how, as defined by their libraries of abstract plans. Inspired by ATLES, a variant itself of ATL, it is possible in our logic to explicitly refer to "rational" strategies for agents developed under the Belief-Desire-Intention agent programming paradigm. This allows us to express and verify properties of BDI systems using ATL-type logical frameworks.

* In Proceedings of the European Conference on Logics in Artificial Intelligence (JELIA), volume 7519 of LNCS, pages 437-449, 2012

Via

Access Paper or Ask Questions

Qualitative Approximate Behavior Composition

Jul 17, 2012

Nitin Yadav, Sebastian Sardina

Figure 1 for Qualitative Approximate Behavior Composition

Abstract:The behavior composition problem involves automatically building a controller that is able to realize a desired, but unavailable, target system (e.g., a house surveillance) by suitably coordinating a set of available components (e.g., video cameras, blinds, lamps, a vacuum cleaner, phones, etc.) Previous work has almost exclusively aimed at bringing about the desired component in its totality, which is highly unsatisfactory for unsolvable problems. In this work, we develop an approach for approximate behavior composition without departing from the classical setting, thus making the problem applicable to a much wider range of cases. Based on the notion of simulation, we characterize what a maximal controller and the "closest" implementable target module (optimal approximation) are, and show how these can be computed using ATL model checking technology for a special case. We show the uniqueness of optimal approximations, and prove their soundness and completeness with respect to their imported controllers.

* In Proceedings of the European Conference on Logics in Artificial Intelligence (JELIA), volume 7519 of LNCS, pages 450-462, 2012

Via

Access Paper or Ask Questions