Abstract:In search advertising, keyword matching connects user queries with relevant ads. While token-based matching increases ad coverage, it can reduce relevance due to overly permissive semantic expansion. This work extends keyword reach through document-side semantic keyword expansion, using a language model to broaden token-level matching without altering queries. We propose a solution using a pre-trained siamese model to generate dense vector representations of ad keywords and identify semantically related variants through nearest neighbor search. To maintain precision, we introduce a cluster-based thresholding mechanism that adjusts similarity cutoffs based on local semantic density. Each expanded keyword maps to a group of seller-listed items, which may only partially align with the original intent. To ensure relevance, we enhance the downstream relevance model by adapting it to the expanded keyword space using an incremental learning strategy with a lightweight decision tree ensemble. This system improves both relevance and click-through rate (CTR), offering a scalable, low-latency solution adaptable to evolving query behavior and advertising inventory.
Abstract:Several approaches have been proposed to forecast day-ahead locational marginal price (daLMP) in deregulated energy markets. The rise of deep learning has motivated its use in energy price forecasts but most deep learning approaches fail to accommodate for exogenous variables, which have significant influence in the peaks and valleys of the daLMP. Accurate forecasts of the daLMP valleys are of crucial importance for power generators since one of the most important decisions they face is whether to sell power at a loss to prevent incurring in shutdown and start-up costs, or to bid at production cost and face the risk of shutting down. In this article we propose a deep learning model that incorporates both the history of daLMP and the effect of exogenous variables (e.g., forecasted load, weather data). A numerical study at the PJM independent system operator (ISO) illustrates how the proposed model outperforms traditional time series techniques while supporting risk-based analysis of shutdown decisions.