Leveraging Large Language Models (LLMs) for recommendation has recently garnered considerable attention, where fine-tuning plays a key role in LLMs' adaptation. However, the cost of fine-tuning LLMs on rapidly expanding recommendation data limits their practical application. To address this challenge, few-shot fine-tuning offers a promising approach to quickly adapt LLMs to new recommendation data. We propose the task of data pruning for efficient LLM-based recommendation, aimed at identifying representative samples tailored for LLMs' few-shot fine-tuning. While coreset selection is closely related to the proposed task, existing coreset selection methods often rely on suboptimal heuristic metrics or entail costly optimization on large-scale recommendation data. To tackle these issues, we introduce two objectives for the data pruning task in the context of LLM-based recommendation: 1) high accuracy aims to identify the influential samples that can lead to high overall performance; and 2) high efficiency underlines the low costs of the data pruning process. To pursue the two objectives, we propose a novel data pruning method based on two scores, i.e., influence score and effort score, to efficiently identify the influential samples. Particularly, the influence score is introduced to accurately estimate the influence of sample removal on the overall performance. To achieve low costs of the data pruning process, we use a small-sized surrogate model to replace LLMs to obtain the influence score. Considering the potential gap between the surrogate model and LLMs, we further propose an effort score to prioritize some hard samples specifically for LLMs. Empirical results on three real-world datasets validate the effectiveness of our proposed method. In particular, the proposed method uses only 2% samples to surpass the full data fine-tuning, reducing time costs by 97%.
Collaborative Filtering (CF) recommender models highly depend on user-item interactions to learn CF representations, thus falling short of recommending cold-start items. To address this issue, prior studies mainly introduce item features (e.g., thumbnails) for cold-start item recommendation. They learn a feature extractor on warm-start items to align feature representations with interactions, and then leverage the feature extractor to extract the feature representations of cold-start items for interaction prediction. Unfortunately, the features of cold-start items, especially the popular ones, tend to diverge from those of warm-start ones due to temporal feature shifts, preventing the feature extractor from accurately learning feature representations of cold-start items. To alleviate the impact of temporal feature shifts, we consider using Distributionally Robust Optimization (DRO) to enhance the generation ability of the feature extractor. Nonetheless, existing DRO methods face an inconsistency issue: the worse-case warm-start items emphasized during DRO training might not align well with the cold-start item distribution. To capture the temporal feature shifts and combat this inconsistency issue, we propose a novel temporal DRO with new optimization objectives, namely, 1) to integrate a worst-case factor to improve the worst-case performance, and 2) to devise a shifting factor to capture the shifting trend of item features and enhance the optimization of the potentially popular groups in cold-start items. Substantial experiments on three real-world datasets validate the superiority of our temporal DRO in enhancing the generalization ability of cold-start recommender models. The code is available at https://github.com/Linxyhaha/TDRO/.
Large Language Models (LLMs) have garnered considerable attention in recommender systems. To achieve LLM-based recommendation, item indexing and generation grounding are two essential steps, bridging between recommendation items and natural language. Item indexing assigns a unique identifier to represent each item in natural language, and generation grounding grounds the generated token sequences to in-corpus items. However, previous works suffer from inherent limitations in the two steps. For item indexing, existing ID-based identifiers (e.g., numeric IDs) and description-based identifiers (e.g., titles) often compromise semantic richness or uniqueness. Moreover, generation grounding might inadvertently produce out-of-corpus identifiers. Worse still, autoregressive generation heavily relies on the initial token's quality. To combat these issues, we propose a novel multi-facet paradigm, namely TransRec, to bridge the LLMs to recommendation. Specifically, TransRec employs multi-facet identifiers that incorporate ID, title, and attribute, achieving both distinctiveness and semantics. Additionally, we introduce a specialized data structure for TransRec to guarantee the in-corpus identifier generation and adopt substring indexing to encourage LLMs to generate from any position. We implement TransRec on two backbone LLMs, i.e., BART-large and LLaMA-7B. Empirical results on three real-world datasets under diverse settings (e.g., full training and few-shot training with warm- and cold-start testings) attest to the superiority of TransRec.
Binary feature descriptors have been widely used in various visual measurement tasks, particularly those with limited computing resources and storage capacities. Existing binary descriptors may not perform well for long-term visual measurement tasks due to their sensitivity to illumination variations. It can be observed that when image illumination changes dramatically, the relative relationship among local patches mostly remains intact. Based on the observation, consequently, this study presents an illumination-insensitive binary (IIB) descriptor by leveraging the local inter-patch invariance exhibited in multiple spatial granularities to deal with unfavorable illumination variations. By taking advantage of integral images for local patch feature computation, a highly efficient IIB descriptor is achieved. It can encode scalable features in multiple spatial granularities, thus facilitating a computationally efficient hierarchical matching from coarse to fine. Moreover, the IIB descriptor can also apply to other types of image data, such as depth maps and semantic segmentation results, when available in some applications. Numerical experiments on both natural and synthetic datasets reveal that the proposed IIB descriptor outperforms state-of-the-art binary descriptors and some testing float descriptors. The proposed IIB descriptor has also been successfully employed in a demo system for long-term visual localization. The code of the IIB descriptor will be publicly available.
* IEEE Transactions on Instrumentation and Measurement 2023 * Accepted by IEEE Transactions on Instrumentation and Measurement
Line segment detection plays a cornerstone role in computer vision tasks. Among numerous detection methods that have been recently proposed, the ones based on edge drawing attract increasing attention owing to their excellent detection efficiency. However, the existing methods are not robust enough due to the inadequate usage of image gradients for edge drawing and line segment fitting. Based on the observation that the line segments should locate on the edge points with both consistent coordinates and level-line information, i.e., the unit vector perpendicular to the gradient orientation, this paper proposes a level-line guided edge drawing for robust line segment detection (GEDRLSD). The level-line information provides potential directions for edge tracking, which could be served as a guideline for accurate edge drawing. Additionally, the level-line information is fused in line segment fitting to improve the robustness. Numerical experiments show the superiority of the proposed GEDRLSD algorithm compared with state-of-the-art methods.
* ICASSP 2023 - 2023 IEEE International Conference on Acoustics,
Speech and Signal Processing (ICASSP) * Accepted by ICASSP 2023
Detection and description of line segments lay the basis for numerous vision tasks. Although many studies have aimed to detect and describe line segments, a comprehensive review is lacking, obstructing their progress. This study fills the gap by comprehensively reviewing related studies on detecting and describing two-dimensional image line segments to provide researchers with an overall picture and deep understanding. Based on their mechanisms, two taxonomies for line segment detection and description are presented to introduce, analyze, and summarize these studies, facilitating researchers to learn about them quickly and extensively. The key issues, core ideas, advantages and disadvantages of existing methods, and their potential applications for each category are analyzed and summarized, including previously unknown findings. The challenges in existing methods and corresponding insights for potentially solving them are also provided to inspire researchers. In addition, some state-of-the-art line segment detection and description algorithms are evaluated without bias, and the evaluation code will be publicly available. The theoretical analysis, coupled with the experimental results, can guide researchers in selecting the best method for their intended vision applications. Finally, this study provides insights for potentially interesting future research directions to attract more attention from researchers to this field.
* This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessible
Generative models such as Generative Adversarial Networks (GANs) and Variational Auto-Encoders (VAEs) are widely utilized to model the generative process of user interactions. However, these generative models suffer from intrinsic limitations such as the instability of GANs and the restricted representation ability of VAEs. Such limitations hinder the accurate modeling of the complex user interaction generation procedure, such as noisy interactions caused by various interference factors. In light of the impressive advantages of Diffusion Models (DMs) over traditional generative models in image synthesis, we propose a novel Diffusion Recommender Model (named DiffRec) to learn the generative process in a denoising manner. To retain personalized information in user interactions, DiffRec reduces the added noises and avoids corrupting users' interactions into pure noises like in image synthesis. In addition, we extend traditional DMs to tackle the unique challenges in practical recommender systems: high resource costs for large-scale item prediction and temporal shifts of user preference. To this end, we propose two extensions of DiffRec: L-DiffRec clusters items for dimension compression and conducts the diffusion processes in the latent space; and T-DiffRec reweights user interactions based on the interaction timestamps to encode temporal information. We conduct extensive experiments on three datasets under multiple settings (e.g. clean training, noisy training, and temporal training). The empirical results and in-depth analysis validate the superiority of DiffRec with two extensions over competitive baselines.
* 10 pages, 7 figures, accepted for publication in SIGIR'23
Recommender systems typically retrieve items from an item corpus for personalized recommendations. However, such a retrieval-based recommender paradigm faces two limitations: 1) the human-generated items in the corpus might fail to satisfy the users' diverse information needs, and 2) users usually adjust the recommendations via passive and inefficient feedback such as clicks. Nowadays, AI-Generated Content (AIGC) has revealed significant success across various domains, offering the potential to overcome these limitations: 1) generative AI can produce personalized items to meet users' specific information needs, and 2) the newly emerged ChatGPT significantly facilitates users to express information needs more precisely via natural language instructions. In this light, the boom of AIGC points the way towards the next-generation recommender paradigm with two new objectives: 1) generating personalized content through generative AI, and 2) integrating user instructions to guide content generation. To this end, we propose a novel Generative Recommender paradigm named GeneRec, which adopts an AI generator to personalize content generation and leverages user instructions to acquire users' information needs. Specifically, we pre-process users' instructions and traditional feedback (e.g., clicks) via an instructor to output the generation guidance. Given the guidance, we instantiate the AI generator through an AI editor and an AI creator to repurpose existing items and create new items, respectively. Eventually, GeneRec can perform content retrieval, repurposing, and creation to meet users' information needs. Besides, to ensure the trustworthiness of the generated items, we emphasize various fidelity checks such as authenticity and legality checks. Lastly, we study the feasibility of implementing the AI editor and AI creator on micro-video generation, showing promising results.
Recommender systems easily face the issue of user preference shifts. User representations will become out-of-date and lead to inappropriate recommendations if user preference has shifted over time. To solve the issue, existing work focuses on learning robust representations or predicting the shifting pattern. There lacks a comprehensive view to discover the underlying reasons for user preference shifts. To understand the preference shift, we abstract a causal graph to describe the generation procedure of user interaction sequences. Assuming user preference is stable within a short period, we abstract the interaction sequence as a set of chronological environments. From the causal graph, we find that the changes of some unobserved factors (e.g., becoming pregnant) cause preference shifts between environments. Besides, the fine-grained user preference over categories sparsely affects the interactions with different items. Inspired by the causal graph, our key considerations to handle preference shifts lie in modeling the interaction generation procedure by: 1) capturing the preference shifts across environments for accurate preference prediction, and 2) disentangling the sparse influence from user preference to interactions for accurate effect estimation of preference. To this end, we propose a Causal Disentangled Recommendation (CDR) framework, which captures preference shifts via a temporal variational autoencoder and learns the sparse influence from multiple environments. Specifically, an encoder is adopted to infer the unobserved factors from user interactions while a decoder is to model the interaction generation process. Besides, we introduce two learnable matrices to disentangle the sparse influence from user preference to interactions. Lastly, we devise a multi-objective loss to optimize CDR. Extensive experiments on three datasets show the superiority of CDR.
* This paper has been accepted for publication in Transactions on