Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yu Lu

DaRec: A Disentangled Alignment Framework for Large Language Model and Recommender System

Aug 15, 2024

Xihong Yang, Heming Jing, Zixing Zhang, Jindong Wang, Huakang Niu, Shuaiqiang Wang, Yu Lu, Junfeng Wang, Dawei Yin, Xinwang Liu(+3 more)

Abstract:Benefiting from the strong reasoning capabilities, Large language models (LLMs) have demonstrated remarkable performance in recommender systems. Various efforts have been made to distill knowledge from LLMs to enhance collaborative models, employing techniques like contrastive learning for representation alignment. In this work, we prove that directly aligning the representations of LLMs and collaborative models is sub-optimal for enhancing downstream recommendation tasks performance, based on the information theorem. Consequently, the challenge of effectively aligning semantic representations between collaborative models and LLMs remains unresolved. Inspired by this viewpoint, we propose a novel plug-and-play alignment framework for LLMs and collaborative models. Specifically, we first disentangle the latent representations of both LLMs and collaborative models into specific and shared components via projection layers and representation regularization. Subsequently, we perform both global and local structure alignment on the shared representations to facilitate knowledge transfer. Additionally, we theoretically prove that the specific and shared representations contain more pertinent and less irrelevant information, which can enhance the effectiveness of downstream recommendation tasks. Extensive experimental results on benchmark datasets demonstrate that our method is superior to existing state-of-the-art algorithms.

Via

Access Paper or Ask Questions

FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention

Jul 29, 2024

Yu Lu, Yuanzhi Liang, Linchao Zhu, Yi Yang

Abstract:Video diffusion models have made substantial progress in various video generation applications. However, training models for long video generation tasks require significant computational and data resources, posing a challenge to developing long video diffusion models. This paper investigates a straightforward and training-free approach to extend an existing short video diffusion model (e.g. pre-trained on 16-frame videos) for consistent long video generation (e.g. 128 frames). Our preliminary observation has found that directly applying the short video diffusion model to generate long videos can lead to severe video quality degradation. Further investigation reveals that this degradation is primarily due to the distortion of high-frequency components in long videos, characterized by a decrease in spatial high-frequency components and an increase in temporal high-frequency components. Motivated by this, we propose a novel solution named FreeLong to balance the frequency distribution of long video features during the denoising process. FreeLong blends the low-frequency components of global video features, which encapsulate the entire video sequence, with the high-frequency components of local video features that focus on shorter subsequences of frames. This approach maintains global consistency while incorporating diverse and high-quality spatiotemporal details from local videos, enhancing both the consistency and fidelity of long video generation. We evaluated FreeLong on multiple base video diffusion models and observed significant improvements. Additionally, our method supports coherent multi-prompt generation, ensuring both visual coherence and seamless transitions between scenes.

* Project page: https://yulu.net.cn/freelong

Via

Access Paper or Ask Questions

Hyperbolic Knowledge Transfer in Cross-Domain Recommendation System

Jun 25, 2024

Xin Yang, Heng Chang, Zhijian La, Jinze Yang, Xingrun Li, Yu Lu, Shuaiqiang Wang, Dawei Yin, Erxue Min

Abstract:Cross-Domain Recommendation (CDR) seeks to utilize knowledge from different domains to alleviate the problem of data sparsity in the target recommendation domain, and it has been gaining more attention in recent years. Although there have been notable advancements in this area, most current methods represent users and items in Euclidean space, which is not ideal for handling long-tail distributed data in recommendation systems. Additionally, adding data from other domains can worsen the long-tail characteristics of the entire dataset, making it harder to train CDR models effectively. Recent studies have shown that hyperbolic methods are particularly suitable for modeling long-tail distributions, which has led us to explore hyperbolic representations for users and items in CDR scenarios. However, due to the distinct characteristics of the different domains, applying hyperbolic representation learning to CDR tasks is quite challenging. In this paper, we introduce a new framework called Hyperbolic Contrastive Learning (HCTS), designed to capture the unique features of each domain while enabling efficient knowledge transfer between domains. We achieve this by embedding users and items from each domain separately and mapping them onto distinct hyperbolic manifolds with adjustable curvatures for prediction. To improve the representations of users and items in the target domain, we develop a hyperbolic contrastive learning module for knowledge transfer. Extensive experiments on real-world datasets demonstrate that hyperbolic manifolds are a promising alternative to Euclidean space for CDR tasks.

Via

Access Paper or Ask Questions

Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs

Jun 04, 2024

Zhiwei Cao, Qian Cao, Yu Lu, Ningxin Peng, Luyang Huang, Shanbo Cheng, Jinsong Su

Figure 1 for Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs

Figure 2 for Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs

Figure 3 for Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs

Figure 4 for Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs

Abstract:The growing popularity of Large Language Models has sparked interest in context compression for Large Language Models (LLMs). However, the performance of previous methods degrades dramatically as compression ratios increase, sometimes even falling to the closed-book level. This decline can be attributed to the loss of key information during the compression process. Our preliminary study supports this hypothesis, emphasizing the significance of retaining key information to maintain model performance under high compression ratios. As a result, we introduce Query-Guided Compressor (QGC), which leverages queries to guide the context compression process, effectively preserving key information within the compressed context. Additionally, we employ a dynamic compression strategy. We validate the effectiveness of our proposed QGC on the Question Answering task, including NaturalQuestions, TriviaQA, and HotpotQA datasets. Experimental results show that QGC can consistently perform well even at high compression ratios, which also offers significant benefits in terms of inference cost and throughput.

* Accepted to ACL 2024

Via

Access Paper or Ask Questions

G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation

May 21, 2024

Xingyuan Pan, Luyang Huang, Liyan Kang, Zhicheng Liu, Yu Lu, Shanbo Cheng

Abstract:Large Language Models (LLMs) have demonstrated remarkable abilities in general scenarios. Instruction finetuning empowers them to align with humans in various tasks. Nevertheless, the Diversity and Quality of the instruction data remain two main challenges for instruction finetuning. With regard to this, in this paper, we propose a novel gradient-based method to automatically select high-quality and diverse instruction finetuning data for machine translation. Our key innovation centers around analyzing how individual training examples influence the model during training. Specifically, we select training examples that exert beneficial influences on the model as high-quality ones by means of Influence Function plus a small high-quality seed dataset. Moreover, to enhance the diversity of the training data we maximize the variety of influences they have on the model by clustering on their gradients and resampling. Extensive experiments on WMT22 and FLORES translation tasks demonstrate the superiority of our methods, and in-depth analysis further validates their effectiveness and generalization.

* Accepted to ACL 2024 main conference

Via

Access Paper or Ask Questions

Automatic Evaluation for Mental Health Counseling using LLMs

Feb 19, 2024

Anqi Li, Yu Lu, Nirui Song, Shuai Zhang, Lizhi Ma, Zhenzhong Lan

Figure 1 for Automatic Evaluation for Mental Health Counseling using LLMs

Figure 2 for Automatic Evaluation for Mental Health Counseling using LLMs

Figure 3 for Automatic Evaluation for Mental Health Counseling using LLMs

Figure 4 for Automatic Evaluation for Mental Health Counseling using LLMs

Abstract:High-quality psychological counseling is crucial for mental health worldwide, and timely evaluation is vital for ensuring its effectiveness. However, obtaining professional evaluation for each counseling session is expensive and challenging. Existing methods that rely on self or third-party manual reports to assess the quality of counseling suffer from subjective biases and limitations of time-consuming. To address above challenges, this paper proposes an innovative and efficient automatic approach using large language models (LLMs) to evaluate the working alliance in counseling conversations. We collected a comprehensive counseling dataset and conducted multiple third-party evaluations based on therapeutic relationship theory. Our LLM-based evaluation, combined with our guidelines, shows high agreement with human evaluations and provides valuable insights into counseling scripts. This highlights the potential of LLMs as supervisory tools for psychotherapists. By integrating LLMs into the evaluation process, our approach offers a cost-effective and dependable means of assessing counseling quality, enhancing overall effectiveness.

* 21 pages, 4 figures

Via

Access Paper or Ask Questions

Unveiling the Secrets of Engaging Conversations: Factors that Keep Users Hooked on Role-Playing Dialog Agents

Feb 18, 2024

Shuai Zhang, Yu Lu, Junwen Liu, Jia Yu, Huachuan Qiu, Yuming Yan, Zhenzhong Lan

Figure 1 for Unveiling the Secrets of Engaging Conversations: Factors that Keep Users Hooked on Role-Playing Dialog Agents

Figure 2 for Unveiling the Secrets of Engaging Conversations: Factors that Keep Users Hooked on Role-Playing Dialog Agents

Figure 3 for Unveiling the Secrets of Engaging Conversations: Factors that Keep Users Hooked on Role-Playing Dialog Agents

Figure 4 for Unveiling the Secrets of Engaging Conversations: Factors that Keep Users Hooked on Role-Playing Dialog Agents

Abstract:With the growing humanlike nature of dialog agents, people are now engaging in extended conversations that can stretch from brief moments to substantial periods of time. Understanding the factors that contribute to sustaining these interactions is crucial, yet existing studies primarily focusing on short-term simulations that rarely explore such prolonged and real conversations. In this paper, we investigate the factors influencing retention rates in real interactions with roleplaying models. By analyzing a large dataset of interactions between real users and thousands of characters, we systematically examine multiple factors and assess their impact on user retention rate. Surprisingly, we find that the degree to which the bot embodies the roles it plays has limited influence on retention rates, while the length of each turn it speaks significantly affects retention rates. This study sheds light on the critical aspects of user engagement with role-playing models and provides valuable insights for future improvements in the development of large language models for role-playing purposes.

Via

Access Paper or Ask Questions

FlowZero: Zero-Shot Text-to-Video Synthesis with LLM-Driven Dynamic Scene Syntax

Nov 27, 2023

Yu Lu, Linchao Zhu, Hehe Fan, Yi Yang

Abstract:Text-to-video (T2V) generation is a rapidly growing research area that aims to translate the scenes, objects, and actions within complex video text into a sequence of coherent visual frames. We present FlowZero, a novel framework that combines Large Language Models (LLMs) with image diffusion models to generate temporally-coherent videos. FlowZero uses LLMs to understand complex spatio-temporal dynamics from text, where LLMs can generate a comprehensive dynamic scene syntax (DSS) containing scene descriptions, object layouts, and background motion patterns. These elements in DSS are then used to guide the image diffusion model for video generation with smooth object motions and frame-to-frame coherence. Moreover, FlowZero incorporates an iterative self-refinement process, enhancing the alignment between the spatio-temporal layouts and the textual prompts for the videos. To enhance global coherence, we propose enriching the initial noise of each frame with motion dynamics to control the background movement and camera motion adaptively. By using spatio-temporal syntaxes to guide the diffusion process, FlowZero achieves improvement in zero-shot video synthesis, generating coherent videos with vivid motion.

* Project page: https://flowzero-video.github.io

Via

Access Paper or Ask Questions

ActiveDC: Distribution Calibration for Active Finetuning

Nov 15, 2023

Wenshuai Xu, Zhenhui Hu, Yu Lu, Jinzhou Meng, Qingjie Liu, Yunhong Wang

Abstract:The pretraining-finetuning paradigm has gained popularity in various computer vision tasks. In this paradigm, the emergence of active finetuning arises due to the abundance of large-scale data and costly annotation requirements. Active finetuning involves selecting a subset of data from an unlabeled pool for annotation, facilitating subsequent finetuning. However, the use of a limited number of training samples can lead to a biased distribution, potentially resulting in model overfitting. In this paper, we propose a new method called ActiveDC for the active finetuning tasks. Firstly, we select samples for annotation by optimizing the distribution similarity between the subset to be selected and the entire unlabeled pool in continuous space. Secondly, we calibrate the distribution of the selected samples by exploiting implicit category information in the unlabeled pool. The feature visualization provides an intuitive sense of the effectiveness of our approach to distribution calibration. We conducted extensive experiments on three image classification datasets with different sampling ratios. The results indicate that ActiveDC consistently outperforms the baseline performance in all image classification tasks. The improvement is particularly significant when the sampling ratio is low, with performance gains of up to 10%. Our code will be released.

* 10 pages, 5 figures

Via

Access Paper or Ask Questions

Data-Driven Modeling of an Unsaturated Bentonite Buffer Model Test Under High Temperatures Using an Enhanced Axisymmetric Reproducing Kernel Particle Method

Sep 24, 2023

Jonghyuk Baek, Yanran Wang, Xiaolong He, Yu Lu, John S. McCartney, J. S. Chen

Figure 1 for Data-Driven Modeling of an Unsaturated Bentonite Buffer Model Test Under High Temperatures Using an Enhanced Axisymmetric Reproducing Kernel Particle Method

Figure 2 for Data-Driven Modeling of an Unsaturated Bentonite Buffer Model Test Under High Temperatures Using an Enhanced Axisymmetric Reproducing Kernel Particle Method

Figure 3 for Data-Driven Modeling of an Unsaturated Bentonite Buffer Model Test Under High Temperatures Using an Enhanced Axisymmetric Reproducing Kernel Particle Method

Figure 4 for Data-Driven Modeling of an Unsaturated Bentonite Buffer Model Test Under High Temperatures Using an Enhanced Axisymmetric Reproducing Kernel Particle Method

Abstract:In deep geological repositories for high level nuclear waste with close canister spacings, bentonite buffers can experience temperatures higher than 100 {\deg}C. In this range of extreme temperatures, phenomenological constitutive laws face limitations in capturing the thermo-hydro-mechanical (THM) behavior of the bentonite, since the pre-defined functional constitutive laws often lack generality and flexibility to capture a wide range of complex coupling phenomena as well as the effects of stress state and path dependency. In this work, a deep neural network (DNN)-based soil-water retention curve (SWRC) of bentonite is introduced and integrated into a Reproducing Kernel Particle Method (RKPM) for conducting THM simulations of the bentonite buffer. The DNN-SWRC model incorporates temperature as an additional input variable, allowing it to learn the relationship between suction and degree of saturation under the general non-isothermal condition, which is difficult to represent using a phenomenological SWRC. For effective modeling of the tank-scale test, new axisymmetric Reproducing Kernel basis functions enriched with singular Dirichlet enforcement representing heater placement and an effective convective heat transfer coefficient representing thin-layer composite tank construction are developed. The proposed method is demonstrated through the modeling of a tank-scale experiment involving a cylindrical layer of MX-80 bentonite exposed to central heating.

* 51 pages, 19 figures

Via

Access Paper or Ask Questions