Alert button
Picture for Yu Lei

Yu Lei

Alert button

COPF: Continual Learning Human Preference through Optimal Policy Fitting

Oct 28, 2023
Han Zhang, Lin Gui, Yuanzhao Zhai, Hui Wang, Yu Lei, Ruifeng Xu

The technique of Reinforcement Learning from Human Feedback (RLHF) is a commonly employed method to improve pre-trained Language Models (LM), enhancing their ability to conform to human preferences. Nevertheless, the current RLHF-based LMs necessitate full retraining each time novel queries or feedback are introduced, which becomes a challenging task because human preferences can vary between different domains or tasks. Retraining LMs poses practical difficulties in many real-world situations due to the significant time and computational resources required, along with concerns related to data privacy. To address this limitation, we propose a new method called Continual Optimal Policy Fitting (COPF), in which we estimate a series of optimal policies using the Monte Carlo method, and then continually fit the policy sequence with the function regularization. COPF involves a single learning phase and doesn't necessitate complex reinforcement learning. Importantly, it shares the capability with RLHF to learn from unlabeled data, making it flexible for continual preference learning. Our experimental results show that COPF outperforms strong Continuous learning (CL) baselines when it comes to consistently aligning with human preferences on different tasks and domains.

Viaarxiv icon

FocalDreamer: Text-driven 3D Editing via Focal-fusion Assembly

Aug 22, 2023
Yuhan Li, Yishun Dou, Yue Shi, Yu Lei, Xuanhong Chen, Yi Zhang, Peng Zhou, Bingbing Ni

Figure 1 for FocalDreamer: Text-driven 3D Editing via Focal-fusion Assembly
Figure 2 for FocalDreamer: Text-driven 3D Editing via Focal-fusion Assembly
Figure 3 for FocalDreamer: Text-driven 3D Editing via Focal-fusion Assembly
Figure 4 for FocalDreamer: Text-driven 3D Editing via Focal-fusion Assembly

While text-3D editing has made significant strides in leveraging score distillation sampling, emerging approaches still fall short in delivering separable, precise and consistent outcomes that are vital to content creation. In response, we introduce FocalDreamer, a framework that merges base shape with editable parts according to text prompts for fine-grained editing within desired regions. Specifically, equipped with geometry union and dual-path rendering, FocalDreamer assembles independent 3D parts into a complete object, tailored for convenient instance reuse and part-wise control. We propose geometric focal loss and style consistency regularization, which encourage focal fusion and congruent overall appearance. Furthermore, FocalDreamer generates high-fidelity geometry and PBR textures which are compatible with widely-used graphics engines. Extensive experiments have highlighted the superior editing capabilities of FocalDreamer in both quantitative and qualitative evaluations.

* Project website: https://focaldreamer.github.io 
Viaarxiv icon

A Lower Bound of Hash Codes' Performance

Oct 12, 2022
Xiaosu Zhu, Jingkuan Song, Yu Lei, Lianli Gao, Heng Tao Shen

Figure 1 for A Lower Bound of Hash Codes' Performance
Figure 2 for A Lower Bound of Hash Codes' Performance
Figure 3 for A Lower Bound of Hash Codes' Performance
Figure 4 for A Lower Bound of Hash Codes' Performance

As a crucial approach for compact representation learning, hashing has achieved great success in effectiveness and efficiency. Numerous heuristic Hamming space metric learning objectives are designed to obtain high-quality hash codes. Nevertheless, a theoretical analysis of criteria for learning good hash codes remains largely unexploited. In this paper, we prove that inter-class distinctiveness and intra-class compactness among hash codes determine the lower bound of hash codes' performance. Promoting these two characteristics could lift the bound and improve hash learning. We then propose a surrogate model to fully exploit the above objective by estimating the posterior of hash codes and controlling it, which results in a low-bias optimization. Extensive experiments reveal the effectiveness of the proposed method. By testing on a series of hash-models, we obtain performance improvements among all of them, with an up to $26.5\%$ increase in mean Average Precision and an up to $20.5\%$ increase in accuracy. Our code is publicly available at \url{https://github.com/VL-Group/LBHash}.

* Accepted to NeurIPS 2022 
Viaarxiv icon

Multi-Faceted Hierarchical Multi-Task Learning for a Large Number of Tasks with Multi-dimensional Relations

Oct 26, 2021
Junning Liu, Zijie Xia, Yu Lei, Xinjian Li, Xu Wang

Figure 1 for Multi-Faceted Hierarchical Multi-Task Learning for a Large Number of Tasks with Multi-dimensional Relations
Figure 2 for Multi-Faceted Hierarchical Multi-Task Learning for a Large Number of Tasks with Multi-dimensional Relations
Figure 3 for Multi-Faceted Hierarchical Multi-Task Learning for a Large Number of Tasks with Multi-dimensional Relations
Figure 4 for Multi-Faceted Hierarchical Multi-Task Learning for a Large Number of Tasks with Multi-dimensional Relations

There has been many studies on improving the efficiency of shared learning in Multi-Task Learning(MTL). Previous work focused on the "micro" sharing perspective for a small number of tasks, while in Recommender Systems(RS) and other AI applications, there are often demands to model a large number of tasks with multi-dimensional task relations. For example, when using MTL to model various user behaviors in RS, if we differentiate new users and new items from old ones, there will be a cartesian product style increase of tasks with multi-dimensional relations. This work studies the "macro" perspective of shared learning network design and proposes a Multi-Faceted Hierarchical MTL model(MFH). MFH exploits the multi-dimension task relations with a nested hierarchical tree structure which maximizes the shared learning. We evaluate MFH and SOTA models in a large industry video platform of 10 billion samples and results show that MFH outperforms SOTA MTL models significantly in both offline and online evaluations across all user groups, especially remarkable for new users with an online increase of 9.1\% in app time per user and 1.85\% in next-day retention rate. MFH now has been deployed in a large scale online video recommender system. MFH is especially beneficial to the cold-start problems in RS where new users and new items often suffer from a "local overfitting" phenomenon. However, the idea is actually generic and widely applicable to other MTL scenarios.

* 12 pages, 8 figures 
Viaarxiv icon

Capsule Graph Neural Networks with EM Routing

Oct 18, 2021
Yu Lei, Jing Zhang

Figure 1 for Capsule Graph Neural Networks with EM Routing
Figure 2 for Capsule Graph Neural Networks with EM Routing

To effectively classify graph instances, graph neural networks need to have the capability to capture the part-whole relationship existing in a graph. A capsule is a group of neurons representing complicated properties of entities, which has shown its advantages in traditional convolutional neural networks. This paper proposed novel Capsule Graph Neural Networks that use the EM routing mechanism (CapsGNNEM) to generate high-quality graph embeddings. Experimental results on a number of real-world graph datasets demonstrate that the proposed CapsGNNEM outperforms nine state-of-the-art models in graph classification tasks.

Viaarxiv icon

Self-adaptive Multi-task Particle Swarm Optimization

Oct 09, 2021
Xiaolong Zheng, Deyun Zhou, Na Li, Yu Lei, Tao Wu, Maoguo Gong

Figure 1 for Self-adaptive Multi-task Particle Swarm Optimization
Figure 2 for Self-adaptive Multi-task Particle Swarm Optimization
Figure 3 for Self-adaptive Multi-task Particle Swarm Optimization
Figure 4 for Self-adaptive Multi-task Particle Swarm Optimization

Multi-task optimization (MTO) studies how to simultaneously solve multiple optimization problems for the purpose of obtaining better performance on each problem. Over the past few years, evolutionary MTO (EMTO) was proposed to handle MTO problems via evolutionary algorithms. So far, many EMTO algorithms have been developed and demonstrated well performance on solving real-world problems. However, there remain many works to do in adapting knowledge transfer to task relatedness in EMTO. Different from the existing works, we develop a self-adaptive multi-task particle swarm optimization (SaMTPSO) through the developed knowledge transfer adaptation strategy, the focus search strategy and the knowledge incorporation strategy. In the knowledge transfer adaptation strategy, each task has a knowledge source pool that consists of all knowledge sources. Each source (task) outputs knowledge to the task. And knowledge transfer adapts to task relatedness via individuals' choice on different sources of a pool, where the chosen probabilities for different sources are computed respectively according to task's success rate in generating improved solutions via these sources. In the focus search strategy, if there is no knowledge source benefit the optimization of a task, then all knowledge sources in the task's pool are forbidden to be utilized except the task, which helps to improve the performance of the proposed algorithm. Note that the task itself is as a knowledge source of its own. In the knowledge incorporation strategy, two different forms are developed to help the SaMTPSO explore and exploit the transferred knowledge from a chosen source, each leading to a version of the SaMTPSO. Several experiments are conducted on two test suites. The results of the SaMTPSO are comparing to that of 3 popular EMTO algorithms and a particle swarm algorithm, which demonstrates the superiority of the SaMTPSO.

Viaarxiv icon

Geom-GCN: Geometric Graph Convolutional Networks

Feb 14, 2020
Hongbin Pei, Bingzhe Wei, Kevin Chen-Chuan Chang, Yu Lei, Bo Yang

Figure 1 for Geom-GCN: Geometric Graph Convolutional Networks
Figure 2 for Geom-GCN: Geometric Graph Convolutional Networks
Figure 3 for Geom-GCN: Geometric Graph Convolutional Networks
Figure 4 for Geom-GCN: Geometric Graph Convolutional Networks

Message-passing neural networks (MPNNs) have been successfully applied to representation learning on graphs in a variety of real-world applications. However, two fundamental weaknesses of MPNNs' aggregators limit their ability to represent graph-structured data: losing the structural information of nodes in neighborhoods and lacking the ability to capture long-range dependencies in disassortative graphs. Few studies have noticed the weaknesses from different perspectives. From the observations on classical neural network and network geometry, we propose a novel geometric aggregation scheme for graph neural networks to overcome the two weaknesses. The behind basic idea is the aggregation on a graph can benefit from a continuous space underlying the graph. The proposed aggregation scheme is permutation-invariant and consists of three modules, node embedding, structural neighborhood, and bi-level aggregation. We also present an implementation of the scheme in graph convolutional networks, termed Geom-GCN (Geometric Graph Convolutional Networks), to perform transductive learning on graphs. Experimental results show the proposed Geom-GCN achieved state-of-the-art performance on a wide range of open datasets of graphs. Code is available at https://github.com/graphdml-uiuc-jlu/geom-gcn.

* Published as a conference paper at ICLR 2020 
Viaarxiv icon

When Collaborative Filtering Meets Reinforcement Learning

Apr 02, 2019
Yu Lei, Wenjie Li

Figure 1 for When Collaborative Filtering Meets Reinforcement Learning
Figure 2 for When Collaborative Filtering Meets Reinforcement Learning

In this paper, we study a multi-step interactive recommendation problem, where the item recommended at current step may affect the quality of future recommendations. To address the problem, we develop a novel and effective approach, named CFRL, which seamlessly integrates the ideas of both collaborative filtering (CF) and reinforcement learning (RL). More specifically, we first model the recommender-user interactive recommendation problem as an agent-environment RL task, which is mathematically described by a Markov decision process (MDP). Further, to achieve collaborative recommendations for the entire user community, we propose a novel CF-based MDP by encoding the states of all users into a shared latent vector space. Finally, we propose an effective Q-network learning method to learn the agent's optimal policy based on the CF-based MDP. The capability of CFRL is demonstrated by comparing its performance against a variety of existing methods on real-world datasets.

Viaarxiv icon