Alert button
Picture for Qiang Fu

Qiang Fu

Alert button

PSGformer: Enhancing 3D Point Cloud Instance Segmentation via Precise Semantic Guidance

Jul 15, 2023
Lei Pan, Wuyang Luan, Yuan Zheng, Qiang Fu, Junhui Li

Figure 1 for PSGformer: Enhancing 3D Point Cloud Instance Segmentation via Precise Semantic Guidance
Figure 2 for PSGformer: Enhancing 3D Point Cloud Instance Segmentation via Precise Semantic Guidance
Figure 3 for PSGformer: Enhancing 3D Point Cloud Instance Segmentation via Precise Semantic Guidance
Figure 4 for PSGformer: Enhancing 3D Point Cloud Instance Segmentation via Precise Semantic Guidance

Most existing 3D instance segmentation methods are derived from 3D semantic segmentation models. However, these indirect approaches suffer from certain limitations. They fail to fully leverage global and local semantic information for accurate prediction, which hampers the overall performance of the 3D instance segmentation framework. To address these issues, this paper presents PSGformer, a novel 3D instance segmentation network. PSGformer incorporates two key advancements to enhance the performance of 3D instance segmentation. Firstly, we propose a Multi-Level Semantic Aggregation Module, which effectively captures scene features by employing foreground point filtering and multi-radius aggregation. This module enables the acquisition of more detailed semantic information from global and local perspectives. Secondly, PSGformer introduces a Parallel Feature Fusion Transformer Module that independently processes super-point features and aggregated features using transformers. The model achieves a more comprehensive feature representation by the features which connect global and local features. We conducted extensive experiments on the ScanNetv2 dataset. Notably, PSGformer exceeds compared state-of-the-art methods by 2.2% on ScanNetv2 hidden test set in terms of mAP. Our code and models will be publicly released.

Viaarxiv icon

RLTF: Reinforcement Learning from Unit Test Feedback

Jul 10, 2023
Jiate Liu, Yiqin Zhu, Kaiwen Xiao, Qiang Fu, Xiao Han, Wei Yang, Deheng Ye

Figure 1 for RLTF: Reinforcement Learning from Unit Test Feedback
Figure 2 for RLTF: Reinforcement Learning from Unit Test Feedback
Figure 3 for RLTF: Reinforcement Learning from Unit Test Feedback
Figure 4 for RLTF: Reinforcement Learning from Unit Test Feedback

The goal of program synthesis, or code generation, is to generate executable code based on given descriptions. Recently, there has been an increasing number of studies employing reinforcement learning (RL) to improve the performance of large language models (LLMs) for code. However, these RL methods have only used offline frameworks, limiting their exploration of new sample spaces. Additionally, current approaches that utilize unit test signals are rather simple, not accounting for specific error locations within the code. To address these issues, we proposed RLTF, i.e., Reinforcement Learning from Unit Test Feedback, a novel online RL framework with unit test feedback of multi-granularity for refining code LLMs. Our approach generates data in real-time during training and simultaneously utilizes fine-grained feedback signals to guide the model towards producing higher-quality code. Extensive experiments show that RLTF achieves state-of-the-art performance on the APPS and the MBPP benchmarks. Our code can be found at: https://github.com/Zyq-scut/RLTF.

Viaarxiv icon

Policy Space Diversity for Non-Transitive Games

Jun 29, 2023
Jian Yao, Weiming Liu, Haobo Fu, Yaodong Yang, Stephen McAleer, Qiang Fu, Wei Yang

Figure 1 for Policy Space Diversity for Non-Transitive Games
Figure 2 for Policy Space Diversity for Non-Transitive Games
Figure 3 for Policy Space Diversity for Non-Transitive Games
Figure 4 for Policy Space Diversity for Non-Transitive Games

Policy-Space Response Oracles (PSRO) is an influential algorithm framework for approximating a Nash Equilibrium (NE) in multi-agent non-transitive games. Many previous studies have been trying to promote policy diversity in PSRO. A major weakness in existing diversity metrics is that a more diverse (according to their diversity metrics) population does not necessarily mean (as we proved in the paper) a better approximation to a NE. To alleviate this problem, we propose a new diversity metric, the improvement of which guarantees a better approximation to a NE. Meanwhile, we develop a practical and well-justified method to optimize our diversity metric using only state-action samples. By incorporating our diversity regularization into the best response solving in PSRO, we obtain a new PSRO variant, Policy Space Diversity PSRO (PSD-PSRO). We present the convergence property of PSD-PSRO. Empirically, extensive experiments on various games demonstrate that PSD-PSRO is more effective in producing significantly less exploitable policies than state-of-the-art PSRO variants.

Viaarxiv icon

Maximum Entropy Heterogeneous-Agent Mirror Learning

Jun 19, 2023
Jiarong Liu, Yifan Zhong, Siyi Hu, Haobo Fu, Qiang Fu, Xiaojun Chang, Yaodong Yang

Figure 1 for Maximum Entropy Heterogeneous-Agent Mirror Learning
Figure 2 for Maximum Entropy Heterogeneous-Agent Mirror Learning
Figure 3 for Maximum Entropy Heterogeneous-Agent Mirror Learning
Figure 4 for Maximum Entropy Heterogeneous-Agent Mirror Learning

Multi-agent reinforcement learning (MARL) has been shown effective for cooperative games in recent years. However, existing state-of-the-art methods face challenges related to sample inefficiency, brittleness regarding hyperparameters, and the risk of converging to a suboptimal Nash Equilibrium. To resolve these issues, in this paper, we propose a novel theoretical framework, named Maximum Entropy Heterogeneous-Agent Mirror Learning (MEHAML), that leverages the maximum entropy principle to design maximum entropy MARL actor-critic algorithms. We prove that algorithms derived from the MEHAML framework enjoy the desired properties of the monotonic improvement of the joint maximum entropy objective and the convergence to quantal response equilibrium (QRE). The practicality of MEHAML is demonstrated by developing a MEHAML extension of the widely used RL algorithm, HASAC (for soft actor-critic), which shows significant improvements in exploration and robustness on three challenging benchmarks: Multi-Agent MuJoCo, StarCraftII, and Google Research Football. Our results show that HASAC outperforms strong baseline methods such as HATD3, HAPPO, QMIX, and MAPPO, thereby establishing the new state of the art. See our project page at https://sites.google.com/view/mehaml.

Viaarxiv icon

On Manipulating Signals of User-Item Graph: A Jacobi Polynomial-based Graph Collaborative Filtering

Jun 06, 2023
Jiayan Guo, Lun Du, Xu Chen, Xiaojun Ma, Qiang Fu, Shi Han, Dongmei Zhang, Yan Zhang

Figure 1 for On Manipulating Signals of User-Item Graph: A Jacobi Polynomial-based Graph Collaborative Filtering
Figure 2 for On Manipulating Signals of User-Item Graph: A Jacobi Polynomial-based Graph Collaborative Filtering
Figure 3 for On Manipulating Signals of User-Item Graph: A Jacobi Polynomial-based Graph Collaborative Filtering
Figure 4 for On Manipulating Signals of User-Item Graph: A Jacobi Polynomial-based Graph Collaborative Filtering

Collaborative filtering (CF) is an important research direction in recommender systems that aims to make recommendations given the information on user-item interactions. Graph CF has attracted more and more attention in recent years due to its effectiveness in leveraging high-order information in the user-item bipartite graph for better recommendations. Specifically, recent studies show the success of graph neural networks (GNN) for CF is attributed to its low-pass filtering effects. However, current researches lack a study of how different signal components contributes to recommendations, and how to design strategies to properly use them well. To this end, from the view of spectral transformation, we analyze the important factors that a graph filter should consider to achieve better performance. Based on the discoveries, we design JGCF, an efficient and effective method for CF based on Jacobi polynomial bases and frequency decomposition strategies. Extensive experiments on four widely used public datasets show the effectiveness and efficiency of the proposed methods, which brings at most 27.06% performance gain on Alibaba-iFashion. Besides, the experimental results also show that JGCF is better at handling sparse datasets, which shows potential in making recommendations for cold-start users.

Viaarxiv icon

Image Quality Is Not All You Want: Task-Driven Lens Design for Image Classification

May 26, 2023
Xinge Yang, Qiang Fu, Yunfeng Nie, Wolfgang Heidrich

Figure 1 for Image Quality Is Not All You Want: Task-Driven Lens Design for Image Classification
Figure 2 for Image Quality Is Not All You Want: Task-Driven Lens Design for Image Classification
Figure 3 for Image Quality Is Not All You Want: Task-Driven Lens Design for Image Classification
Figure 4 for Image Quality Is Not All You Want: Task-Driven Lens Design for Image Classification

In computer vision, it has long been taken for granted that high-quality images obtained through well-designed camera lenses would lead to superior results. However, we find that this common perception is not a "one-size-fits-all" solution for diverse computer vision tasks. We demonstrate that task-driven and deep-learned simple optics can actually deliver better visual task performance. The Task-Driven lens design approach, which relies solely on a well-trained network model for supervision, is proven to be capable of designing lenses from scratch. Experimental results demonstrate the designed image classification lens (``TaskLens'') exhibits higher accuracy compared to conventional imaging-driven lenses, even with fewer lens elements. Furthermore, we show that our TaskLens is compatible with various network models while maintaining enhanced classification accuracy. We propose that TaskLens holds significant potential, particularly when physical dimensions and cost are severely constrained.

* Use an image classification network to supervise the lens design from scratch. The final designs can achieve higher accuracy with fewer optical elements 
Viaarxiv icon

Future-conditioned Unsupervised Pretraining for Decision Transformer

May 26, 2023
Zhihui Xie, Zichuan Lin, Deheng Ye, Qiang Fu, Wei Yang, Shuai Li

Figure 1 for Future-conditioned Unsupervised Pretraining for Decision Transformer
Figure 2 for Future-conditioned Unsupervised Pretraining for Decision Transformer
Figure 3 for Future-conditioned Unsupervised Pretraining for Decision Transformer
Figure 4 for Future-conditioned Unsupervised Pretraining for Decision Transformer

Recent research in offline reinforcement learning (RL) has demonstrated that return-conditioned supervised learning is a powerful paradigm for decision-making problems. While promising, return conditioning is limited to training data labeled with rewards and therefore faces challenges in learning from unsupervised data. In this work, we aim to utilize generalized future conditioning to enable efficient unsupervised pretraining from reward-free and sub-optimal offline data. We propose Pretrained Decision Transformer (PDT), a conceptually simple approach for unsupervised RL pretraining. PDT leverages future trajectory information as a privileged context to predict actions during training. The ability to make decisions based on both present and future factors enhances PDT's capability for generalization. Besides, this feature can be easily incorporated into a return-conditioned framework for online finetuning, by assigning return values to possible futures and sampling future embeddings based on their respective values. Empirically, PDT outperforms or performs on par with its supervised pretraining counterpart, especially when dealing with sub-optimal data. Further analysis reveals that PDT can extract diverse behaviors from offline data and controllably sample high-return behaviors by online finetuning. Code is available at here.

* 17 pages, 9 figures, ICML 2023 
Viaarxiv icon

Skill-Based Few-Shot Selection for In-Context Learning

May 23, 2023
Shengnan An, Bo Zhou, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Weizhu Chen, Jian-Guang Lou

Figure 1 for Skill-Based Few-Shot Selection for In-Context Learning
Figure 2 for Skill-Based Few-Shot Selection for In-Context Learning
Figure 3 for Skill-Based Few-Shot Selection for In-Context Learning
Figure 4 for Skill-Based Few-Shot Selection for In-Context Learning

In-Context learning is the paradigm that adapts large language models to downstream tasks by providing a few examples. Few-shot selection -- selecting appropriate examples for each test instance separately -- is important for in-context learning. In this paper, we propose Skill-KNN, a skill-based few-shot selection method for in-context learning. The key advantages of Skill-KNN include: (1) it addresses the problem that existing methods based on pre-trained embeddings can be easily biased by surface natural language features that are not important for the target task; (2) it does not require training or fine-tuning of any models, making it suitable for frequently expanding or changing example banks. The key insight is to optimize the inputs fed into the embedding model, rather than tuning the model itself. Technically, Skill-KNN generates the skill-based representations for each test case and candidate example by utilizing a pre-processing few-shot prompting, thus eliminating unimportant surface features. Experimental results across four cross-domain semantic parsing tasks and four backbone models show that Skill-KNN significantly outperforms existing methods.

* 18 pages, 6 figures 
Viaarxiv icon

Causal-Based Supervision of Attention in Graph Neural Network: A Better and Simpler Choice towards Powerful Attention

May 22, 2023
Hongjun Wang, Jiyuan Chen, Lun Du, Qiang Fu, Shi Han, Xuan Song

Figure 1 for Causal-Based Supervision of Attention in Graph Neural Network: A Better and Simpler Choice towards Powerful Attention
Figure 2 for Causal-Based Supervision of Attention in Graph Neural Network: A Better and Simpler Choice towards Powerful Attention
Figure 3 for Causal-Based Supervision of Attention in Graph Neural Network: A Better and Simpler Choice towards Powerful Attention
Figure 4 for Causal-Based Supervision of Attention in Graph Neural Network: A Better and Simpler Choice towards Powerful Attention

In recent years, attention mechanisms have demonstrated significant potential in the field of graph representation learning. However, while variants of attention-based GNNs are setting new benchmarks for numerous real-world datasets, recent works have pointed out that their induced attentions are less robust and generalizable against noisy graphs due to the lack of direct supervision. In this paper, we present a new framework that utilizes the tool of causality to provide a powerful supervision signal for the learning process of attention functions. Specifically, we estimate the direct causal effect of attention on the final prediction and then maximize such effect to guide attention to attend to more meaningful neighbors. Our method can serve as a plug-and-play module for any canonical attention-based GNNs in an end-to-end fashion. Extensive experiments on a wide range of benchmark datasets illustrated that, by directly supervising attention with our method, the model is able to converge faster with a clearer decision boundary, and thus yields better performances.

Viaarxiv icon

How Do In-Context Examples Affect Compositional Generalization?

May 08, 2023
Shengnan An, Zeqi Lin, Qiang Fu, Bei Chen, Nanning Zheng, Jian-Guang Lou, Dongmei Zhang

Figure 1 for How Do In-Context Examples Affect Compositional Generalization?
Figure 2 for How Do In-Context Examples Affect Compositional Generalization?
Figure 3 for How Do In-Context Examples Affect Compositional Generalization?
Figure 4 for How Do In-Context Examples Affect Compositional Generalization?

Compositional generalization--understanding unseen combinations of seen primitives--is an essential reasoning capability in human intelligence. The AI community mainly studies this capability by fine-tuning neural networks on lots of training samples, while it is still unclear whether and how in-context learning--the prevailing few-shot paradigm based on large language models--exhibits compositional generalization. In this paper, we present CoFe, a test suite to investigate in-context compositional generalization. We find that the compositional generalization performance can be easily affected by the selection of in-context examples, thus raising the research question what the key factors are to make good in-context examples for compositional generalization. We study three potential factors: similarity, diversity and complexity. Our systematic experiments indicate that in-context examples should be structurally similar to the test case, diverse from each other, and individually simple. Furthermore, two strong limitations are observed: in-context compositional generalization on fictional words is much weaker than that on commonly used ones; it is still critical that the in-context examples should cover required linguistic structures, even though the backbone model has been pre-trained on large corpus. We hope our analysis would facilitate the understanding and utilization of in-context learning paradigm.

* ACL 2023, long paper 
Viaarxiv icon