Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wei Hu

State Key Laboratory for Novel Software Technology, Nanjing University

KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation

Mar 22, 2024

Xindi Luo, Zequn Sun, Jing Zhao, Zhe Zhao, Wei Hu

Figure 1 for KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation

Figure 2 for KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation

Figure 3 for KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation

Figure 4 for KnowLA: Enhancing Parameter-efficient Finetuning with Knowledgeable Adaptation

Abstract:Parameter-efficient finetuning (PEFT) is a key technique for adapting large language models (LLMs) to downstream tasks. In this paper, we study leveraging knowledge graph embeddings to improve the effectiveness of PEFT. We propose a knowledgeable adaptation method called KnowLA. It inserts an adaptation layer into an LLM to integrate the embeddings of entities appearing in the input text. The adaptation layer is trained in combination with LoRA on instruction data. Experiments on six benchmarks with two popular LLMs and three knowledge graphs demonstrate the effectiveness and robustness of KnowLA. We show that \modelname can help activate the relevant parameterized knowledge in an LLM to answer a question without changing its parameters or input prompts.

* Accepted in the 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024)

Via

Access Paper or Ask Questions

CasSR: Activating Image Power for Real-World Image Super-Resolution

Mar 18, 2024

Haolan Chen, Jinhua Hao, Kai Zhao, Kun Yuan, Ming Sun, Chao Zhou, Wei Hu

Abstract:The objective of image super-resolution is to generate clean and high-resolution images from degraded versions. Recent advancements in diffusion modeling have led to the emergence of various image super-resolution techniques that leverage pretrained text-to-image (T2I) models. Nevertheless, due to the prevalent severe degradation in low-resolution images and the inherent characteristics of diffusion models, achieving high-fidelity image restoration remains challenging. Existing methods often exhibit issues including semantic loss, artifacts, and the introduction of spurious content not present in the original image. To tackle this challenge, we propose Cascaded diffusion for Super-Resolution, CasSR , a novel method designed to produce highly detailed and realistic images. In particular, we develop a cascaded controllable diffusion model that aims to optimize the extraction of information from low-resolution images. This model generates a preliminary reference image to facilitate initial information extraction and degradation mitigation. Furthermore, we propose a multi-attention mechanism to enhance the T2I model's capability in maximizing the restoration of the original image content. Through a comprehensive blend of qualitative and quantitative analyses, we substantiate the efficacy and superiority of our approach.

Via

Access Paper or Ask Questions

RangeLDM: Fast Realistic LiDAR Point Cloud Generation

Mar 15, 2024

Qianjiang Hu, Zhimin Zhang, Wei Hu

Abstract:Autonomous driving demands high-quality LiDAR data, yet the cost of physical LiDAR sensors presents a significant scaling-up challenge. While recent efforts have explored deep generative models to address this issue, they often consume substantial computational resources with slow generation speeds while suffering from a lack of realism. To address these limitations, we introduce RangeLDM, a novel approach for rapidly generating high-quality range-view LiDAR point clouds via latent diffusion models. We achieve this by correcting range-view data distribution for accurate projection from point clouds to range images via Hough voting, which has a critical impact on generative learning. We then compress the range images into a latent space with a variational autoencoder, and leverage a diffusion model to enhance expressivity. Additionally, we instruct the model to preserve 3D structural fidelity by devising a range-guided discriminator. Experimental results on KITTI-360 and nuScenes datasets demonstrate both the robust expressiveness and fast speed of our LiDAR point cloud generation.

Via

Access Paper or Ask Questions

LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content

Mar 13, 2024

Qihao Zhao, Yalun Dai, Hao Li, Wei Hu, Fan Zhang, Jun Liu

Abstract:Long-tail recognition is challenging because it requires the model to learn good representations from tail categories and address imbalances across all categories. In this paper, we propose a novel generative and fine-tuning framework, LTGC, to handle long-tail recognition via leveraging generated content. Firstly, inspired by the rich implicit knowledge in large-scale models (e.g., large language models, LLMs), LTGC leverages the power of these models to parse and reason over the original tail data to produce diverse tail-class content. We then propose several novel designs for LTGC to ensure the quality of the generated data and to efficiently fine-tune the model using both the generated and original data. The visualization demonstrates the effectiveness of the generation module in LTGC, which produces accurate and diverse tail data. Additionally, the experimental results demonstrate that our LTGC outperforms existing state-of-the-art methods on popular long-tailed benchmarks.

* CVPR 2024

Via

Access Paper or Ask Questions

Near-Interpolators: Rapid Norm Growth and the Trade-Off between Interpolation and Generalization

Mar 12, 2024

Yutong Wang, Rishi Sonthalia, Wei Hu

Abstract:We study the generalization capability of nearly-interpolating linear regressors: $\boldsymbol{\beta}$'s whose training error $\tau$ is positive but small, i.e., below the noise floor. Under a random matrix theoretic assumption on the data distribution and an eigendecay assumption on the data covariance matrix $\boldsymbol{\Sigma}$, we demonstrate that any near-interpolator exhibits rapid norm growth: for $\tau$ fixed, $\boldsymbol{\beta}$ has squared $\ell_2$-norm $\mathbb{E}[\|{\boldsymbol{\beta}}\|_{2}^{2}] = \Omega(n^{\alpha})$ where $n$ is the number of samples and $\alpha >1$ is the exponent of the eigendecay, i.e., $\lambda_i(\boldsymbol{\Sigma}) \sim i^{-\alpha}$. This implies that existing data-independent norm-based bounds are necessarily loose. On the other hand, in the same regime we precisely characterize the asymptotic trade-off between interpolation and generalization. Our characterization reveals that larger norm scaling exponents $\alpha$ correspond to worse trade-offs between interpolation and generalization. We verify empirically that a similar phenomenon holds for nearly-interpolating shallow neural networks.

* AISTATS 2024

Via

Access Paper or Ask Questions

Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing

Feb 27, 2024

Bi'an Du, Xiang Gao, Wei Hu, Renjie Liao

Figure 1 for Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing

Figure 2 for Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing

Figure 3 for Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing

Figure 4 for Generative 3D Part Assembly via Part-Whole-Hierarchy Message Passing

Abstract:Generative 3D part assembly involves understanding part relationships and predicting their 6-DoF poses for assembling a realistic 3D shape. Prior work often focus on the geometry of individual parts, neglecting part-whole hierarchies of objects. Leveraging two key observations: 1) super-part poses provide strong hints about part poses, and 2) predicting super-part poses is easier due to fewer superparts, we propose a part-whole-hierarchy message passing network for efficient 3D part assembly. We first introduce super-parts by grouping geometrically similar parts without any semantic labels. Then we employ a part-whole hierarchical encoder, wherein a super-part encoder predicts latent super-part poses based on input parts. Subsequently, we transform the point cloud using the latent poses, feeding it to the part encoder for aggregating super-part information and reasoning about part relationships to predict all part poses. In training, only ground-truth part poses are required. During inference, the predicted latent poses of super-parts enhance interpretability. Experimental results on the PartNet dataset show that our method achieves state-of-the-art performance in part and connectivity accuracy and enables an interpretable hierarchical part assembly.

Via

Access Paper or Ask Questions

InvariantOODG: Learning Invariant Features of Point Clouds for Out-of-Distribution Generalization

Jan 08, 2024

Zhimin Zhang, Xiang Gao, Wei Hu

Figure 1 for InvariantOODG: Learning Invariant Features of Point Clouds for Out-of-Distribution Generalization

Figure 2 for InvariantOODG: Learning Invariant Features of Point Clouds for Out-of-Distribution Generalization

Figure 3 for InvariantOODG: Learning Invariant Features of Point Clouds for Out-of-Distribution Generalization

Abstract:The convenience of 3D sensors has led to an increase in the use of 3D point clouds in various applications. However, the differences in acquisition devices or scenarios lead to divergence in the data distribution of point clouds, which requires good generalization of point cloud representation learning methods. While most previous methods rely on domain adaptation, which involves fine-tuning pre-trained models on target domain data, this may not always be feasible in real-world scenarios where target domain data may be unavailable. To address this issue, we propose InvariantOODG, which learns invariability between point clouds with different distributions using a two-branch network to extract local-to-global features from original and augmented point clouds. Specifically, to enhance local feature learning of point clouds, we define a set of learnable anchor points that locate the most useful local regions and two types of transformations to augment the input point clouds. The experimental results demonstrate the effectiveness of the proposed model on 3D domain generalization benchmarks.

Via

Access Paper or Ask Questions

DHGCN: Dynamic Hop Graph Convolution Network for Self-supervised Point Cloud Learning

Jan 05, 2024

Jincen Jiang, Lizhi Zhao, Xuequan Lu, Wei Hu, Imran Razzak, Meili Wang

Figure 1 for DHGCN: Dynamic Hop Graph Convolution Network for Self-supervised Point Cloud Learning

Figure 2 for DHGCN: Dynamic Hop Graph Convolution Network for Self-supervised Point Cloud Learning

Figure 3 for DHGCN: Dynamic Hop Graph Convolution Network for Self-supervised Point Cloud Learning

Figure 4 for DHGCN: Dynamic Hop Graph Convolution Network for Self-supervised Point Cloud Learning

Abstract:Recent works attempt to extend Graph Convolution Networks (GCNs) to point clouds for classification and segmentation tasks. These works tend to sample and group points to create smaller point sets locally and mainly focus on extracting local features through GCNs, while ignoring the relationship between point sets. In this paper, we propose the Dynamic Hop Graph Convolution Network (DHGCN) for explicitly learning the contextual relationships between the voxelized point parts, which are treated as graph nodes. Motivated by the intuition that the contextual information between point parts lies in the pairwise adjacent relationship, which can be depicted by the hop distance of the graph quantitatively, we devise a novel self-supervised part-level hop distance reconstruction task and design a novel loss function accordingly to facilitate training. In addition, we propose the Hop Graph Attention (HGA), which takes the learned hop distance as input for producing attention weights to allow edge features to contribute distinctively in aggregation. Eventually, the proposed DHGCN is a plug-and-play module that is compatible with point-based backbone networks. Comprehensive experiments on different backbones and tasks demonstrate that our self-supervised method achieves state-of-the-art performance. Our source code is available at: https://github.com/Jinec98/DHGCN.

* Accepted to AAAI 2024

Via

Access Paper or Ask Questions

Generating Explanations to Understand and Repair Embedding-based Entity Alignment

Dec 19, 2023

Xiaobin Tian, Zequn Sun, Wei Hu

Abstract:Entity alignment (EA) seeks identical entities in different knowledge graphs, which is a long-standing task in the database research. Recent work leverages deep learning to embed entities in vector space and align them via nearest neighbor search. Although embedding-based EA has gained marked success in recent years, it lacks explanations for alignment decisions. In this paper, we present the first framework that can generate explanations for understanding and repairing embedding-based EA results. Given an EA pair produced by an embedding model, we first compare its neighbor entities and relations to build a matching subgraph as a local explanation. We then construct an alignment dependency graph to understand the pair from an abstract perspective. Finally, we repair the pair by resolving three types of alignment conflicts based on dependency graphs. Experiments on a variety of EA datasets demonstrate the effectiveness, generalization, and robustness of our framework in explaining and repairing embedding-based EA results.

* Accepted in the 40th IEEE International Conference on Data Engineering (ICDE 2024)

Via

Access Paper or Ask Questions

Knowledge Graph Error Detection with Contrastive Confidence Adaption

Dec 19, 2023

Xiangyu Liu, Yang Liu, Wei Hu

Abstract:Knowledge graphs (KGs) often contain various errors. Previous works on detecting errors in KGs mainly rely on triplet embedding from graph structure. We conduct an empirical study and find that these works struggle to discriminate noise from semantically-similar correct triplets. In this paper, we propose a KG error detection model CCA to integrate both textual and graph structural information from triplet reconstruction for better distinguishing semantics. We design interactive contrastive learning to capture the differences between textual and structural patterns. Furthermore, we construct realistic datasets with semantically-similar noise and adversarial noise. Experimental results demonstrate that CCA outperforms state-of-the-art baselines, especially in detecting semantically-similar noise and adversarial noise.

* Accepted in the 38th AAAI Conference on Artificial Intelligence (AAAI 2024)

Via

Access Paper or Ask Questions