Alert button
Picture for Yaxin Li

Yaxin Li

Alert button

Exploring Memorization in Fine-tuned Language Models

Oct 10, 2023
Shenglai Zeng, Yaxin Li, Jie Ren, Yiding Liu, Han Xu, Pengfei He, Yue Xing, Shuaiqiang Wang, Jiliang Tang, Dawei Yin

Figure 1 for Exploring Memorization in Fine-tuned Language Models
Figure 2 for Exploring Memorization in Fine-tuned Language Models
Figure 3 for Exploring Memorization in Fine-tuned Language Models
Figure 4 for Exploring Memorization in Fine-tuned Language Models

LLMs have shown great capabilities in various tasks but also exhibited memorization of training data, thus raising tremendous privacy and copyright concerns. While prior work has studied memorization during pre-training, the exploration of memorization during fine-tuning is rather limited. Compared with pre-training, fine-tuning typically involves sensitive data and diverse objectives, thus may bring unique memorization behaviors and distinct privacy risks. In this work, we conduct the first comprehensive analysis to explore LMs' memorization during fine-tuning across tasks. Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that fine-tuned memorization presents a strong disparity among tasks. We provide an understanding of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution. By investigating its memorization behavior, multi-task fine-tuning paves a potential strategy to mitigate fine-tuned memorization.

Viaarxiv icon

3D Reconstruction of Spherical Images based on Incremental Structure from Motion

Jun 24, 2023
San Jiang, Kan You, Yaxin Li, Duojie Weng, Wu Chen

Figure 1 for 3D Reconstruction of Spherical Images based on Incremental Structure from Motion
Figure 2 for 3D Reconstruction of Spherical Images based on Incremental Structure from Motion
Figure 3 for 3D Reconstruction of Spherical Images based on Incremental Structure from Motion
Figure 4 for 3D Reconstruction of Spherical Images based on Incremental Structure from Motion

3D reconstruction plays an increasingly important role in modern photogrammetric systems. Conventional satellite or aerial-based remote sensing (RS) platforms can provide the necessary data sources for the 3D reconstruction of large-scale landforms and cities. Even with low-altitude UAVs (Unmanned Aerial Vehicles), 3D reconstruction in complicated situations, such as urban canyons and indoor scenes, is challenging due to the frequent tracking failures between camera frames and high data collection costs. Recently, spherical images have been extensively exploited due to the capability of recording surrounding environments from one camera exposure. Classical 3D reconstruction pipelines, however, cannot be used for spherical images. Besides, there exist few software packages for 3D reconstruction of spherical images. Based on the imaging geometry of spherical cameras, this study investigates the algorithms for the relative orientation using spherical correspondences, absolute orientation using 3D correspondences between scene and spherical points, and the cost functions for BA (bundle adjustment) optimization. In addition, an incremental SfM (Structure from Motion) workflow has been proposed for spherical images using the above-mentioned algorithms. The proposed solution is finally verified by using three spherical datasets captured by both consumer-grade and professional spherical cameras. The results demonstrate that the proposed SfM workflow can achieve the successful 3D reconstruction of complex scenes and provide useful clues for the implementation in open-source software packages. The source code of the designed SfM workflow would be made publicly available.

Viaarxiv icon

Biologically inspired structure learning with reverse knowledge distillation for spiking neural networks

Apr 19, 2023
Qi Xu, Yaxin Li, Xuanye Fang, Jiangrong Shen, Jian K. Liu, Huajin Tang, Gang Pan

Figure 1 for Biologically inspired structure learning with reverse knowledge distillation for spiking neural networks
Figure 2 for Biologically inspired structure learning with reverse knowledge distillation for spiking neural networks
Figure 3 for Biologically inspired structure learning with reverse knowledge distillation for spiking neural networks
Figure 4 for Biologically inspired structure learning with reverse knowledge distillation for spiking neural networks

Spiking neural networks (SNNs) have superb characteristics in sensory information recognition tasks due to their biological plausibility. However, the performance of some current spiking-based models is limited by their structures which means either fully connected or too-deep structures bring too much redundancy. This redundancy from both connection and neurons is one of the key factors hindering the practical application of SNNs. Although Some pruning methods were proposed to tackle this problem, they normally ignored the fact the neural topology in the human brain could be adjusted dynamically. Inspired by this, this paper proposed an evolutionary-based structure construction method for constructing more reasonable SNNs. By integrating the knowledge distillation and connection pruning method, the synaptic connections in SNNs can be optimized dynamically to reach an optimal state. As a result, the structure of SNNs could not only absorb knowledge from the teacher model but also search for deep but sparse network topology. Experimental results on CIFAR100 and DVS-Gesture show that the proposed structure learning method can get pretty well performance while reducing the connection redundancy. The proposed method explores a novel dynamical way for structure learning from scratch in SNNs which could build a bridge to close the gap between deep learning and bio-inspired neural dynamics.

Viaarxiv icon

Constructing Deep Spiking Neural Networks from Artificial Neural Networks with Knowledge Distillation

Apr 17, 2023
Qi Xu, Yaxin Li, Jiangrong Shen, Jian K Liu, Huajin Tang, Gang Pan

Figure 1 for Constructing Deep Spiking Neural Networks from Artificial Neural Networks with Knowledge Distillation
Figure 2 for Constructing Deep Spiking Neural Networks from Artificial Neural Networks with Knowledge Distillation
Figure 3 for Constructing Deep Spiking Neural Networks from Artificial Neural Networks with Knowledge Distillation
Figure 4 for Constructing Deep Spiking Neural Networks from Artificial Neural Networks with Knowledge Distillation

Spiking neural networks (SNNs) are well known as the brain-inspired models with high computing efficiency, due to a key component that they utilize spikes as information units, close to the biological neural systems. Although spiking based models are energy efficient by taking advantage of discrete spike signals, their performance is limited by current network structures and their training methods. As discrete signals, typical SNNs cannot apply the gradient descent rules directly into parameters adjustment as artificial neural networks (ANNs). Aiming at this limitation, here we propose a novel method of constructing deep SNN models with knowledge distillation (KD) that uses ANN as teacher model and SNN as student model. Through ANN-SNN joint training algorithm, the student SNN model can learn rich feature information from the teacher ANN model through the KD method, yet it avoids training SNN from scratch when communicating with non-differentiable spikes. Our method can not only build a more efficient deep spiking structure feasibly and reasonably, but use few time steps to train whole model compared to direct training or ANN to SNN methods. More importantly, it has a superb ability of noise immunity for various types of artificial noises and natural signals. The proposed novel method provides efficient ways to improve the performance of SNN through constructing deeper structures in a high-throughput fashion, with potential usage for light and efficient brain-inspired computing of practical scenarios.

Viaarxiv icon

3D reconstruction of spherical images: A review of techniques, applications, and prospects

Feb 09, 2023
San Jiang, Yaxin Li, Duojie Weng, Kan You, Wu Chen

Figure 1 for 3D reconstruction of spherical images: A review of techniques, applications, and prospects
Figure 2 for 3D reconstruction of spherical images: A review of techniques, applications, and prospects
Figure 3 for 3D reconstruction of spherical images: A review of techniques, applications, and prospects
Figure 4 for 3D reconstruction of spherical images: A review of techniques, applications, and prospects

3D reconstruction plays an increasingly important role in modern photogrammetric systems. Conventional satellite or aerial-based remote sensing (RS) platforms can provide the necessary data sources for the 3D reconstruction of large-scale landforms and cities. Even with low-altitude UAVs (Unmanned Aerial Vehicles), 3D reconstruction in complicated situations, such as urban canyons and indoor scenes, is challenging due to frequent tracking failures between camera frames and high data collection costs. Recently, spherical images have been extensively used due to the capability of recording surrounding environments from one camera exposure. In contrast to perspective images with limited FOV (Field of View), spherical images can cover the whole scene with full horizontal and vertical FOV and facilitate camera tracking and data acquisition in these complex scenes. With the rapid evolution and extensive use of professional and consumer-grade spherical cameras, spherical images show great potential for the 3D modeling of urban and indoor scenes. Classical 3D reconstruction pipelines, however, cannot be directly used for spherical images. Besides, there exist few software packages that are designed for the 3D reconstruction of spherical images. As a result, this research provides a thorough survey of the state-of-the-art for 3D reconstruction of spherical images in terms of data acquisition, feature detection and matching, image orientation, and dense matching as well as presenting promising applications and discussing potential prospects. We anticipate that this study offers insightful clues to direct future research.

Viaarxiv icon

Enhancing Adversarial Training with Feature Separability

May 02, 2022
Yaxin Li, Xiaorui Liu, Han Xu, Wentao Wang, Jiliang Tang

Figure 1 for Enhancing Adversarial Training with Feature Separability
Figure 2 for Enhancing Adversarial Training with Feature Separability
Figure 3 for Enhancing Adversarial Training with Feature Separability
Figure 4 for Enhancing Adversarial Training with Feature Separability

Deep Neural Network (DNN) are vulnerable to adversarial attacks. As a countermeasure, adversarial training aims to achieve robustness based on the min-max optimization problem and it has shown to be one of the most effective defense strategies. However, in this work, we found that compared with natural training, adversarial training fails to learn better feature representations for either clean or adversarial samples, which can be one reason why adversarial training tends to have severe overfitting issues and less satisfied generalize performance. Specifically, we observe two major shortcomings of the features learned by existing adversarial training methods:(1) low intra-class feature similarity; and (2) conservative inter-classes feature variance. To overcome these shortcomings, we introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to coherently boost the intra-class feature similarity and increase inter-class feature variance. Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.

* 10 pages 
Viaarxiv icon

Trustworthy AI: A Computational Perspective

Aug 02, 2021
Haochen Liu, Yiqi Wang, Wenqi Fan, Xiaorui Liu, Yaxin Li, Shaili Jain, Yunhao Liu, Anil K. Jain, Jiliang Tang

Figure 1 for Trustworthy AI: A Computational Perspective
Figure 2 for Trustworthy AI: A Computational Perspective
Figure 3 for Trustworthy AI: A Computational Perspective
Figure 4 for Trustworthy AI: A Computational Perspective

In the past few decades, artificial intelligence (AI) technology has experienced swift developments, changing everyone's daily life and profoundly altering the course of human society. The intention of developing AI is to benefit humans, by reducing human labor, bringing everyday convenience to human lives, and promoting social good. However, recent research and AI applications show that AI can cause unintentional harm to humans, such as making unreliable decisions in safety-critical scenarios or undermining fairness by inadvertently discriminating against one group. Thus, trustworthy AI has attracted immense attention recently, which requires careful consideration to avoid the adverse effects that AI may bring to humans, so that humans can fully trust and live in harmony with AI technologies. Recent years have witnessed a tremendous amount of research on trustworthy AI. In this survey, we present a comprehensive survey of trustworthy AI from a computational perspective, to help readers understand the latest technologies for achieving trustworthy AI. Trustworthy AI is a large and complex area, involving various dimensions. In this work, we focus on six of the most crucial dimensions in achieving trustworthy AI: (i) Safety & Robustness, (ii) Non-discrimination & Fairness, (iii) Explainability, (iv) Privacy, (v) Accountability & Auditability, and (vi) Environmental Well-Being. For each dimension, we review the recent related technologies according to a taxonomy and summarize their applications in real-world systems. We also discuss the accordant and conflicting interactions among different dimensions and discuss potential aspects for trustworthy AI to investigate in the future.

* 55 pages 
Viaarxiv icon

Wood-leaf classification of tree point cloud based on intensity and geometrical information

Aug 02, 2021
Jingqian Sun, Pei Wang, Zhiyong Gao, Zichu Liu, Yaxin Li, Xiaozheng Gan

Figure 1 for Wood-leaf classification of tree point cloud based on intensity and geometrical information
Figure 2 for Wood-leaf classification of tree point cloud based on intensity and geometrical information
Figure 3 for Wood-leaf classification of tree point cloud based on intensity and geometrical information
Figure 4 for Wood-leaf classification of tree point cloud based on intensity and geometrical information

Terrestrial laser scanning (TLS) can obtain tree point cloud with high precision and high density. Efficient classification of wood points and leaf points is essential to study tree structural parameters and ecological characteristics. By using both the intensity and spatial information, a three-step classification and verification method was proposed to achieve automated wood-leaf classification. Tree point cloud was classified into wood points and leaf points by using intensity threshold, neighborhood density and voxelization successively. Experiment was carried in Haidian Park, Beijing, and 24 trees were scanned by using the RIEGL VZ-400 scanner. The tree point clouds were processed by using the proposed method, whose classification results were compared with the manual classification results which were used as standard results. To evaluate the classification accuracy, three indicators were used in the experiment, which are Overall Accuracy (OA), Kappa coefficient (Kappa) and Matthews correlation coefficient (MCC). The ranges of OA, Kappa and MCC of the proposed method are from 0.9167 to 0.9872, from 0.7276 to 0.9191, and from 0.7544 to 0.9211 respectively. The average values of OA, Kappa and MCC are 0.9550, 0.8547 and 0.8627 respectively. Time cost of wood-leaf classification was also recorded to evaluate the algorithm efficiency. The average processing time are 1.4 seconds per million points. The results showed that the proposed method performed well automatically and quickly on wood-leaf classification based on the experimental dataset.

Viaarxiv icon

Imbalanced Adversarial Training with Reweighting

Jul 28, 2021
Wentao Wang, Han Xu, Xiaorui Liu, Yaxin Li, Bhavani Thuraisingham, Jiliang Tang

Figure 1 for Imbalanced Adversarial Training with Reweighting
Figure 2 for Imbalanced Adversarial Training with Reweighting
Figure 3 for Imbalanced Adversarial Training with Reweighting
Figure 4 for Imbalanced Adversarial Training with Reweighting

Adversarial training has been empirically proven to be one of the most effective and reliable defense methods against adversarial attacks. However, almost all existing studies about adversarial training are focused on balanced datasets, where each class has an equal amount of training examples. Research on adversarial training with imbalanced training datasets is rather limited. As the initial effort to investigate this problem, we reveal the facts that adversarially trained models present two distinguished behaviors from naturally trained models in imbalanced datasets: (1) Compared to natural training, adversarially trained models can suffer much worse performance on under-represented classes, when the training dataset is extremely imbalanced. (2) Traditional reweighting strategies may lose efficacy to deal with the imbalance issue for adversarial training. For example, upweighting the under-represented classes will drastically hurt the model's performance on well-represented classes, and as a result, finding an optimal reweighting value can be tremendously challenging. In this paper, to further understand our observations, we theoretically show that the poor data separability is one key reason causing this strong tension between under-represented and well-represented classes. Motivated by this finding, we propose Separable Reweighted Adversarial Training (SRAT) to facilitate adversarial training under imbalanced scenarios, by learning more separable features for different classes. Extensive experiments on various datasets verify the effectiveness of the proposed framework.

Viaarxiv icon