Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wei Hu

State Key Laboratory for Novel Software Technology, Nanjing University

3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack

Aug 15, 2023

Yunbo Tao, Daizong Liu, Pan Zhou, Yulai Xie, Wei Du, Wei Hu

Figure 1 for 3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack

Figure 2 for 3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack

Figure 3 for 3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack

Figure 4 for 3DHacker: Spectrum-based Decision Boundary Generation for Hard-label 3D Point Cloud Attack

Abstract:With the maturity of depth sensors, the vulnerability of 3D point cloud models has received increasing attention in various applications such as autonomous driving and robot navigation. Previous 3D adversarial attackers either follow the white-box setting to iteratively update the coordinate perturbations based on gradients, or utilize the output model logits to estimate noisy gradients in the black-box setting. However, these attack methods are hard to be deployed in real-world scenarios since realistic 3D applications will not share any model details to users. Therefore, we explore a more challenging yet practical 3D attack setting, \textit{i.e.}, attacking point clouds with black-box hard labels, in which the attacker can only have access to the prediction label of the input. To tackle this setting, we propose a novel 3D attack method, termed \textbf{3D} \textbf{H}ard-label att\textbf{acker} (\textbf{3DHacker}), based on the developed decision boundary algorithm to generate adversarial samples solely with the knowledge of class labels. Specifically, to construct the class-aware model decision boundary, 3DHacker first randomly fuses two point clouds of different classes in the spectral domain to craft their intermediate sample with high imperceptibility, then projects it onto the decision boundary via binary search. To restrict the final perturbation size, 3DHacker further introduces an iterative optimization strategy to move the intermediate sample along the decision boundary for generating adversarial point clouds with smallest trivial perturbations. Extensive evaluations show that, even in the challenging hard-label setting, 3DHacker still competitively outperforms existing 3D attacks regarding the attack performance as well as adversary quality.

* Accepted by ICCV 2023

Via

Access Paper or Ask Questions

Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity

Jul 17, 2023

Zhanpeng Zhou, Yongyi Yang, Xiaojiang Yang, Junchi Yan, Wei Hu

Figure 1 for Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity

Figure 2 for Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity

Figure 3 for Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity

Figure 4 for Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity

Abstract:Recent work has revealed many intriguing empirical phenomena in neural network training, despite the poorly understood and highly complex loss landscapes and training dynamics. One of these phenomena, Linear Mode Connectivity (LMC), has gained considerable attention due to the intriguing observation that different solutions can be connected by a linear path in the parameter space while maintaining near-constant training and test losses. In this work, we introduce a stronger notion of linear connectivity, Layerwise Linear Feature Connectivity (LLFC), which says that the feature maps of every layer in different trained networks are also linearly connected. We provide comprehensive empirical evidence for LLFC across a wide range of settings, demonstrating that whenever two trained networks satisfy LMC (via either spawning or permutation methods), they also satisfy LLFC in nearly all the layers. Furthermore, we delve deeper into the underlying factors contributing to LLFC, which reveal new insights into the spawning and permutation approaches. The study of LLFC transcends and advances our understanding of LMC by adopting a feature-learning perspective.

* 25 pages, 23 figures

Via

Access Paper or Ask Questions

IR Design for Application-Specific Natural Language: A Case Study on Traffic Data

Jul 13, 2023

Wei Hu, Xuhong Wang, Ding Wang, Shengyue Yao, Zuqiu Mao, Li Li, Fei-Yue Wang, Yilun Lin

Figure 1 for IR Design for Application-Specific Natural Language: A Case Study on Traffic Data

Figure 2 for IR Design for Application-Specific Natural Language: A Case Study on Traffic Data

Figure 3 for IR Design for Application-Specific Natural Language: A Case Study on Traffic Data

Figure 4 for IR Design for Application-Specific Natural Language: A Case Study on Traffic Data

Abstract:In the realm of software applications in the transportation industry, Domain-Specific Languages (DSLs) have enjoyed widespread adoption due to their ease of use and various other benefits. With the ceaseless progress in computer performance and the rapid development of large-scale models, the possibility of programming using natural language in specified applications - referred to as Application-Specific Natural Language (ASNL) - has emerged. ASNL exhibits greater flexibility and freedom, which, in turn, leads to an increase in computational complexity for parsing and a decrease in processing performance. To tackle this issue, our paper advances a design for an intermediate representation (IR) that caters to ASNL and can uniformly process transportation data into graph data format, improving data processing performance. Experimental comparisons reveal that in standard data query operations, our proposed IR design can achieve a speed improvement of over forty times compared to direct usage of standard XML format data.

Via

Access Paper or Ask Questions

Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations

Jun 29, 2023

Yongyi Yang, Jacob Steinhardt, Wei Hu

Figure 1 for Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations

Figure 2 for Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations

Figure 3 for Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations

Figure 4 for Are Neurons Actually Collapsed? On the Fine-Grained Structure in Neural Representations

Abstract:Recent work has observed an intriguing ''Neural Collapse'' phenomenon in well-trained neural networks, where the last-layer representations of training samples with the same label collapse into each other. This appears to suggest that the last-layer representations are completely determined by the labels, and do not depend on the intrinsic structure of input distribution. We provide evidence that this is not a complete description, and that the apparent collapse hides important fine-grained structure in the representations. Specifically, even when representations apparently collapse, the small amount of remaining variation can still faithfully and accurately captures the intrinsic structure of input distribution. As an example, if we train on CIFAR-10 using only 5 coarse-grained labels (by combining two classes into one super-class) until convergence, we can reconstruct the original 10-class labels from the learned representations via unsupervised clustering. The reconstructed labels achieve $93\%$ accuracy on the CIFAR-10 test set, nearly matching the normal CIFAR-10 accuracy for the same architecture. We also provide an initial theoretical result showing the fine-grained representation structure in a simplified synthetic setting. Our results show concretely how the structure of input data can play a significant role in determining the fine-grained structure of neural representations, going beyond what Neural Collapse predicts.

* This paper has been accepted as a conference paper at ICML 2023

Via

Access Paper or Ask Questions

Joint Pre-training and Local Re-training: Transferable Representation Learning on Multi-source Knowledge Graphs

Jun 05, 2023

Zequn Sun, Jiacheng Huang, Jinghao Lin, Xiaozhou Xu, Qijin Chen, Wei Hu

Figure 1 for Joint Pre-training and Local Re-training: Transferable Representation Learning on Multi-source Knowledge Graphs

Figure 2 for Joint Pre-training and Local Re-training: Transferable Representation Learning on Multi-source Knowledge Graphs

Figure 3 for Joint Pre-training and Local Re-training: Transferable Representation Learning on Multi-source Knowledge Graphs

Figure 4 for Joint Pre-training and Local Re-training: Transferable Representation Learning on Multi-source Knowledge Graphs

Abstract:In this paper, we present the ``joint pre-training and local re-training'' framework for learning and applying multi-source knowledge graph (KG) embeddings. We are motivated by the fact that different KGs contain complementary information to improve KG embeddings and downstream tasks. We pre-train a large teacher KG embedding model over linked multi-source KGs and distill knowledge to train a student model for a task-specific KG. To enable knowledge transfer across different KGs, we use entity alignment to build a linked subgraph for connecting the pre-trained KGs and the target KG. The linked subgraph is re-trained for three-level knowledge distillation from the teacher to the student, i.e., feature knowledge distillation, network knowledge distillation, and prediction knowledge distillation, to generate more expressive embeddings. The teacher model can be reused for different target KGs and tasks without having to train from scratch. We conduct extensive experiments to demonstrate the effectiveness and efficiency of our framework.

* Accepted in the 29th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2023)

Via

Access Paper or Ask Questions

What Makes Entities Similar? A Similarity Flooding Perspective for Multi-sourced Knowledge Graph Embeddings

Jun 05, 2023

Zequn Sun, Jiacheng Huang, Xiaozhou Xu, Qijin Chen, Weijun Ren, Wei Hu

Abstract:Joint representation learning over multi-sourced knowledge graphs (KGs) yields transferable and expressive embeddings that improve downstream tasks. Entity alignment (EA) is a critical step in this process. Despite recent considerable research progress in embedding-based EA, how it works remains to be explored. In this paper, we provide a similarity flooding perspective to explain existing translation-based and aggregation-based EA models. We prove that the embedding learning process of these models actually seeks a fixpoint of pairwise similarities between entities. We also provide experimental evidence to support our theoretical analysis. We propose two simple but effective methods inspired by the fixpoint computation in similarity flooding, and demonstrate their effectiveness on benchmark datasets. Our work bridges the gap between recent embedding-based models and the conventional similarity flooding algorithm. It would improve our understanding of and increase our faith in embedding-based EA.

* Accepted in the 40th International Conference on Machine Learning (ICML 2023)

Via

Access Paper or Ask Questions

The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks

Jun 01, 2023

Can Yaras, Peng Wang, Wei Hu, Zhihui Zhu, Laura Balzano, Qing Qu

Abstract:Over the past few years, an extensively studied phenomenon in training deep networks is the implicit bias of gradient descent towards parsimonious solutions. In this work, we investigate this phenomenon by narrowing our focus to deep linear networks. Through our analysis, we reveal a surprising "law of parsimony" in the learning dynamics when the data possesses low-dimensional structures. Specifically, we show that the evolution of gradient descent starting from orthogonal initialization only affects a minimal portion of singular vector spaces across all weight matrices. In other words, the learning process happens only within a small invariant subspace of each weight matrix, despite the fact that all weight parameters are updated throughout training. This simplicity in learning dynamics could have significant implications for both efficient training and a better understanding of deep networks. First, the analysis enables us to considerably improve training efficiency by taking advantage of the low-dimensional structure in learning dynamics. We can construct smaller, equivalent deep linear networks without sacrificing the benefits associated with the wider counterparts. Second, it allows us to better understand deep representation learning by elucidating the linear progressive separation and concentration of representations from shallow to deep layers. We also conduct numerical experiments to support our theoretical results. The code for our experiments can be found at https://github.com/cjyaras/lawofparsimony.

* The first two authors contributed to this work equally; 32 pages, 12 figures

Via

Access Paper or Ask Questions

Robust Sparse Mean Estimation via Incremental Learning

May 24, 2023

Jianhao Ma, Rui Ray Chen, Yinghui He, Salar Fattahi, Wei Hu

Abstract:In this paper, we study the problem of robust sparse mean estimation, where the goal is to estimate a $k$-sparse mean from a collection of partially corrupted samples drawn from a heavy-tailed distribution. Existing estimators face two critical challenges in this setting. First, they are limited by a conjectured computational-statistical tradeoff, implying that any computationally efficient algorithm needs $\tilde\Omega(k^2)$ samples, while its statistically-optimal counterpart only requires $\tilde O(k)$ samples. Second, the existing estimators fall short of practical use as they scale poorly with the ambient dimension. This paper presents a simple mean estimator that overcomes both challenges under moderate conditions: it runs in near-linear time and memory (both with respect to the ambient dimension) while requiring only $\tilde O(k)$ samples to recover the true mean. At the core of our method lies an incremental learning phenomenon: we introduce a simple nonconvex framework that can incrementally learn the top-$k$ nonzero elements of the mean while keeping the zero elements arbitrarily small. Unlike existing estimators, our method does not need any prior knowledge of the sparsity level $k$. We prove the optimality of our estimator by providing a matching information-theoretic lower bound. Finally, we conduct a series of simulations to corroborate our theoretical findings. Our code is available at https://github.com/huihui0902/Robust_mean_estimation.

Via

Access Paper or Ask Questions

Serial Contrastive Knowledge Distillation for Continual Few-shot Relation Extraction

May 11, 2023

Xinyi Wang, Zitao Wang, Wei Hu

Figure 1 for Serial Contrastive Knowledge Distillation for Continual Few-shot Relation Extraction

Figure 2 for Serial Contrastive Knowledge Distillation for Continual Few-shot Relation Extraction

Figure 3 for Serial Contrastive Knowledge Distillation for Continual Few-shot Relation Extraction

Figure 4 for Serial Contrastive Knowledge Distillation for Continual Few-shot Relation Extraction

Abstract:Continual few-shot relation extraction (RE) aims to continuously train a model for new relations with few labeled training data, of which the major challenges are the catastrophic forgetting of old relations and the overfitting caused by data sparsity. In this paper, we propose a new model, namely SCKD, to accomplish the continual few-shot RE task. Specifically, we design serial knowledge distillation to preserve the prior knowledge from previous models and conduct contrastive learning with pseudo samples to keep the representations of samples in different relations sufficiently distinguishable. Our experiments on two benchmark datasets validate the effectiveness of SCKD for continual few-shot RE and its superiority in knowledge transfer and memory utilization over state-of-the-art models.

* Accepted in the Findings of ACL 2023

Via

Access Paper or Ask Questions

Using a Bayesian-Inference Approach to Calibrating Models for Simulation in Robotics

May 11, 2023

Huzaifa Mustafa Unjhawala, Ruochun Zhang, Wei Hu, Jinlong Wu, Radu Serban, Dan Negrut

Figure 1 for Using a Bayesian-Inference Approach to Calibrating Models for Simulation in Robotics

Figure 2 for Using a Bayesian-Inference Approach to Calibrating Models for Simulation in Robotics

Figure 3 for Using a Bayesian-Inference Approach to Calibrating Models for Simulation in Robotics

Figure 4 for Using a Bayesian-Inference Approach to Calibrating Models for Simulation in Robotics

Abstract:In robotics, simulation has the potential to reduce design time and costs, and lead to a more robust engineered solution and a safer development process. However, the use of simulators is predicated on the availability of good models. This contribution is concerned with improving the quality of these models via calibration, which is cast herein in a Bayesian framework. First, we discuss the Bayesian machinery involved in model calibration. Then, we demonstrate it in one example: calibration of a vehicle dynamics model that has low degree of freedom count and can be used for state estimation, model predictive control, or path planning. A high fidelity simulator is used to emulate the ``experiments'' and generate the data for the calibration. The merit of this work is not tied to a new Bayesian methodology for calibration, but to the demonstration of how the Bayesian machinery can establish connections among models in computational dynamics, even when the data in use is noisy. The software used to generate the results reported herein is available in a public repository for unfettered use and distribution.

* 061004-18 / Vol. 18, JUNE 2023
* 19 pages, 42 figures

Via

Access Paper or Ask Questions