Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xuefeng Du

Feed Two Birds with One Scone: Exploiting Wild Data for Both Out-of-Distribution Generalization and Detection

Jun 15, 2023

Haoyue Bai, Gregory Canal, Xuefeng Du, Jeongyeol Kwon, Robert Nowak, Yixuan Li

Abstract:Modern machine learning models deployed in the wild can encounter both covariate and semantic shifts, giving rise to the problems of out-of-distribution (OOD) generalization and OOD detection respectively. While both problems have received significant research attention lately, they have been pursued independently. This may not be surprising, since the two tasks have seemingly conflicting goals. This paper provides a new unified approach that is capable of simultaneously generalizing to covariate shifts while robustly detecting semantic shifts. We propose a margin-based learning framework that exploits freely available unlabeled data in the wild that captures the environmental test-time OOD distributions under both covariate and semantic shifts. We show both empirically and theoretically that the proposed margin constraint is the key to achieving both OOD generalization and detection. Extensive experiments show the superiority of our framework, outperforming competitive baselines that specialize in either OOD generalization or OOD detection. Code is publicly available at https://github.com/deeplearning-wisc/scone.

* ICML 2023

Via

Access Paper or Ask Questions

Non-Parametric Outlier Synthesis

Mar 06, 2023

Leitian Tao, Xuefeng Du, Xiaojin Zhu, Yixuan Li

Abstract:Out-of-distribution (OOD) detection is indispensable for safely deploying machine learning models in the wild. One of the key challenges is that models lack supervision signals from unknown data, and as a result, can produce overconfident predictions on OOD data. Recent work on outlier synthesis modeled the feature space as parametric Gaussian distribution, a strong and restrictive assumption that might not hold in reality. In this paper, we propose a novel framework, Non-Parametric Outlier Synthesis (NPOS), which generates artificial OOD training data and facilitates learning a reliable decision boundary between ID and OOD data. Importantly, our proposed synthesis approach does not make any distributional assumption on the ID embeddings, thereby offering strong flexibility and generality. We show that our synthesis approach can be mathematically interpreted as a rejection sampling framework. Extensive experiments show that NPOS can achieve superior OOD detection performance, outperforming the competitive rivals by a significant margin. Code is publicly available at https://github.com/deeplearning-wisc/npos.

* ICLR 2023

Via

Access Paper or Ask Questions

OpenOOD: Benchmarking Generalized Out-of-Distribution Detection

Oct 13, 2022

Jingkang Yang, Pengyun Wang, Dejian Zou, Zitang Zhou, Kunyuan Ding, Wenxuan Peng, Haoqi Wang, Guangyao Chen, Bo Li, Yiyou Sun(+6 more)

Figure 1 for OpenOOD: Benchmarking Generalized Out-of-Distribution Detection

Figure 2 for OpenOOD: Benchmarking Generalized Out-of-Distribution Detection

Figure 3 for OpenOOD: Benchmarking Generalized Out-of-Distribution Detection

Abstract:Out-of-distribution (OOD) detection is vital to safety-critical machine learning applications and has thus been extensively studied, with a plethora of methods developed in the literature. However, the field currently lacks a unified, strictly formulated, and comprehensive benchmark, which often results in unfair comparisons and inconclusive results. From the problem setting perspective, OOD detection is closely related to neighboring fields including anomaly detection (AD), open set recognition (OSR), and model uncertainty, since methods developed for one domain are often applicable to each other. To help the community to improve the evaluation and advance, we build a unified, well-structured codebase called OpenOOD, which implements over 30 methods developed in relevant fields and provides a comprehensive benchmark under the recently proposed generalized OOD detection framework. With a comprehensive comparison of these methods, we are gratified that the field has progressed significantly over the past few years, where both preprocessing methods and the orthogonal post-hoc methods show strong potential.

* Accepted by NeurIPS 2022 Datasets and Benchmarks Track. Codebase: https://github.com/Jingkang50/OpenOOD

Via

Access Paper or Ask Questions

Unknown-Aware Object Detection: Learning What You Don't Know from Videos in the Wild

Mar 08, 2022

Xuefeng Du, Xin Wang, Gabriel Gozum, Yixuan Li

Figure 1 for Unknown-Aware Object Detection: Learning What You Don't Know from Videos in the Wild

Figure 2 for Unknown-Aware Object Detection: Learning What You Don't Know from Videos in the Wild

Figure 3 for Unknown-Aware Object Detection: Learning What You Don't Know from Videos in the Wild

Figure 4 for Unknown-Aware Object Detection: Learning What You Don't Know from Videos in the Wild

Abstract:Building reliable object detectors that can detect out-of-distribution (OOD) objects is critical yet underexplored. One of the key challenges is that models lack supervision signals from unknown data, producing overconfident predictions on OOD objects. We propose a new unknown-aware object detection framework through Spatial-Temporal Unknown Distillation (STUD), which distills unknown objects from videos in the wild and meaningfully regularizes the model's decision boundary. STUD first identifies the unknown candidate object proposals in the spatial dimension, and then aggregates the candidates across multiple video frames to form a diverse set of unknown objects near the decision boundary. Alongside, we employ an energy-based uncertainty regularization loss, which contrastively shapes the uncertainty space between the in-distribution and distilled unknown objects. STUD establishes the state-of-the-art performance on OOD detection tasks for object detection, reducing the FPR95 score by over 10% compared to the previous best method. Code is available at https://github.com/deeplearning-wisc/stud.

* CVPR2022

Via

Access Paper or Ask Questions

VOS: Learning What You Don't Know by Virtual Outlier Synthesis

Feb 04, 2022

Xuefeng Du, Zhaoning Wang, Mu Cai, Yixuan Li

Figure 1 for VOS: Learning What You Don't Know by Virtual Outlier Synthesis

Figure 2 for VOS: Learning What You Don't Know by Virtual Outlier Synthesis

Figure 3 for VOS: Learning What You Don't Know by Virtual Outlier Synthesis

Figure 4 for VOS: Learning What You Don't Know by Virtual Outlier Synthesis

Abstract:Out-of-distribution (OOD) detection has received much attention lately due to its importance in the safe deployment of neural networks. One of the key challenges is that models lack supervision signals from unknown data, and as a result, can produce overconfident predictions on OOD data. Previous approaches rely on real outlier datasets for model regularization, which can be costly and sometimes infeasible to obtain in practice. In this paper, we present VOS, a novel framework for OOD detection by adaptively synthesizing virtual outliers that can meaningfully regularize the model's decision boundary during training. Specifically, VOS samples virtual outliers from the low-likelihood region of the class-conditional distribution estimated in the feature space. Alongside, we introduce a novel unknown-aware training objective, which contrastively shapes the uncertainty space between the ID data and synthesized outlier data. VOS achieves state-of-the-art performance on both object detection and image classification models, reducing the FPR95 by up to 7.87% compared to the previous best method. Code is available at https://github.com/deeplearning-wisc/vos.

* ICLR 2022

Via

Access Paper or Ask Questions

PI-GNN: A Novel Perspective on Semi-Supervised Node Classification against Noisy Labels

Jun 14, 2021

Xuefeng Du, Tian Bian, Yu Rong, Bo Han, Tongliang Liu, Tingyang Xu, Wenbing Huang, Junzhou Huang

Figure 1 for PI-GNN: A Novel Perspective on Semi-Supervised Node Classification against Noisy Labels

Figure 2 for PI-GNN: A Novel Perspective on Semi-Supervised Node Classification against Noisy Labels

Figure 3 for PI-GNN: A Novel Perspective on Semi-Supervised Node Classification against Noisy Labels

Figure 4 for PI-GNN: A Novel Perspective on Semi-Supervised Node Classification against Noisy Labels

Abstract:Semi-supervised node classification, as a fundamental problem in graph learning, leverages unlabeled nodes along with a small portion of labeled nodes for training. Existing methods rely heavily on high-quality labels, which, however, are expensive to obtain in real-world applications since certain noises are inevitably involved during the labeling process. It hence poses an unavoidable challenge for the learning algorithm to generalize well. In this paper, we propose a novel robust learning objective dubbed pairwise interactions (PI) for the model, such as Graph Neural Network (GNN) to combat noisy labels. Unlike classic robust training approaches that operate on the pointwise interactions between node and class label pairs, PI explicitly forces the embeddings for node pairs that hold a positive PI label to be close to each other, which can be applied to both labeled and unlabeled nodes. We design several instantiations for PI labels based on the graph structure and the node class labels, and further propose a new uncertainty-aware training technique to mitigate the negative effect of the sub-optimal PI labels. Extensive experiments on different datasets and GNN architectures demonstrate the effectiveness of PI, yielding a promising improvement over the state-of-the-art methods.

* 16 pages, 3 figures

Via

Access Paper or Ask Questions

Active Learning to Classify Macromolecular Structures in situ for Less Supervision in Cryo-Electron Tomography

Feb 24, 2021

Xuefeng Du, Haohan Wang, Zhenxi Zhu, Xiangrui Zeng, Yi-Wei Chang, Jing Zhang, Eric Xing, Min Xu

Figure 1 for Active Learning to Classify Macromolecular Structures in situ for Less Supervision in Cryo-Electron Tomography

Figure 2 for Active Learning to Classify Macromolecular Structures in situ for Less Supervision in Cryo-Electron Tomography

Figure 3 for Active Learning to Classify Macromolecular Structures in situ for Less Supervision in Cryo-Electron Tomography

Figure 4 for Active Learning to Classify Macromolecular Structures in situ for Less Supervision in Cryo-Electron Tomography

Abstract:Motivation: Cryo-Electron Tomography (cryo-ET) is a 3D bioimaging tool that visualizes the structural and spatial organization of macromolecules at a near-native state in single cells, which has broad applications in life science. However, the systematic structural recognition and recovery of macromolecules captured by cryo-ET are difficult due to high structural complexity and imaging limits. Deep learning based subtomogram classification have played critical roles for such tasks. As supervised approaches, however, their performance relies on sufficient and laborious annotation on a large training dataset. Results: To alleviate this major labeling burden, we proposed a Hybrid Active Learning (HAL) framework for querying subtomograms for labelling from a large unlabeled subtomogram pool. Firstly, HAL adopts uncertainty sampling to select the subtomograms that have the most uncertain predictions. Moreover, to mitigate the sampling bias caused by such strategy, a discriminator is introduced to judge if a certain subtomogram is labeled or unlabeled and subsequently the model queries the subtomogram that have higher probabilities to be unlabeled. Additionally, HAL introduces a subset sampling strategy to improve the diversity of the query set, so that the information overlap is decreased between the queried batches and the algorithmic efficiency is improved. Our experiments on subtomogram classification tasks using both simulated and real data demonstrate that we can achieve comparable testing performance (on average only 3% accuracy drop) by using less than 30% of the labeled subtomograms, which shows a very promising result for subtomogram classification task with limited labeling resources.

* 14 ages

Via

Access Paper or Ask Questions

Learning Diverse-Structured Networks for Adversarial Robustness

Feb 08, 2021

Xuefeng Du, Jingfeng Zhang, Bo Han, Tongliang Liu, Yu Rong, Gang Niu, Junzhou Huang, Masashi Sugiyama

Figure 1 for Learning Diverse-Structured Networks for Adversarial Robustness

Figure 2 for Learning Diverse-Structured Networks for Adversarial Robustness

Figure 3 for Learning Diverse-Structured Networks for Adversarial Robustness

Figure 4 for Learning Diverse-Structured Networks for Adversarial Robustness

Abstract:In adversarial training (AT), the main focus has been the objective and optimizer while the model has been less studied, so that the models being used are still those classic ones in standard training (ST). Classic network architectures (NAs) are generally worse than searched NAs in ST, which should be the same in AT. In this paper, we argue that NA and AT cannot be handled independently, since given a dataset, the optimal NA in ST would be no longer optimal in AT. That being said, AT is time-consuming itself; if we directly search NAs in AT over large search spaces, the computation will be practically infeasible. Thus, we propose a diverse-structured network (DS-Net), to significantly reduce the size of the search space: instead of low-level operations, we only consider predefined atomic blocks, where an atomic block is a time-tested building block like the residual block. There are only a few atomic blocks and thus we can weight all atomic blocks rather than find the best one in a searched block of DS-Net, which is an essential trade-off between exploring diverse structures and exploiting the best structures. Empirical results demonstrate the advantages of DS-Net, i.e., weighting the atomic blocks.

* 26 pages, 8 figures

Via

Access Paper or Ask Questions

Small-Group Learning, with Application to Neural Architecture Search

Dec 23, 2020

Xuefeng Du, Pengtao Xie

Figure 1 for Small-Group Learning, with Application to Neural Architecture Search

Figure 2 for Small-Group Learning, with Application to Neural Architecture Search

Figure 3 for Small-Group Learning, with Application to Neural Architecture Search

Figure 4 for Small-Group Learning, with Application to Neural Architecture Search

Abstract:Small-group learning is a broadly used methodology in human learning and shows great effectiveness in improving learning outcomes: a small group of students work together towards the same learning objective, where they express their understanding of a topic to their peers, compare their ideas, and help each other to trouble-shoot problems. We are interested in investigating whether this powerful learning technique can be borrowed from humans to improve the learning abilities of machines. We propose a novel learning approach called small-group learning (SGL). In our approach, each learner uses its intermediately trained model to generate a pseudo-labeled dataset and re-trains its model using pseudo-labeled datasets generated by other learners. We propose a multi-level optimization framework to formulate SGL which involves three learning stages: learners train their network weights independently; learners train their network weights collaboratively via mutual pseudo-labeling; learners improve their architectures by minimizing validation losses. We develop an efficient algorithm to solve the SGL problem. We apply our approach to neural architecture search and achieve significant improvement on CIFAR-100, CIFAR-10, and ImageNet.

* arXiv admin note: substantial text overlap with arXiv:2011.15102, arXiv:2012.04863

Via

Access Paper or Ask Questions

Skillearn: Machine Learning Inspired by Humans' Learning Skills

Dec 09, 2020

Pengtao Xie, Xuefeng Du, Hao Ban

Figure 1 for Skillearn: Machine Learning Inspired by Humans' Learning Skills

Figure 2 for Skillearn: Machine Learning Inspired by Humans' Learning Skills

Figure 3 for Skillearn: Machine Learning Inspired by Humans' Learning Skills

Figure 4 for Skillearn: Machine Learning Inspired by Humans' Learning Skills

Abstract:Humans, as the most powerful learners on the planet, have accumulated a lot of learning skills, such as learning through tests, interleaving learning, self-explanation, active recalling, to name a few. These learning skills and methodologies enable humans to learn new topics more effectively and efficiently. We are interested in investigating whether humans' learning skills can be borrowed to help machines to learn better. Specifically, we aim to formalize these skills and leverage them to train better machine learning (ML) models. To achieve this goal, we develop a general framework -- Skillearn, which provides a principled way to represent humans' learning skills mathematically and use the formally-represented skills to improve the training of ML models. In two case studies, we apply Skillearn to formalize two learning skills of humans: learning by passing tests and interleaving learning, and use the formalized skills to improve neural architecture search. Experiments on various datasets show that trained using the skills formalized by Skillearn, ML models achieve significantly better performance.

* arXiv admin note: substantial text overlap with arXiv:2011.15102

Via

Access Paper or Ask Questions