Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yang Liu

Performative Prediction with Bandit Feedback: Learning through Reparameterization

May 01, 2023
Yatong Chen, Wei Tang, Chien-Ju Ho, Yang Liu

Figure 1 for Performative Prediction with Bandit Feedback: Learning through Reparameterization

Figure 2 for Performative Prediction with Bandit Feedback: Learning through Reparameterization

Performative prediction, as introduced by Perdomo et al. (2020), is a framework for studying social prediction in which the data distribution itself changes in response to the deployment of a model. Existing work on optimizing accuracy in this setting hinges on two assumptions that are easily violated in practice: that the performative risk is convex over the deployed model, and that the mapping from the model to the data distribution is known to the model designer in advance. In this paper, we initiate the study of tractable performative prediction problems that do not require these assumptions. To tackle this more challenging setting, we develop a two-level zeroth-order optimization algorithm, where one level aims to compute the distribution map, and the other level reparameterizes the performative prediction objective as a function of the induced data distribution. Under mild conditions, this reparameterization allows us to transform the non-convex objective into a convex one and achieve provable regret guarantees. In particular, we provide a regret bound that is sublinear in the total number of performative samples taken and only polynomial in the dimension of the model parameter.

Via

Access Paper or Ask Questions

An EEG Channel Selection Framework for Driver Drowsiness Detection via Interpretability Guidance

Apr 26, 2023
Xinliang Zhou, Dan Lin, Ziyu Jia, Jiaping Xiao, Chenyu Liu, Liming Zhai, Yang Liu

Figure 1 for An EEG Channel Selection Framework for Driver Drowsiness Detection via Interpretability Guidance

Figure 2 for An EEG Channel Selection Framework for Driver Drowsiness Detection via Interpretability Guidance

Figure 3 for An EEG Channel Selection Framework for Driver Drowsiness Detection via Interpretability Guidance

Figure 4 for An EEG Channel Selection Framework for Driver Drowsiness Detection via Interpretability Guidance

Drowsy driving has a crucial influence on driving safety, creating an urgent demand for driver drowsiness detection. Electroencephalogram (EEG) signal can accurately reflect the mental fatigue state and thus has been widely studied in drowsiness monitoring. However, the raw EEG data is inherently noisy and redundant, which is neglected by existing works that just use single-channel EEG data or full-head channel EEG data for model training, resulting in limited performance of driver drowsiness detection. In this paper, we are the first to propose an Interpretability-guided Channel Selection (ICS) framework for the driver drowsiness detection task. Specifically, we design a two-stage training strategy to progressively select the key contributing channels with the guidance of interpretability. We first train a teacher network in the first stage using full-head channel EEG data. Then we apply the class activation mapping (CAM) to the trained teacher model to highlight the high-contributing EEG channels and further propose a channel voting scheme to select the top N contributing EEG channels. Finally, we train a student network with the selected channels of EEG data in the second stage for driver drowsiness detection. Experiments are designed on a public dataset, and the results demonstrate that our method is highly applicable and can significantly improve the performance of cross-subject driver drowsiness detection.

Via

Access Paper or Ask Questions

Fully Sparse Fusion for 3D Object Detection

Apr 25, 2023
Yingyan Li, Lue Fan, Yang Liu, Zehao Huang, Yuntao Chen, Naiyan Wang, Zhaoxiang Zhang, Tieniu Tan

Figure 1 for Fully Sparse Fusion for 3D Object Detection

Figure 2 for Fully Sparse Fusion for 3D Object Detection

Figure 3 for Fully Sparse Fusion for 3D Object Detection

Figure 4 for Fully Sparse Fusion for 3D Object Detection

Currently prevalent multimodal 3D detection methods are built upon LiDAR-based detectors that usually use dense Bird's-Eye-View (BEV) feature maps. However, the cost of such BEV feature maps is quadratic to the detection range, making it not suitable for long-range detection. Fully sparse architecture is gaining attention as they are highly efficient in long-range perception. In this paper, we study how to effectively leverage image modality in the emerging fully sparse architecture. Particularly, utilizing instance queries, our framework integrates the well-studied 2D instance segmentation into the LiDAR side, which is parallel to the 3D instance segmentation part in the fully sparse detector. This design achieves a uniform query-based fusion framework in both the 2D and 3D sides while maintaining the fully sparse characteristic. Extensive experiments showcase state-of-the-art results on the widely used nuScenes dataset and the long-range Argoverse 2 dataset. Notably, the inference speed of the proposed method under the long-range LiDAR perception setting is 2.7 $\times$ faster than that of other state-of-the-art multimodal 3D detection methods. Code will be released at \url{https://github.com/BraveGroup/FullySparseFusion}.

Via

Access Paper or Ask Questions

Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding

Apr 24, 2023
Yu-Qi Yang, Yu-Xiao Guo, Jian-Yu Xiong, Yang Liu, Hao Pan, Peng-Shuai Wang, Xin Tong, Baining Guo

Figure 1 for Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding

Figure 2 for Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding

Figure 3 for Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding

Figure 4 for Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding

Pretrained backbones with fine-tuning have been widely adopted in 2D vision and natural language processing tasks and demonstrated significant advantages to task-specific networks. In this paper, we present a pretrained 3D backbone, named Swin3D, which first outperforms all state-of-the-art methods in downstream 3D indoor scene understanding tasks. Our backbone network is based on a 3D Swin transformer and carefully designed to efficiently conduct self-attention on sparse voxels with linear memory complexity and capture the irregularity of point signals via generalized contextual relative positional embedding. Based on this backbone design, we pretrained a large Swin3D model on a synthetic Structured3D dataset that is 10 times larger than the ScanNet dataset and fine-tuned the pretrained model in various downstream real-world indoor scene understanding tasks. The results demonstrate that our model pretrained on the synthetic dataset not only exhibits good generality in both downstream segmentation and detection on real 3D point datasets, but also surpasses the state-of-the-art methods on downstream tasks after fine-tuning with +2.3 mIoU and +2.2 mIoU on S3DIS Area5 and 6-fold semantic segmentation, +2.1 mIoU on ScanNet segmentation (val), +1.9 mAP@0.5 on ScanNet detection, +8.1 mAP@0.5 on S3DIS detection. Our method demonstrates the great potential of pretrained 3D backbones with fine-tuning for 3D understanding tasks. The code and models are available at https://github.com/microsoft/Swin3D .

* Project page: https://yukichiii.github.io/project/swin3D/swin3D.html

Via

Access Paper or Ask Questions

Breaching FedMD: Image Recovery via Paired-Logits Inversion Attack

Apr 22, 2023
Hideaki Takahashi, Jingjing Liu, Yang Liu

Figure 1 for Breaching FedMD: Image Recovery via Paired-Logits Inversion Attack

Figure 2 for Breaching FedMD: Image Recovery via Paired-Logits Inversion Attack

Figure 3 for Breaching FedMD: Image Recovery via Paired-Logits Inversion Attack

Figure 4 for Breaching FedMD: Image Recovery via Paired-Logits Inversion Attack

Federated Learning with Model Distillation (FedMD) is a nascent collaborative learning paradigm, where only output logits of public datasets are transmitted as distilled knowledge, instead of passing on private model parameters that are susceptible to gradient inversion attacks, a known privacy risk in federated learning. In this paper, we found that even though sharing output logits of public datasets is safer than directly sharing gradients, there still exists a substantial risk of data exposure caused by carefully designed malicious attacks. Our study shows that a malicious server can inject a PLI (Paired-Logits Inversion) attack against FedMD and its variants by training an inversion neural network that exploits the confidence gap between the server and client models. Experiments on multiple facial recognition datasets validate that under FedMD-like schemes, by using paired server-client logits of public datasets only, the malicious server is able to reconstruct private images on all tested benchmarks with a high success rate.

Via

Access Paper or Ask Questions

Interpretable and Robust AI in EEG Systems: A Survey

Apr 21, 2023
Xinliang Zhou, Chenyu Liu, Liming Zhai, Ziyu Jia, Cuntai Guan, Yang Liu

Figure 1 for Interpretable and Robust AI in EEG Systems: A Survey

Figure 2 for Interpretable and Robust AI in EEG Systems: A Survey

Figure 3 for Interpretable and Robust AI in EEG Systems: A Survey

Figure 4 for Interpretable and Robust AI in EEG Systems: A Survey

The close coupling of artificial intelligence (AI) and electroencephalography (EEG) has substantially advanced human-computer interaction (HCI) technologies in the AI era. Different from traditional EEG systems, the interpretability and robustness of AI-based EEG systems are becoming particularly crucial. The interpretability clarifies the inner working mechanisms of AI models and thus can gain the trust of users. The robustness reflects the AI's reliability against attacks and perturbations, which is essential for sensitive and fragile EEG signals. Thus the interpretability and robustness of AI in EEG systems have attracted increasing attention, and their research has achieved great progress recently. However, there is still no survey covering recent advances in this field. In this paper, we present the first comprehensive survey and summarize the interpretable and robust AI techniques for EEG systems. Specifically, we first propose a taxonomy of interpretability by characterizing it into three types: backpropagation, perturbation, and inherently interpretable methods. Then we classify the robustness mechanisms into four classes: noise and artifacts, human variability, data acquisition instability, and adversarial attacks. Finally, we identify several critical and unresolved challenges for interpretable and robust AI in EEG systems and further discuss their future directions.

Via

Access Paper or Ask Questions

AutoTaskFormer: Searching Vision Transformers for Multi-task Learning

Apr 20, 2023
Yang Liu, Shen Yan, Yuge Zhang, Kan Ren, Quanlu Zhang, Zebin Ren, Deng Cai, Mi Zhang

Figure 1 for AutoTaskFormer: Searching Vision Transformers for Multi-task Learning

Figure 2 for AutoTaskFormer: Searching Vision Transformers for Multi-task Learning

Figure 3 for AutoTaskFormer: Searching Vision Transformers for Multi-task Learning

Figure 4 for AutoTaskFormer: Searching Vision Transformers for Multi-task Learning

Vision Transformers have shown great performance in single tasks such as classification and segmentation. However, real-world problems are not isolated, which calls for vision transformers that can perform multiple tasks concurrently. Existing multi-task vision transformers are handcrafted and heavily rely on human expertise. In this work, we propose a novel one-shot neural architecture search framework, dubbed AutoTaskFormer (Automated Multi-Task Vision TransFormer), to automate this process. AutoTaskFormer not only identifies the weights to share across multiple tasks automatically, but also provides thousands of well-trained vision transformers with a wide range of parameters (e.g., number of heads and network depth) for deployment under various resource constraints. Experiments on both small-scale (2-task Cityscapes and 3-task NYUv2) and large-scale (16-task Taskonomy) datasets show that AutoTaskFormer outperforms state-of-the-art handcrafted vision transformers in multi-task learning. The entire code and models will be open-sourced.

* 15 pages

Via

Access Paper or Ask Questions

Decadal Temperature Prediction via Chaotic Behavior Tracking

Apr 19, 2023
Jinfu Ren, Yang Liu, Jiming Liu

Figure 1 for Decadal Temperature Prediction via Chaotic Behavior Tracking

Figure 2 for Decadal Temperature Prediction via Chaotic Behavior Tracking

Figure 3 for Decadal Temperature Prediction via Chaotic Behavior Tracking

Decadal temperature prediction provides crucial information for quantifying the expected effects of future climate changes and thus informs strategic planning and decision-making in various domains. However, such long-term predictions are extremely challenging, due to the chaotic nature of temperature variations. Moreover, the usefulness of existing simulation-based and machine learning-based methods for this task is limited because initial simulation or prediction errors increase exponentially over time. To address this challenging task, we devise a novel prediction method involving an information tracking mechanism that aims to track and adapt to changes in temperature dynamics during the prediction phase by providing probabilistic feedback on the prediction error of the next step based on the current prediction. We integrate this information tracking mechanism, which can be considered as a model calibrator, into the objective function of our method to obtain the corrections needed to avoid error accumulation. Our results show the ability of our method to accurately predict global land-surface temperatures over a decadal range. Furthermore, we demonstrate that our results are meaningful in a real-world context: the temperatures predicted using our method are consistent with and can be used to explain the well-known teleconnections within and between different continents.

Via

Access Paper or Ask Questions

Long-Term Fairness with Unknown Dynamics

Apr 19, 2023
Tongxin Yin, Reilly Raab, Mingyan Liu, Yang Liu

Figure 1 for Long-Term Fairness with Unknown Dynamics

Figure 2 for Long-Term Fairness with Unknown Dynamics

Figure 3 for Long-Term Fairness with Unknown Dynamics

Figure 4 for Long-Term Fairness with Unknown Dynamics

While machine learning can myopically reinforce social inequalities, it may also be used to dynamically seek equitable outcomes. In this paper, we formalize long-term fairness in the context of online reinforcement learning. This formulation can accommodate dynamical control objectives, such as driving equity inherent in the state of a population, that cannot be incorporated into static formulations of fairness. We demonstrate that this framing allows an algorithm to adapt to unknown dynamics by sacrificing short-term incentives to drive a classifier-population system towards more desirable equilibria. For the proposed setting, we develop an algorithm that adapts recent work in online learning. We prove that this algorithm achieves simultaneous probabilistic bounds on cumulative loss and cumulative violations of fairness (as statistical regularities between demographic groups). We compare our proposed algorithm to the repeated retraining of myopic classifiers, as a baseline, and to a deep reinforcement learning algorithm that lacks safety guarantees. Our experiments model human populations according to evolutionary game theory and integrate real-world datasets.

Via

Access Paper or Ask Questions