Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xing Xie

Microsoft Research Asia

Foundation Model-oriented Robustness: Robust Image Model Evaluation with Pretrained Models

Aug 23, 2023

Peiyan Zhang, Haoyang Liu, Chaozhuo Li, Xing Xie, Sunghun Kim, Haohan Wang

Figure 1 for Foundation Model-oriented Robustness: Robust Image Model Evaluation with Pretrained Models

Figure 2 for Foundation Model-oriented Robustness: Robust Image Model Evaluation with Pretrained Models

Figure 3 for Foundation Model-oriented Robustness: Robust Image Model Evaluation with Pretrained Models

Figure 4 for Foundation Model-oriented Robustness: Robust Image Model Evaluation with Pretrained Models

Abstract:Machine learning has demonstrated remarkable performance over finite datasets, yet whether the scores over the fixed benchmarks can sufficiently indicate the model's performance in the real world is still in discussion. In reality, an ideal robust model will probably behave similarly to the oracle (e.g., the human users), thus a good evaluation protocol is probably to evaluate the models' behaviors in comparison to the oracle. In this paper, we introduce a new robustness measurement that directly measures the image classification model's performance compared with a surrogate oracle (i.e., a foundation model). Besides, we design a simple method that can accomplish the evaluation beyond the scope of the benchmarks. Our method extends the image datasets with new samples that are sufficiently perturbed to be distinct from the ones in the original sets, but are still bounded within the same image-label structure the original test image represents, constrained by a foundation model pretrained with a large amount of samples. As a result, our new method will offer us a new way to evaluate the models' robustness performance, free of limitations of fixed benchmarks or constrained perturbations, although scoped by the power of the oracle. In addition to the evaluation results, we also leverage our generated data to understand the behaviors of the model and our new evaluation strategies.

Via

Access Paper or Ask Questions

Towards Attack-tolerant Federated Learning via Critical Parameter Analysis

Aug 18, 2023

Sungwon Han, Sungwon Park, Fangzhao Wu, Sundong Kim, Bin Zhu, Xing Xie, Meeyoung Cha

Figure 1 for Towards Attack-tolerant Federated Learning via Critical Parameter Analysis

Figure 2 for Towards Attack-tolerant Federated Learning via Critical Parameter Analysis

Figure 3 for Towards Attack-tolerant Federated Learning via Critical Parameter Analysis

Figure 4 for Towards Attack-tolerant Federated Learning via Critical Parameter Analysis

Abstract:Federated learning is used to train a shared model in a decentralized way without clients sharing private data with each other. Federated learning systems are susceptible to poisoning attacks when malicious clients send false updates to the central server. Existing defense strategies are ineffective under non-IID data settings. This paper proposes a new defense strategy, FedCPA (Federated learning with Critical Parameter Analysis). Our attack-tolerant aggregation method is based on the observation that benign local models have similar sets of top-k and bottom-k critical parameters, whereas poisoned local models do not. Experiments with different attack scenarios on multiple datasets demonstrate that our model outperforms existing defense strategies in defending against poisoning attacks.

* ICCV'23 Accepted

Via

Access Paper or Ask Questions

ConvFormer: Revisiting Transformer for Sequential User Modeling

Aug 05, 2023

Hao Wang, Jianxun Lian, Mingqi Wu, Haoxuan Li, Jiajun Fan, Wanyue Xu, Chaozhuo Li, Xing Xie

Figure 1 for ConvFormer: Revisiting Transformer for Sequential User Modeling

Figure 2 for ConvFormer: Revisiting Transformer for Sequential User Modeling

Figure 3 for ConvFormer: Revisiting Transformer for Sequential User Modeling

Figure 4 for ConvFormer: Revisiting Transformer for Sequential User Modeling

Abstract:Sequential user modeling, a critical task in personalized recommender systems, focuses on predicting the next item a user would prefer, requiring a deep understanding of user behavior sequences. Despite the remarkable success of Transformer-based models across various domains, their full potential in comprehending user behavior remains untapped. In this paper, we re-examine Transformer-like architectures aiming to advance state-of-the-art performance. We start by revisiting the core building blocks of Transformer-based methods, analyzing the effectiveness of the item-to-item mechanism within the context of sequential user modeling. After conducting a thorough experimental analysis, we identify three essential criteria for devising efficient sequential user models, which we hope will serve as practical guidelines to inspire and shape future designs. Following this, we introduce ConvFormer, a simple but powerful modification to the Transformer architecture that meets these criteria, yielding state-of-the-art results. Additionally, we present an acceleration technique to minimize the complexity associated with processing extremely long sequences. Experiments on four public datasets showcase ConvFormer's superiority and confirm the validity of our proposed criteria.

Via

Access Paper or Ask Questions

DIVERSIFY: A General Framework for Time Series Out-of-distribution Detection and Generalization

Aug 04, 2023

Wang Lu, Jindong Wang, Xinwei Sun, Yiqiang Chen, Xiangyang Ji, Qiang Yang, Xing Xie

Figure 1 for DIVERSIFY: A General Framework for Time Series Out-of-distribution Detection and Generalization

Figure 2 for DIVERSIFY: A General Framework for Time Series Out-of-distribution Detection and Generalization

Figure 3 for DIVERSIFY: A General Framework for Time Series Out-of-distribution Detection and Generalization

Figure 4 for DIVERSIFY: A General Framework for Time Series Out-of-distribution Detection and Generalization

Abstract:Time series remains one of the most challenging modalities in machine learning research. The out-of-distribution (OOD) detection and generalization on time series tend to suffer due to its non-stationary property, i.e., the distribution changes over time. The dynamic distributions inside time series pose great challenges to existing algorithms to identify invariant distributions since they mainly focus on the scenario where the domain information is given as prior knowledge. In this paper, we attempt to exploit subdomains within a whole dataset to counteract issues induced by non-stationary for generalized representation learning. We propose DIVERSIFY, a general framework, for OOD detection and generalization on dynamic distributions of time series. DIVERSIFY takes an iterative process: it first obtains the "worst-case" latent distribution scenario via adversarial training, then reduces the gap between these latent distributions. We implement DIVERSIFY via combining existing OOD detection methods according to either extracted features or outputs of models for detection while we also directly utilize outputs for classification. In addition, theoretical insights illustrate that DIVERSIFY is theoretically supported. Extensive experiments are conducted on seven datasets with different OOD settings across gesture recognition, speech commands recognition, wearable stress and affect detection, and sensor-based human activity recognition. Qualitative and quantitative results demonstrate that DIVERSIFY learns more generalized features and significantly outperforms other baselines.

* Journal version of arXiv:2209.07027; 17 pages

Via

Access Paper or Ask Questions

Frustratingly Easy Model Generalization by Dummy Risk Minimization

Aug 04, 2023

Juncheng Wang, Jindong Wang, Xixu Hu, Shujun Wang, Xing Xie

Figure 1 for Frustratingly Easy Model Generalization by Dummy Risk Minimization

Figure 2 for Frustratingly Easy Model Generalization by Dummy Risk Minimization

Figure 3 for Frustratingly Easy Model Generalization by Dummy Risk Minimization

Figure 4 for Frustratingly Easy Model Generalization by Dummy Risk Minimization

Abstract:Empirical risk minimization (ERM) is a fundamental machine learning paradigm. However, its generalization ability is limited in various tasks. In this paper, we devise Dummy Risk Minimization (DuRM), a frustratingly easy and general technique to improve the generalization of ERM. DuRM is extremely simple to implement: just enlarging the dimension of the output logits and then optimizing using standard gradient descent. Moreover, we validate the efficacy of DuRM on both theoretical and empirical analysis. Theoretically, we show that DuRM derives greater variance of the gradient, which facilitates model generalization by observing better flat local minima. Empirically, we conduct evaluations of DuRM across different datasets, modalities, and network architectures on diverse tasks, including conventional classification, semantic segmentation, out-of-distribution generalization, adverserial training, and long-tailed recognition. Results demonstrate that DuRM could consistently improve the performance under all tasks with an almost free lunch manner. Furthermore, we show that DuRM is compatible with existing generalization techniques and we discuss possible limitations. We hope that DuRM could trigger new interest in the fundamental research on risk minimization.

* Technical report; 22 pages

Via

Access Paper or Ask Questions

EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus

Aug 01, 2023

Cheng Li, Jindong Wang, Kaijie Zhu, Yixuan Zhang, Wenxin Hou, Jianxun Lian, Xing Xie

Figure 1 for EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus

Figure 2 for EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus

Figure 3 for EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus

Figure 4 for EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus

Abstract:Large language models (LLMs) have achieved significant performance in many fields such as reasoning, language understanding, and math problem-solving, and are regarded as a crucial step to artificial general intelligence (AGI). However, the sensitivity of LLMs to prompts remains a major bottleneck for their daily adoption. In this paper, we take inspiration from psychology and propose EmotionPrompt to explore emotional intelligence to enhance the performance of LLMs. EmotionPrompt operates on a remarkably straightforward principle: the incorporation of emotional stimulus into prompts. Experimental results demonstrate that our EmotionPrompt, using the same single prompt templates, significantly outperforms original zero-shot prompt and Zero-shot-CoT on 8 tasks with diverse models: ChatGPT, Vicuna-13b, Bloom, and T5. Further, EmotionPrompt was observed to improve both truthfulness and informativeness. We believe that EmotionPrompt heralds a novel avenue for exploring interdisciplinary knowledge for humans-LLMs interaction.

* Work in progress; 9 pages

Via

Access Paper or Ask Questions

Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning

Aug 01, 2023

Kaijie Zhu, Jindong Wang, Xixu Hu, Xing Xie, Ge Yang

Figure 1 for Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning

Figure 2 for Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning

Figure 3 for Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning

Figure 4 for Improving Generalization of Adversarial Training via Robust Critical Fine-Tuning

Abstract:Deep neural networks are susceptible to adversarial examples, posing a significant security risk in critical applications. Adversarial Training (AT) is a well-established technique to enhance adversarial robustness, but it often comes at the cost of decreased generalization ability. This paper proposes Robustness Critical Fine-Tuning (RiFT), a novel approach to enhance generalization without compromising adversarial robustness. The core idea of RiFT is to exploit the redundant capacity for robustness by fine-tuning the adversarially trained model on its non-robust-critical module. To do so, we introduce module robust criticality (MRC), a measure that evaluates the significance of a given module to model robustness under worst-case weight perturbations. Using this measure, we identify the module with the lowest MRC value as the non-robust-critical module and fine-tune its weights to obtain fine-tuned weights. Subsequently, we linearly interpolate between the adversarially trained weights and fine-tuned weights to derive the optimal fine-tuned model weights. We demonstrate the efficacy of RiFT on ResNet18, ResNet34, and WideResNet34-10 models trained on CIFAR10, CIFAR100, and Tiny-ImageNet datasets. Our experiments show that \method can significantly improve both generalization and out-of-distribution robustness by around 1.5% while maintaining or even slightly enhancing adversarial robustness. Code is available at https://github.com/microsoft/robustlearn.

* Accepted by International Conference on Computer Vision (ICCV) 2023; code is at https://github.com/microsoft/robustlearn

Via

Access Paper or Ask Questions

A Survey on Evaluation of Large Language Models

Jul 18, 2023

Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Kaijie Zhu, Hao Chen, Linyi Yang, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang(+6 more)

Figure 1 for A Survey on Evaluation of Large Language Models

Figure 2 for A Survey on Evaluation of Large Language Models

Figure 3 for A Survey on Evaluation of Large Language Models

Figure 4 for A Survey on Evaluation of Large Language Models

Abstract:Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes increasingly critical, not only at the task level, but also at the society level for better understanding of their potential risks. Over the past years, significant efforts have been made to examine LLMs from various perspectives. This paper presents a comprehensive review of these evaluation methods for LLMs, focusing on three key dimensions: what to evaluate, where to evaluate, and how to evaluate. Firstly, we provide an overview from the perspective of evaluation tasks, encompassing general natural language processing tasks, reasoning, medical usage, ethics, educations, natural and social sciences, agent applications, and other areas. Secondly, we answer the `where' and `how' questions by diving into the evaluation methods and benchmarks, which serve as crucial components in assessing performance of LLMs. Then, we summarize the success and failure cases of LLMs in different tasks. Finally, we shed light on several future challenges that lie ahead in LLMs evaluation. Our aim is to offer invaluable insights to researchers in the realm of LLMs evaluation, thereby aiding the development of more proficient LLMs. Our key point is that evaluation should be treated as an essential discipline to better assist the development of LLMs. We consistently maintain the related open-source materials at: https://github.com/MLGroupJLU/LLM-eval-survey.

* 25 pages; more work is at: https://llm-eval.github.io/

Via

Access Paper or Ask Questions

FedDefender: Client-Side Attack-Tolerant Federated Learning

Jul 18, 2023

Sungwon Park, Sungwon Han, Fangzhao Wu, Sundong Kim, Bin Zhu, Xing Xie, Meeyoung Cha

Abstract:Federated learning enables learning from decentralized data sources without compromising privacy, which makes it a crucial technique. However, it is vulnerable to model poisoning attacks, where malicious clients interfere with the training process. Previous defense mechanisms have focused on the server-side by using careful model aggregation, but this may not be effective when the data is not identically distributed or when attackers can access the information of benign clients. In this paper, we propose a new defense mechanism that focuses on the client-side, called FedDefender, to help benign clients train robust local models and avoid the adverse impact of malicious model updates from attackers, even when a server-side defense cannot identify or remove adversaries. Our method consists of two main components: (1) attack-tolerant local meta update and (2) attack-tolerant global knowledge distillation. These components are used to find noise-resilient model parameters while accurately extracting knowledge from a potentially corrupted global model. Our client-side defense strategy has a flexible structure and can work in conjunction with any existing server-side strategies. Evaluations of real-world scenarios across multiple datasets show that the proposed method enhances the robustness of federated learning against model poisoning attacks.

* KDD'23 research track accepted

Via

Access Paper or Ask Questions

FedSampling: A Better Sampling Strategy for Federated Learning

Jun 25, 2023

Tao Qi, Fangzhao Wu, Lingjuan Lyu, Yongfeng Huang, Xing Xie

Abstract:Federated learning (FL) is an important technique for learning models from decentralized data in a privacy-preserving way. Existing FL methods usually uniformly sample clients for local model learning in each round. However, different clients may have significantly different data sizes, and the clients with more data cannot have more opportunities to contribute to model training, which may lead to inferior performance. In this paper, instead of client uniform sampling, we propose a novel data uniform sampling strategy for federated learning (FedSampling), which can effectively improve the performance of federated learning especially when client data size distribution is highly imbalanced across clients. In each federated learning round, local data on each client is randomly sampled for local model learning according to a probability based on the server desired sample size and the total sample size on all available clients. Since the data size on each client is privacy-sensitive, we propose a privacy-preserving way to estimate the total sample size with a differential privacy guarantee. Experiments on four benchmark datasets show that FedSampling can effectively improve the performance of federated learning.

* IJCAI 2023

Via

Access Paper or Ask Questions