Alert button
Picture for Peng Xia

Peng Xia

Alert button

LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition

May 08, 2023
Peng Xia, Di Xu, Lie Ju, Ming Hu, Jun Chen, Zongyuan Ge

Figure 1 for LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition
Figure 2 for LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition
Figure 3 for LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition
Figure 4 for LMPT: Prompt Tuning with Class-Specific Embedding Loss for Long-tailed Multi-Label Visual Recognition

Long-tailed multi-label visual recognition (LTML) task is a highly challenging task due to the label co-occurrence and imbalanced data distribution. In this work, we propose a unified framework for LTML, namely prompt tuning with class-specific embedding loss (LMPT), capturing the semantic feature interactions between categories by combining text and image modality data and improving the performance synchronously on both head and tail classes. Specifically, LMPT introduces the embedding loss function with class-aware soft margin and re-weighting to learn class-specific contexts with the benefit of textual descriptions (captions), which could help establish semantic relationships between classes, especially between the head and tail classes. Furthermore, taking into account the class imbalance, the distribution-balanced loss is adopted as the classification loss function to further improve the performance on the tail classes without compromising head classes. Extensive experiments are conducted on VOC-LT and COCO-LT datasets, which demonstrates that the proposed method significantly surpasses the previous state-of-the-art methods and zero-shot CLIP in LTML. Our codes are fully available at \url{https://github.com/richard-peng-xia/LMPT}.

Viaarxiv icon

Chinese grammatical error correction based on knowledge distillation

Aug 05, 2022
Peng Xia, Yuechi Zhou, Ziyan Zhang, Zecheng Tang, Juntao Li

Figure 1 for Chinese grammatical error correction based on knowledge distillation
Figure 2 for Chinese grammatical error correction based on knowledge distillation
Figure 3 for Chinese grammatical error correction based on knowledge distillation
Figure 4 for Chinese grammatical error correction based on knowledge distillation

In view of the poor robustness of existing Chinese grammatical error correction models on attack test sets and large model parameters, this paper uses the method of knowledge distillation to compress model parameters and improve the anti-attack ability of the model. In terms of data, the attack test set is constructed by integrating the disturbance into the standard evaluation data set, and the model robustness is evaluated by the attack test set. The experimental results show that the distilled small model can ensure the performance and improve the training speed under the condition of reducing the number of model parameters, and achieve the optimal effect on the attack test set, and the robustness is significantly improved.

* The paper need to withdrawn due to my advisor's request. And we will submit a new one after we modify it and translate it into English to make the paper be read more widely. 
Viaarxiv icon

Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

Jun 25, 2021
David Eriksson, Pierce I-Jen Chuang, Samuel Daulton, Peng Xia, Akshat Shrivastava, Arun Babu, Shicong Zhao, Ahmed Aly, Ganesh Venkatesh, Maximilian Balandat

Figure 1 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization
Figure 2 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization
Figure 3 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization
Figure 4 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

When tuning the architecture and hyperparameters of large machine learning models for on-device deployment, it is desirable to understand the optimal trade-offs between on-device latency and model accuracy. In this work, we leverage recent methodological advances in Bayesian optimization over high-dimensional search spaces and multi-objective Bayesian optimization to efficiently explore these trade-offs for a production-scale on-device natural language understanding model at Facebook.

* To Appear at the 8th ICML Workshop on Automated Machine Learning, ICML 2021 
Viaarxiv icon