Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox


Black-Box Tuning for Language-Model-as-a-Service

Jan 10, 2022
Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu


Share this with someone who'll enjoy it:


Extremely large pre-trained language models (PTMs) such as GPT-3 are usually released as a service, allowing users to design task-specific prompts to query the PTMs through some black-box APIs. In such a scenario, which we call Language-Model-as-a-Service (LMaaS), gradients of the PTMs are usually not available. Can we optimize the task prompts by only accessing the model inference APIs? Based on recent observations that large PTMs have a very low intrinsic dimensionality, this work proposes the Black-Box Tuning to optimize PTMs through derivative-free algorithms. In particular, we invoke the CMA-ES to optimize the continuous prompt prepended to the input text by iteratively calling PTM inference APIs. Our experimental results demonstrate that, black-box tuning with RoBERTa on a few labeled samples not only significantly outperforms manual prompt and GPT-3's in-context learning, but also surpasses the gradient-based counterparts, namely prompt tuning and full model tuning.

* Work in progress 


   Access Paper Source



Share this with someone who'll enjoy it: