Task-Oriented dialogue systems use four connected modules such as Natural Language Understanding (NLU), Dialogue State Tracker (DST), Dialogue Policy (DP) and Natural Language Generator (NLG). A research challenge is to learn each module with the least amount of samples (i.e., few-shots) given the high cost related to the data collection. The most common and effective technique to solve this problem is transferring learning, where large language models, either pre-trained on text or task-specific data, are fine-tuned on the few samples. These methods require fine-tuning steps and a set of parameters for each task. Differently, language models such as GPT-2 (Radford et al., 2019) and GPT-3 Brown et al., 2020) allows few-shot learning by priming the model with few-examples. In this paper, we evaluate the few-shot ability of Language Models such as GPT-2 by priming in the NLU, DST, DP and NLG tasks. Importantly, we highlight the current limitations of this approach and we discuss the possible implication to future work.