Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox


Pchatbot: A Large-Scale Dataset for Personalized Chatbot

Sep 28, 2020
Xiaohe Li, Hanxun Zhong, Yu Guo, Yueyuan Ma, Hongjin Qian, Zhanliang Liu, Zhicheng Dou, Ji-Rong Wen


Share this with someone who'll enjoy it:


Natural language dialogue systems raise great attention recently. As many dialogue models are data-driven, high quality datasets are essential to these systems. In this paper, we introduce Pchatbot, a large scale dialogue dataset which contains two subsets collected from Weibo and Judical forums respectively. Different from existing datasets which only contain post-response pairs, we include anonymized user IDs as well as timestamps. This enables the development of personalized dialogue models which depend on the availability of users' historical conversations. Furthermore, the scale of Pchatbot is significantly larger than existing datasets, which might benefit the data-driven models. Our preliminary experimental study shows that a personalized chatbot model trained on Pchatbot outperforms the corresponding ad-hoc chatbot models. We also demonstrate that using larger dataset improves the quality of dialog models.

* 10 pages 


   Access Paper Source



Share this with someone who'll enjoy it: