Alert button
Picture for Hongpeng Lin

Hongpeng Lin

Alert button

TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat

Jan 14, 2023
Hongpeng Lin, Ludan Ruan, Wenke Xia, Peiyu Liu, Jingyuan Wen, Yixin Xu, Di Hu, Ruihua Song, Wayne Xin Zhao, Qin Jin, Zhiwu Lu

Figure 1 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Figure 2 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Figure 3 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Figure 4 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat

We present a novel multi-modal chitchat dialogue dataset-TikTalk aimed at facilitating the research of intelligent chatbots. It consists of the videos and corresponding dialogues users generate on video social applications. In contrast to existing multi-modal dialogue datasets, we construct dialogue corpora based on video comment-reply pairs, which is more similar to chitchat in real-world dialogue scenarios. Our dialogue context includes three modalities: text, vision, and audio. Compared with previous image-based dialogue datasets, the richer sources of context in TikTalk lead to a greater diversity of conversations. TikTalk contains over 38K videos and 367K dialogues. Data analysis shows that responses in TikTalk are in correlation with various contexts and external knowledge. It poses a great challenge for the deep understanding of multi-modal information and the generation of responses. We evaluate several baselines on three types of automatic metrics and conduct case studies. Experimental results demonstrate that there is still a large room for future improvement on TikTalk. Our dataset is available at \url{https://github.com/RUC-AIMind/TikTalk}.

Viaarxiv icon