Abstract:Recently, the advent of large language models (LLMs) has revolutionized generative agents. Among them, Role-Playing Conversational Agents (RPCAs) attract considerable attention due to their ability to emotionally engage users. However, the absence of a comprehensive benchmark impedes progress in this field. To bridge this gap, we introduce CharacterEval, a Chinese benchmark for comprehensive RPCA assessment, complemented by a tailored high-quality dataset. The dataset comprises 1,785 multi-turn role-playing dialogues, encompassing 23,020 examples and featuring 77 characters derived from Chinese novels and scripts. It was carefully constructed, beginning with initial dialogue extraction via GPT-4, followed by rigorous human-led quality control, and enhanced with in-depth character profiles sourced from Baidu Baike. CharacterEval employs a multifaceted evaluation approach, encompassing thirteen targeted metrics on four dimensions. Comprehensive experiments on CharacterEval demonstrate that Chinese LLMs exhibit more promising capabilities than GPT-4 in Chinese role-playing conversation. Source code, data source and reward model will be publicly accessible at https://github.com/morecry/CharacterEval.
Abstract:The advent of deep learning (DL)-based models has significantly advanced Channel State Information (CSI) feedback mechanisms in wireless communication systems. However, traditional approaches often suffer from high communication overhead and potential privacy risks due to the centralized nature of CSI data processing. To address these challenges, we design a CSI feedback training framework called Dig-CSI, in which the dataset for training the CSI feedback model is produced by the distributed generators uploaded by each user equipment (UE), but not through local data upload. Each UE trains an autoencoder, where the decoder is considered as the distributed generator, with local data to gain reconstruction accuracy and the ability to generate. Experimental results show that Dig-CSI can train a global CSI feedback model with comparable performance to the model trained with classical centralized learning with a much lighter communication overhead.