Alert button
Picture for Cheng Yang

Cheng Yang

Alert button

A Stochastic Online Forecast-and-Optimize Framework for Real-Time Energy Dispatch in Virtual Power Plants under Uncertainty

Sep 15, 2023
Wei Jiang, Zhongkai Yi, Li Wang, Hanwei Zhang, Jihai Zhang, Fangquan Lin, Cheng Yang

Aggregating distributed energy resources in power systems significantly increases uncertainties, in particular caused by the fluctuation of renewable energy generation. This issue has driven the necessity of widely exploiting advanced predictive control techniques under uncertainty to ensure long-term economics and decarbonization. In this paper, we propose a real-time uncertainty-aware energy dispatch framework, which is composed of two key elements: (i) A hybrid forecast-and-optimize sequential task, integrating deep learning-based forecasting and stochastic optimization, where these two stages are connected by the uncertainty estimation at multiple temporal resolutions; (ii) An efficient online data augmentation scheme, jointly involving model pre-training and online fine-tuning stages. In this way, the proposed framework is capable to rapidly adapt to the real-time data distribution, as well as to target on uncertainties caused by data drift, model discrepancy and environment perturbations in the control process, and finally to realize an optimal and robust dispatch solution. The proposed framework won the championship in CityLearn Challenge 2022, which provided an influential opportunity to investigate the potential of AI application in the energy domain. In addition, comprehensive experiments are conducted to interpret its effectiveness in the real-life scenario of smart building energy management.

* Preprint. Accepted by CIKM 23 
Viaarxiv icon

AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents

Aug 21, 2023
Weize Chen, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chen Qian, Chi-Min Chan, Yujia Qin, Yaxi Lu, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie Zhou

Figure 1 for AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents
Figure 2 for AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents
Figure 3 for AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents
Figure 4 for AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents

Autonomous agents empowered by Large Language Models (LLMs) have undergone significant improvements, enabling them to generalize across a broad spectrum of tasks. However, in real-world scenarios, cooperation among individuals is often required to enhance the efficiency and effectiveness of task accomplishment. Hence, inspired by human group dynamics, we propose a multi-agent framework \framework that can collaboratively and dynamically adjust its composition as a greater-than-the-sum-of-its-parts system. Our experiments demonstrate that \framework framework can effectively deploy multi-agent groups that outperform a single agent. Furthermore, we delve into the emergence of social behaviors among individual agents within a group during collaborative task accomplishment. In view of these behaviors, we discuss some possible strategies to leverage positive ones and mitigate negative ones for improving the collaborative potential of multi-agent groups. Our codes for \framework will soon be released at \url{https://github.com/OpenBMB/AgentVerse}.

* Work in progress 
Viaarxiv icon

Deep Learning-Based Knowledge Injection for Metaphor Detection: A Comprehensive Review

Aug 15, 2023
Cheng Yang, Wenye Zhao, Zhiyue Liu, Qingbao Huang

The history of metaphor research also marks the evolution of knowledge infusion research. With the continued advancement of deep learning techniques in recent years, the natural language processing community has shown great interest in applying knowledge to successful results in metaphor recognition tasks. Although there has been a gradual increase in the number of approaches involving knowledge injection in the field of metaphor recognition, there is a lack of a complete review article on knowledge injection based approaches. Therefore, the goal of this paper is to provide a comprehensive review of research advances in the application of deep learning for knowledge injection in metaphor recognition tasks. In this paper, we systematically summarize and generalize the mainstream knowledge and knowledge injection principles, as well as review the datasets, evaluation metrics, and benchmark models used in metaphor recognition tasks. Finally, we explore the current issues facing knowledge injection methods and provide an outlook on future research directions.

* 15 pages 
Viaarxiv icon

Does Correction Remain A Problem For Large Language Models?

Aug 14, 2023
Xiaowu Zhang, Xiaotian Zhang, Cheng Yang, Hang Yan, Xipeng Qiu

Figure 1 for Does Correction Remain A Problem For Large Language Models?
Figure 2 for Does Correction Remain A Problem For Large Language Models?
Figure 3 for Does Correction Remain A Problem For Large Language Models?
Figure 4 for Does Correction Remain A Problem For Large Language Models?

As large language models, such as GPT, continue to advance the capabilities of natural language processing (NLP), the question arises: does the problem of correction still persist? This paper investigates the role of correction in the context of large language models by conducting two experiments. The first experiment focuses on correction as a standalone task, employing few-shot learning techniques with GPT-like models for error correction. The second experiment explores the notion of correction as a preparatory task for other NLP tasks, examining whether large language models can tolerate and perform adequately on texts containing certain levels of noise or errors. By addressing these experiments, we aim to shed light on the significance of correction in the era of large language models and its implications for various NLP applications.

Viaarxiv icon

AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models

Aug 12, 2023
Siheng Li, Cheng Yang, Yichun Yin, Xinyu Zhu, Zesen Cheng, Lifeng Shang, Xin Jiang, Qun Liu, Yujiu Yang

Figure 1 for AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models
Figure 2 for AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models
Figure 3 for AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models
Figure 4 for AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models

Information-seeking conversation, which aims to help users gather information through conversation, has achieved great progress in recent years. However, the research is still stymied by the scarcity of training data. To alleviate this problem, we propose AutoConv for synthetic conversation generation, which takes advantage of the few-shot learning ability and generation capacity of large language models (LLM). Specifically, we formulate the conversation generation problem as a language modeling task, then finetune an LLM with a few human conversations to capture the characteristics of the information-seeking process and use it for generating synthetic conversations with high quality. Experimental results on two frequently-used datasets verify that AutoConv has substantial improvements over strong baselines and alleviates the dependence on human annotation. In addition, we also provide several analysis studies to promote future research.

* Accepted to ACL 2023 Main Conference (Short) 
Viaarxiv icon

NewsDialogues: Towards Proactive News Grounded Conversation

Aug 12, 2023
Siheng Li, Yichun Yin, Cheng Yang, Wangjie Jiang, Yiwei Li, Zesen Cheng, Lifeng Shang, Xin Jiang, Qun Liu, Yujiu Yang

Figure 1 for NewsDialogues: Towards Proactive News Grounded Conversation
Figure 2 for NewsDialogues: Towards Proactive News Grounded Conversation
Figure 3 for NewsDialogues: Towards Proactive News Grounded Conversation
Figure 4 for NewsDialogues: Towards Proactive News Grounded Conversation

Hot news is one of the most popular topics in daily conversations. However, news grounded conversation has long been stymied by the lack of well-designed task definition and scarce data. In this paper, we propose a novel task, Proactive News Grounded Conversation, in which a dialogue system can proactively lead the conversation based on some key topics of the news. In addition, both information-seeking and chit-chat scenarios are included realistically, where the user may ask a series of questions about the news details or express their opinions and be eager to chat. To further develop this novel task, we collect a human-to-human Chinese dialogue dataset \ts{NewsDialogues}, which includes 1K conversations with a total of 14.6K utterances and detailed annotations for target topics and knowledge spans. Furthermore, we propose a method named Predict-Generate-Rank, consisting of a generator for grounded knowledge prediction and response generation, and a ranker for the ranking of multiple responses to alleviate the exposure bias. We conduct comprehensive experiments to demonstrate the effectiveness of the proposed method and further present several key findings and challenges to prompt future research.

* Accepted to ACL 2023 Conference (Long Paper; Findings) 
Viaarxiv icon

Does Correction Remain An Problem For Large Language Models?

Aug 03, 2023
Xiaowu Zhang, Xiaotian Zhang, Cheng Yang, Hang Yan, Xipeng Qiu

Figure 1 for Does Correction Remain An Problem For Large Language Models?
Figure 2 for Does Correction Remain An Problem For Large Language Models?
Figure 3 for Does Correction Remain An Problem For Large Language Models?
Figure 4 for Does Correction Remain An Problem For Large Language Models?

As large language models, such as GPT, continue to advance the capabilities of natural language processing (NLP), the question arises: does the problem of correction still persist? This paper investigates the role of correction in the context of large language models by conducting two experiments. The first experiment focuses on correction as a standalone task, employing few-shot learning techniques with GPT-like models for error correction. The second experiment explores the notion of correction as a preparatory task for other NLP tasks, examining whether large language models can tolerate and perform adequately on texts containing certain levels of noise or errors. By addressing these experiments, we aim to shed light on the significance of correction in the era of large language models and its implications for various NLP applications.

Viaarxiv icon

Communicative Agents for Software Development

Jul 18, 2023
Chen Qian, Xin Cong, Cheng Yang, Weize Chen, Yusheng Su, Juyuan Xu, Zhiyuan Liu, Maosong Sun

Figure 1 for Communicative Agents for Software Development
Figure 2 for Communicative Agents for Software Development
Figure 3 for Communicative Agents for Software Development
Figure 4 for Communicative Agents for Software Development

Software engineering is a domain characterized by intricate decision-making processes, often relying on nuanced intuition and consultation. Recent advancements in deep learning have started to revolutionize software engineering practices through elaborate designs implemented at various stages of software development. In this paper, we present an innovative paradigm that leverages large language models (LLMs) throughout the entire software development process, streamlining and unifying key processes through natural language communication, thereby eliminating the need for specialized models at each phase. At the core of this paradigm lies ChatDev, a virtual chat-powered software development company that mirrors the established waterfall model, meticulously dividing the development process into four distinct chronological stages: designing, coding, testing, and documenting. Each stage engages a team of agents, such as programmers, code reviewers, and test engineers, fostering collaborative dialogue and facilitating a seamless workflow. The chat chain acts as a facilitator, breaking down each stage into atomic subtasks. This enables dual roles, allowing for proposing and validating solutions through context-aware communication, leading to efficient resolution of specific subtasks. The instrumental analysis of ChatDev highlights its remarkable efficacy in software generation, enabling the completion of the entire software development process in under seven minutes at a cost of less than one dollar. It not only identifies and alleviates potential vulnerabilities but also rectifies potential hallucinations while maintaining commendable efficiency and cost-effectiveness. The potential of ChatDev unveils fresh possibilities for integrating LLMs into the realm of software development.

* 25 pages, 9 figures, 2 tables 
Viaarxiv icon

An Efficient Virtual Data Generation Method for Reducing Communication in Federated Learning

Jun 29, 2023
Cheng Yang, Xue Yang, Dongxian Wu, Xiaohu Tang

Figure 1 for An Efficient Virtual Data Generation Method for Reducing Communication in Federated Learning
Figure 2 for An Efficient Virtual Data Generation Method for Reducing Communication in Federated Learning
Figure 3 for An Efficient Virtual Data Generation Method for Reducing Communication in Federated Learning
Figure 4 for An Efficient Virtual Data Generation Method for Reducing Communication in Federated Learning

Communication overhead is one of the major challenges in Federated Learning(FL). A few classical schemes assume the server can extract the auxiliary information about training data of the participants from the local models to construct a central dummy dataset. The server uses the dummy dataset to finetune aggregated global model to achieve the target test accuracy in fewer communication rounds. In this paper, we summarize the above solutions into a data-based communication-efficient FL framework. The key of the proposed framework is to design an efficient extraction module(EM) which ensures the dummy dataset has a positive effect on finetuning aggregated global model. Different from the existing methods that use generator to design EM, our proposed method, FedINIBoost borrows the idea of gradient match to construct EM. Specifically, FedINIBoost builds a proxy dataset of the real dataset in two steps for each participant at each communication round. Then the server aggregates all the proxy datasets to form a central dummy dataset, which is used to finetune aggregated global model. Extensive experiments verify the superiority of our method compared with the existing classical method, FedAVG, FedProx, Moon and FedFTG. Moreover, FedINIBoost plays a significant role in finetuning the performance of aggregated global model at the initial stage of FL.

* There are errors in the experimental settings in our paper 
Viaarxiv icon