Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Large Language Models

May 26, 2023

Yao Yao, Zuchao Li, Hai Zhao

Figure 1 for Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Large Language Models

Figure 2 for Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Large Language Models

Figure 3 for Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Large Language Models

Figure 4 for Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Large Language Models

Share this with someone who'll enjoy it:

Abstract:With the widespread use of large language models (LLMs) in NLP tasks, researchers have discovered the potential of Chain-of-thought (CoT) to assist LLMs in accomplishing complex reasoning tasks by generating intermediate steps. However, human thought processes are often non-linear, rather than simply sequential chains of thoughts. Therefore, we propose Graph-of-Thought (GoT) reasoning, which models human thought processes not only as a chain but also as a graph. By representing thought units as nodes and connections between them as edges, our approach captures the non-sequential nature of human thinking and allows for a more realistic modeling of thought processes. Similar to Multimodal-CoT, we modeled GoT reasoning as a two-stage framework, generating rationales first and then producing the final answer. Specifically, we employ an additional graph-of-thoughts encoder for GoT representation learning and fuse the GoT representation with the original input representation through a gated fusion mechanism. We implement a GoT reasoning model on the T5 pre-trained model and evaluate its performance on a text-only reasoning task (GSM8K) and a multimodal reasoning task (ScienceQA). Our model achieves significant improvement over the strong CoT baseline with 3.41% and 5.08% on the GSM8K test set with T5-base and T5-large architectures, respectively. Additionally, our model boosts accuracy from 84.91% to 91.54% using the T5-base model and from 91.68% to 92.77% using the T5-large model over the state-of-the-art Multimodal-CoT on the ScienceQA test set. Experiments have shown that GoT achieves comparable results to Multimodal-CoT(large) with over 700M parameters, despite having fewer than 250M backbone model parameters, demonstrating the effectiveness of GoT.

View paper on

Share this with someone who'll enjoy it:

Title:Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Large Language Models

Paper and Code