Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Roy Eisenstadt

Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs

Jun 08, 2025

Roy Eisenstadt, Itamar Zimerman, Lior Wolf

Figure 1 for Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs

Figure 2 for Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs

Figure 3 for Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs

Figure 4 for Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs

Abstract:Recently, techniques such as explicit structured reasoning have demonstrated strong test-time scaling behavior by enforcing a separation between the model's internal "thinking" process and the final response. A key factor influencing answer quality in this setting is the length of the thinking stage. When the reasoning is too short, the model may fail to capture the complexity of the task. Conversely, when it is too long, the model may overthink, leading to unnecessary computation and degraded performance. This paper explores and exploits the underlying mechanisms by which LLMs understand and regulate the length of their reasoning during explicit thought processes. First, we show that LLMs encode their progress through the reasoning process and introduce an interactive progress bar visualization, which is then used to reveal insights on the model's planning dynamics. Second, we manipulate the internal progress encoding during inference to reduce unnecessary steps and generate a more concise and decisive chain of thoughts. Our empirical results demonstrate that this "overclocking" method mitigates overthinking, improves answer accuracy, and reduces inference latency. Our code is publicly available.

Via

Access Paper or Ask Questions

An End-to-End Dialogue Summarization System for Sales Calls

Apr 28, 2022

Abedelkadir Asi, Song Wang, Roy Eisenstadt, Dean Geckt, Yarin Kuper, Yi Mao, Royi Ronen

Figure 1 for An End-to-End Dialogue Summarization System for Sales Calls

Figure 2 for An End-to-End Dialogue Summarization System for Sales Calls

Figure 3 for An End-to-End Dialogue Summarization System for Sales Calls

Figure 4 for An End-to-End Dialogue Summarization System for Sales Calls

Abstract:Summarizing sales calls is a routine task performed manually by salespeople. We present a production system which combines generative models fine-tuned for customer-agent setting, with a human-in-the-loop user experience for an interactive summary curation process. We address challenging aspects of dialogue summarization task in a real-world setting including long input dialogues, content validation, lack of labeled data and quality evaluation. We show how GPT-3 can be leveraged as an offline data labeler to handle training data scarcity and accommodate privacy constraints in an industrial setting. Experiments show significant improvements by our models in tackling the summarization and content validation tasks on public datasets.

* To be published in NAACL 2022

Via

Access Paper or Ask Questions