Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ritesh Goru

Efficient Single-Pass Training for Multi-Turn Reasoning

Apr 25, 2025

Ritesh Goru, Shanay Mehta, Prateek Jain

Abstract:Training Large Language Models ( LLMs) to generate explicit reasoning before they produce an answer has been shown to improve their performance across various tasks such as mathematics and coding. However, fine-tuning LLMs on multi-turn reasoning datasets presents a unique challenge: LLMs must generate reasoning tokens that are excluded from subsequent inputs to the LLM. This discrepancy prevents us from processing an entire conversation in a single forward pass-an optimization readily available when we fine-tune on a multi-turn non-reasoning dataset. This paper proposes a novel approach that overcomes this limitation through response token duplication and a custom attention mask that enforces appropriate visibility constraints. Our approach significantly reduces the training time and allows efficient fine-tuning on multi-turn reasoning datasets.

* 9 pages, 3 figures

Via

Access Paper or Ask Questions

Batch Decorrelation for Active Metric Learning

May 23, 2020

Priyadarshini K, Ritesh Goru, Siddhartha Chaudhuri, Subhasis Chaudhuri

Figure 1 for Batch Decorrelation for Active Metric Learning

Figure 2 for Batch Decorrelation for Active Metric Learning

Figure 3 for Batch Decorrelation for Active Metric Learning

Figure 4 for Batch Decorrelation for Active Metric Learning

Abstract:We present an active learning strategy for training parametric models of distance metrics, given triplet-based similarity assessments: object $x_i$ is more similar to object $x_j$ than to $x_k$. In contrast to prior work on class-based learning, where the fundamental goal is classification and any implicit or explicit metric is binary, we focus on {\em perceptual} metrics that express the {\em degree} of (dis)similarity between objects. We find that standard active learning approaches degrade when annotations are requested for {\em batches} of triplets at a time: our studies suggest that correlation among triplets is responsible. In this work, we propose a novel method to {\em decorrelate} batches of triplets, that jointly balances informativeness and diversity while decoupling the choice of heuristic for each criterion. Experiments indicate our method is general, adaptable, and outperforms the state-of-the-art.

* Accepted to IJCAI-PRICAI 2020

Via

Access Paper or Ask Questions