Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ashita Saxena

Robustness and Reasoning Fidelity of Large Language Models in Long-Context Code Question Answering

Feb 19, 2026

Kishan Maharaj, Nandakishore Menon, Ashita Saxena, Srikanth Tamilselvam

Abstract:Large language models (LLMs) increasingly assist software engineering tasks that require reasoning over long code contexts, yet their robustness under varying input conditions remains unclear. We conduct a systematic study of long-context code question answering using controlled ablations that test sensitivity to answer format, distractors, and context scale. Extending LongCodeBench Python dataset with new COBOL and Java question-answer sets, we evaluate state-of-the-art models under three settings: (i) shuffled multiple-choice options, (ii) open-ended questions and (iii) needle-in-a-haystack contexts containing relevant and adversarially irrelevant information. Results show substantial performance drops in both shuffled multiple-choice options and open-ended questions, and brittle behavior in the presence of irrelevant cues. Our findings highlight limitations of current long-context evaluations and provide a broader benchmark for assessing code reasoning in both legacy and modern systems.

* 11 pages, 4 Figures, 5 Tables, Work in Progress

Via

Access Paper or Ask Questions

Mental Disorder Classification via Temporal Representation of Text

Jun 15, 2024

Raja Kumar, Kishan Maharaj, Ashita Saxena, Pushpak Bhattacharyya

Figure 1 for Mental Disorder Classification via Temporal Representation of Text

Figure 2 for Mental Disorder Classification via Temporal Representation of Text

Figure 3 for Mental Disorder Classification via Temporal Representation of Text

Figure 4 for Mental Disorder Classification via Temporal Representation of Text

Abstract:Mental disorders pose a global challenge, aggravated by the shortage of qualified mental health professionals. Mental disorder prediction from social media posts by current LLMs is challenging due to the complexities of sequential text data and the limited context length of language models. Current language model-based approaches split a single data instance into multiple chunks to compensate for limited context size. The predictive model is then applied to each chunk individually, and the most voted output is selected as the final prediction. This results in the loss of inter-post dependencies and important time variant information, leading to poor performance. We propose a novel framework which first compresses the large sequence of chronologically ordered social media posts into a series of numbers. We then use this time variant representation for mental disorder classification. We demonstrate the generalization capabilities of our framework by outperforming the current SOTA in three different mental conditions: depression, self-harm, and anorexia, with an absolute improvement of 5% in the F1 score. We investigate the situation where current data instances fall within the context length of language models and present empirical results highlighting the importance of temporal properties of textual data. Furthermore, we utilize the proposed framework for a cross-domain study, exploring commonalities across disorders and the possibility of inter-domain data usage.

* RK and KM contributed equally to this work, 15 pages, 5 figures, 9 table

Via

Access Paper or Ask Questions