Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Maisha Maliha

Hessian-Enhanced Token Attribution (HETA): Interpreting Autoregressive LLMs

Apr 14, 2026

Vishal Pramanik, Maisha Maliha, Nathaniel D. Bastian, Sumit Kumar Jha

Abstract:Attribution methods seek to explain language model predictions by quantifying the contribution of input tokens to generated outputs. However, most existing techniques are designed for encoder-based architectures and rely on linear approximations that fail to capture the causal and semantic complexities of autoregressive generation in decoder-only models. To address these limitations, we propose Hessian-Enhanced Token Attribution (HETA), a novel attribution framework tailored for decoder-only language models. HETA combines three complementary components: a semantic transition vector that captures token-to-token influence across layers, Hessian-based sensitivity scores that model second-order effects, and KL divergence to measure information loss when tokens are masked. This unified design produces context-aware, causally faithful, and semantically grounded attributions. Additionally, we introduce a curated benchmark dataset for systematically evaluating attribution quality in generative settings. Empirical evaluations across multiple models and datasets demonstrate that HETA consistently outperforms existing methods in attribution faithfulness and alignment with human annotations, establishing a new standard for interpretability in autoregressive language models.

* Accepted at ICLR 2026

Via

Access Paper or Ask Questions

Jailbreaking the Matrix: Nullspace Steering for Controlled Model Subversion

Apr 11, 2026

Vishal Pramanik, Maisha Maliha, Susmit Jha, Sumit Kumar Jha

Abstract:Large language models remain vulnerable to jailbreak attacks -- inputs designed to bypass safety mechanisms and elicit harmful responses -- despite advances in alignment and instruction tuning. We propose Head-Masked Nullspace Steering (HMNS), a circuit-level intervention that (i) identifies attention heads most causally responsible for a model's default behavior, (ii) suppresses their write paths via targeted column masking, and (iii) injects a perturbation constrained to the orthogonal complement of the muted subspace. HMNS operates in a closed-loop detection-intervention cycle, re-identifying causal heads and reapplying interventions across multiple decoding attempts. Across multiple jailbreak benchmarks, strong safety defenses, and widely used language models, HMNS attains state-of-the-art attack success rates with fewer queries than prior methods. Ablations confirm that nullspace-constrained injection, residual norm scaling, and iterative re-identification are key to its effectiveness. To our knowledge, this is the first jailbreak method to leverage geometry-aware, interpretability-informed interventions, highlighting a new paradigm for controlled model steering and adversarial safety circumvention.

Via

Access Paper or Ask Questions

Hey AI Can You Grade My Essay?: Automatic Essay Grading

Oct 12, 2024

Maisha Maliha, Vishal Pramanik

Abstract:Automatic essay grading (AEG) has attracted the the attention of the NLP community because of its applications to several educational applications, such as scoring essays, short answers, etc. AEG systems can save significant time and money when grading essays. In the existing works, the essays are graded where a single network is responsible for the whole process, which may be ineffective because a single network may not be able to learn all the features of a human-written essay. In this work, we have introduced a new model that outperforms the state-of-the-art models in the field of AEG. We have used the concept of collaborative and transfer learning, where one network will be responsible for checking the grammatical and structural features of the sentences of an essay while another network is responsible for scoring the overall idea present in the essay. These learnings are transferred to another network to score the essay. We also compared the performances of the different models mentioned in our work, and our proposed model has shown the highest accuracy of 85.50%.

* Accepted in ICAAAIML (4th International Conference on Advances and Applications of Artificial Intelligence and Machine Learning) 2023

Via

Access Paper or Ask Questions

A Survey on Congestion Control and Scheduling for Multipath TCP: Machine Learning vs Classical Approaches

Sep 17, 2023

Maisha Maliha, Golnaz Habibi, Mohammed Atiquzzaman

Abstract:Multipath TCP (MPTCP) has been widely used as an efficient way for communication in many applications. Data centers, smartphones, and network operators use MPTCP to balance the traffic in a network efficiently. MPTCP is an extension of TCP (Transmission Control Protocol), which provides multiple paths, leading to higher throughput and low latency. Although MPTCP has shown better performance than TCP in many applications, it has its own challenges. The network can become congested due to heavy traffic in the multiple paths (subflows) if the subflow rates are not determined correctly. Moreover, communication latency can occur if the packets are not scheduled correctly between the subflows. This paper reviews techniques to solve the above-mentioned problems based on two main approaches; non data-driven (classical) and data-driven (Machine Learning) approaches. This paper compares these two approaches and highlights their strengths and weaknesses with a view to motivating future researchers in this exciting area of machine learning for communications. This paper also provides details on the simulation of MPTCP and its implementations in real environments.

* 13 pages, 7 figures

Via

Access Paper or Ask Questions