Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hammad Rizwan

Multi-Granular Node Pruning for Circuit Discovery

Dec 11, 2025

Muhammad Umair Haider, Hammad Rizwan, Hassan Sajjad, A. B. Siddique

Abstract:Circuit discovery aims to identify minimal subnetworks that are responsible for specific behaviors in large language models (LLMs). Existing approaches primarily rely on iterative edge pruning, which is computationally expensive and limited to coarse-grained units such as attention heads or MLP blocks, overlooking finer structures like individual neurons. We propose a node-level pruning framework for circuit discovery that addresses both scalability and granularity limitations. Our method introduces learnable masks across multiple levels of granularity, from entire blocks to individual neurons, within a unified optimization objective. Granularity-specific sparsity penalties guide the pruning process, allowing a comprehensive compression in a single fine-tuning run. Empirically, our approach identifies circuits that are smaller in nodes than those discovered by prior methods; moreover, we demonstrate that many neurons deemed important by coarse methods are actually irrelevant, while still maintaining task performance. Furthermore, our method has a significantly lower memory footprint, 5-10x, as it does not require keeping intermediate activations in the memory to work.

Via

Access Paper or Ask Questions

Resolving Lexical Bias in Edit Scoping with Projector Editor Networks

Aug 19, 2024

Hammad Rizwan, Domenic Rosati, Ga Wu, Hassan Sajjad

Figure 1 for Resolving Lexical Bias in Edit Scoping with Projector Editor Networks

Figure 2 for Resolving Lexical Bias in Edit Scoping with Projector Editor Networks

Figure 3 for Resolving Lexical Bias in Edit Scoping with Projector Editor Networks

Figure 4 for Resolving Lexical Bias in Edit Scoping with Projector Editor Networks

Abstract:Weight-preserving model editing techniques heavily rely on the scoping mechanism that decides when to apply an edit to the base model. These scoping mechanisms utilize distance functions in the representation space to ascertain the scope of the edit. In this work, we show that distance-based scoping functions grapple with lexical biases leading to issues such as misfires with irrelevant prompts that share similar lexical characteristics. To address this problem, we introduce, Projector Editor Networks for Model Editing (PENME),is a model editing approach that employs a compact adapter with a projection network trained via a contrastive learning objective. We demonstrate the efficacy of PENME in achieving superior results while being compute efficient and flexible to adapt across model architectures.

Via

Access Paper or Ask Questions