Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lauren Hannah

Duke University

MemoryLLM: Plug-n-Play Interpretable Feed-Forward Memory for Transformers

Jan 30, 2026

Ajay Jaiswal, Lauren Hannah, Han-Byul Kim, Duc Hoang, Arnav Kundu, Mehrdad Farajtabar, Minsik Cho

Abstract:Understanding how transformer components operate in LLMs is important, as it is at the core of recent technological advances in artificial intelligence. In this work, we revisit the challenges associated with interpretability of feed-forward modules (FFNs) and propose MemoryLLM, which aims to decouple FFNs from self-attention and enables us to study the decoupled FFNs as context-free token-wise neural retrieval memory. In detail, we investigate how input tokens access memory locations within FFN parameters and the importance of FFN memory across different downstream tasks. MemoryLLM achieves context-free FFNs by training them in isolation from self-attention directly using the token embeddings. This approach allows FFNs to be pre-computed as token-wise lookups (ToLs), enabling on-demand transfer between VRAM and storage, additionally enhancing inference efficiency. We also introduce Flex-MemoryLLM, positioning it between a conventional transformer design and MemoryLLM. This architecture bridges the performance gap caused by training FFNs with context-free token-wise embeddings.

Via

Access Paper or Ask Questions

Ensemble Methods for Convex Regression with Applications to Geometric Programming Based Circuit Design

Jun 18, 2012

Lauren Hannah, David Dunson

Figure 1 for Ensemble Methods for Convex Regression with Applications to Geometric Programming Based Circuit Design

Figure 2 for Ensemble Methods for Convex Regression with Applications to Geometric Programming Based Circuit Design

Figure 3 for Ensemble Methods for Convex Regression with Applications to Geometric Programming Based Circuit Design

Abstract:Convex regression is a promising area for bridging statistical estimation and deterministic convex optimization. New piecewise linear convex regression methods are fast and scalable, but can have instability when used to approximate constraints or objective functions for optimization. Ensemble methods, like bagging, smearing and random partitioning, can alleviate this problem and maintain the theoretical properties of the underlying estimator. We empirically examine the performance of ensemble methods for prediction and optimization, and then apply them to device modeling and constraint approximation for geometric programming based circuit design.

* ICML2012

Via

Access Paper or Ask Questions

Beta-Negative Binomial Process and Poisson Factor Analysis

Feb 04, 2012

Mingyuan Zhou, Lauren Hannah, David Dunson, Lawrence Carin

Figure 1 for Beta-Negative Binomial Process and Poisson Factor Analysis

Figure 2 for Beta-Negative Binomial Process and Poisson Factor Analysis

Figure 3 for Beta-Negative Binomial Process and Poisson Factor Analysis

Figure 4 for Beta-Negative Binomial Process and Poisson Factor Analysis

Abstract:A beta-negative binomial (BNB) process is proposed, leading to a beta-gamma-Poisson process, which may be viewed as a "multi-scoop" generalization of the beta-Bernoulli process. The BNB process is augmented into a beta-gamma-gamma-Poisson hierarchical structure, and applied as a nonparametric Bayesian prior for an infinite Poisson factor analysis model. A finite approximation for the beta process Levy random measure is constructed for convenient implementation. Efficient MCMC computations are performed with data augmentation and marginalization techniques. Encouraging results are shown on document count matrix factorization.

* Appearing in AISTATS 2012 (submitted on Oct. 2011)

Via

Access Paper or Ask Questions