Sparse Autoencoder


Sparse-Autoencoder-Guided Internal Representation Unlearning for Large Language Models

Add code
Sep 19, 2025
Viaarxiv icon

The Anatomy of Alignment: Decomposing Preference Optimization by Steering Sparse Features

Add code
Sep 16, 2025
Viaarxiv icon

Learning Mechanistic Subtypes of Neurodegeneration with a Physics-Informed Variational Autoencoder Mixture Model

Add code
Sep 18, 2025
Viaarxiv icon

Random Matrix Theory-guided sparse PCA for single-cell RNA-seq data

Add code
Sep 18, 2025
Viaarxiv icon

Learning Minimal Representations of Many-Body Physics from Snapshots of a Quantum Simulator

Add code
Sep 17, 2025
Viaarxiv icon

A Graph Machine Learning Approach for Detecting Topological Patterns in Transactional Graphs

Add code
Sep 16, 2025
Viaarxiv icon

Towards Interpretable Deep Neural Networks for Tabular Data

Add code
Sep 10, 2025
Viaarxiv icon

From Noise to Narrative: Tracing the Origins of Hallucinations in Transformers

Add code
Sep 08, 2025
Viaarxiv icon

Sparse Autoencoder Neural Operators: Model Recovery in Function Spaces

Add code
Sep 03, 2025
Viaarxiv icon

AdaptiveK Sparse Autoencoders: Dynamic Sparsity Allocation for Interpretable LLM Representations

Add code
Aug 24, 2025
Viaarxiv icon