Picture for Mengnan Du

Mengnan Du

Improving LLM Reasoning through Interpretable Role-Playing Steering

Add code
Jun 09, 2025
Viaarxiv icon

Fine-Grained Interpretation of Political Opinions in Large Language Models

Add code
Jun 05, 2025
Viaarxiv icon

SAE-SSV: Supervised Steering in Sparse Representation Spaces for Reliable Control of Language Models

Add code
May 22, 2025
Viaarxiv icon

Denoising Concept Vectors with Sparse Autoencoders for Improved Language Model Steering

Add code
May 21, 2025
Viaarxiv icon

Feature Extraction and Steering for Enhanced Chain-of-Thought Reasoning in Language Models

Add code
May 21, 2025
Viaarxiv icon

SAE-FiRE: Enhancing Earnings Surprise Predictions Through Sparse Autoencoder Feature Selection

Add code
May 20, 2025
Viaarxiv icon

Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders

Add code
May 12, 2025
Viaarxiv icon

A Comprehensive Survey of Synthetic Tabular Data Generation

Add code
Apr 23, 2025
Viaarxiv icon

A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models

Add code
Mar 07, 2025
Viaarxiv icon

DBR: Divergence-Based Regularization for Debiasing Natural Language Understanding Models

Add code
Feb 25, 2025
Viaarxiv icon