Picture for Haiyan Zhao

Haiyan Zhao

Denoising Concept Vectors with Sparse Autoencoders for Improved Language Model Steering

Add code
May 21, 2025
Viaarxiv icon

Beyond Input Activations: Identifying Influential Latents by Gradient Sparse Autoencoders

Add code
May 12, 2025
Viaarxiv icon

PROPHET: An Inferable Future Forecasting Benchmark with Causal Intervened Likelihood Estimation

Add code
Apr 02, 2025
Viaarxiv icon

Enhancing LLM Generation with Knowledge Hypergraph for Evidence-Based Medicine

Add code
Mar 18, 2025
Viaarxiv icon

RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing

Add code
Mar 13, 2025
Viaarxiv icon

A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models

Add code
Mar 07, 2025
Viaarxiv icon

SAIF: A Sparse Autoencoder Framework for Interpreting and Steering Instruction Following of Language Models

Add code
Feb 17, 2025
Viaarxiv icon

Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability

Add code
Jan 02, 2025
Viaarxiv icon

MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning

Add code
Oct 30, 2024
Figure 1 for MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning
Figure 2 for MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning
Figure 3 for MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning
Figure 4 for MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning
Viaarxiv icon

Exploring LLM-based Data Annotation Strategies for Medical Dialogue Preference Alignment

Add code
Oct 05, 2024
Viaarxiv icon