Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiangchen Song

Position: Mechanistic Interpretability Should Prioritize Feature Consistency in SAEs

May 26, 2025

Xiangchen Song, Aashiq Muhamed, Yujia Zheng, Lingjing Kong, Zeyu Tang, Mona T. Diab, Virginia Smith, Kun Zhang

Abstract:Sparse Autoencoders (SAEs) are a prominent tool in mechanistic interpretability (MI) for decomposing neural network activations into interpretable features. However, the aspiration to identify a canonical set of features is challenged by the observed inconsistency of learned SAE features across different training runs, undermining the reliability and efficiency of MI research. This position paper argues that mechanistic interpretability should prioritize feature consistency in SAEs -- the reliable convergence to equivalent feature sets across independent runs. We propose using the Pairwise Dictionary Mean Correlation Coefficient (PW-MCC) as a practical metric to operationalize consistency and demonstrate that high levels are achievable (0.80 for TopK SAEs on LLM activations) with appropriate architectural choices. Our contributions include detailing the benefits of prioritizing consistency; providing theoretical grounding and synthetic validation using a model organism, which verifies PW-MCC as a reliable proxy for ground-truth recovery; and extending these findings to real-world LLM data, where high feature consistency strongly correlates with the semantic similarity of learned feature explanations. We call for a community-wide shift towards systematically measuring feature consistency to foster robust cumulative progress in MI.

Via

Access Paper or Ask Questions

Reflection-Window Decoding: Text Generation with Selective Refinement

Feb 05, 2025

Zeyu Tang, Zhenhao Chen, Loka Li, Xiangchen Song, Yunlong Deng, Yifan Shen, Guangyi Chen, Peter Spirtes, Kun Zhang

Figure 1 for Reflection-Window Decoding: Text Generation with Selective Refinement

Figure 2 for Reflection-Window Decoding: Text Generation with Selective Refinement

Figure 3 for Reflection-Window Decoding: Text Generation with Selective Refinement

Figure 4 for Reflection-Window Decoding: Text Generation with Selective Refinement

Abstract:The autoregressive decoding for text generation in large language models (LLMs), while widely used, is inherently suboptimal due to the lack of a built-in mechanism to perform refinement and/or correction of the generated content. In this paper, we consider optimality in terms of the joint probability over the generated response, when jointly considering all tokens at the same time. We theoretically characterize the potential deviation of the autoregressively generated response from its globally optimal counterpart that is of the same length. Our analysis suggests that we need to be cautious when noticeable uncertainty arises during text generation, which may signal the sub-optimality of the generation history. To address the pitfall of autoregressive decoding for text generation, we propose an approach that incorporates a sliding reflection window and a pausing criterion, such that refinement and generation can be carried out interchangeably as the decoding proceeds. Our selective refinement framework strikes a balance between efficiency and optimality, and our extensive experimental results demonstrate the effectiveness of our approach.

Via

Access Paper or Ask Questions

Causal Temporal Representation Learning with Nonstationary Sparse Transition

Sep 05, 2024

Xiangchen Song, Zijian Li, Guangyi Chen, Yujia Zheng, Yewen Fan, Xinshuai Dong, Kun Zhang

Figure 1 for Causal Temporal Representation Learning with Nonstationary Sparse Transition

Figure 2 for Causal Temporal Representation Learning with Nonstationary Sparse Transition

Figure 3 for Causal Temporal Representation Learning with Nonstationary Sparse Transition

Figure 4 for Causal Temporal Representation Learning with Nonstationary Sparse Transition

Abstract:Causal Temporal Representation Learning (Ctrl) methods aim to identify the temporal causal dynamics of complex nonstationary temporal sequences. Despite the success of existing Ctrl methods, they require either directly observing the domain variables or assuming a Markov prior on them. Such requirements limit the application of these methods in real-world scenarios when we do not have such prior knowledge of the domain variables. To address this problem, this work adopts a sparse transition assumption, aligned with intuitive human understanding, and presents identifiability results from a theoretical perspective. In particular, we explore under what conditions on the significance of the variability of the transitions we can build a model to identify the distribution shifts. Based on the theoretical result, we introduce a novel framework, Causal Temporal Representation Learning with Nonstationary Sparse Transition (CtrlNS), designed to leverage the constraints on transition sparsity and conditional independence to reliably identify both distribution shifts and latent factors. Our experimental evaluations on synthetic and real-world datasets demonstrate significant improvements over existing baselines, highlighting the effectiveness of our approach.

Via

Access Paper or Ask Questions

On the Identification of Temporally Causal Representation with Instantaneous Dependence

May 24, 2024

Zijian Li, Yifan Shen, Kaitao Zheng, Ruichu Cai, Xiangchen Song, Mingming Gong, Zhifeng Hao, Zhengmao Zhu, Guangyi Chen, Kun Zhang

Figure 1 for On the Identification of Temporally Causal Representation with Instantaneous Dependence

Figure 2 for On the Identification of Temporally Causal Representation with Instantaneous Dependence

Figure 3 for On the Identification of Temporally Causal Representation with Instantaneous Dependence

Abstract:Temporally causal representation learning aims to identify the latent causal process from time series observations, but most methods require the assumption that the latent causal processes do not have instantaneous relations. Although some recent methods achieve identifiability in the instantaneous causality case, they require either interventions on the latent variables or grouping of the observations, which are in general difficult to obtain in real-world scenarios. To fill this gap, we propose an \textbf{ID}entification framework for instantane\textbf{O}us \textbf{L}atent dynamics (\textbf{IDOL}) by imposing a sparse influence constraint that the latent causal processes have sparse time-delayed and instantaneous relations. Specifically, we establish identifiability results of the latent causal process based on sufficient variability and the sparse influence constraint by employing contextual information of time series data. Based on these theories, we incorporate a temporally variational inference architecture to estimate the latent variables and a gradient-based sparsity regularization to identify the latent causal process. Experimental results on simulation datasets illustrate that our method can identify the latent causal process. Furthermore, evaluations on multiple human motion forecasting benchmarks with instantaneous dependencies indicate the effectiveness of our method in real-world settings.

Via

Access Paper or Ask Questions

When and How: Learning Identifiable Latent States for Nonstationary Time Series Forecasting

Feb 20, 2024

Zijian Li, Ruichu Cai, Zhenhui Yang, Haiqin Huang, Guangyi Chen, Yifan Shen, Zhengming Chen, Xiangchen Song, Zhifeng Hao, Kun Zhang

Figure 1 for When and How: Learning Identifiable Latent States for Nonstationary Time Series Forecasting

Figure 2 for When and How: Learning Identifiable Latent States for Nonstationary Time Series Forecasting

Figure 3 for When and How: Learning Identifiable Latent States for Nonstationary Time Series Forecasting

Figure 4 for When and How: Learning Identifiable Latent States for Nonstationary Time Series Forecasting

Abstract:Temporal distribution shifts are ubiquitous in time series data. One of the most popular methods assumes that the temporal distribution shift occurs uniformly to disentangle the stationary and nonstationary dependencies. But this assumption is difficult to meet, as we do not know when the distribution shifts occur. To solve this problem, we propose to learn IDentifiable latEnt stAtes (IDEA) to detect when the distribution shifts occur. Beyond that, we further disentangle the stationary and nonstationary latent states via sufficient observation assumption to learn how the latent states change. Specifically, we formalize the causal process with environment-irrelated stationary and environment-related nonstationary variables. Under mild conditions, we show that latent environments and stationary/nonstationary variables are identifiable. Based on these theories, we devise the IDEA model, which incorporates an autoregressive hidden Markov model to estimate latent environments and modular prior networks to identify latent states. The IDEA model outperforms several latest nonstationary forecasting methods on various benchmark datasets, highlighting its advantages in real-world scenarios.

Via

Access Paper or Ask Questions

Calibration-then-Calculation: A Variance Reduced Metric Framework in Deep Click-Through Rate Prediction Models

Jan 30, 2024

Yewen Fan, Nian Si, Xiangchen Song, Kun Zhang

Figure 1 for Calibration-then-Calculation: A Variance Reduced Metric Framework in Deep Click-Through Rate Prediction Models

Figure 2 for Calibration-then-Calculation: A Variance Reduced Metric Framework in Deep Click-Through Rate Prediction Models

Figure 3 for Calibration-then-Calculation: A Variance Reduced Metric Framework in Deep Click-Through Rate Prediction Models

Figure 4 for Calibration-then-Calculation: A Variance Reduced Metric Framework in Deep Click-Through Rate Prediction Models

Abstract:Deep learning has been widely adopted across various fields, but there has been little focus on evaluating the performance of deep learning pipelines. With the increased use of large datasets and complex models, it has become common to run the training process only once and compare the result to previous benchmarks. However, this procedure can lead to imprecise comparisons due to the variance in neural network evaluation metrics. The metric variance comes from the randomness inherent in the training process of deep learning pipelines. Traditional solutions such as running the training process multiple times are usually not feasible in deep learning due to computational limitations. In this paper, we propose a new metric framework, Calibrated Loss Metric, that addresses this issue by reducing the variance in its vanilla counterpart. As a result, the new metric has a higher accuracy to detect effective modeling improvement. Our approach is supported by theoretical justifications and extensive experimental validations in the context of Deep Click-Through Rate Prediction Models.

Via

Access Paper or Ask Questions

CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

Jan 25, 2024

Guangyi Chen, Yifan Shen, Zhenhao Chen, Xiangchen Song, Yuewen Sun, Weiran Yao, Xiao Liu, Kun Zhang

Figure 1 for CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

Figure 2 for CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

Figure 3 for CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

Figure 4 for CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

Abstract:Identifying the underlying time-delayed latent causal processes in sequential data is vital for grasping temporal dynamics and making downstream reasoning. While some recent methods can robustly identify these latent causal variables, they rely on strict assumptions about the invertible generation process from latent variables to observed data. However, these assumptions are often hard to satisfy in real-world applications containing information loss. For instance, the visual perception process translates a 3D space into 2D images, or the phenomenon of persistence of vision incorporates historical data into current perceptions. To address this challenge, we establish an identifiability theory that allows for the recovery of independent latent components even when they come from a nonlinear and non-invertible mix. Using this theory as a foundation, we propose a principled approach, CaRiNG, to learn the CAusal RepresentatIon of Non-invertible Generative temporal data with identifiability guarantees. Specifically, we utilize temporal context to recover lost latent information and apply the conditions in our theory to guide the training process. Through experiments conducted on synthetic datasets, we validate that our CaRiNG method reliably identifies the causal process, even when the generation process is non-invertible. Moreover, we demonstrate that our approach considerably improves temporal understanding and reasoning in practical applications.

* 22 pages, preprint

Via

Access Paper or Ask Questions

A Versatile Causal Discovery Framework to Allow Causally-Related Hidden Variables

Dec 18, 2023

Xinshuai Dong, Biwei Huang, Ignavier Ng, Xiangchen Song, Yujia Zheng, Songyao Jin, Roberto Legaspi, Peter Spirtes, Kun Zhang

Figure 1 for A Versatile Causal Discovery Framework to Allow Causally-Related Hidden Variables

Figure 2 for A Versatile Causal Discovery Framework to Allow Causally-Related Hidden Variables

Figure 3 for A Versatile Causal Discovery Framework to Allow Causally-Related Hidden Variables

Figure 4 for A Versatile Causal Discovery Framework to Allow Causally-Related Hidden Variables

Abstract:Most existing causal discovery methods rely on the assumption of no latent confounders, limiting their applicability in solving real-life problems. In this paper, we introduce a novel, versatile framework for causal discovery that accommodates the presence of causally-related hidden variables almost everywhere in the causal network (for instance, they can be effects of observed variables), based on rank information of covariance matrix over observed variables. We start by investigating the efficacy of rank in comparison to conditional independence and, theoretically, establish necessary and sufficient conditions for the identifiability of certain latent structural patterns. Furthermore, we develop a Rank-based Latent Causal Discovery algorithm, RLCD, that can efficiently locate hidden variables, determine their cardinalities, and discover the entire causal structure over both measured and hidden ones. We also show that, under certain graphical conditions, RLCD correctly identifies the Markov Equivalence Class of the whole latent causal graph asymptotically. Experimental results on both synthetic and real-world personality data sets demonstrate the efficacy of the proposed approach in finite-sample cases.

Via

Access Paper or Ask Questions

Temporally Disentangled Representation Learning under Unknown Nonstationarity

Oct 28, 2023

Xiangchen Song, Weiran Yao, Yewen Fan, Xinshuai Dong, Guangyi Chen, Juan Carlos Niebles, Eric Xing, Kun Zhang

Figure 1 for Temporally Disentangled Representation Learning under Unknown Nonstationarity

Figure 2 for Temporally Disentangled Representation Learning under Unknown Nonstationarity

Figure 3 for Temporally Disentangled Representation Learning under Unknown Nonstationarity

Figure 4 for Temporally Disentangled Representation Learning under Unknown Nonstationarity

Abstract:In unsupervised causal representation learning for sequential data with time-delayed latent causal influences, strong identifiability results for the disentanglement of causally-related latent variables have been established in stationary settings by leveraging temporal structure. However, in nonstationary setting, existing work only partially addressed the problem by either utilizing observed auxiliary variables (e.g., class labels and/or domain indexes) as side information or assuming simplified latent causal dynamics. Both constrain the method to a limited range of scenarios. In this study, we further explored the Markov Assumption under time-delayed causally related process in nonstationary setting and showed that under mild conditions, the independent latent components can be recovered from their nonlinear mixture up to a permutation and a component-wise transformation, without the observation of auxiliary variables. We then introduce NCTRL, a principled estimation framework, to reconstruct time-delayed latent causal variables and identify their relations from measured sequential data only. Empirical evaluations demonstrated the reliable identification of time-delayed latent causal influences, with our methodology substantially outperforming existing baselines that fail to exploit the nonstationarity adequately and then, consequently, cannot distinguish distribution shifts.

* NeurIPS 2023

Via

Access Paper or Ask Questions

Estimating Treatment Effects from Irregular Time Series Observations with Hidden Confounders

Mar 04, 2023

Defu Cao, James Enouen, Yujing Wang, Xiangchen Song, Chuizheng Meng, Hao Niu, Yan Liu

Abstract:Causal analysis for time series data, in particular estimating individualized treatment effect (ITE), is a key task in many real-world applications, such as finance, retail, healthcare, etc. Real-world time series can include large-scale, irregular, and intermittent time series observations, raising significant challenges to existing work attempting to estimate treatment effects. Specifically, the existence of hidden confounders can lead to biased treatment estimates and complicate the causal inference process. In particular, anomaly hidden confounders which exceed the typical range can lead to high variance estimates. Moreover, in continuous time settings with irregular samples, it is challenging to directly handle the dynamics of causality. In this paper, we leverage recent advances in Lipschitz regularization and neural controlled differential equations (CDE) to develop an effective and scalable solution, namely LipCDE, to address the above challenges. LipCDE can directly model the dynamic causal relationships between historical data and outcomes with irregular samples by considering the boundary of hidden confounders given by Lipschitz-constrained neural networks. Furthermore, we conduct extensive experiments on both synthetic and real-world datasets to demonstrate the effectiveness and scalability of LipCDE.

* Accepted by AAAI 2023

Via

Access Paper or Ask Questions