Picture for Zhiying Deng

Zhiying Deng

Are We Evaluating the Edit Locality of LLM Model Editing Properly?

Add code
Jan 24, 2026
Viaarxiv icon

Is Model Editing Built on Sand? Revealing Its Illusory Success and Fragile Foundation

Add code
Oct 01, 2025
Figure 1 for Is Model Editing Built on Sand? Revealing Its Illusory Success and Fragile Foundation
Figure 2 for Is Model Editing Built on Sand? Revealing Its Illusory Success and Fragile Foundation
Figure 3 for Is Model Editing Built on Sand? Revealing Its Illusory Success and Fragile Foundation
Figure 4 for Is Model Editing Built on Sand? Revealing Its Illusory Success and Fragile Foundation
Viaarxiv icon

Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets

Add code
May 04, 2025
Figure 1 for Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets
Figure 2 for Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets
Figure 3 for Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets
Figure 4 for Adversarial Cooperative Rationalization: The Risk of Spurious Correlations in Even Clean Datasets
Viaarxiv icon

Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization

Add code
Mar 08, 2025
Figure 1 for Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization
Figure 2 for Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization
Figure 3 for Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization
Figure 4 for Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization
Viaarxiv icon

Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization

Add code
Oct 08, 2024
Figure 1 for Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization
Figure 2 for Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization
Figure 3 for Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization
Figure 4 for Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization
Viaarxiv icon

Enhancing the Rationale-Input Alignment for Self-explaining Rationalization

Add code
Dec 15, 2023
Figure 1 for Enhancing the Rationale-Input Alignment for Self-explaining Rationalization
Figure 2 for Enhancing the Rationale-Input Alignment for Self-explaining Rationalization
Figure 3 for Enhancing the Rationale-Input Alignment for Self-explaining Rationalization
Figure 4 for Enhancing the Rationale-Input Alignment for Self-explaining Rationalization
Viaarxiv icon

D-Separation for Causal Self-Explanation

Add code
Sep 23, 2023
Viaarxiv icon