Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Samira Hajizadeh

Retroactive Chain-of-Thought (RetroCoT): Forensic Reconstruction Prompts as a Safety Diagnostic Across Model Generations

Jul 06, 2026

Samira Hajizadeh

Abstract:Safety alignment in large language models is typically evaluated against direct, imperative harmful requests. We show that this alignment is highly conditioned on pragmatic register: models that refuse a direct request frequently comply when the same underlying objective is expressed through a different communicative stance. This suggests that current alignment policies are not invariant to semantic equivalence, but remain sensitive to how a request is pragmatically framed. We introduce Retroactive Chain-of-Thought (RetroCoT), a single-turn attack that reframes harmful requests as forensic reconstruction tasks. Rather than requesting harmful instructions directly, RetroCoT presupposes that the harmful outcome has already occurred and asks the model, acting as a forensic analyst, to reconstruct in reverse the causal chain that produced it. On AdvBench (n=50), RetroCoT achieves attach success rate of 58% on gpt-4o and 52% on gpt-4o-mini, compared with direct-request baselines of 0% and 4%, respectively. We further identify a pronounced generation gap: GPT-5-family models refuse RetroCoT entirely, explicitly identifying the reconstruction premise in their refusal rationales, consistent with explicit coverage of this reconstruction register. However, this robustness does not generalize across pragmatic forms. A single adversarial feedback turn presenting an existing forensic reconstruction response alongside evaluator critique raises ASR from 0% to 48% on GPT-5.4-mini and from 58% to 94% on GPT-4o; a control condition omitting the fabricated low score achieves 85% on GPT-5.4-mini, indicating that the operative element is pragmatic continuation within the established forensic frame rather than score manipulation. These results suggest that frontier-model alignment remains conditioned on pragmatic framing rather than semantic intent, and that new pragmatic registers can continue to expose a...

Via

Access Paper or Ask Questions

EffiPair: Improving the Efficiency of LLM-generated Code with Relative Contrastive Feedback

Apr 06, 2026

Samira Hajizadeh, Suman Jana

Abstract:Large language models (LLMs) often generate code that is functionally correct but inefficient in runtime and memory. Prior approaches to improving code efficiency typically rely on absolute execution feedback, such as profiling a single program's runtime or memory usage, which is costly and provides weak guidance for refinement. We propose Relative Contrastive Feedback (RCF), an inference-time feedback mechanism that requires no model fine-tuning or parameter updates. RCF compares two structurally similar programs for the same task and highlights the differences associated with better efficiency. Building on this idea, we introduce EffiPair, an inference-time iterative refinement framework that operates entirely at test time by generating multiple candidate solutions, identifying informative program pairs with large efficiency gaps, summarizing their execution differences into lightweight feedback, and using this signal to produce more efficient solutions. By replacing isolated scalar feedback with pairwise contrastive comparisons, EffiPair provides more direct guidance while reducing profiling and prompting overhead. Experiments on code-efficiency benchmarks show that EffiPair consistently improves efficiency while preserving correctness. For instance, with DeepSeek-Chat V3.2, EffiPair achieves up to 1.5x speedup over generation without performance feedback, while reducing token usage by more than 90% compared to prior work.

Via

Access Paper or Ask Questions

Sense and Sensitivity: Examining the Influence of Semantic Recall on Long Context Code Reasoning

May 19, 2025

Adam Štorek, Mukur Gupta, Samira Hajizadeh, Prashast Srivastava, Suman Jana

Figure 1 for Sense and Sensitivity: Examining the Influence of Semantic Recall on Long Context Code Reasoning

Figure 2 for Sense and Sensitivity: Examining the Influence of Semantic Recall on Long Context Code Reasoning

Figure 3 for Sense and Sensitivity: Examining the Influence of Semantic Recall on Long Context Code Reasoning

Figure 4 for Sense and Sensitivity: Examining the Influence of Semantic Recall on Long Context Code Reasoning

Abstract:Although modern Large Language Models (LLMs) support extremely large contexts, their effectiveness in utilizing long context for code reasoning remains unclear. This paper investigates LLM reasoning ability over code snippets within large repositories and how it relates to their recall ability. Specifically, we differentiate between lexical code recall (verbatim retrieval) and semantic code recall (remembering what the code does). To measure semantic recall, we propose SemTrace, a code reasoning technique where the impact of specific statements on output is attributable and unpredictable. We also present a method to quantify semantic recall sensitivity in existing benchmarks. Our evaluation of state-of-the-art LLMs reveals a significant drop in code reasoning accuracy as a code snippet approaches the middle of the input context, particularly with techniques requiring high semantic recall like SemTrace. Moreover, we find that lexical recall varies by granularity, with models excelling at function retrieval but struggling with line-by-line recall. Notably, a disconnect exists between lexical and semantic recall, suggesting different underlying mechanisms. Finally, our findings indicate that current code reasoning benchmarks may exhibit low semantic recall sensitivity, potentially underestimating LLM challenges in leveraging in-context information.

Via

Access Paper or Ask Questions