Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rohan Pandey

Failure Modes of Deep Multi-Agent RL in Asynchronous Pricing: Reproducible Triggers, Trace Diagnostics, and a Partial Fix

Jun 03, 2026

Shree Murthy, Rohan Pandey

Abstract:We study two reproducible failure modes of deep multi-agent reinforcement learning in continuous-time pricing markets: (i) tacit cartel formation between competing DDPG agents, and (ii) actor--critic instability at high event rates. We instantiate both inside a single CT-MARL benchmark (Poisson-clocked price updates, observation latency $δ$, interior-optimum logit demand), show that synchronous DDPG agents reliably trigger Failure Mode 1 with collusion index $Δ= 0.69 \pm 0.11$, and quantify a partial microstructure fix: asynchrony alone cuts collusion by 48\% and adding latency drives it to a minimum of $Δ= 0.28$. The fix has clearly documented costs: it is partial ($Δ$ remains supra-Bertrand), it is non-monotone in $δ$, and it does not survive Failure Mode 2, which emerges as DDPG critic divergence at $λ= 5$ and corrupts the phase-diagram cell at $(λ{=}5, δ{=}1)$. We accompany the scalar collusion index with trajectory-level trace diagnostics that expose the within-episode signalling collapse and the post-shock non-recovery.

Via

Access Paper or Ask Questions

ThinkSwitch: Context Distillation with LoRA and Weight Interpolation for Specific-Purpose Reasoning Tasks

May 31, 2026

Dhruv Saini, Rohan Pandey

Abstract:Large language models often improve on difficult tasks by spending inference-time compute on a reasoning trace before producing the final answer. That extra computation can be useful, but it also raises latency, token cost, and deployment complexity. We introduce \textbf{ThinkSwitch}, a low-compute procedure for co-training paired instruct and thinking checkpoints. Starting from compatible Qwen3-4B instruct and thinking models, each iteration asks the thinking checkpoint to generate answers, removes the reasoning trace, distills the answer-only pairs into the instruct checkpoint with QLoRA, and reconstructs a thinking checkpoint with spherical weight interpolation. The only human-supplied inputs are task prompts; the labels are generated by the model itself. On a 30-question AIME 2026 evaluation, ThinkSwitch improves the instruct checkpoint from 10/30 to 20/30 and the thinking checkpoint from 14/30 to 22/30. On a 30-question PubMedQA subset, it improves the instruct checkpoint from 13/30 to 18/30 and the thinking checkpoint from 18/30 to 25/30. The complete experiment uses 15 training prompts per domain and costs \$2.86 on a single cloud RTX 3070. The results are small-scale, but they indicate that targeted distillation loops can move part of the benefit of explicit reasoning into weights while preserving a separate thinking mode.

Via

Access Paper or Ask Questions

Poisoning the Watchtower: Prompt Injection Attacks Against LLM-Augmented Security Operations Through Adversarial Log Content

May 23, 2026

Rohan Pandey, Archit Bhujang

Abstract:Large language models (LLMs) are increasingly used as analyst assistants in security operations centers (SOCs), where they ingest log and alert data to produce triage labels, incident summaries, or remediation advice. We study a structural failure mode of this design: many log fields are attacker controlled. User agents, URLs, payloads, DNS queries, and attempted usernames can therefore carry instructions to the model alongside evidence of the intrusion. We call this setting \emph{log-substrate prompt injection}. We introduce a four-class taxonomy of log-substrate attacks: direct override (S1), persona hijack (S2), context manipulation (S3), and obfuscated payloads (S4). We evaluate 48 strategy-defense-task combinations using \texttt{gpt-4o-mini} as the analyst. Three findings stand out. First, direct overrides are ineffective in our setting: all S1 classification attacks achieve 0\% suppression. In contrast, persona hijacks suppress 68\% of malicious logs under a naive classifier and remain effective under stronger defenses. Second, summarization is the highest-risk task: context manipulation reaches 96\% injection success without defenses and 38\% even with constrained output. Third, defenses reduce but do not eliminate the attack surface: average injection success falls from 26.6\% under naive prompting to 11.8\% under our strongest defense. We also compare empirical results to a deterministic mock analyst and find that simulation substantially mispredicts current model behavior, especially for direct overrides. These results suggest that SOC copilots should treat raw log content as adversarial input rather than ordinary analyst context.

* 10 pages

Via

Access Paper or Ask Questions

Beyond the Answer: Decoding the Behavior of LLMs as Scientific Reasoners

Mar 30, 2026

Rohan Pandey, Eric Ye, Michael Li

Abstract:As Large Language Models (LLMs) achieve increasingly sophisticated performance on complex reasoning tasks, current architectures serve as critical proxies for the internal heuristics of frontier models. Characterizing emergent reasoning is vital for long-term interpretability and safety. Furthermore, understanding how prompting modulates these processes is essential, as natural language will likely be the primary interface for interacting with AGI systems. In this work, we use a custom variant of Genetic Pareto (GEPA) to systematically optimize prompts for scientific reasoning tasks, and analyze how prompting can affect reasoning behavior. We investigate the structural patterns and logical heuristics inherent in GEPA-optimized prompts, and evaluate their transferability and brittleness. Our findings reveal that gains in scientific reasoning often correspond to model-specific heuristics that fail to generalize across systems, which we call "local" logic. By framing prompt optimization as a tool for model interpretability, we argue that mapping these preferred reasoning structures for LLMs is an important prerequisite for effectively collaborating with superhuman intelligence.

* Accepted at the Post-AGI Science and Society Workshop at ICLR 2026

Via

Access Paper or Ask Questions

CircuitBuilder: From Polynomials to Circuits via Reinforcement Learning

Mar 17, 2026

Weikun K. Zhang, Rohan Pandey, Bhaumik Mehta, Kaijie Jin, Naomi Morato, Archit Ganapule, Michael Ruofan Zeng, Jarod Alper

Abstract:Motivated by auto-proof generation and Valiant's VP vs. VNP conjecture, we study the problem of discovering efficient arithmetic circuits to compute polynomials, using addition and multiplication gates. We formulate this problem as a single-player game, where an RL agent attempts to build the circuit within a fixed number of operations. We implement an AlphaZero-style training loop and compare two approaches: Proximal Policy Optimization with Monte Carlo Tree Search (PPO+MCTS) and Soft Actor-Critic (SAC). SAC achieves the highest success rates on two-variable targets, while PPO+MCTS scales to three variables and demonstrates steady improvement on harder instances. These results suggest that polynomial circuit synthesis is a compact, verifiable setting for studying self-improving search policies.

* ICLR 2026 Workshop on AI with Recursive Self-Improvement

Via

Access Paper or Ask Questions

Predicting first-episode homelessness among US Veterans using longitudinal EHR data: time-varying models and social risk factors

Feb 02, 2026

Rohan Pandey, Haijuan Yan, Hong Yu, Jack Tsai

Abstract:Homelessness among US veterans remains a critical public health challenge, yet risk prediction offers a pathway for proactive intervention. In this retrospective prognostic study, we analyzed electronic health record (EHR) data from 4,276,403 Veterans Affairs patients during a 2016 observation period to predict first-episode homelessness occurring 3-12 months later in 2017 (prevalence: 0.32-1.19%). We constructed static and time-varying EHR representations, utilizing clinician-informed logic to model the persistence of clinical conditions and social risks over time. We then compared the performance of classical machine learning, transformer-based masked language models, and fine-tuned large language models (LLMs). We demonstrate that incorporating social and behavioral factors into longitudinal models improved precision-recall area under the curve (PR-AUC) by 15-30%. In the top 1% risk tier, models yielded positive predictive values ranging from 3.93-4.72% at 3 months, 7.39-8.30% at 6 months, 9.84-11.41% at 9 months, and 11.65-13.80% at 12 months across model architectures. Large language models underperformed encoder-based models on discrimination but showed smaller performance disparities across racial groups. These results demonstrate that longitudinal, socially informed EHR modeling concentrates homelessness risk into actionable strata, enabling targeted and data-informed prevention strategies for at-risk veterans.

Via

Access Paper or Ask Questions

Amortized Inference for Model Rocket Aerodynamics: Learning to Estimate Physical Parameters from Simulation

Dec 24, 2025

Rohit Pandey, Rohan Pandey

Abstract:Accurate prediction of model rocket flight performance requires estimating aerodynamic parameters that are difficult to measure directly. Traditional approaches rely on computational fluid dynamics or empirical correlations, while data-driven methods require extensive real flight data that is expensive and time-consuming to collect. We present a simulation-based amortized inference approach that trains a neural network on synthetic flight data generated from a physics simulator, then applies the learned model to real flights without any fine-tuning. Our method learns to invert the forward physics model, directly predicting drag coefficient and thrust correction factor from a single apogee measurement combined with motor and configuration features. In this proof-of-concept study, we train on 10,000 synthetic flights and evaluate on 8 real flights, achieving a mean absolute error of 12.3 m in apogee prediction - demonstrating promising sim-to-real transfer with zero real training examples. Analysis reveals a systematic positive bias in predictions, providing quantitative insight into the gap between idealized physics and real-world flight conditions. We additionally compare against OpenRocket baseline predictions, showing that our learned approach reduces apogee prediction error. Our implementation is publicly available to support reproducibility and adoption in the amateur rocketry community.

* 5 pages, 2 figures, 2 tables

Via

Access Paper or Ask Questions

gzip Predicts Data-dependent Scaling Laws

May 26, 2024

Rohan Pandey

Abstract:Past work has established scaling laws that predict the performance of a neural language model (LM) as a function of its parameter count and the number of tokens it's trained on, enabling optimal allocation of a fixed compute budget. Are these scaling laws agnostic to training data as some prior work suggests? We generate training datasets of varying complexities by modulating the syntactic properties of a PCFG, finding that 1) scaling laws are sensitive to differences in data complexity and that 2) gzip, a compression algorithm, is an effective predictor of how data complexity impacts scaling properties. We propose a new data-dependent scaling law for LM's that accounts for the training data's gzip-compressibility; its compute-optimal frontier increases in dataset size preference (over parameter count preference) as training data becomes harder to compress.

* 9 pages, 9 figures

Via

Access Paper or Ask Questions

Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP

Aug 27, 2023

Vedant Palit, Rohan Pandey, Aryaman Arora, Paul Pu Liang

Figure 1 for Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP

Figure 2 for Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP

Figure 3 for Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP

Figure 4 for Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP

Abstract:Mechanistic interpretability seeks to understand the neural mechanisms that enable specific behaviors in Large Language Models (LLMs) by leveraging causality-based methods. While these approaches have identified neural circuits that copy spans of text, capture factual knowledge, and more, they remain unusable for multimodal models since adapting these tools to the vision-language domain requires considerable architectural changes. In this work, we adapt a unimodal causal tracing tool to BLIP to enable the study of the neural mechanisms underlying image-conditioned text generation. We demonstrate our approach on a visual question answering dataset, highlighting the causal relevance of later layer representations for all tokens. Furthermore, we release our BLIP causal tracing tool as open source to enable further experimentation in vision-language mechanistic interpretability by the community. Our code is available at https://github.com/vedantpalit/Towards-Vision-Language-Mechanistic-Interpretability.

* Final version for 5th Workshop on Closing the Loop Between Vision and Language (CLVL) @ ICCV 2023. 4 pages, 5 figures

Via

Access Paper or Ask Questions

Athena 2.0: Discourse and User Modeling in Open Domain Dialogue

Aug 03, 2023

Omkar Patil, Lena Reed, Kevin K. Bowden, Juraj Juraska, Wen Cui, Vrindavan Harrison, Rishi Rajasekaran, Angela Ramirez, Cecilia Li, Eduardo Zamora(+5 more)

Figure 1 for Athena 2.0: Discourse and User Modeling in Open Domain Dialogue

Figure 2 for Athena 2.0: Discourse and User Modeling in Open Domain Dialogue

Figure 3 for Athena 2.0: Discourse and User Modeling in Open Domain Dialogue

Figure 4 for Athena 2.0: Discourse and User Modeling in Open Domain Dialogue

Abstract:Conversational agents are consistently growing in popularity and many people interact with them every day. While many conversational agents act as personal assistants, they can have many different goals. Some are task-oriented, such as providing customer support for a bank or making a reservation. Others are designed to be empathetic and to form emotional connections with the user. The Alexa Prize Challenge aims to create a socialbot, which allows the user to engage in coherent conversations, on a range of popular topics that will interest the user. Here we describe Athena 2.0, UCSC's conversational agent for Amazon's Socialbot Grand Challenge 4. Athena 2.0 utilizes a novel knowledge-grounded discourse model that tracks the entity links that Athena introduces into the dialogue, and uses them to constrain named-entity recognition and linking, and coreference resolution. Athena 2.0 also relies on a user model to personalize topic selection and other aspects of the conversation to individual users.

* Alexa Prize Proceedings, 2021. Socialbot Grand Challenge 4

Via

Access Paper or Ask Questions