Abstract:Adaptive prompting mechanisms have been proposed to enhance vision-language models by dynamically tailoring prompts to inputs. However, in frozen few-shot prompt learning with CLIP-style backbones, we systematically observe that adaptive gates and prompt-selection modules often collapse: they produce nearly constant outputs, contribute negligible gradient signals, and frequently fail to outperform fixed prompts. To further explore this issue, we present a systematic diagnostic study to uncover the underlying causes and conditions of adaptation failure. Through controlled experiments across datasets and multiple prompt learning architectures, we identify two recurring failure modes: gradient magnitude imbalance and gate degradation. Our findings invite a re-examination of indiscriminately adding architectural complexity in parameter-efficient learning and clarify when prompt-level adaptive gating is, and is not, effective in this regime.
Abstract:Hardware-aware training (HAT) is widely used to improve the robustness of neural networks on non-ideal AI accelerators, such as analog in-memory computing (IMC) systems. However, not all hardware-induced distortions are equally compensable by training. This paper presents a diagnostic framework that models hardware non-idealities as structured perturbations of the forward operator and evaluates their compatibility with gradient-based optimization. We analyze six representative perturbation classes--read noise, variability, drift, stuck-at faults, IR-drop, and ADC discretization--and identify three key diagnostics: gradient expectation consistency, bounded gradient variance, and non-degenerate sensitivity. Our results show a clear separation between perturbations that can be compensated by HAT and those that consistently break optimization. This provides practical guidance for hardware-software co-design, clarifying which non-idealities can be addressed at the training level and which require circuit-, architecture-, or calibration-level mitigation. This study should be interpreted as a controlled empirical analysis under vanilla forward-perturbation HAT, rather than as a universal theory of hardware-aware training.