This note introduces Projected Microbatch Accumulation (PROMA), a proximal policy method that modifies gradient accumulation across microbatches rather than relying on likelihood ratios relative to a reference policy. During accumulation, PROMA projects the partially accumulated gradient to be orthogonal to the sequence-wise gradients of the current microbatch. This projection is applied layer-wise during the backward pass, enabling efficient implementation. Empirically, PROMA achieves proximal updates without entropy collapse while providing tighter local KL control than GRPO.