Optimizing discrete phase shifts in large-scale reconfigurable intelligent surfaces (RISs) is challenging due to their non-convex and non-linear nature. In this letter, we propose a heuristic-integrated deep reinforcement learning (DRL) framework that (1) leverages accumulated actions over multiple steps in the double deep Q-network (DDQN) for RIS column-wise control and (2) integrates a greedy algorithm (GA) into each DRL step to refine the state via fine-grained, element-wise optimization of RIS configurations. By learning from GA-included states, the proposed approach effectively addresses RIS optimization within a small DRL action space, demonstrating its capability to optimize phase-shift configurations of large-scale RISs.