This study explores a next-generation multiple access (NGMA) framework for cell-free massive MIMO (CF-mMIMO) systems enhanced by stacked intelligent metasurfaces (SIMs), aiming to improve simultaneous wireless information and power transfer (SWIPT) performance. A fundamental challenge lies in optimally selecting the operating modes of access points (APs) to jointly maximize the received energy and satisfy spectral efficiency (SE) quality-of-service constraints. Practical system impairments, including a non-linear harvested energy model, pilot contamination (PC), channel estimation errors, and reliance on long-term statistical channel state information (CSI), are considered. We derive closed-form expressions for both the achievable SE and the average sum harvested energy (sum-HE). A mixed-integer non-convex optimization problem is formulated to jointly optimize the SIM phase shifts, APs mode selection, and power allocation to maximize average sum-HE under SE and average harvested energy constraints. To solve this problem, we propose a centralized training, decentralized execution (CTDE) framework based on deep reinforcement learning (DRL), which efficiently handles high-dimensional decision spaces. A Markovian environment and a normalized joint reward function are introduced to enhance the training stability across on-policy and off-policy DRL algorithms. Additionally, we provide a two-phase convex-based solution as a theoretical robust performance. Numerical results demonstrate that the proposed DRL-based CTDE framework achieves SWIPT performance comparable to convexification-based solution, while significantly outperforming baselines.